Determining word boundaries when no space exists in text

I am developing a text search feature for a viewer application and I run into PDFs quite often that do not use the space character to delineate word boundaries. For example, a text showing operator with individual glyph positioning will contain strings and positioning information like this:

[(de)15(grees)-262(and)-262(who)-262(w)10(ould)-262(contrib)20(ute)-26 2(an)]TJ

When the strings are concatenated the result is:

"degreesandwhowouldcontributean"

Without spaces it's not possile to split the string into words based on character information. It would appear the only information that could be used to guess word boundaries is the glyph positioning. I have tested the documents in Adobe Reader and the application is able to correctly determine where word boundaries are, and it must be doing so by examing the glyph positioning and metrics.

My first appreach was to get the glyph width for the space character, and assume a space is any position advance greater than the glyph width of a space. The problem with that is the case where the font has been subsetted and the 'space' glyph is missing from the font.

My second approach was to calculate the average glyph width for the font, then assume any text advance greater than 33% of the average glyph width is a space. Works better but still not a reliable general solution.

My question: does Adobe have a standard method for determining word boundaries when space characters are missing?

Determining word boundaries when no space exists in text

Trending Articles

KMS & Digital & Online Activation Suite v5.7

13 Japanese teen boys caught peeping into girls’ hot spring bath during class...

Moondru Mudichu 01-05-2017 – Polimer tv Serial

Trailer Park Boys Jail S01-S02 1080p NF WEB-DL H264-FLUX

13917

The 10 Tennessee Cities With The Largest Black Population For 2021

Das MausPad • Req.Bin ein Star usw.

The Personal Assistant (JL Creation) (ENG+RUS) [L] [1.79GB]

Shatta Wale – You Shock Me (Prod. by Willis Beatz)

Alessia Cara – Know It All (Album) [2015] – FREE DOWNLOAD – ZIP

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

Lady Gaga & Bruno Mars – Die With A Smile (Acoustic) – Single [iTunes Plus M4A]

Mp3 Download: Mdu - Auntie

Kanulanu Thaake Lyrics and translation | Manam (2014)

Karimnagar District Police Office Mobile Numbers List in Telangana State

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

[GET] Rob Lennon – AI Lead Magnets + Workshop ($199)

Senior High School (SHS) DLL - Organization and Management

Practice Sheet of Right form of verbs for HSC Students

Love Status in Punjabi, ਪੰਜਾਬੀ ਲਵ ਸਟੇਟਸ