Hi all,
Iam facing some big problem with text extraction from pdf file.
Currently iam using congviews pdf2xl text extraction tool.
About 95% of the text extract correcly but few charaters showing box some ? and some dotted circle mark.
Font Used:
ArialUnicodeMS(Embedded Subset)
Type:(True Type (CID)
Encoding:Identity-H
TimesNewRomanPSMT
Type:True Type
ActualFont:TimesNewRomanPSMT
ActualFontType:TrueType
Anyone please help me to overcome this.
Regards
Gilbert.X