Quantcast
Channel: Adobe Community : Popular Discussions - PDF Language and Specifications
Viewing all articles
Browse latest Browse all 46145

Character Positions in a PDF

$
0
0
I already have a program that takes a pdf page and converts it into a jpeg.

I'm trying to write a program that finds a bounding box for each character on the page so that I can then highlight this text on top of the jpeg.

I have been successful in doing this for the core 14 fonts.

Right now I'm looking at TrueType fonts and am confused because I get the widths and the FontBBox from the FontDescriptor and it looks like this: [-665 -325 2000 1006]. Pretty normal, right? And then I run through the page contents and get commands like this:

BT
/TT2 1 Tf

7.98 0 0 7.98 72 29.04 Tm
/Cs6 cs 0 0 0 scn 0.0008 Tc 9.4883 Tw

(SomeText_) Tj

ET

I then compute the bounding box dimensions for each character like this:

width = charWidth * Tfs * Th (char width from /Widths array)
height = FontBBox.height * Tfs

The problem is..... FontBBox.height is 1.331 and Tfs is 1 because it just got set in the Tf command. So I get a height of 1.331, when the actual height is probably around 10.

So there must be some other scaling parameter in TrueType ??

Waitasecond... would the scale factor happen to be that 7.98 from the Text Matrix ?

Viewing all articles
Browse latest Browse all 46145

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>