reuse embedded objects like images in pdf

July 9, 2010, 4:34 am

≪ Previous: Programmatically search for a word in PDF file

Hello!

I was just wondering, if there is a way of reusing embedded objects inside a PDF (like images.)?

So let's say, I have a graphic on the first page and I want to use the same graphic on the third page without storing it twice in the same PDF-file.

If this is possible, could you also reuse text or textblocks inside a PDF?

Thanks for your help!

Kind regards,

Johannes

↧

Secure PDF using certificate

September 21, 2010, 9:42 pm

≫ Next: Can't open libHaru-generated PDF with Reader, other readers work.

≪ Previous: reuse embedded objects like images in pdf

Hi All,

I'm developing a secure PDF using certificate.

When I open the PDF I get acrobat security saying " A digital ID was used to encrypt this document but no digital ID is present to decrypt it.Make sure your digital ID is properly installed or contact the document author. "

Any help in resolving this issue is appreciated?

↧

Can't open libHaru-generated PDF with Reader, other readers work.

October 13, 2011, 11:02 am

≫ Next: Fill and stroke a rectangle

≪ Previous: Secure PDF using certificate

Our application uses libHaru to export PDF images. These PDFs can be opened using PDFlite on Windows, Preview on Mac and some other readers. However, I have tried several versions of Adobe Reader on both platforms and the files fail to open with the message: "There was an error processing a page. There was a problem reading this document (14)." So, it seams that Adobe Reader believes that there is problem with this file. However, from my very limited understanding of PDF structure, I have no idea what might be wrong, so I don't know how to fix it. The file is written unencrypted. It is about as simple a PDF as we can generate - a red-filled rectangle. I would greatly appreciate any suggestions or help.

Here is a sample PDF:

%PDF-1.3

%∑æ≠™

1 0 obj

/Type /Catalog

/Pages 2 0 R

endobj

2 0 obj

/Type /Pages

/Kids [ 4 0 R ]

/Count 1

endobj

3 0 obj

/Producer (Haru Free PDF Library 2.3.0-dev)

endobj

4 0 obj

/Type /Page

/MediaBox [ 0 0 236 207 ]

/Contents 5 0 R

/Resources <<

/ProcSet [ /PDF /Text /ImageB /ImageC /ImageI ]

/Pattern <<

/Type /Pattern

/PatternType 1

/PaintType 2

/TilingType 2

/BBox [ 0 0 100 100 ]

/XStep 100

/YStep 100

/Parent 2 0 R

endobj

5 0 obj

/Length 6 0 R

stream

1 0 0 -1 -118 296 cm

1 w

0 0 0 RG

[] 0 d

1 0.3 0.3 rg

% Rect

119.25 90 m

352.5 90 l

352.5 294.75 l

119.25 294.75 l

119.25 90 l

endstream

endobj

6 0 obj

135

endobj

xref

0 7

0000000000 65535 f

0000000015 00000 n

0000000064 00000 n

0000000123 00000 n

0000000188 00000 n

0000000458 00000 n

0000000647 00000 n

trailer

/Root 1 0 R

/Info 3 0 R

/Size 7

startxref

666

%%EOF

↧

Fill and stroke a rectangle

January 12, 2012, 6:14 am

≫ Next: Pages and resources dictionary

≪ Previous: Can't open libHaru-generated PDF with Reader, other readers work.

Hello.

When I write the following text, adobe reader displays a red rectangle with a black border.

10 w

100 100 200 150 re

1 0 0 rg %red fill

0 0 0 RG %black stroke

but when i use operators f and S the red rectangle is displayed without the border.

10 w

100 100 200 150 re

1 0 0 rg %red fill

0 0 0 RG %black stroke

From a pdf reader perspective why in the second example the border is not displayed?

Isn't f+S=B ?

Thank You.

↧

Pages and resources dictionary

December 17, 2012, 1:19 pm

≫ Next: Can't construct a valid cross reference stream.

≪ Previous: Fill and stroke a rectangle

When we read a Page object with a Resources dictionary , do we need to look also at Resources of Parent(s) of the Page and do some merging of resources?

Thank You

↧

Can't construct a valid cross reference stream.

September 27, 2014, 5:38 am

≫ Next: Rectangle Subpath?

≪ Previous: Pages and resources dictionary

I'm building a PDF generation library from scratch.

Currently I'm having trouble generating a valid crossreference stream, but I'm totally lost as to why it is invalid.

%PDF-1.7
%µ
0 0 obj
<<
/Pages 1 0 R
/Type /Catalog
>>
endobj
1 0 obj
<<
/Type /Pages
/Kids [2 0 R]
/Count 1
>>
endobj
2 0 obj
<<
/Parent 1 0 R
/Type /Page
/MediaBox [0 0 612 792]
/Contents 3 0 R
>>
endobj
3 0 obj
<<
/Length 0
>>
stream
endstream
endobj
4 0 obj
<<
/Type /XRef
/W [1 2 0]
/Size 6
/Length 16
>>
stream
...
endstream
endobj
startxref
254
%%EOF

The full pdf file can be found here: https://www.dropbox.com/s/mvn0xptf0lasb28/test.pdf?dl=0

According to the spec a PDF file can consist of only objects with exemption of the first line (the 2nd is a comment) and the part from startxref.

Any tips would be greatly appreciated.

For simplicitly I've added the stream (extract via a hex editor) below:

0A 01 00 0D 01 00 3E 01 00 77 01 00 CE 01 00 FE 0A

Note that the stream starts and ends with a newline character. There are 17 bytes and the last line ending is not part of the stream length.

The remaining bytes 16 bytes have 15 bytes of data, (the first line ending is ignored (right?)):

01 00 0D

01 00 3E

01 00 77

01 00 CE

01 00 FE

As far as I can tell this PDF file's cross reference stream is valid. Any help would be greatly appreciated!

↧

Rectangle Subpath?

April 10, 2016, 10:17 pm

≫ Next: PDF password problem

≪ Previous: Can't construct a valid cross reference stream.

Hi everybody.. I hope somebody can help me.

I am trying to interpret the subpath entries in my pdf file so I made the below pdf document with a small table in it. It is just a simple table with one rectangle.

I realize the operators re are for Rectangle but I'm not sure why there are so many. Does the rectangle re operator refer to individual lines? There are 8 used here. See the internals of this pdf below.

I just want to know how this should be interpreted. I know in the pdf reference it says the values stand for (x,y,width, height) and the x,y coordinates are for the lower

left corner (the origin) of the subpath. But why are there 8 entries here. There's only 1 rectangle.. Or does each line represent a rectangle? I don't really know and the pdf reference doesn't seem to tell me.

/GS1 gs

/TT2 1 Tf

10.98 0 0 10.98 72 758.0003 Tm

0 g

.0007 Tc

0 Tw

(test)Tj

/TT1 1 Tf

1.5628 0 TD

0 Tc

<0003>Tj

66.06 769.52 .48 .48004 re

66.06 769.52 463.2 .48004 re

528.78 769.52 .47998 .48004 re

66.06 751.58 .48 .47998 re

66.06 751.58 463.2 .47998 re

528.78 751.58 .47998 .47998 re

66.06 752.06 .48 17.46 re

528.78 752.06 .47998 17.46 re

↧

PDF password problem

June 17, 2010, 2:30 am

≫ Next: Minimal PDF file set

≪ Previous: Rectangle Subpath?

Hi,
I am developing a application in which I am creating PDF files using PDF specifications given in PDF 1.3. All things are working, for data encryption I am using FlateDecode. Now I have to implement password protection in PDF. I read complete specification regarding this. I want to use standard security handler for this. I implemented this also. I am giving both user and owner password to PDF, but still I am not able to open this file.

For owner password I am using following steps if anyone implemented this please check it and let me know.

For owner password:
1) I gave "jain"
2)after padding it is
jain(¿N^NuŠAd.NVÿú.....¶Ðh>€/.©þ

for padding I used
0x28, 0xBF, 0x4E, 0x5E, 0x4E, 0x75, 0x8A, 0x41,
0x64, 0x00, 0x4E, 0x56, 0xFF, 0xFA, 0x01, 0x08,
0x2E, 0x2E, 0x00, 0xB6, 0xD0, 0x68, 0x3E, 0x80,
0x2F, 0x0C, 0xA9, 0xFE, 0x64, 0x53, 0x69, 0x7A
3)MD5 hash this
output is Qåý†ÜzùÀù 7 Õ¹Î`

4)Get the 5 byte key for rc4 algo
it is Qåý†Ü

5)get the user password "atul"
6) after padding it is atul(¿N^NuŠAd.NVÿú.....¶Ðh>€/.©þ
7)call the rc4 algo as

CryptAcquireContext (&vCryptProv, NULL, MS_ENHANCED_PROV, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT);

CryptCreateHash (vCryptProv, CALG_MD5, 0, 0, &vHashObj);

CryptHashData (vHashObj, (unsigned char*)pEncKey, 5, 0);

CryptDeriveKey (vCryptProv, CALG_RC4, vHashObj, CRYPT_EXPORTABLE, &vSessionKey);
CryptEncrypt (vSessionKey, 0, TRUE, 0, (unsigned char*)rawdata, pEncMsgLen, rawdatalen);

8) so final owner entry is
õ¶hÈHWVtMº|émÊöó^g(æO¨&LÃ”MYûÜý—

but I don't know whether it is correct or not, If u can please verify these steps and let me know.

↧

Minimal PDF file set

January 18, 2015, 1:05 pm

≫ Next: ObjStm Documentation

≪ Previous: PDF password problem

Hello guys!

I'm trying to figure out the minimal instructions set to produce a valid blank PDF file.

https://dl.dropboxusercontent.com/u/1162023/Blank%20PDF.pdf - my best try. Still, opened in Acrobat Reader this one asks user to save something.

Am I wrong at counting xref table values?

Thanks in advance!

↧

ObjStm Documentation

February 9, 2011, 11:47 am

≫ Next: Batch processing the document language

≪ Previous: Minimal PDF file set

Hey folks,

I was wondering if there was any documentation available on ObjStm objects? I've found a number of PDFs that use this object, but have been unable to find documentation on it in the PDF specification or elsewhere.

I'm working with some really large PDFs with a lot of objects, I'd like to try and compress them as much as possible, this seems to be one way I could do it. Any other suggestions are appreciated.

Thanks!

↧

Batch processing the document language

May 8, 2013, 8:17 am

≫ Next: Change the image under Form XObject to the first layer

≪ Previous: ObjStm Documentation

Hi,

I'm trying to batch process a load of PDFs so that they have a document language.

I've looked in using Javascript but I can't find a variable that corresponds with it. I know it's in the end stream data.

Any suggestions?

Thanks

↧

Change the image under Form XObject to the first layer

December 29, 2013, 5:33 am

≫ Next: Decompressing a cross reference stream with Params.

≪ Previous: Batch processing the document language

Hi here

The right column is the file before structure optimization, all the 5139 images is encaptured in one form(fm0).

The left column shows the structure in which all the images are changed to fist laye under XObject.

The change from right to left would make difference for specific environment like:

1)Acrobat Pro 9 would open the file after change.

2)Some workflow like ONYX Thrive could handle.

The form content actually is not repeatable,I would think it's a wrong application of Form XObject ,leading to nested XObject.

I am not a programmer so I would like to see whether it's possible in high level application to optimize the structure in Acrobat Pro?

If I have to use some low level tools to change, please help to comment which is would easy one for me.

Many thanks

Kevin

↧

Decompressing a cross reference stream with Params.

August 28, 2014, 9:14 pm

≫ Next: Changing creation date on adobe acrobat 9

≪ Previous: Change the image under Form XObject to the first layer

I'm losing my mind trying to decompress this stream. I've asked everyone and am not having much luck with it. This is a cross reference stream with decode params.

I have used an external library for decompressing this and for some reason, the xref streams just cannot be decoded. I noticed that software designed to decompress pdf files such as Quick Pdf, VeryPdf etc will decompress all streams but these. So there must be something extra about the output of these that I don't understand.

A common error message would be that the stream cannot be displayed because it contains binary data. Very odd. Any clues.

25 0 obj

<</DecodeParms<</Columns 4/Predictor 12>>/Filter/FlateDecode/ID[<6647557224A6C102A60F6D82BB22C18D><AA383B5CF85B7F4BACB9D502B93 343E9>]/Index[10 20]/Info 9 0 R/Length 64/Prev 23381/Root 11 0 R/Size 30/Type/XRef/W[1 2 1]>>stream

hÞbbd ``b`Š ~@‚ñ `Ù $Øù€ W P°~.............compressed stream data.

endstream

endobj

↧

Changing creation date on adobe acrobat 9

November 5, 2014, 3:23 pm

≫ Next: CIDFont and PDF

≪ Previous: Decompressing a cross reference stream with Params.

I've got a folder full of old PDF files at work. They are all labeled DD-MM-YY to signify the date they were made but the pdf's were originally in Word format and when they were all converted to PDF's, they all took that day as their created date. I'm looking for a way to change the created date on these hundreds of PDF's to match the date in the file name. Is there an easy way to do this? I have to print the documents and organize them chronologically by date. However, I need to also have corresponding creation dates with the documents. I'm not good with scripting or programming however I am great at following step by step directions. I would greatly appreciate help on this since this is a semi time sensitive project.

Thank you.

↧

CIDFont and PDF

November 11, 2014, 6:20 am

≫ Next: I want to understand TJ command with array string

≪ Previous: Changing creation date on adobe acrobat 9

Hi,

thanks to the lrosenth for the help with my question about support for CID keyed Type1 fonts in PDF.

I will make sure that I read the ISO PDF spec - it is more clear.

Sadly, I have another question:

I have a valid CIDFont (consists of a readable PS part and also binary data which I believe is _parts_ of a CFF font program, not the whole program).

The readable PS part has info like byte offsets to CharStrings and also the table that maps CID values to font/glyph (CIDMap)

The FreeType library will render glyphs from this font.

The binary parts appear to be the CIDMap and CharStrings etc - but not the usual CFF header stuff - ie. just the bits you need.

My question: Am I right that you cant put such a CIDFont in a PDF file as an embedded font program (as part of a composite CIDFontType0) ?

From my reading of the spec, a composite font (Type0) in PDF references a CIDFont dict that doesn't have any of the info that the PS CIDFont contains (such as byte offsets to charstring data etc)

and so it needs the complete CFF program - specified as /FontFile3 in the font descriptor.

regards

JLM

↧

I want to understand TJ command with array string

March 9, 2015, 12:19 am

≫ Next: I downloaded Cativate 9 but it is in spanish. how can i change it to English?

≪ Previous: CIDFont and PDF

In one of my pdf inside the stream i have this code:

/F2 8.5 Tf

1 0 0 -1 0 7.0295 Tm

[IS, 12, B, 4, N: 978-1-449-32914-3] TJ

0 -16.2 Td

[[LS, 12, I]] TJ

The Abobe text block display

ISBN: 978-1-449-32914-3

[LSI]

I wanted to understand the significance of number inside [IS, 12, B, 4, N: 978-1-449-32914-3] , What does 12 and 4 stands for, where end result is ISBN: 978-1-449-32914-3

↧

I downloaded Cativate 9 but it is in spanish. how can i change it to English?

September 8, 2015, 7:59 am

≫ Next: Th operator and horizontal scaling?

≪ Previous: I want to understand TJ command with array string

My captivate downloaded in Spanish, how can I change it to English?

↧

Th operator and horizontal scaling?

September 9, 2015, 9:06 pm

≫ Next: Constructing PDF using PDF language

≪ Previous: I downloaded Cativate 9 but it is in spanish. how can i change it to English?

I'm trying to find an example of a pdf with horizontal scaling in one of it's Tm's.

The other thing I want to find is a pdf file with a Th operator utilized in it.

I'm thinking that the reason I can't find anything on this is because the Th operator is used for creating pdf files from scratch rather than the automated writing of a pdf from any file (using Adobe writer).

My understanding is that if I have a Tm like this. 12 10 20 36 0 99

and I have a Text Font operator showing this:

/F2 1

then the horizontal scaling would be 20 x 1 = 20.

But if this is the case then why would a pdf file contain a Th operator. Maybe they don't. I haven't found an example of one yet?

Also, are my assumptions correct about horizontal scaling above?

Myself and my colleague are trying to work this out so we can incorporate this into our pdf parsing tool we're writing used in our workplace?

↧

Constructing PDF using PDF language

December 30, 2015, 10:37 am

≫ Next: cross reference stream confusion (PDF 32000-1:2008)

≪ Previous: Th operator and horizontal scaling?

If I want to construct a pdf through scripting, can I just use pdf scripting in a text file (as specified in PDF reference manual) and save as pdf? Is there any alternative approach?

↧

cross reference stream confusion (PDF 32000-1:2008)

August 19, 2016, 8:32 am

≫ Next: How to display ligature of Hindi correctly in the PDF?

≪ Previous: Constructing PDF using PDF language

I am trying to understand how the /Index value works in cross reference stream dictionaries. According to Table 17 in section 7.5.8.2, the default value for this key is the array [0 Size]. The default makes sense- my confusion is regarding the non default case. The table states that the value should consist of:

"An array containing a pair of integers for each subsection in this section"

Suppose I have two subsections- the first with s1 objects, starting with object n1, and the second with s2 objects starting with object n2. Would I encode the xref table (with s1+s2 rows) into a single xref stream, and then have the /Index value consist of the array [n1 s1 n2 s2]?

Thanks very much for any help.

Patch

↧