Hello everyone.
I'm trying to interpret exacty what is and isn't included in a PDF stream and to date am still confused. I'll paste a section of the ISO3200 - 1 PDF reference below.
I'm not sure, but these statements appear to contradict each other.
So I have a stream which specifies a lenghth of 2215 bytes in its compressed form.
There is a carriage return and a line feed at the start and end of the stream data falling between the 'stream' and 'endstream' keywords.
So my data looks like this : stream CR LF Data Data Data CR LF endstream Keep in mind that CR = Carriage Return and LF = Line feed
Before I remove the CR and LF from each end of the data the total size of the stream is 2217 bytes (between the 'stream' and 'endstream' keywords. From the first paragraph below it appears that I am reading the data between the Carriage return and line feed characters at each end which brings the compressed size down to 2213 bytes (not 2215 as the stream 'Lenght' specifies.
If I follow the second paragraph from Table 5 in relation to Stream Lenghth, it appears that only the carriage return and line feed at the end of the stream are removed. So the stream to be decompressed would look like this: CR LF Data Data Data . This in fact adheres to the Stream Lenght specification for that stream which is 2215 bytes?
When decompressing a stream, what should and shouldn't be included? Cut the CR and LF from the start or the end ,,, or both? Note the red bolded section below: "lie between the end-of-line marker (I assume this means not inclusive). Like saying, stand between those two people (this doesn't mean stand on these two people and centre yourself). Yet... the green bolded area in the second section doesn't mention the initial white space?
Perhaps this is what it means. The first whitespace character after the 'stream' keyword and the whitespace character preceding the 'endstream' keyword are ignored so the stream looks like this:
Original Stream Data before removing whitespace: CR LF Data Data Data CR LF
Actual Stream data to be decompressed (whitespace removed): LF Data Data Data CR
That last option produces a stream of 2215 bytes as well.
Thanks
Under 'Stream Objects - General'
The keyword stream that follows the stream dictionary shall be followed by an end-of-line marker
consisting of either a CARRIAGE RETURN and a LINE FEED or just a LINE FEED, and not by a CARRIAGE RETURN alone. The sequence of bytes that make up a stream lie between the end-of-line marker following the stream keyword and the endstream keyword; the stream dictionary specifies the exact number of bytes. There should be an end-of-line marker after the data and before endstream; this marker shall not be included in the stream length.
AND
From table 5 in relation to the stream Length.
(Required) The number of bytes from the beginning of the line
following the keyword stream to the last byte just before the
keyword endstream. (There may be an additional EOL
marker, preceding endstream, that is not included in the count
and is not logically part of the stream data.) See 7.3.8.2,
"Stream Extent", for further discussion.