Hi, I am a total newbie to pdf development. Basically I am working on a MS SQL Server procedure that looks for a specific keyword in the uploaded pdf. The uploaded pdf is stored in a binary column of a database table. The T-SQL functions seem to be working when looking for keywords, but I want to further optimize the query by eliminating all those pdfs in which you cannot do text search. for the lack of my knowledge I call them "Image PDFs" as opposed to those which you can search the text in hence "Text PDFs".
Is there any place/metadata in the file content itself that can help me differentiate an image pdf from the text pdf. I hope I was able to explain?
Thanks in advance for any leads.
Aamir.