michaelifer
New member
Adobe PDF reference told us that a PDF files can be explained through the following four aspects:
1. The object, a PDF document is composed of a set of basic data types of data structure.
2. File (physical structure), decided how object stores in a PDF document, how they are visited, how to be updated. This structure is independent of the semantic objects.
3. Document structure, explain how some basic object types shows the composition of the PDF document: for example page, pictures, font, notation, etc.
4. Content flow, a PDF document content flow contains a series of instructions, and describes the appearance of the page or other graphic and appearance of the document content entity.
Page object as the most important object in PDF, including how to display the page information, such as the fonts, about the content (the text, pictures, etc.) ( the size of the page. Inside information can read directly, of course, the real information stored in other object. Page contains information is included in a stream flow (called), the flow of the object length (bytes) must be given directly to another object or (contain a whole numerical, suggests that the flow length). The following figure:1. The object, a PDF document is composed of a set of basic data types of data structure.
2. File (physical structure), decided how object stores in a PDF document, how they are visited, how to be updated. This structure is independent of the semantic objects.
3. Document structure, explain how some basic object types shows the composition of the PDF document: for example page, pictures, font, notation, etc.
4. Content flow, a PDF document content flow contains a series of instructions, and describes the appearance of the page or other graphic and appearance of the document content entity.

PDF document logical structure
As a structured file format, a PDF document is made of some called "object". And every object has digital label, so the object can be referenced by other object. These objects don't need appear in PDF documents inside in order, the order can be arbitrary, such as a PDF file have 3 page, page 3 will appear on page 1, the only benefit of object appears in order is to increase the readability of the file, if you don't use text editor to read PDF structure, then you need not to concern about the order.
File (Trail), indicating that the root object number, and the position of the table that cross references, through refer to the table can find directory objects (query Catalog). This directory object is the PDF documents of the root object, including the outline of a PDF document (like) and page group object (from mix. Outline the object is to point to PDF files bookmark trees; Page groups object (mix) with the file page number, each page object

PDF analytic process
Review of the above detailed explanation; we can put this simple PDF analytic process is simple as follows by illustrations
