Architecture#
Quadrant IntegrityLens processes documents through a pipeline of extraction, analysis, and reporting.
flowchart TD
A[PDF Input] --> B{Text layer OK?}
B -- Yes --> C[Embedded Text Extraction<br/>~0.2s]
B -- No --> D[OCR with PaddleOCR<br/>~25s]
C --> E[Markdown with Page Markers]
D --> E
E --> F[Parse Structure<br/>Pages + Headings]
F --> G[Run Scanners Concurrently]
G --> H[Annotate Findings<br/>Page, Heading, Section]
H --> I[Sort by Position]
I --> J[Terminal Display]
I --> K[PDF Report]- Text Extraction — how PDFs are converted to analysable text
- Analysis — how scanners process the text and produce findings