<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Architecture on Quadrant IntegrityLens</title><link>https://docs.integritylens.quadrant.tools/architecture/</link><description>Recent content in Architecture on Quadrant IntegrityLens</description><generator>Hugo</generator><language>en</language><atom:link href="https://docs.integritylens.quadrant.tools/architecture/index.xml" rel="self" type="application/rss+xml"/><item><title>Text Extraction</title><link>https://docs.integritylens.quadrant.tools/architecture/extraction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.integritylens.quadrant.tools/architecture/extraction/</guid><description>&lt;h1 id="text-extraction"&gt;Text Extraction&lt;a class="anchor" href="#text-extraction"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Quadrant IntegrityLens uses a smart extraction strategy that balances speed and
accuracy depending on the type of PDF.&lt;/p&gt;
&lt;div class="book-steps"&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;h2 id="embedded-text-fast-path"&gt;Embedded text (fast path)&lt;a class="anchor" href="#embedded-text-fast-path"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Most PDFs created from Word processors have an embedded text layer. Extracting this text is very fast (~0.2 seconds) and produces high-quality results. This is the default path for most student submissions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h2 id="broken-text-layer-detection"&gt;Broken text layer detection&lt;a class="anchor" href="#broken-text-layer-detection"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Some PDFs — particularly those generated by LaTeX — have a text layer that contains garbled characters. Quadrant IntegrityLens detects this automatically by checking for specific Unicode indicators (standalone diaeresis characters) that signal a broken text layer. When a broken text layer is detected, Quadrant IntegrityLens falls back to OCR automatically. No manual intervention is needed.&lt;/p&gt;</description></item><item><title>Analysis</title><link>https://docs.integritylens.quadrant.tools/architecture/analysis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.integritylens.quadrant.tools/architecture/analysis/</guid><description>&lt;h1 id="analysis"&gt;Analysis&lt;a class="anchor" href="#analysis"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;After text extraction, Quadrant IntegrityLens parses the document structure and runs
scanners concurrently.&lt;/p&gt;
&lt;div class="book-steps"&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;h2 id="structure-parsing"&gt;Structure parsing&lt;a class="anchor" href="#structure-parsing"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The extracted Markdown text is parsed to identify &lt;strong&gt;page boundaries&lt;/strong&gt; (from &lt;code&gt;&amp;lt;!-- page N --&amp;gt;&lt;/code&gt; markers), &lt;strong&gt;headings&lt;/strong&gt; (Markdown headings at any level), and &lt;strong&gt;sections&lt;/strong&gt; (text between headings). This structure allows each finding to be annotated with a precise location: page number, heading, and surrounding section text.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;h2 id="concurrent-scanning"&gt;Concurrent scanning&lt;a class="anchor" href="#concurrent-scanning"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;All enabled scanners run concurrently on the full text. Each scanner is independent and focuses on a specific type of AI indicator. Scanners declare which languages they support — when you set &lt;code&gt;--language&lt;/code&gt;, only matching scanners run. Language-independent scanners (Unicode and structural) always run.&lt;/p&gt;</description></item></channel></rss>