Structural Pattern Scanners#
These scanners detect formatting patterns and textual artifacts that are characteristic of AI-generated content. They run regardless of the selected language.
Label lists (label-list)#
Detects AI-style lists where each item starts with a bold label followed by a description:
Clarity: The text should be easy to understand. Brevity: Keep sentences short. Accuracy: Facts must be verified.
Students writing lists typically use plain bullet points without bold labels.
Bold overuse (bold-overuse)#
Detects paragraphs with excessive **bold** markup (three or more bold
phrases per paragraph). AI models, especially ChatGPT, tend to bold key
terms throughout their output.
Thematic breaks (thematic-break)#
Detects Markdown horizontal rules (---, ***, ___). These appear when
AI-generated Markdown is pasted into a document without cleanup.
Confidence: Medium for one occurrence, High for multiple.
Emoji (emoji)#
Detects emoji characters in academic text. Handles complex emoji sequences (ZWJ sequences, skin tone modifiers) correctly.
Confidence: High — emoji have no place in academic submissions.
Skipping heading levels (skipping-heading-levels)#
Detects when Markdown headings skip levels (e.g., jumping from # to ###
without ##). This is a structural artifact of AI-generated outlines.
Confidence: Medium for one skip, High for multiple.
Rule of three (rule-of-three)#
Detects the AI tendency to present items in groups of exactly three. When a disproportionate number of lists in a document have exactly three items, this pattern triggers.
The scanner calculates the ratio of three-item lists to all lists and triggers based on the ratio and absolute count.
ChatGPT artifacts (chatgpt-artifacts)#
Detects residual markup from ChatGPT, Grok, and Copilot that students forgot to remove:
turn0search0,turn0search1(ChatGPT web search markers)oaicite:references<tool_call>and similar XML-like tags- Grok-specific markers
Confidence: High — these are definitive proof of AI tool usage.
Chatbot phrases (chatbot-phrases)#
Detects collaborative language typical of chatbot conversations:
- “I hope this helps”
- “Feel free to ask”
- “As an AI language model”
- “Sure! Here is…”
The scanner matches both English and German variants of these phrases.
Confidence: High — these phrases directly reveal chatbot interaction.
Placeholder text (placeholder-text)#
Detects unfilled template placeholders left behind from AI-generated text:
[Describe your...]INSERT_URL[TODO][Your name here]
Confidence: High — these indicate an AI template was used without complete customisation.
Colon-bullet cascade (colon-bullet-cascade)#
Detects the exposition pattern where a prose line ending in a colon is immediately followed by a bulleted or numbered list, repeated several times across the document. AI models organise explanations this way by default: every idea is introduced with a short “prompt line:” and unpacked as a bullet list.
A single colon-to-list handoff is a perfectly normal writing move. Three or more across a document is a strong structural tell.
Confidence: Low for 3–4 handoffs, Medium for 5–7, High for 8+.
AI meta-disclaimer signature (meta-disclaimer-signature)#
Detects opener and closer boilerplate that AI chat assistants produce reflexively around their response bodies. Users routinely copy-paste these artefacts into documents together with the actual content:
- “Certainly! Here is…” / “Sure! Below is…”
- “I hope this helps” / “Happy to help”
- “Let me know if you need…” / “Feel free to ask…”
- “As an AI language model…”
The scanner covers both English and German variants, because students frequently paste English chat responses into German documents unchanged.
Confidence: High when a match sits in the first or last 300 characters of the document (where such artefacts overwhelmingly appear) or when two or more matches occur; Medium for a single body-only match.