Structural Pattern Scanners#

These scanners detect formatting patterns and textual artifacts that are characteristic of AI-generated content. They run regardless of the selected language.

Label lists (label-list)#

Detects AI-style lists where each item starts with a bold label followed by a description:

Clarity: The text should be easy to understand. Brevity: Keep sentences short. Accuracy: Facts must be verified.

Students writing lists typically use plain bullet points without bold labels.

Bold overuse (bold-overuse)#

Detects paragraphs with excessive **bold** markup (three or more bold phrases per paragraph). AI models, especially ChatGPT, tend to bold key terms throughout their output.

Thematic breaks (thematic-break)#

Detects Markdown horizontal rules (---, ***, ___). These appear when AI-generated Markdown is pasted into a document without cleanup.

Confidence: Medium for one occurrence, High for multiple.

Emoji (emoji)#

Detects emoji characters in academic text. Handles complex emoji sequences (ZWJ sequences, skin tone modifiers) correctly.

Confidence: High — emoji have no place in academic submissions.

Skipping heading levels (skipping-heading-levels)#

Detects when Markdown headings skip levels (e.g., jumping from # to ### without ##). This is a structural artifact of AI-generated outlines.

Confidence: Medium for one skip, High for multiple.

Rule of three (rule-of-three)#

Detects the AI tendency to present items in groups of exactly three. When a disproportionate number of lists in a document have exactly three items, this pattern triggers.

The scanner calculates the ratio of three-item lists to all lists and triggers based on the ratio and absolute count.

ChatGPT artifacts (chatgpt-artifacts)#

Detects residual markup from ChatGPT, Grok, and Copilot that students forgot to remove:

  • turn0search0, turn0search1 (ChatGPT web search markers)
  • oaicite: references
  • <tool_call> and similar XML-like tags
  • Grok-specific markers

Confidence: High — these are definitive proof of AI tool usage.

Chatbot phrases (chatbot-phrases)#

Detects collaborative language typical of chatbot conversations:

  • “I hope this helps”
  • “Feel free to ask”
  • “As an AI language model”
  • “Sure! Here is…”

The scanner matches both English and German variants of these phrases.

Confidence: High — these phrases directly reveal chatbot interaction.

Placeholder text (placeholder-text)#

Detects unfilled template placeholders left behind from AI-generated text:

  • [Describe your...]
  • INSERT_URL
  • [TODO]
  • [Your name here]

Confidence: High — these indicate an AI template was used without complete customisation.

Colon-bullet cascade (colon-bullet-cascade)#

Detects the exposition pattern where a prose line ending in a colon is immediately followed by a bulleted or numbered list, repeated several times across the document. AI models organise explanations this way by default: every idea is introduced with a short “prompt line:” and unpacked as a bullet list.

A single colon-to-list handoff is a perfectly normal writing move. Three or more across a document is a strong structural tell.

Confidence: Low for 3–4 handoffs, Medium for 5–7, High for 8+.

AI meta-disclaimer signature (meta-disclaimer-signature)#

Detects opener and closer boilerplate that AI chat assistants produce reflexively around their response bodies. Users routinely copy-paste these artefacts into documents together with the actual content:

  • “Certainly! Here is…” / “Sure! Below is…”
  • “I hope this helps” / “Happy to help”
  • “Let me know if you need…” / “Feel free to ask…”
  • “As an AI language model…”

The scanner covers both English and German variants, because students frequently paste English chat responses into German documents unchanged.

Confidence: High when a match sits in the first or last 300 characters of the document (where such artefacts overwhelmingly appear) or when two or more matches occur; Medium for a single body-only match.