Goal Make the AI proofreader produce atomic phrase-level edits (the style of a real human proofreader), not sentence-level rewrites. The same clause skeleton the student wrote is preserved — only the errors are touched.
Pipeline Constrained Claude prompt → streaming NDJSON (one correction object per line, pushed over Server-Sent Events as Claude emits them) → per-correction minimal-diff trim (strip any prefix/suffix words shared between original and replacement) → JavaScript validator (drops spans that cross sentence boundaries, break word boundaries, or exceed 4 tokens) → live letter render. The default below was pre-computed on a sample letter; paste your own letter and corrections stream in one-by-one, typically first byte in 2–4 seconds.
The prompt enforces small-span output through four rules. Full text below — copy and test it yourself on any OET letter.
loading…
Even when the LLM picks a small span, the span often still has repeated words that are identical in the correction — e.g. bilateral ears-ringing → bilateral tympanic membranes. The word "bilateral" appears in both, so it doesn't need to be part of the correction. This post-processor strips those shared prefix/suffix words so the span contains only the words that actually change: ears-ringing → tympanic membranes. Runs after the LLM call, before the validator. Strict equality — a capitalisation change still counts as a change.
loading…
After trim, this validator runs before render. It drops any correction whose original_text crosses a sentence boundary, starts or ends mid-word, or exceeds the token budget. Dropped corrections are silently filtered — not shown in the UI, only surfaced as a count in the metrics strip. This pre-render check directly addresses the "broken tracked changes" and "duplicated text" issues from the brief.
loading…
The sample you see above is not hand-authored. The pipeline was run once on the COPD letter with a single Claude Sonnet 4.6 API call. The JSON response was saved and is loaded here as the default. The Try Your Own Letter button calls the same endpoint live.
claude-sonnet-4-6--- delimitersThe validator also demonstrates itself: one of the LLM's 21 corrections ("time. we" → "time. We") technically crosses a sentence boundary, and the validator flags it — an example of the pipeline catching a real LLM slip.