Closed-loop QA preprint
medRxiv preprint
Production clinical AI documentation QA across 13 hospital sites, 42 tracked physician complaints, and 1,089 optimization iterations.
This is production evidence from a closed-loop QA system running against real physician feedback, physician complaints, and production notes.
Open project.
ScribeBench
Public benchmark
A clinical documentation fidelity benchmark for narrative quality, source fidelity, leak detection, dangerous fabrication, and rubric-based review.
It puts the hard part of ambient documentation in public view: did the system preserve delivered care, or did it write something that sounds right and is not true?
Open project.
ed-chest-pain-analytics
Public Stanford MCiM control repo
Chest pain risk support work built around ambient encounter data, differential framing, physician-authored decision traces, AIMI presentation, and AMIA 2026 podium abstract #15227.
It reflects how I think about clinical support systems: useful signal early, explicit uncertainty, and no pretense that the model is the physician.
Open project.
clinical-ai-learning-os
Private progress system, public shell
A Stanford MCiM learning and proof system for clinical AI, model validation, research inventory, quizzes, and post-graduation technical depth.
It is the operating system behind the public portfolio: source tracking, proof packets, technical gaps, and the research center Andrew asked me to use.
Open project.
clinical-nlp-patterns
Public architecture notes
Patterns for clinical documentation systems, extraction-first prompting, validation design, and QA loops.
This is the public technical layer behind a lot of my writing about chart safety, constraint design, and evidence grounding.
Open project.
diag2icd10
Private research/software with Geanderson Santos
Diagnosis-to-ICD-10 code selection using California HCAI frequency data, synthetic EM/HM examples, frequency-aware retrieval, constrained LLM selection, and error analysis.
Large clinical label spaces are where simple RAG breaks. The work separates retrieval misses from selector errors and treats coding as an evaluation problem.
sayvant-sqs-framework
Private framework artifact
Structured quality-scoring framework for clinical AI documentation systems, used as the methods layer under closed-loop QA, ScribeBench, and benchmark work.
Clinical documentation AI needs explicit scoring rules before it can claim chart quality. The framework makes the evaluation target inspectable.
sayvant-em-benchmark
Private benchmark artifact
Emergency medicine documentation benchmark artifact for multi-dimensional scoring of clinical AI notes.
Emergency medicine notes have high variation, time pressure, and billing risk. The benchmark keeps that setting visible instead of hiding it inside generic documentation examples.
human-in-the-loop procedural guidance
Preprint staging
Connected video laryngoscopy workflow for airway training, with computer vision observations, event telemetry, generated reports, JSON exports, and trainer-confirmed safety fields.
This connects the device work to clinical AI without pretending the AI is autonomous. The human confirms safety-critical fields.
pediatric-vl-cea
Private manuscript buildout
Pediatric video laryngoscopy cost-effectiveness analysis with manuscript draft, field-data strategy, and budget-impact appendix path.
Device adoption has to pencil out for the buyer and still improve patient care. Cost-effectiveness is part of the clinical argument.