ARISE
ARISE Logo

MAST

About

The Medical AI Superintelligence Test is an independent effort run by the ARISE AI Research Network to curate the most robust and realistic clinical benchmarks to measure the performance of medical AI. MAST exists to ensure that AI entering healthcare is rigorously tested, independently validated, and held to the highest clinical standards before it reaches patients.

ARISE research team collaboration

Our Mission

To establish an open, evidence-based evaluation framework that holds medical AI to the highest clinical standards — ensuring that deployed systems help rather than harm patients. We believe rigorous, independent benchmarking is the foundation of safe AI adoption in healthcare.

Our Approach

We evaluate AI systems the way medicine evaluates treatments: with blinded assessments, expert panels, standardized rubrics, and transparent methodology. Every benchmark in the MAST suite is designed by board-certified physicians, validated against clinical consensus, and resistant to data contamination or shortcut learning.

Team

MAST is developed by a multidisciplinary team of clinicians, AI researchers, biostatisticians, and medical educators from ARISE, an independent academic collaborative spanning Stanford Medicine, Harvard Medical School, and partner institutions.

The MAST Steering Committee provides strategic direction, approves new evaluation domains, and sets the weighting methodology for the composite score. Committee members are drawn from leading academic medical centers and represent diverse clinical specialties. Membership rotates on a three-year cycle, and members must disclose all potential conflicts of interest upon appointment and annually thereafter.

The annotation workforce consists of board-certified physicians across 10 medical specialties who undergo standardized training on scoring rubrics before participating in evaluations. Technical infrastructure is maintained by a dedicated engineering team responsible for the evaluation pipeline, data security, and leaderboard operations.

Meet the full team

Independence and Funding

MAST does not accept direct funding from AI companies for benchmark development or evaluation. Institutional funding sources are disclosed publicly and reviewed annually.

SourceTypePeriodPurpose
Stanford MedicineInstitutional2024–PresentCore research infrastructure and personnel
Harvard Medical SchoolInstitutional2024–PresentClinical validation and annotation support
NIH/NIDDKFederal Grant2024–2026Benchmark development and data curation

Conflict of Interest Policy

ARISE maintains strict conflict of interest policies to protect the integrity of MAST evaluations. The following rules apply to all team members, advisors, and collaborators involved in the benchmark process:

  • AI companies cannot fund or sponsor specific benchmark evaluations or influence evaluation scheduling.
  • Team members with financial affiliations to any evaluated AI company must recuse themselves from scoring that company's submissions.
  • All advisory board members must disclose potential conflicts of interest, which are reviewed annually and published on our website.
  • Evaluation rubrics and scoring criteria are locked before any model submission is evaluated and cannot be modified retroactively.
  • External audit of our evaluation process is conducted annually by an independent academic review committee.

Data Use Policy

During evaluation, model providers submit their systems through our controlled API pipeline. MAST does not share benchmark cases with model providers before or after evaluation. All evaluation data is processed in a secure environment, and model outputs are stored only for the duration needed to complete scoring.

De-identified clinical cases used in the benchmark are sourced from existing institutional research datasets with appropriate IRB approvals. No patient-identifiable information is included in any benchmark case. Model providers' API keys and system configurations are handled under standard data protection protocols and are not retained after evaluation completion.

Contact

For questions about MAST, our methodology, or our transparency practices:

Contact Us