Apply: Stanford AI in Healthcare Leadership & Strategy Program (May–June 2026)

Governance

MAST is governed by an independent academic structure that ensures rigorous, unbiased evaluation of medical AI systems. This page outlines our governance model, team composition, and conflict of interest policies.

Overview

The MAST benchmark is governed by an independent academic structure designed to ensure rigorous, unbiased evaluation of medical AI systems. Our governance model separates strategic oversight from day-to-day operations, with clear accountability at every level.

All governance decisions — including benchmark selection, scoring methodology, and publication policy — are made through a transparent, consensus-driven process. No single institution or individual holds unilateral authority over MAST outcomes.

Steering Committee

The MAST Steering Committee provides strategic direction and ensures the benchmark maintains the highest standards of scientific rigor and clinical relevance. Committee members are drawn from leading academic medical centers and represent diverse clinical specialties.

The committee meets quarterly to review benchmark performance, approve new evaluation domains, and address any methodological concerns. All committee deliberations are documented and key decisions are published in our transparency reports.

Committee membership is rotated on a three-year cycle to ensure fresh perspectives and prevent institutional capture. Members must disclose all potential conflicts of interest upon appointment and annually thereafter.

Team Members

The MAST operational team is composed of clinician-scientists, AI researchers, biostatisticians, and medical educators from institutions across the ARISE Network. Each team member brings specialized expertise essential to maintaining evaluation quality.

Our annotation workforce consists of board-certified physicians across 10 medical specialties who undergo standardized training on our scoring rubrics before participating in evaluations. All annotators are credentialed and their qualifications are verified independently.

Technical infrastructure is maintained by a dedicated engineering team responsible for the evaluation pipeline, data security, and leaderboard operations.

Methodology and Weighting

MAST composite scores are calculated using a weighted aggregation across constituent benchmarks. Weights are determined by the Steering Committee based on clinical importance, task diversity, and methodological maturity of each benchmark component.

The current weighting framework prioritizes patient safety metrics above diagnostic accuracy, reflecting the principle that harm avoidance is the foundational requirement for clinical AI deployment. Specific weight allocations are published alongside each benchmark release.

Weighting decisions are reviewed annually and may be adjusted as new clinical evidence emerges or as the benchmark suite expands. Any changes to the weighting methodology are announced at least 60 days before taking effect, with a public comment period for stakeholder feedback.

Disclosures and Conflicts

All individuals involved in MAST governance, evaluation, and operations are required to disclose any financial relationships, advisory roles, or equity holdings with AI companies whose products are or may be evaluated by the benchmark.

Disclosed conflicts are reviewed by an independent ethics officer. Individuals with material conflicts are recused from scoring, methodology decisions, or publication review related to the conflicted entity. Recusals are documented and reported in our annual transparency disclosures.

MAST does not accept direct funding from AI companies for benchmark development or evaluation operations. Institutional funding sources are disclosed publicly and reviewed annually to ensure no indirect conflicts compromise evaluation integrity.

Join us in shaping the future of
healthcare with AI

Mailing List Signup