All newsMedAscend Insights

Before Institutions Trust an AI Patient, They Should Ask What Sits Behind It

Why the next phase of AI simulation in healthcare education will be judged by reliability, fairness and educational impact, not just realism.

By Ahmed Sharaf, CEO MedAscend18 May 2026 8 to 10 min read

Read the article

Innovation is not just about what we can build. It is about what we should build, who it serves, who it might exclude, and who remains accountable when it goes wrong.

Key takeaways

The question is no longer whether an AI patient can hold a conversation. It is whether the system around it can be trusted, defended and evaluated.
Modality should match the learning objective. Voice and video serve different educational purposes; neither is inherently better.
Reliability, fairness, governance, curriculum alignment, assessment defensibility and educational impact are the six things institutions are really evaluating.
Bias evaluation, evidence depth and post-launch model monitoring still need more work across the whole sector.

Innovation is not just about what we can build.

It is about what we should build. Who it serves. Who it might exclude. Who it might harm. And who remains accountable when it goes wrong.

That question matters deeply in healthcare education. A simulated patient is not simply a chatbot with a clinical backstory. It is a learning environment. It shapes how students ask questions, interpret risk, practise empathy, receive feedback and understand what “good” clinical performance looks like.

As generative AI becomes easier to build with, the barrier to creating conversational tools has fallen dramatically. What once required years of engineering resource can now often be assembled far more quickly. That is exciting, but it also creates a serious challenge for institutions.

The question is no longer whether an AI system can hold a conversation. The question is whether it can be trusted.

For a real solution, institutions need answers to a tight set of questions:

Can it respond consistently across questioning styles?
Can it assess fairly across learner cohorts?
Can it support different learners equitably?
Can it align with local curricula and marking standards?
Can educators understand and oversee how feedback is generated?
Can institutions evidence that the solution improves learning, not just activity?

These are the questions that should define the next phase of AI-enabled clinical simulation.

The confidence problem in AI simulation

An educator and a nursing student reviewing learner feedback together on a laptop in a Clinical Skills Centre. — Educators reviewing learner feedback in a Clinical Skills Centre. The infrastructure around an AI patient is what institutions ultimately evaluate.

Healthcare education has always had a simulation bottleneck.

Students need repeated opportunities to practise communication, clinical reasoning, escalation, documentation and professional judgement. Educators need scalable ways to deliver that practice without compromising educational quality. Traditional simulation with actors, faculty observers and dedicated spaces is valuable, but it is expensive, time-limited and difficult to scale.

AI simulation can help address that bottleneck. It can allow learners to practise repeatedly, receive feedback quickly and encounter a wider range of clinical scenarios. The wider evidence base for virtual patient simulation is promising. A systematic review and meta-analysis of virtual patient simulations in health professions education found that virtual patients produced similar knowledge outcomes to traditional education and favoured virtual patients for skills outcomes, while also highlighting variation in study design and implementation.[1]

That distinction matters. The direction of travel is promising, but the sector should avoid mistaking momentum for settled evidence.

Institutions do not need AI systems that simply sound impressive in a demo. They need systems they can defend to students, faculty, regulators and, ultimately, patients.

NICE's evidence standards framework for digital health technologies was created to help evaluators and decision-makers assess whether digital health technologies are likely to offer benefits to users and the health and care system. It also makes clear that meeting evidence standards is not the same as formal NICE endorsement or regulatory approval.[2]

For AI simulation in healthcare education, that should be the mindset: not “does this feel exciting?”, but “has this been designed, tested and governed in a way that an institution can responsibly adopt?”

Evaluation framework

What institutions are really evaluating

Six areas that consistently come up in conversations with medical schools, NHS education teams and healthcare programme directors.

Reliability

Consistent patient behaviour and feedback across questioning styles and learner cohorts.

Fairness

Equitable performance across accents, dialects, demographics and communication styles.

Governance

Clear data, safety, escalation and oversight controls educators can defend.

Curriculum alignment

Mapped to local learning outcomes, assessment standards and disciplinary context.

Assessment defensibility

Scores tied to a transparent rubric and traceable to evidence in the consultation.

Educational impact

Evidence that the system improves learning, not just increases activity.

The realism question needs more nuance

One of the easiest mistakes in simulation is to equate realism with educational value.

A simulation can look realistic and still be poorly aligned to the learning objective. Equally, a simple simulation can be highly effective if it targets the right skill, at the right level, for the right learner.

Hamstra and colleagues challenged the traditional emphasis on fidelity as physical resemblance, arguing that educational effectiveness is better understood through concepts such as transfer of learning, learner engagement and suspension of disbelief.[3]

The goal is not always to make a simulated patient look, sound or behave with maximum realism. The goal is to select the level of realism that supports the intended learning outcome.

This is where the conversation around voice and video often becomes too simplistic.

Voice is not “less serious” because it lacks visuals. Video is not “better” simply because it looks more realistic. Both modalities can be educationally valuable when used for the right objective.

Modality

Modality should match the learning objective

Voice simulation thrives when the goal is

History-taking structure
Consultation flow
Question phrasing
Verbal reasoning
Repeated practice at scale
Early confidence building
Remote or asynchronous practice

Video simulation thrives when the goal involves

Breaking bad news
Distressed relatives
Mental health risk assessment
Delirium or confusion
Capacity assessment
De-escalation
Telemedicine realism
Recognising hesitation or emotional distress

The responsible question is not whether voice or video is better. It is what level of realism is educationally necessary for the skill being trained.

What institutions should expect from AI-enabled simulation

Before adopting an AI simulation solution, institutions should look beyond surface-level conversation quality.

A useful assurance framework should cover at least six areas:

Educational alignment: mapped to your learning outcomes, not generic ones.
Response reliability: consistent behaviour across questioning styles.
Assessment defensibility: scores traceable to rubric and transcript.
Governance and safety: clear data, escalation and oversight controls.
Fairness and equity: equivalent experience across accents and demographics.
Evidence generation: proof the solution improves learning over time.

The NHS Digital Technology Assessment Criteria brings together recognised good practice across areas such as clinical safety, data protection, technical security, interoperability, usability and accessibility. Although educational simulation platforms are not identical to clinical digital health technologies, institutions are increasingly expecting the same mindset: structured assurance, not informal promises.

Curriculum embedding is the real adoption challenge

Most AI simulation tools can hold a conversation. Very few are built to slot into the way a programme actually teaches and assesses. Curriculum embedding is where adoption usually breaks down, and where MedAscend is designed to be a complete solution rather than a bolt-on.

A simulation platform that does not map to your curriculum is a tool. A platform that does is a solution.

In practice, MedAscend helps programmes embed AI simulation by giving educators the controls they need without engineering support:

Map every scenario to your learning outcomes, year of study and discipline.
Embed your own marking rubric so feedback reflects your standards, not the vendor's.
Align with existing OSCE blueprints and station structures, including timed circuits.
Author scenarios in-house with AI autofill, so faculty own the content.
Track cohort and station-level analytics against curriculum domains.
Support multi-discipline programmes across medicine, nursing, pharmacy, PA, paramedicine and allied health from one workspace.

The result is a solution that fits inside the curriculum, rather than asking the curriculum to reshape itself around the technology.

Fairness cannot be an afterthought

Bias in AI simulation is not only a technical issue. It is an educational and ethical issue.

If a system responds less accurately to certain accents, first languages or speech patterns, some learners may receive a poorer experience. If feedback models reward one communication style while penalising another, assessment may become unfair. If patient personas are not designed carefully, simulation can reinforce stereotypes rather than challenge them.

The WHO's guidance on AI for health emphasises that ethics and human rights must be placed at the centre of AI design, deployment and use. It also highlights the need for governance that holds stakeholders accountable to healthcare workers and the communities affected by these technologies.[4]

Healthcare education should apply the same standard.

21 months of pre-launch work

Five core problems we focused on before launch

Founded: February 2024Market launch: November 202521 months of pre-launch development, co-design, evaluation and iteration

1
Data quality
Volume alone is not enough. The dataset became part of the infrastructure needed to move beyond an impressive conversation and towards a reliable learning system, supporting response consistency, disclosure behaviour, feedback specificity and assessment logic.
5,000,000+real consultation data points curated for structure, not just volume
2
Educator alignment
Co-design sessions with educators and institutions reshaped the product. Schools needed educator-controlled scenarios, curriculum alignment, marking frameworks, analytics, governance and feedback that reflected their own standards, across medicine, nursing, pharmacy, physician associate and allied health programmes.
3
Assessment reliability
An inter-rater reliability study compared MedAscend's AI feedback engine with real OSCE examiners across tested domains. The aim is not to replace examiners. It is to make formative feedback close enough to human standards to support repeated practice at scale.
4
Product iteration through failure
Disclosure control, actionable feedback, scoring alignment with rubrics, and clearer separation between what a learner missed and what was never reasonably elicited. These were not minor issues. They were the work.
5
Evidence generation
Three research papers are currently in peer review. Until publication, those findings are described as work in peer review, not settled evidence. We have also run funded pilots, including pilots funded by us, to remove financial barriers to early institutional evaluation.

What we have not solved yet

A credible conversation about AI simulation should not pretend that every challenge has been solved.

Algorithmic bias in health AI is well documented: a widely cited study found that an algorithm used to manage the care of millions of patients systematically under-referred Black patients relative to equally sick White patients, producing unequal outcomes across populations.[5] Although AI simulation is not the same as a clinical decision algorithm, the principle transfers: if a system influences learning, feedback or progression, institutions should expect transparency about bias, generalisability and subgroup performance.

Honest assessment

What still needs more work across the sector

Bias and subgroup evaluation

More formal subgroup evaluation across learner demographics, accents, dialects, first-language differences and socioeconomic context is needed across the whole sector, not just one company.

Evidence depth and long-term outcomes

Early pilots can show engagement and acceptability. The sector still needs stronger evidence on objective performance, retention, transfer into clinical settings and long-term educational impact.

Model drift and post-launch monitoring

AI systems change, underlying models update and user behaviour shifts. A simulation system that performs well at launch still needs structured monitoring after launch.

These are not reasons to avoid AI simulation. They are reasons to adopt it carefully and evaluate it continuously.

Working with educators, not around them

Why these problems need to be solved with educators

The most important solutions in AI simulation will not be built by companies alone. They will be built with the educators responsible for teaching, assessing and protecting learners.

Learner proximity

Built from the problem students face: too few opportunities to practise, too little detailed feedback.

Educator co-design

Shaped with clinicians, educators and institutions, not delivered to them as a finished product.

Institution-defined rubrics

Marking frameworks and learning outcomes set by the institution, not by the vendor.

Funded pilots and real-world evaluation

Removing financial barriers so institutions can generate real usage and feedback data.

Continuous feedback loops

Iteration driven by educator input, learner outcomes and post-deployment observation.

Governance and oversight

Clear controls around data, scenarios, escalation, safeguarding and review.

The next phase

Next month, we are launching MedAscend's new AI engine, built on this work and designed to improve the reliability of patient responses, assessment and feedback across both voice and video simulation.

The focus is not novelty for its own sake. It is delivering a solution institutions can actually adopt: consistency, disclosure control, rubric alignment, feedback specificity, modality choice and educator oversight, without pretending that one format, one scenario type or one assessment model fits every context.

Learners do not just need an AI that sounds human. They need feedback they can trust. They need responses that remain consistent across different questioning styles. They need assessment that reflects the standards used by real educators.

The standard

The future of AI simulation will not be defined by who adds AI fastest. It will be defined by who can build systems that are reliable, fair, governed, educationally aligned and capable of improving learning without widening inequity.

Not

Can this AI patient talk?

But

Can this system support the learning objective, produce defensible feedback, protect learners, reduce bias, give educators control and generate evidence over time?

That is the standard institutions should expect. It is also the standard companies in this space should hold themselves to.

References

Sources cited in this article

Show all 5 references

1.Kononowicz AA, Woodham LA, Edelbring S, Stathakarou N, Davies D, Saxena N, et al. Virtual patient simulations in health professions education: systematic review and meta-analysis by the Digital Health Education Collaboration. Journal of Medical Internet Research. 2019;21(7):e14676.
2.NICE. Evidence standards framework for digital health technologies.
3.Hamstra SJ, Brydges R, Hatala R, Zendejas B, Cook DA. Reconsidering fidelity in simulation-based training. Academic Medicine. 2014;89(3):387–392.
4.World Health Organization. Ethics and governance of artificial intelligence for health: WHO guidance. 2021.
5.Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–453.

1.Kononowicz AA, Woodham LA, Edelbring S, Stathakarou N, Davies D, Saxena N, et al. Virtual patient simulations in health professions education: systematic review and meta-analysis by the Digital Health Education Collaboration. Journal of Medical Internet Research. 2019;21(7):e14676.
2.NICE. Evidence standards framework for digital health technologies.
3.Hamstra SJ, Brydges R, Hatala R, Zendejas B, Cook DA. Reconsidering fidelity in simulation-based training. Academic Medicine. 2014;89(3):387–392.
4.World Health Organization. Ethics and governance of artificial intelligence for health: WHO guidance. 2021.
5.Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–453.

Written by Ahmed Sharaf, CEO MedAscend