Clinical Data Accuracy, HITRUST Certified

Clinical accuracy & AI model validation

The xCures Clinical Clarity Engine (the Engine) uses two complementary large language model (LLM) approaches to extract and structure clinical data from heterogeneous medical records. Both approaches are validated against clinically trained human reviewers before deployment, with real-time monitoring/logging in place.

How the Models Work

Schema-based extraction

Schema-based extraction applies named entity recognition (NER) and relation extraction (RE) to unstructured clinical documents (notes, discharge summaries, scanned records), processing textual information into FHIR R4 and OHDSI-normalized structured data Linkage to source verbatim is preserved for each element for full traceability

Checklist-based assertion

Checklist-based assertion answers specific clinical questions (e.g., “What was the patient’s cancer stage at initial pathological diagnosis?”) across the full longitudinal record using retrieval-augmented generation (RAG), returning structured outputs with source citations and evidence-hierarchy rules to resolve conflicting documentation. Checklists use structured clinical data and the clinical context of natural language in medical records to generate patient-level assertions.

Validated performance

The table below reflects past human-validated performance across five extractors and checklists, measured against clinically trained reviewers using a random 10% audit with third-reviewer arbitration. These results are based on a retrospective analysis of a defined historical dataset and do not guarantee future performance.

Extractor / Checklist	Accuracy	Precision	Recall	F1 Score
Medications	95.7%	97.5%	95.0%	96.3%
Surgical Procedures	96.6%	97.7%	98.8%	98.2%
Cancer Diagnosis	98.2%	98.7%	99.4%	99.0%
Lines of Therapy	97.0%	95.4%	99.8%	97.6%

Deployment threshold: We strive to achieve accuracy and precision scores of ≥95% before any extractor or checklist enters production.
Source: Stuhlmiller TJ et al. “A Scalable Method for Validated Data Extraction from Electronic Health Records with Large Language Models.” Submitted for peer review, 2026. Full methods, supplemental tables, and raw counts available on request.

How we validate

Validation is not a one-time exercise. Every extractor follows this lifecycle:

Human Review

Clinically-trained human reviewers independently assess extracted outputs against source documents.

Classification

Reviewers classify each field as TP, TN, FP, or FN. Discrepancies between reviewers are adjudicated by a third reviewer with access to clinical experts.

Hallucination detection

Only explicitly stated, verifiable extractions are counted as True Positives. Correct inferences not present verbatim in the source document are counted as errors.

Edge cases

Errors are captured as edge cases and used to iteratively refine prompts, retrieval parameters, and conflict-resolution rules.

Version control

Extraction models are version-controlled and support rollback. A/B testing across prompts, models, and hyperparameters guides ongoing refinement.

Known limitations

OCR quality: Accuracy of extraction from scanned or faxed documents depends on the quality of the documents being processed. Degraded scans may introduce errors or omissions that are unavoidable whether a human or system is reading the documents.
Semantic search coverage gaps: Both checklists and schema LLMs rely on semantic search and do not pull from all patient documents. Only the top N semantically matched documents are used, meaning relevant information in lower-ranked documents could be missed.

Security & compliance

The xCures Engine is HITRUST e1 certified and operates on AWS infrastructure. The xCures HITRUST Certification inherits and leverages selected controls assessed in the AWS HITRUST r2 certification. Additionally, xCures periodically reviews the AWS ISO 27001 certification and SOC 2 attestation report to validate ongoing adherence with xCures compliance and security requirements.

Certifications and attestations

The xCures Clinical Clarity Engine is HITRUST e1 certified (in the process of being upgraded to HITRUST i1 Certification). It operates on AWS infrastructure which has the following certifications and attestations:

HITRUST r2

Certification

ISO 27001

2022 Certification

SOC2 Type2

Attestation

As part of the HITRUST certification process, xCures is able to inherit selected AWS HITRUST r2 controls to include as evidence in xCures’s HITRUST certification program.

Security controls

Control	Detail	Control
Encryption	In transit (TLS) and at rest
Access Controls	RBAC with Least Privilege and Minimum Necessary principles, MFA, and SSO
Audit Logging	Immutable logs via AWS + Datadog, SIEM monitoring
Data Deletion	In accordance with contractual requirements

Data governance & interoperability standards

Clinical data extracted by xCures is mapped to established healthcare standards to support interoperability, traceability to its source, and compatibility with downstream workflows.

Standards compliance

Control	Detail
FHIR R4	All extracted data mapped to FHIR R4 resources for downstream interoperability
OHDSI / OMOP	Concepts normalized to OHDSI Standardized Vocabularies (SNOMED, LOINC, RxNorm)
mCODE	Oncology-specific data elements aligned to HL7 Minimal Common Oncology Data Elements profile
HIPAA	HIPAA Compliant with annual HIPAA Evaluation and HITRUST certification program.

Data provenance

Every extracted data element is anchored to its source document. This means:

Structured data fields carry provenance metadata identifying the originating clinical document
Checklist-based assertions include citations listing the specific documents consulted and the document hierarchy used (e.g., pathology report prioritized over clinic note for cancer staging)
Source documents are preserved in the form they were received by the xCures Engine from the provider (CCDA, XML, PDF, image) alongside the normalized structured output
Full CRUD (Create, Read, Update, Delete) audit logs are maintained across all data access and modification events

Frequently asked

Is xCures HIPAA compliant?

Yes. Operating as your business associate, xCures® and the xCures Clinical Clarity Engine is HIPAA compliant. xCures conducts an annual HIPAA evaluation to validate HIPAA compliance and conducts further assessments as part of its HITRUST certification program.

Is xCures HITRUST certified?

Yes. The xCures Clinical Clarity Engine received HITRUST e1 certification in 2025, and in 2026 it is being upgraded to HITRUST i1 certification. Separately, it operates on AWS infrastructure, which holds HITRUST r2 and ISO 27001:2022 certifications and maintains a SOC 2 Type 2 attestation.

How is xCures extraction accuracy validated?

Every extractor is validated against clinically trained human reviewers before deployment. Reviewers classify each field as a true or false positive or negative, with discrepancies resolved by a third reviewer. Only explicitly stated, verifiable extractions count as correct, so inferred data is treated as an error.

How does xCures keep outputs traceable to the source?

Every extracted data element is anchored to its source document. Structured fields carry provenance metadata, and checklist assertions cite the specific documents used. Source documents are preserved in their original form alongside the structured output.

Where is xCures data hosted and how is it secured?

xCures operates on AWS infrastructure with encryption in transit and at rest. Access follows role-based controls with least-privilege and minimum-necessary principles, MFA, and SSO. Immutable audit logs run through AWS and Datadog with SIEM monitoring.

Contact

xCures trust & transparency