GammaMetric — Acquisition-State Reliability for Imaging AI

Live Engine Output

Real inputs. Real outputs.
No mock data.

Three studies run through the sensitivity engine. Each score is what the API returns — your system decides what to do with it.

■ RED — Immediate Alert Siemens SOMATOM · 5.0mm · B40f · 8.2 mGy

Est. Relative Sensitivity 56.2% baseline 78.2% · CI [51.2–61.2%]

Degradation −22pp below validated baseline

Drivers

Slice thickness 5.0mm −13.2pp CTDIvol 5.0 mGy −2.3pp

Diameter Uncertainty

Mean shift ↑ +1.7mm · 95% CI width 9.9mm — nodule sizes may be overestimated under these conditions

Score returned: RED. 3–6mm nodules 28% relatively less likely to be detected under current acquisition conditions. Diameter measurements may be overestimated by +1.7mm on average. Your system decides what to surface.

Example downstream action — hospital-side email notification built on the RED score

RED Acquisition Reliability Warning alert

■ YELLOW — Daily Digest GE Revolution · 3.75mm · B30f · 7.5 mGy

Est. Relative Sensitivity 68.9% baseline 78.2% · CI [63.9–73.9%]

Degradation −9.3pp below validated baseline

Drivers

Slice thickness 3.75mm −8.1pp Dose 7.5 mGy −1.2pp

Score returned: YELLOW. Acquisition conditions are moderately outside the validated envelope. Your system decides whether to flag or surface the result normally.

■ GREEN — No Action Philips IQon · 1.25mm · B30f · 9.0 mGy

Est. Relative Sensitivity 77.7% baseline 78.2% · CI [72.7–82.7%]

Degradation −0.5pp within normal range

Drivers

Dose 9.0 mGy −0.5pp

Score returned: GREEN. Acquisition parameters within the characterized validation envelope. Proceed normally.

Note — These are real outputs from the live engine. Sensitivity deltas are derived from 154-case LIDC-IDRI perturbation experiments (arXiv:2603.26785). Baseline: MONAI RetinaNet, LUNA16-trained, v1.0.0.

View Sample Detectability Report →

Capabilities

One engine. Two buyers.
Same acquisition-state signal.

The underlying engine is shared. AI vendors use it per-scan before surfacing a result. Health systems use it site-wide for governance and protocol QA. Same physics. Different surfaces.

AI Vendors — Available Now

Reliability API

Per-scan reliability score (GREEN / YELLOW / RED) via REST API or DICOM webhook — your system decides how to use it
Physics-grounded sensitivity prediction from acquisition metadata — slice thickness, kernel, dose, reconstruction state
Pixel-based acquisition fingerprinting — detects conditions standard DICOM cannot expose; ConvolutionKernel reads identically for FBP and iterative reconstruction (AUC 0.995 on independent phantom validation)
Detection-aware scoring — tier elevates when AI confidence falls in the acquisition-sensitive regime under mismatch
Per-nodule comparability scoring across prior and current reconstruction conditions
Full audit log — every study classified and timestamped
PDF reliability report on demand — suitable for post-market surveillance documentation
Based on published research: arXiv:2603.26785, under review at Academic Radiology

Request a Demo →

Health Systems — Available Now

Site Reliability Monitoring

Passive Orthanc DICOM listener — every study classified automatically, no workflow change
Per-study reliability record — acquisition parameters, sensitivity estimate, full audit trail
Automated site reliability report — acquisition trends, protocol drift, sensitivity impact over time
Designed for post-market surveillance, Joint Commission QA, and CHAI governance programs
Answers the question regulators are starting to ask: is your AI performing as validated at this site, with these protocols?
Diameter uncertainty quantification — mean shift and 95% CI per acquisition state

Site dashboard — coming soon

Request a Pilot →

Also Available — Free Tool

CT Dose Analytics & Leapfrog Reporting

Leapfrog Section 8B, ACR DIR benchmarking, protocol outlier detection. Free at dose.gammametric.com.

Try It Free →

Case Studies

The research behind the work.

Two analyses showing exactly what GammaMetric measures — and what it finds.

Protocol Optimization

Your Protocols Are Costing You on Three Fronts Simultaneously

Dose compliance. Image quality. AI performance. Most CT protocol reviews address one. This analysis shows how the parameters interact — and which ones actually matter.

5mm slice thickness: −13.2pp AI sensitivity loss
Soft reconstruction kernel: −10.5pp AI sensitivity loss
mAs reduction: only −4pp — the least destructive lever
Leapfrog compliance and AI performance are different problems

Read the case study →

AI Validation

How Your AI Degrades After Deployment

Post-deployment validation of a CT lung nodule detection algorithm across six real-world imaging perturbations. Based on LIDC-IDRI (154 cases). Methodology: arXiv:2603.26785.

Baseline sensitivity: 84.8% under reference protocol
Combined perturbation: ~65–68% — a 20pp gap
Effect most pronounced in the 3–6mm nodule range
Vendor benchmarks do not reflect site-specific conditions

Read the case study →

The Problem

AI models are validated in one imaging environment
and deployed into another.

The gap between validation conditions and real-world deployment is where AI performance quietly degrades — and where accountability belongs to whoever ships the model.

01

DICOM Metadata Is Not Enough

ConvolutionKernel reads identically for FBP and iterative reconstruction on major scanners. Slice thickness varies across sites. Dose drifts without notice. Standard pipelines are blind to acquisition conditions that materially affect model performance.

02

Vendor Benchmarks Don't Reflect Deployment

FDA clearance is tested at controlled dose levels and standard protocols. Real-world sites run lower doses, thicker slices, and varied reconstruction. The gap between "cleared" and "deployed" is rarely measured — until now.

03

Failures Get Blamed on the Model

When acquisition conditions push a study outside the validated envelope, the AI result is unreliable — but the model gets blamed. GammaMetric quantifies which studies are operating outside that envelope before a result is surfaced.

Integration

One API call.
Before you surface a result.

GammaMetric runs passively in your pipeline. Every study gets scored before your AI result is surfaced — your system decides what to do with the signal.

01

Send the Study

Forward DICOM headers (or pixel data) to the GammaMetric API via webhook or REST. Only acquisition parameters and pixel patches are used — no PHI transmitted, no image storage.

02

Get a Reliability Score

Each study is scored against your model's characterized acquisition envelope. Slice thickness, kernel, dose, and reconstruction state are all assessed. You get GREEN / YELLOW / RED plus sensitivity delta.

03

Your System Decides

Show the result. Suppress it. Flag it. Route it to secondary review. GammaMetric returns the signal — you own the decision. No clinical logic baked in, no radiologist-facing UI required.

On Demand

PDF Reliability Report

Generate a 7-page site-specific reliability report from any study — acquisition profile, sensitivity degradation analysis, and prioritized protocol recommendations. Suitable for post-market surveillance documentation.

Request a Demo →

Also Available

CT Dose Analytics

Self-serve CT dose monitoring at dose.gammametric.com. Leapfrog Section 8B reporting, ACR DIR benchmarking, drift alerts. Free to use.

Try It Free →

Report Contents

Everything your quality
program needs

Compliance

Leapfrog Section 8B Reporting

Median DLP for routine head and abdomen-pelvis CT across all five Leapfrog pediatric age groups (<1, 1–4, 5–9, 10–14, 15–17) — formatted and ready for Section 8B reference.

Benchmarking

ACR DIR Benchmark Comparisons

Your facility's dose percentiles compared against ACR Dose Index Registry national reference levels. Clear status flags — Excellent, Acceptable, or Above Benchmark — for every body region.

Quality

Outlier Detection

Automatic identification of exams with unusually high DLP — repeat acquisitions, wrong protocols, or multi-phase studies — with transparent methodology notes for your physics team.

Optimization

Protocol Observations

Physicist observations on protocol consistency, scanner variability, and dose reduction opportunities — useful context for your quality improvement program beyond compliance reporting.

Trend

Dose Trends Across Reporting Period

Dose trends visualized across your full reporting period. Identify protocol changes, scanner drift, or technologist variability — supporting ongoing QA program development beyond Leapfrog season.

Deliverable

Professional PDF Report

Publication-quality output with percentile tables, benchmark charts, methodology documentation, and your facility name — suitable for quality committee presentation or Leapfrog submission reference.

Context

The performance gap
is measurable

GammaMetric's own pilot study quantifies how acquisition variability affects imaging AI — and why protocol optimization matters beyond compliance.

1 in 6

Patients who receive a different AI-derived Lung-RADS follow-up recommendation between full-dose and quarter-dose reconstructions of the same scan (n=183, LIDC-IDRI; replicated on real projection-domain data, AAPM Mayo).

~19pp

Sensitivity drop at 5mm slice thickness versus standard. The gap between your protocol and the vendor's validation conditions is rarely measured.

0.995 AUC

Domain separability between FBP and iterative reconstruction on phantom data — with identical DICOM ConvolutionKernel tags. Standard metadata pipelines cannot detect this condition. Pixel analysis can.

Pricing

Built for vendors.
Priced per site.

AI monitoring is the primary product. CT dose analytics runs alongside it, free.

AI Reliability Monitoring

Contact
for Pricing

terms vary by site — reach out to discuss

Per-scan reliability score via REST API or DICOM webhook
Physics-grounded sensitivity prediction — slice thickness, kernel, dose, reconstruction state
Pixel-based acquisition fingerprinting — detects conditions invisible to DICOM metadata
Detection-aware scoring — confidence-aware tier elevation on acquisition mismatch
PDF reliability report on demand — suitable for post-market surveillance documentation
Full audit log — every study classified and timestamped
Based on published research — defensible for regulatory review

Request a Demo →

CT Dose Analytics

$0

free tool · always available

Self-serve at dose.gammametric.com
Adult CT DLP percentiles — all body regions
Pediatric CT — all five Leapfrog age strata
ACR DIR national benchmark comparisons
Drift alerts and QA acknowledgment workflow
Physicist-reviewed reports available — $1,500/facility/year

Try It Free →

FAQ

Common questions

What data format do you accept?

CSV exports from Radimetrics (Bayer), DoseWatch (GE), or any dose monitoring system. Manual PACS query exports are also accepted. Common column naming conventions are auto-detected. Non-standard formats are welcome — format mapping is handled before analysis begins.

Is my data secure?

De-identifying patient data before sending is strongly recommended — the analysis only requires dose metrics, exam descriptions, and patient age. No PHI is needed or requested. Raw data files are not retained after analysis is complete.

What Leapfrog section does this cover?

Section 8B: Pediatric Computed Tomography (CT) Radiation Dose. This requires reporting median DLP for routine head and abdomen-pelvis CT across five pediatric age groups. Reports provide exactly those data points, plus adult CT analysis as a value-add for your quality program.

Is this a replacement for a medical physicist?

No. Every report includes review by a diagnostic medical physicist, but final interpretation, regulatory compliance, and clinical protocols remain the responsibility of your institution and its qualified physics staff. GammaMetric is a reporting and analytics service, not a substitute for physics oversight.

How are benchmarks determined?

Dose percentiles are compared against ACR Dose Index Registry (DIR) national reference levels, maintained and updated by a diagnostic medical physicist to reflect current national practice.

What does an AI validation engagement look like?

This describes a standalone validation engagement, separate from the continuous monitoring product. You provide de-identified DICOM data or model outputs across your facility's acquisition conditions (dose levels, slice thicknesses, protocols). GammaMetric applies systematic degradation and inference to quantify sensitivity loss and failure modes under each condition — delivered as a physicist-reviewed report with methodology documentation suitable for quality committee or regulatory review.

What does the API actually return?

A per-scan reliability score (GREEN / YELLOW / RED), estimated sensitivity delta relative to your model's validation baseline, acquisition state characterization (dose, slice thickness, kernel, reconstruction conditions), and any pixel-fingerprinting findings the DICOM metadata did not expose. Your system decides what to do — show the AI result, suppress it, flag it, or route it. Every scored study is logged with its full parameter set for audit trail purposes.

You know when your scanners are failing.
Do you know when your AI is?

Real inputs. Real outputs.
No mock data.

One engine. Two buyers.
Same acquisition-state signal.

Reliability API

Site Reliability Monitoring

The research behind the work.

Your Protocols Are Costing You on Three Fronts Simultaneously

How Your AI Degrades After Deployment

AI models are validated in one imaging environment
and deployed into another.

DICOM Metadata Is Not Enough

Vendor Benchmarks Don't Reflect Deployment

Failures Get Blamed on the Model

One API call.
Before you surface a result.

Send the Study

Get a Reliability Score

Your System Decides

PDF Reliability Report

CT Dose Analytics

Everything your quality
program needs

Leapfrog Section 8B Reporting

ACR DIR Benchmark Comparisons

Outlier Detection

Protocol Observations

Dose Trends Across Reporting Period

Professional PDF Report

The performance gap
is measurable

Built for vendors.
Priced per site.

Common questions

See it
working.

You know when your scanners are failing.Do you know when your AI is?

Real inputs. Real outputs.No mock data.

One engine. Two buyers.Same acquisition-state signal.

Reliability API

Site Reliability Monitoring

The research behind the work.

Your Protocols Are Costing You on Three Fronts Simultaneously

How Your AI Degrades After Deployment

AI models are validated in one imaging environmentand deployed into another.

DICOM Metadata Is Not Enough

Vendor Benchmarks Don't Reflect Deployment

Failures Get Blamed on the Model

One API call.Before you surface a result.

Send the Study

Get a Reliability Score

Your System Decides

PDF Reliability Report

CT Dose Analytics

Everything your qualityprogram needs

Leapfrog Section 8B Reporting

ACR DIR Benchmark Comparisons

Outlier Detection

Protocol Observations

Dose Trends Across Reporting Period

Professional PDF Report

The performance gapis measurable

Built for vendors.Priced per site.

Common questions

See itworking.

You know when your scanners are failing.
Do you know when your AI is?

Real inputs. Real outputs.
No mock data.

One engine. Two buyers.
Same acquisition-state signal.

AI models are validated in one imaging environment
and deployed into another.

One API call.
Before you surface a result.

Everything your quality
program needs

The performance gap
is measurable

Built for vendors.
Priced per site.

See it
working.