Comparison of Postmarketing Findings vs the Initial Clinical Validation Findings of a Thyroid Nodule Gene Expression Classifier: A Systematic Review and Meta-analysis
Valderrabano P, Hallanger-Johnson JE, Thapa R, Wang X, McIver B
In the United States, the most used molecular test for the evaluation of cytologically indeterminate thyroid nodules is the Afirma gene expression classifier (GEC).
To evaluate the GEC’s diagnostic performance through a novel approach to assess whether the findings of the initial validation study are consistent with the results of postmarketing studies.
PubMed was systematically searched from inception through October 26, 2017, using the terms gene expression classifier or Afirma or GEC and thyroid.
Studies included were those in which the GEC diagnostic performance could be calculated on consecutively resected cytologically indeterminate thyroid nodules.
Data Extraction and Synthesis
Two observers independently assessed study eligibility and risk of bias using the quality assessment tool for observational cohort and cross-sectional studies of the National Heart, Lung, and Blood Institute. Summary data were extracted by a reviewer and reviewed independently by another. Study authors were contacted if missing data were needed. Data were pooled using a random-effects model. PRISMA and MOOSE guidelines were followed.
Main Outcomes and Measures
Evaluation of the linear correlation between the benign call rate (BCR) and the positive predictive value (PPV).
Of the 137 retrieved titles, 19 (13.9%) were included, comprising a total of 2568 thyroid nodules. Based on a simulation using the sensitivity and specificity reported in the initial validation study, the observed BCR and PPV values in postmarketing studies would have to be explained by different underlying prevalence rates of cancer (15% vs 30%), which is an impossible event. Furthermore, the overall correlation between BCR and PPV for independent studies fell outside the PPV 95% CI of the initial validation study (95% CI, 0.17-0.32) at the BCR of pooled independent studies (0.45) and was just at the limit of the BCR 95% CI of the initial validation study (95% CI, 0.32-0.45) at the PPV of pooled independent studies (0.45). The diagnostic performance was statistically significantly better for atypia or follicular lesions of undetermined significance (diagnostic odds ratio [DOR], 5.67; 95% CI, 4.23-7.60) compared with follicular neoplasms (DOR, 2.24; 95% CI, 1.45-3.47).
Conclusions and Relevance
The findings suggest that the initial validation study cohort was not representative of the populations in whom the GEC has been used, calling into question its reported diagnostic performance, including its negative predictive value.
JAMA Otolaryngol Head Neck Surg. 2019; doi: 10.1001/jamaoto.2019.1449