Potential Limitations of Clinical Criteria for the Diagnosis of Idiopathic Pulmonary Fibrosis/Cryptogenic Fibrosing AlveolitisPeckham R.M.a · Shorr A.F.b · Helman Jr D.L.b
Departments of aInternal Medicine, and bPulmonary and Critical Care Medicine, Walter Reed Army Medical Center, Washington, D.C., USA
Background: The need to perform surgical lung biopsy (SLB) in all cases of suspected idiopathic pulmonary fibrosis/cryptogenic fibrosing alveolitis (IPF/CFA) is controversial. The American Thoracic Society (ATS) and the European Respiratory Society (ERS) recently endorsed explicit clinical criteria for the diagnosis of IPF/CFA in the absence of SLB. Prior studies evaluating clinical criteria for the diagnosis of IPF/CFA have been limited in that either they were performed by clinicians with expertise in the diagnosis of IPF/CFA or they did not utilize explicit diagnostic criteria. We investigated the accuracy of the ATS/ERS criteria when applied in a general pulmonary medicine setting. Objectives: To determine the interobserver variability of clinical criteria for the diagnosis of IPF/CFA. Methods: This was a retrospective, blinded evaluation by three board certified pulmonary physicians without extensive experience in the evaluation of IPF/CFA performed at a United States Army tertiary care academic medical center. Patients referred for surgical lung biopsy as part of a diagnostic evaluation of interstitial lung disease (ILD) were evaluated. The physicians reviewed high-resolution computed tomography scans of the chest (HRCT) and clinical data for each patient. The physicians were blinded to all other data and to the opinions of other study participants. Employing the histologic presence of usual interstitial pneumonia (UIP) coupled with appropriate clinical findings as the gold standard for a diagnosis of IPF/CFA we determined the accuracy and interobserver variability for a diagnosis of IPF/CFA based on HRCT alone and based on the ATS/ERS clinical criteria. Results: The sensitivity and positive predictive value for a HRCT diagnosis of IPF/CFA were 71% each while specificity and negative predictive value were 67% each. For the ATS/ERS criteria sensitivity, specificity, positive predictive value and negative predictive value were 71, 75, 77 and 69%, respectively. The interobserver variability, expressed as a kappa coefficient, for HRCT and the ATS/ERS criteria were 0.59 and 0.53, respectively. Conclusions: Both HRCT and the ATS/ERS clinical criteria may lead to misdiagnosis of patients with ILD. Further studies are needed to fully characterize the accuracy of these tests when applied in a routine pulmonary medicine practice setting.
Copyright © 2004 S. Karger AG, Basel
Cryptogenic fibrosing alveolitis (CFA), also known as idiopathic pulmonary fibrosis (IPF), is a distinct interstitial lung disease (ILD) of unknown etiology characterized histologically by the pattern of usual interstitial pneumonia (UIP) . The gold standard for diagnosis of IPF/CFA requires appropriate clinical findings coupled with a surgical lung biopsy (SLB) demonstrating UIP . For a variety of reasons not all patients with suspected IPF/CFA undergo SLB. Clinicians have instead diagnosed such patients on clinical grounds alone. Recent efforts directed at determining the value of a clinical diagnosis of IPF/CFA absent SLB culminated in joint publication of explicit clinical criteria by the American Thoracic Society (ATS) and the European Respiratory Society (ERS) (table 1) .
Table 1. ATS/ERS criteria for diagnosis of IPF/CFA in absence of surgical lung biopsy
To date practitioners recognized for their expertise in the area of ILD have performed much of the research investigating the utility of clinical criteria for the diagnosis of IPF/CFA. Raghu et al. , using SLB as a gold standard, demonstrated the sensitivity and specificity of their predetermined clinical criteria were 62 and 97%, respectively. A subsequent multicenter study utilizing expert opinion rather than pre-defined clinical criteria achieved an overall sensitivity and specificity of a clinical diagnosis of IPF/CFA of 72 and 84%, respectively . In part because of the relatively low incidence of IPF/CFA outside large referral centers the ATS/ERS criteria for a clinical diagnosis of IPF/CFA have not been validated in the community setting. We, therefore, conducted an investigation to help define the utility of the ATS/ERS criteria when applied by pulmonologists in general practice.
Materials and Methods
This retrospective evaluation of cases of ILD was performed at a United States Army teaching hospital. We identified potential cases from an institutional operative database of patients who had undergone SLB (either an open lung biopsy or a video assisted thoracoscopic procedure) for evaluation of ILD between the years 1995 and 2000. A single investigator reviewed the medical records for each patient and recorded, in a standardized fashion, data pertinent to the diagnostic evaluation of IPF/CFA. Available high resolution thoracic computed tomography (HRCT) scans from the pre-operative period were obtained as well. Fifteen cases were excluded because complete clinical or radiographic data were unavailable. Six (40%) of these fifteen patients had a SLB revealing UIP histology, two had sarcoidosis, two end stage fibrosis, one acute interstitial pneumonia (AIP), one non-specific interstitial pneumonia (NSIP), one respiratory bronchiolitis interstitial lung disease (RBILD), one Wegener’s granulomatosis, and one whose biopsy revealed only atelectasis and emphysematous changes. In addition two cases were excluded because IPF/CFA was not a diagnostic consideration at the time of the SLB. One of these was a 32 year old male with rapid onset of respiratory symptoms whose SLB revealed lymphangitic spread of adenocarcinoma. The other was an 85-year-old male without respiratory complaints who had a known history of chronic lymphocytic leukemia (CLL) and whose SLB revealed CLL.
The Department of Clinical Investigation at Walter Reed Army Medical Center approved this study. No extramural funding was utilized.
Three board certified pulmonologists independently reviewed each case; none of the three had participated in the case identification process. All three physicians practiced in a tertiary care referral setting, but none had extensive experience with ILD. All evaluators were familiar with the ATS/ERS criteria for a clinical diagnosis of IPF/CFA. Evaluators initially reviewed the HRCT and recorded whether or not the radiograph supported a diagnosis of IPF/CFA. HRCT interpretation was based upon each pulmonologist’s knowledge of the radiographic features characteristic of IPF/CFA as delineated by the ATS/ERS criteria. After recording their decision, the pulmonologists next reviewed clinical data, presented in a standardized format, for each case. The clinical data was similar to that available to the expert reviewers in Hunninghake’s project and included: patient age; gender; presence and duration of dyspnea; presence and duration of cough; history of fever, weight loss, myalgias, arthralgias, rash or arthritis; past medical history; tobacco history; history of exposure to agents known to cause interstitial lung disease, presence or absence of rales on lung examination; presence of rash or synovitis on examination;presence or absence of digital clubbing; room air pulse oximetry reading and/or arterial blood gas; pulmonary function test (PFT) results and results of bronchoscopy (i.e. results of bronchoalveolar lavage and transbronchial biopsy) . Based upon the HRCT and patient data, evaluators determined whether or not the patient met ATS/ERS criteria for a clinical diagnosis of IPF/CFA (table 1). We did not require the reviewers to specify which of the ATS/ERS criteria were or were not present in each case, nor did we mandate that the reviewers specify which HRCT features consistent with a diagnosis of IPF/CFA were present or absent. For each case, we blinded the evaluators to SLB results, any prior interpretation of radiographic studies, and to the opinions of other study participants. Evaluators were also unaware of the incidence of UIP/IPF within the cohort.
Upon completion of the review process we recorded the radiographic and clinical diagnoses for each patient. When the clinicians disagreed as to the presence or absence of IPF/CFA we used the majority opinion to determine the group’s final diagnosis. We utilized the histologic presence of UIP in the absence of know etiologies for ILD as our gold standard for a diagnosis of IPF/CFA. Histologic diagnoses were obtained by reviewing each patient’s medical record. Pathologists at our institution utilized accepted histo-pathologic criteria when identifying the pattern of UIP; specifically a heterogeneous appearance at low magnification and the presence of scattered fibroblastic foci interspersed with deposits of collagen or honeycombing . When the histologic diagnosis was unclear cases were forwarded to the Armed Forces Institute of Pathology for expert review. We then determined, by application of a standard two by two table, the sensitivity, specificity, accuracy, negative and positive predictive values for each evaluator and for the group as a whole. Further, we assessed the degree of agreement between the three observers by calculating a kappa statistic. Ninety five percent confidence intervals are reported where appropriate. All analyses were performed using SPSS 9.0 statistical software (SPSS, Chicago, Ill., USA).
Twenty-six cases were evaluated in our study. As shown in table 2, the mean ± SD age of the cohort was 61.3 ± 11.9 years. There were 18 males (69.2%). The majority of subjects were Caucasian. Eighty-eight percent of the group were current or former smokers at the time of their SLB.
Table 2. Demographic characteristics
Fourteen individuals (53.8%) had a SLB revealing UIP and a clinical history consistent with IPF/CFA. Of the remaining patients, five had NSIP, two had sarcoidosis, two had neoplastic disease, one had end-stage fibrosis, one had cryptogenic organizing pneumonia (COP) (formerly known as bronchiolitis obliterans organizing pneumonia or BOOP), and one had RBILD. For our group the point values and 95% confidence intervals for sensitivity, specificity, positive predictive value, and negative predictive value for a diagnosis of IPF/CFA based on HRCT alone were 71% (51–92%), 67% (39–86%), 71% (51–92%), and 67% (39–86%), respectively. There were four false-positives and four false negatives yielding an overall accuracy of 69% (50–83%) (table 3).
Table 3. Test characteristics
Also shown in table 3, application of the ATS/ERS criteria resulted in a sensitivity and specificity of 71% (51–92%) and 75% (47–92%). When applying the ATS/ERS criteria the positive and negative predictive values were 77% (50–92%) and 69% (42–87%), respectively. Using the ATS/ERS criteria the group had four false-negative and three false-positive results. The accuracy of the ATS/ERS clinical criteria was therefore 73% (54–86%). Differences in operating characteristics between HRCT and the ATS/ERS clinical criteria were not statistically significant.
Using the HRCT interpretation, two individuals with NSIP, and one with COP and were falsely diagnosed as having IPF/CFA; the observers disagreed only in the case of COP. Using the ATS/ERS criteria, two cases of NSIP were misdiagnosed as IPF/CFA; the observers agreed in all three cases. A patient with a histologic finding of end-stage honeycomb lung was misclassified by both HRCT and the ATS/ERS criteria.
The observers concurred on the interpretation of 18 of the 26 HRCT scans. When applying the ATS/ERS criteria they agreed in 17 of the 26 cases. Thus interobserver variability expressed as a kappa statistic, for HRCT and for the ATS/ERS criteria were 0.59 and 0.53, respectively (p values were not statistically significant for comparison of the kappa statistics).
Overall, our study highlights the potential limitations of both radiographic and clinical diagnoses of IPF/CFA. Importantly, we evaluated the utility of the ATS/ERS criteria for a clinical diagnosis of IPF/CFA as applied by pulmonologists in a general pulmonary practice setting as opposed to those with recognized mastery in the field of ILD. Our study demonstrates a number of important findings. First, both the ATS/ERS criteria and HRCT performed modestly compared to SLB. Next, application of the ATS/ERS criteria led to misclassification of more than one quarter of patients with IPF/CFA in our cohort. Also, the ATS/ERS criteria led to a false-positive diagnosis of IPF/CFA in three (25%) patients with non-IPF ILD. Two of these patients had NSIP and one had end stage fibrosis. Arguably, end stage fibrosis is at the end of the continuum of histologic change for many forms of ILD and this patient may very well have had UIP/IPF. However, even discounting the case of end stage fibrosis, two patients (16.7%) with a potentially treatable form of ILD would have received a false diagnosis of IPF/CFA. Finally, our data suggest HRCT and the ATS/ERS clinical criteria (which incorporate HRCT) perform similarly in terms of sensitivity and specificity for diagnosis of IPF/ CFA. While the addition of ATS/ERS clinical variables to HRCT interpretation may allow improved specificity for a diagnosis of IPF/CFA our study did not demonstrate a statistically significant difference. The sensitivity of the HRCT diagnosis in our study was similar to that reported in prior studies, however our specificity was somewhat lower [3, 4]. With regard to interobserver variability, neither HRCT (k = 0.59) nor the ATS/ERS criteria (k = 0.53) achieved more than moderate agreement. Our findings mirror the results reported by Hunninghake et al., who demonstrated an agreement within their radiology core of k = 0.54 and within their clinical core of k = 0.59 . As a comparison, investigators have demonstrated good to excellent inter-observer agreement (k = 0.72–0.96) for the diagnosis of pulmonary embolism by CT angiography, for the determination of size of pulmonary nodules by spiral chest CT scanning (k = 0.61–0.81) and for the diagnosis of cystic lung disease by HRCT (k = 0.77–1.0) [6, 7, 8, 9, 10, 11]. The relatively high degree of variability in a clinical diagnosis of IPF/CFA may have implications in evaluating results of clinical trials that involve patients with only clinical diagnoses of IPF/CFA. If investigators at different institutions are responsible for enrolling patients into a multi-center trial for treatment of IPF/CFA, inconsistencies in the implementation of these criteria may result in a situation where a patient enrolled at one center might have been excluded at another. This could potentially introduce bias into these prospective studies.
Our study has several limitations. We utilized a relatively small sample size. Although the percentage of the cohort who had UIP/IPF was similar to that in prior studies, the remaining diagnoses fell into only seven categories [3, 4]. A larger sample with a greater number of diagnoses other than IPF/CFA might have altered our results. The retrospective nature of our study may also have introduced bias. We were seeking to validate the ATS/ERS criteria in a community-type rather than an ILD referral-type setting. Ideally one would prefer a prospective evaluation of the ATS/ERS criteria in a community setting; however, the relatively rarity of IPF/CFA makes it impossible to evaluate a sufficient number of patients within a sufficient period of time outside of large referral centers. Our evaluators could not interview or examine the patients, rather they had access only to data extracted from a chart review. To address this limitation we attempted to focus on clear and objective end-points in order to minimize the introduction of a sampling bias. The format that we utilized for data presentation to our evaluators was similar to that used in the Hunninghake project . We derived our cohort from a group of patients specifically referred for SLB raising the potential for referral bias. Clinicians at our institution likely diagnosed other ILD patients without referring them for SLB; inclusion of such patients with more classical clinical presentations may have improved either tests characteristics. We utilized pulmonologists interpretations of the HRCT scans as opposed to radiologists interpretations because we believe this scenario mimics the more common clinical practice in the community setting. We do, however, recognize that some practitioners place a greater emphasis on formal HRCT interpretations by radiologists. Finally, just as earlier studies were limited in that they were performed by ILD experts, our study was performed at a tertiary care teaching center and our results may not fully apply to the medical community outside an academic setting.
The ability of clinicians to accurately diagnose patients suspected of having IPF/CFA carries significant implications in terms of prognosis, choice and timing of therapy and for future research regarding investigational treatment options. Our study suggests the ATS/ERS criteria for a clinical diagnosis of IPF/CFA, when applied by practitioners in a routine pulmonary practice setting, may not be an acceptable alternative to SLB. A larger prospective study could help to further define the utility of the ATS/ERS criteria in such a clinical setting.
Andrew F. Shorr, MD
Walter Reed Army Medical Center
Department of Pulmonary and Critical Care Medicine
6900 Georgia Ave. NW, Washington DC 20307-5001 (USA)
Tel. +1 202 782 6745, Fax +1 202 782 9032, E-Mail email@example.com
Disclaimer: The opinions expressed herein are not to be construed as official or as reflecting the policy of either the Department of Defense or the Department of the Army.
Received: April 22, 2003
Accepted after revision: September 9, 2003
Number of Print Pages : 5
Number of Figures : 0, Number of Tables : 3, Number of References : 11
Respiration (International Review of Thoracic Diseases)
Founded 1944 as ‘Schweizerische Zeitschrift für Tuberkulose und Pneumonologie’ by E. Bachmann, M. Gilbert, F. Häberlin, W. Löffler, P. Steiner and E. Uehlinger, continued 1962–1967 as ‘Medicina Thoracalis’ as of 1968 as ‘Respiration’, H. Herzog (1962–1997)
Official Journal of the European Association for Bronchology and Interventional Pulmonology
Vol. 71, No. 2, Year 2004 (Cover Date: March-April 2004)
Journal Editor: C.T. Bolliger, Cape Town
ISSN: 0025–7931 (print), 1423–0356 (Online)
For additional information: http://www.karger.com/journals/res