Abstract
Background: The real accuracy of computed tomographic colonography (CTC) is still unknown. Objective: To perform a meta-analysis of the diagnostic accuracy of CTC for the detection of polyps and colorectal tumors. Methods: Selection of studies:Studiesassessing the accuracy of CTC for the detection of colorectal polyps and tumors were selected. Data synthesis:Meta-analyses combining sensitivities, specificities and likelihood ratios (LRs) for the diagnosis of polyps and colorectal tumors were carried out. Results: Forty-seven studies, providing data of 10,546 patients, were included. Overall per-polyp sensitivity of CTC was 66% (64–68%), for polyps 6–9 mm in size it was 59% (56–61%), and 76% (73–79%) for polyps larger than 9 mm. Overall per-patient sensitivity was 69% (66–72%), for polyps 6–9 mm 60% (56–65%), and 83% (70–85%) for lesions larger than 9 mm. Overall CTC specificity was 83% (81–84%). Positive and negative LRs were 2.9 (1.8–4) and 0.38 (0.27–0.53), respectively; for polyps 6–9 mm in size, they were 3.8 (2.5–5.7) and 0.4 (0.27–0.59), and 12.3 (7.7–19.4) and 0.19 (0.12–0.3) for polyps larger than 9 mm. Conclusion: CTC is highly specific for the detection of colorectal polyps and tumors. Some studies reported high sensitivities, but the results of the studies were highly heterogeneous, while the studied variables explained only part of this discrepancy.
Introduction
Colorectal cancer (CRC) is the second most common cause of cancer-related death in the western world [1]. The average lifetime incidence of CRC is 6% and this figure is even higher for those with a family history of colorectal neoplasia with other predisposing conditions [1]. The majority of malignant tumors of colon and rectum arise from preexisting adenomatous polyps [2]. The fact that the progression from adenoma to cancer may take up to 10 years provides the opportunity of early detection of those premalignant lesions, with the use of various screening techniques.
Despite scientific evidence that the screening for CRC can reduce mortality, individuals at risk of developing this disease remain largely underscreened [3]. A major limitation to the widespread implementation of CRC screening is the lack of a single test that meets all the requirements for screening. A diversity of competing and imperfect tests for CRC screening exists, many of which are required in combination. For many years, barium examination and endoscopy have been the only diagnostic methods for evaluating diseases of the colon [4]. However, in the last 15–20 years, spiral computed tomography (CT) has also been shown to be an essential tool in radiological evaluation of the gastrointestinal tract.
CT colonography (CTC) or virtual colonoscopy was first described in 1994 by Vining et al. [5]. Since then, there have been numerous advances in the CTC technique, which have the potential to improve the accuracy of the test. Modern CT scanners can obtain very thin image slice in a short duration of time with low-radiation exposure [6]. Investigators have developed new methods of stool tagging and electronic colon cleansing to improve the false-negative and positive rates of CTC [7]. Image interpretation can now be enhanced with polyp detection software and three-dimensional (3-D) reconstruction of the bowel lumen, and radiologists are receiving specific training to CTC implementation [8, 9, 10].
CTC has been recommended as a less-invasive alternative to conventional colonoscopy (CC) for CRC screening [11]; however, the true accuracy of CTC for the diagnosis of colorectal polyps or tumors is still unknown, as controversial results have been reported by different authors, and the explanation for the heterogeneity of results is insufficiently explained.
The purpose of this study was to perform a systematic review and a meta-analysis of studies evaluating accuracy of CTC compared with CC for detecting colorectal polyps and tumors.
Methods
Study Identification and Selection
Bibliographical searches were performed, up to January 2009, in MEDLINE and EMBASE electronic databases, looking for the following words (all fields): ‘virtual colonoscopy’, ‘virtual colonography’, ‘computed tomography colonography’, ‘computed tomography colonoscopy’, ‘CT colonoscopy’ or ‘CT colonography’. The title and the abstract of potentially relevant studies and review articles were screened for appropriateness before retrieval of the full articles. Inclusion criteria were a prospective, blinded design (in which results of CTC were interpreted independently of findings on colonoscopy or during surgery), enrollment of adult patients who were to undergo CTC after a full bowel preparation, followed by complete colonoscopy or surgery, and use of at least a single detector scanner, with colon insufflation by air or carbon dioxide. In case of multiple articles from the same institution, the dates for study inclusion were evaluated to ensure that there were no overlaps of the patients. Studies that had evaluated computer-aided detection systems were excluded.
Study Quality
The quality of the studies was assessed using the QUADAS (‘Quality Assessment of Diagnostic Accuracy Studies’) tool [12]. This is the first tool for the assessment of the quality of diagnostic accuracy studies which has been systematically developed and is evidence based [12]. The tool is based on the 14-item questions summarized in table 1, which should each be answered with ‘yes’, ‘no’ or ‘unclear’. The tool does not incorporate a (global) quality score; the reasons for this are justified in detail in the original paper describing the QUADAS tool [12]. Among these reasons, the most noteworthy is that quality scores ignore the fact that the importance of individual items and the direction of potential biases associated with these items may vary according to the context in which they are applied. Therefore, the application of quality scores, with no consideration of the individual quality items, may dilute or entirely miss potential association.
Data Abstraction
The following variables were abstracted for the original studies in a predefined data extraction form (table 2): characteristics of the study (year, reference standard, type of contrast used, colonic preparation), patients (number of patients, risk of CRC), scanner (type of viewer, type of contrast, collimation, reconstruction thickness, milliamperes), and study quality. True positives, false positives, false negatives, and true negatives with the CTC were included (tables 3, 4). If data could not be extracted or calculated from the manuscript with confidence, none were entered.
End Points and Definitions
The primary end points were the sensitivity and specificity of the assay used. This was either quoted directly in the studies or was extractable from analysis of the true positives, true negatives, false positives, and false negatives on a per-patient and per-polyp basis.
Data Synthesis
The sensitivity, specificity, positive and negative likelihood ratios (LRs), and their corresponding 95% confidence intervals were calculated for each study. LRs state how many times more likely particular test results are in patients with disease than in those without disease. LRs can be used to adapt the results of a study to your patients. The LR expresses the relative odds for occurrence of a specific test combination in a patient with polyps as opposed to patients without polyps. Using the Bayes’ theorem, the posttest odds that the patient has the disease are estimated by multiplying the pretest odds by the LR. Positive LRs >10 and negative LRs <0.1 have been noted as providing convincing diagnostic evidence, whereas those >5 and <0.2 give strong diagnostic evidence [13]. To calculate LRs, if the event of one of the cells of the cross table contained a zero value, 0.5 points were added to all the cells.
The heterogeneity of all indexes was evaluated by the graphic examination of the forest plots, and statistically through a homogeneity test based on the χ2 test. Due to the low power of this test, a minimum cutoff p value of 0.1 was established as a threshold of homogeneity, lower values indicating heterogeneity. In addition, the I2 statistic was calculated to assess the impact of heterogeneity on the results. This statistic which describes the percentage of the variability in effect estimates that it is due to heterogeneity rather than sampling error (chance). A value >50% may be considered substantial heterogeneity [14].
Meta-analyses were performed combining the sensitivities, specificities, and LRs of the individual studies in the corresponding pooled indexes. LRs were pooled using a random effects model (DerSimonian and Laird). As a ‘threshold effect’ was not detected (by the Spearman test and the examination of the plot of sensitivity and specificity on a ROC plane), summary receiver operating characteristic curves were not constructed [15]. Meta-analysis was conducted using Meta-DiSc© for Windows Version 1.1.1 software [16].
Subanalyses
Subgroup analyses were done by polyp size categories (6–9 mm and larger than 9 mm), colonic preparation, use of fecal tagging (yes or no), collimation width and reconstruction interval (in millimeters), type of scanner (single-detector, multidetector or mixed), imaging technique (2-D imaging with 3-D confirmation only when a lesion was observed or always 3-D imaging), radiation dose, and risk of CRC. Analyses for polyps smaller than 6 mm were not done because the majority of authors did not report ‘diminutive’ lesions.
Results
With the aforementioned search strategy, we initially identified 1,798 articles in MEDLINE and EMBASE. The title and abstract of potentially relevant studies and review articles were screened for appropiateness before retrieval of the full articles. Articles that did not meet inclusion criteria were excluded as reported in the flowchart (fig. 1). Finally, 47 studies in total were included in the meta-analysis.
We included 47 prospective studies involving 10,546 patients that compared CTC to the reference standard of CC or surgery [17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]. Table 2 shows detailed information from individual studies. The average number of participants in a study was 224. Sixteen studies used single-detector scanners, 27 used mulidetector scanners and 4 used both single and multidetector scanners. CC was the gold standard in 44 studies and surgery was the gold standard in 3 of them. Twenty-four studies used 2-D imaging, with 3-D imaging on selected slices at the discretion of the radiologist; 20 studies used both 2-D and 3-D imaging and 2 studies fly-through imaging with 2-D reconstruction. Ten studies employed sodium phosphate for bowel preparation and 21 employed polyethylene glycol. Six studies used fecal tagging and 12 intravenous contrast. The average collimation was 3.7 mm and the average reconstruction interval was 2 mm. Six of the studies were done in an average-risk population and the others in a high-risk population. The average radiation intensity employed was 114 mA.
Sensitivity of CTC
Per-polyp sensitivity for CTC for polyps larger than 6 mm varied from 28 to 100%. The overall pooled sensitivity for CTC was 66% [95% confidence interval (95% CI): 64–68%] (fig. 2). The sensitivity increased progressively as polyp size increased: it was 59% (95% CI 56–61%; range 16–90%) for polyps 6–9 mm, and 76% (95% CI 73–79%; range 50–100%) for polyps larger than 9 mm (fig. 3). Each of these analyses was statistically heterogeneous (I2 > 50%), and most of the variance was attributable to between-study heterogeneity.
Per-patient sensitivity for CTC varied from 24 to 100% (fig. 4). The overall pooled sensitivity for CTC was 69% (95% CI 66–72%). Sensitivity increased progressively as polyp size increased: it was 60% (95% CI 56–65%) for patients with polyps of 6–9 mm (range 20–91%), and 83% (95% CI 70–85%; range 46–100%) for patient with polyps larger than 9 mm (fig. 5). Each of these analyses was statistically heterogeneous (I2 >50%), and most of the variance was attributable to between-study heterogeneity.
Subanalyses
(1) Use of Phospho-soda for bowel preparation: Mean sensitivity of studies that used Phopho-soda was 83.3% (95% CI 79–87%) (I2 = 73%). Sensitivity of the 16 studies that did not use Phospho-soda was 62% (95% CI 58–66%) (I2 = 93%) (fig. 6).
(2) Fecal tagging: Sensitivity of the studies using fecal tagging was 88% (95% CI 84–91%) (I2 <50%). On the other hand, sensitivity of studies without fecal tagging was 59% (95% CI 56–63%) (I2 = 91%) (fig. 7).
(3) Collimation width: Sensitivity of studies that employed a collimation thinner than 5 mm was 72% (95% CI 68–76%) (I2 = 89%), while with a collimation of 5 mm or thicker it was 65% (95% CI 60–70%) (I2 = 95%) (fig. 8).
(4) Reconstruction thickness: Sensitivity of studies with reconstruction thinner than 3 mm is 64% (95% CI 60–68%) (I2 = 90%), while with reconstruction of 3 mm or thicker it was 58% (95% CI 49–67%) (I2 = 87%).
(5) Reconstruction mode: Studies that used 2-D imaging with confirmation by 3-D imaging only when considered necessary yielded a sensitivity of 64% (95% CI 60–67%) (I2 = 90%), whereas studies that always used 3-D imaging had pooled sensitivity of 83% (95% CI 78–87%) (I2 = 84).
(6) Radiation: Studies that employed low radiation dose (<100 mA) reported overall sensitivity of 63% (95% CI 60–67%) (I2 = 95), which is lower than that calculated for studies that used a higher radiation dose (>100 mA) (79%; 95% CI 75–83%) (I2 = 75%) (fig. 9).
(7) Population risk: Studies in patients at high risk of polyps or CRC population reported a sensitivity of 65% (95% CI 61–68%) (I2 = 94%), whereas the sensitivity of studies in average-risk population was 82% (95% CI 77–87%) (I2 = 83%).
(8) Other variables: We did not find differences in other variables analyzed, including quality of the studies.
Specificity of CTC
Overall CTC specificity was 83% (95% CI 81–84%) (I2 = 89%). Specificity improved as polyp size increased, and the heterogeneity decreased within each stratum. For patients with polyps 6–9 mm in size, specificity was 90% (95% CI 89–91%) (I2 = 21%), and increased to 92% (95% CI 92–93%) (I2 = 62%) for polyps larger than 9 mm (fig. 10).
Likelihood Ratios
Overallpositive and negative LRs were 2.9 (1.8–4) and 0.38 (0.27–0.53). Positive and negative LRs for polyps between 6 and 9 mm were 3.8 (2.5–5.7) and 0.4 (0.27–0.59). For polyps larger than 9 mm in size, positive and negative LRs were 12.3 (7.7–19.4) and 0.19 (0.12–0.3).
Discussion
Our meta-analysis showed that CTC is highly specific, but the reported sensitivities for CTC varied widely, even for larger polyps. Any screening method recommended for general use must be demonstrated to be highly and consistently sensitive in a variety of settings [64].
Our analysis revealed some factors that accounted for the wide range of sensitivities. First, scanners that used thinner collimation had higher sensitivity. Second, the mode of imaging also appeared to be important: studies using fly-through technology had higher sensitivity. However, this finding must be interpreted with caution because it was based on data of only two studies. Until the study by Pickhardt et al. [35], all published CTC studies had employed a primary 2-D evaluation of the data, with 3-D endoluminal evaluation limited to problem solving and lesion confirmation. Recent advances in workstation software have transformed 3-D endoluminal navigation of the colon from a cumbersome, time-consuming technique to one that can be performed relatively efficiently. Investigational studies currently in progress are evaluating the relative value of 2-D and 3-D image review [4].
Third, bowel preparation seems to be another important factor influencing accuracy. Studies using sodium phosphate reported better results. In most CTC trials, the investigators have used the bowel preparation usually prescribed by the gastroenterologists. The most common bowel preparations prescribed are polyethylene glycol solution or sodium phosphate plus bisacodyl. With both preparations residual fluid may be left in the colon at the time of examination, but polyethylene glycol solution, in particular, tends to produce a large amount of residual colonic fluid, which can obscure a large position of the colon wall and hide polyps [65]. This problem can be reduced by adding oral iodinated and barium contrast agents to the bowel preparation. We also found higher sensitivities reported in studies using fecal tagging. Additionally, the high-density residual fluid and stool can be electronically removed from the images [66], but this technique is not yet widely available.
Few large and multicenter studies evaluating the accuracy of the CTC have been performed with different results. Recently, the American College of Radiology Imaging Network has carried out a multicenter trial in 2,600 asymptomatic adults, reporting that CTC identified 90% of polyps larger than 10 mm [17]. Johnson et al. [37] published a study involving 703 patients, and reported a sensitivity of 46% for polyps larger than 1 cm and a specificity from 95 to 98% for polyps of the same size. Pickhardt et al. [35] published a multicenter trial of 1,233 asymptomatic patients, reporting per-polyp sensitivities for CTC of 85.7, 92.6, and 92.2% with respect to adenomas of at least 6 mm, 8 mm or at least 10 mm in diameter; the specificity was 96%. Cotton et al. [28] published a comparative trial of 615 patients, reporting a sensitivity of 55% for the identification of patients with at least one lesion larger than 10 mm and a specificity above 90%. Finally, Rockey et al. [25] published a study of 614 symptomatic patients reporting a sensitivity of CTC of 59% for lesions larger than 10 mm.
Four meta-analyses have been published before ours [64, 67, 68, 69]. Mulhall et al. [64] reported sensitivities of 48, 70 and 85% for polyps smaller than 6 mm, 6–9 mm and larger than 9 mm, respectively. It was also found that sensitivity of CTC was very heterogeneous. The authors found that collimation thickness, mode of imaging and type of detector could explain a small amount of the discrepancy. In that meta-analysis it was concluded that CTC cannot be recommended for general use until the source of heterogeneity is more clearly explained. The last meta-analysis was performed by Rosman and Korsten [69]in 2007. Thirty studies were included, reporting pooled per-patient sensitivities of 82, 63 and 56% for polyps larger than 10 mm, 5–10 mm and smaller than 5 mm, respectively. Our meta-analysis presents some advantages compared to that published previously. First, we included a greater number of studies and a greater number of patients. Second, we performed more detailed subgroup analyses, trying to identify the variables that have more influence on the accuracy of the CTC. The use of fecal tagging, the use of sodium phosphate for bowel cleansing, the thickness of collimation and the mode of imaging seemed to be the most determining variables at the time of obtaining a greater diagnostic accuracy. The radiation dose employed in CTC has been shown to determine some influence in the diagnostic accuracy, although we could not evaluate the effect of the effective dose because we did not have sufficient information for calculating this value. Finally, we calculated LRs that are very important estimators of the accuracy of a diagnostic test. In this respect, we found that CTC provides convincing diagnostic evidence for polyps larger than 9 mm, and gives strong diagnostic evidence for polyps of 6–9 mm in size.
Overall sensitivity of CTC for detecting polyps smaller than 10 mm has been poor. However, the importance of these polyps has been debated [70, 71]. Since it is not clear whether small polyps should be removed by polypectomy, some radiologists have decided not to report polyps smaller than 5 mm in size and to recommend CT follow-up of polyps in the 6- to 9-mm range by repeating CTC at 1–2 years’ intervals [72].
The risks associated with systemic radiation of asymptomatic people remain unknown and are very important in a screening test. A recent report estimated that about 1% of all cancer deaths in the United States are now related to medical radiation [73]. Extending the risk of radiation to asymptomatic people raises important ethical issues. Some countries do not permit the use of CT scanning in asymptomatic people. The risk of cancer induced by radiation as a result of a CTC study is 0.14% in a 50-year-old patient [74]. This is similar to the rate of perforation during screening colonoscopies [74], but cancer would be expected to have a higher mortality than colonoscopic perforation. Lower-dose radiation protocols for CTC have been developed. Liedenbaum et al. [75] evaluated the use of effective doses in CTC and they concluded that although the number of CTC protocols with dose modulation had increased substantially since 2004, no significant decrease in effective dose was found.
CTC had been accepted by institutions for patients with incomplete CC and for patients that for some reason cannot undergo CC [76], but in previous assessments of the performance of CTC, the American Cancer Society concluded that data were insufficient to recommend screening with CTC for average-risk individuals [77]. Based on the accumulation of evidence since that time, the expert panel has concluded that there are sufficient data now to include CTC as an acceptable option for CRC screening [11].
The implementation of CTC will require quality metrics to be defined and implemented in clinical practice. In 2005, the ACR Practice Guideline for the Performance of Computed Tomography Colonography in Adults was published, encompassing the techniques, quality control, clinical uses, training and communication of results of CTC [78]. An update of these guidelines is planned following publication of the results of the ACRIN CT screening trial [11].
The management of CTC findings is an important part of a CTC screening program. Nowadays the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, the American College of Radiology and the ESGAR recommend that all patients with one or more polyps ≥10 mm or 3 or more polyps ≥6 mm should be referred for colonoscopy [11, 79]. Such polyps must be removed if found at CC because of the risk, albeit low, of advanced neoplasia [80]. Patients with one or two lesions of 6–9 mm should be recommended interval surveillance after up to 3 years depending on several factors [11, 79]. Patients who decline CC or who are not good candidates for CC should be offered surveillance with CTC. At this time, optimal management of patients whose largest polyp is 6 mm detected on CTC is uncertain, because the risk of advanced features in these polyps is very low (1.7%) [80]. It is recommended to repeat exams every 5 years if the initial CTC is negative for significant polyps until further studies are completed and are able to provide additional guidelines [11].
In conclusion, CTC offers high sensitivity and specificity for the detection of CRC and polyps larger than 10 mm in size. It is the method of choice for incomplete colonoscopies and for patients that for some reason cannot undergo CC. The wide range of sensitivities that have been reported and cannot be completely explained must be resolved. The use of CTC in routine screening should be subject to quality control with prescribed examination standards.
Acknowledgment
CIBEREHD is funded by the Instituto de Salud Carlos III.