The Utility of the Congenital Pulmonary Airway Malformation-Volume Ratio in the Assessment of Fetal Echogenic Lung Lesions: A Systematic Review

Although relatively uncommon, the incidence of fetal echogenic lung lesions – a heterogeneous group of anomalies that includes congenital pulmonary airway malformations (CPAM) and bronchopulmonary sequestrations (BPS) – has increased recently. Two decades ago, the CPAM-volume ratio (CVR) was first described as a tool to predict the development of hydrops, with this outcome found to be unlikely in fetuses with CVRs of ≤1.6 cm2. Since then, no clear international consensus has evolved as to the optimal CVR thresholds for the prediction of fetal/neonatal outcomes. This systematic review aimed to assess all original research studies that reported on the predictive utility of the CVR. Potentially relevant papers were identified through searching for citations of the paper that originally described the CVR, in addition to keyword searches of electronic databases. Fifty-two original research papers were included in the final review. Of these, 34 used the CVR for descriptive purposes only, 5 assessed the validity of established thresholds in different populations, and 13 proposed new thresholds. The evidence identified in this review would suggest that a threshold much lower than 1.6 cm2 is likely to be of greater utility in most populations for many outcomes of perinatal relevance. For neonatal outcomes (mostly respiratory compromise at birth), a CVR on the initial ultrasound scan ranging from 0.5 to 1.0 cm2 appears to have the greatest predictive value. Although a number of studies concurred that 1.6 cm2 was a useful threshold for the prediction of hydrops, many others were unable to assess this due to the rarity of this complication. For this reason, thresholds as low as 0.4 cm2 may be more useful for the prediction of a broader range of fetal concerns, including mediastinal shift and fluid collections. Further large-scale studies are required to determine the true utility of this well-established index.

tions of the paper that originally described the CVR, in addition to keyword searches of electronic databases. Fifty-two original research papers were included in the final review. Of these, 34 used the CVR for descriptive purposes only, 5 assessed the validity of established thresholds in different populations, and 13 proposed new thresholds. The evidence identified in this review would suggest that a threshold much lower than 1.6 cm 2 is likely to be of greater utility in most populations for many outcomes of perinatal relevance. For neonatal outcomes (mostly respiratory compromise at birth), a CVR on the initial ultrasound scan ranging from 0.5 to 1.0 cm 2 appears to have the greatest predictive value. Although a number of studies concurred that 1.6 cm 2 was a useful threshold for the prediction of hydrops, many others were unable to assess this due to the rarity of this complication. For this reason, thresholds as low as 0.4 cm 2 may be more useful for the prediction of a broader range of fetal concerns, including mediastinal shift and fluid collections. Further large-scale studies are required to determine the true utility of this well-established index.
Fetal echogenic lung lesions (ELL) represent a heterogeneous group of congenital anomalies, including congenital pulmonary airways malformation (CPAM), bronchopulmonary sequestration (BPS), congenital lobar emphysema, bronchogenic cysts and bronchial atresia, with CPAM and BPS accounting for a majority of cases [1,2]. Although uncommon overall, advances in ultrasound technology have resulted in an apparent increase in their frequency [3], with CPAM now occurring at a rate of 0.94 per 10,000 live births [4]. There is considerable diversity in the literature regarding nomenclature and classification systems for fetal ELL [5][6][7][8], a situation made more complex by postnatal taxonomies that require a histopathological diagnosis [9,10] even though many neonates born with such lesions do not have them excised [11,12].
CPAM is a developmental malformation of the lower respiratory tract due to failure of maturation of bronchiolar structures during the pseudoglandular stage of lung development, resulting in overgrowth of the terminal bronchioles without corresponding alveoli. The lesion communicates with the tracheobronchial tree and derives its blood supply from the pulmonary arteries [13,14]. BPS result from the formation of a supernumerary nonfunctioning lung bud localised within the normal lung tissue (intralobar) or the development of separate pleura (extralobar). A systemic blood supply is present in both subtypes. Hybrid lesions displaying characteristics of CPAM and BPS have been described sonographically and histopathologically [13]. CHAOS is also commonly associated with echogenic fetal lungs, but is a discrete pathological entity that has a more predictably poor prognosis than other ELL [6]. Other lesions, such as bronchogenic cysts, are even less common.
These lesions may be associated with mass effect, pleural effusions, hydrops fetalis, fetal demise, or neonatal morbidity, and mortality secondary to respiratory embarrassment. It can be difficult to differentiate ELL antenatally, even when MRI is employed in addition to ultrasound [15,16]. The final histopathological diagnosis, when obtained, is often different to the presumptive antenatal categorisation [17][18][19]. What matters to parents is accurate information regarding the likely outcome of a pregnancy affected by ELL, rather than a precise pathological diagnosis. To this end, a range of sonographic predictors has been suggested, with the most commonly utilised being the CPAM-volume ratio (CVR): the ratio of the volume of the lung lesion (calculated as a prolate ellipse, i.e., height × length × width × 0.52) to the head circumference, to normalise for gestation (Fig. 1). This ratio was first described by Liechty et al. [20] in abstract form in 1999. Its first mention in a peer-reviewed journal was by Crombleholme et al. [21] in 2002, in a series in which a CVR threshold of 1.6 cm 2 was found to predict the later development of hydrops. Numerous subsequent studies have assessed the utility of this and other CVR thresholds in predicting a range of outcomes, both for lesions presumed to be CPAM, and those deemed to be BPS and of other origin [22]. This paper presents a systematic review of all publications of original research that has reported on the utility of the CVR in predicting the course and outcomes of fetal ELL, and seeks to summarise current evidence for the optimal clinical application of this sonographic index.

Search Strategy and Selection Criteria
This study was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Anal- Fig. 1. Image of the three orthogonal measurements of an ELL that permit calculation of the CPAM-volume ratio. Reproduced without modification from Hellmund et al. [44].
Fetal Diagn Ther 2020;47:171-181 DOI: 10.1159/000502841 yses (PRISMA) [23]. A protocol was developed using the PRISMA-P Guidelines [24], and eligibility and exclusion criteria were agreed prior to study commencement. Studies were considered eligible if they reported fetal ultrasound diagnosis of ELL and assessment thereof using the CVR [21]. Despite its sonographic presentation as echogenic lungs, CHAOS cases were excluded from the analysis as they represent a different pathological entity with a more predictable prognosis.
Publications that did not describe antenatal assessment or postnatal outcomes, or those in which the diagnosis was only obtained after birth, were not eligible for inclusion. Additionally, only studies with a primary ultrasound diagnosis were included, whilst those relying exclusively on other technologies such as MRI were considered ineligible. To minimise the risk of publication bias, case series with fewer than 10 discrete cases were also excluded.
To identify papers of relevance, all of the publications that have cited the original paper by Crombleholme et al. [21] (2002) were reviewed, along with the reference lists of these publications. Additionally, a systematic literature search was undertaken using the PubMed, Medline, Em-Base, Cochrane Collaboration, and Google Scholar databases, commencing from 2002 (when the CVR was first reported). The key search terms were: "congenital cystic adenomatoid malformation," "CCAM," "bronchopulmonary" AND "sequestration," "echogenic lung lesions," "congenital pulmonary airway malformation," "CPAM," "congenital lung lesion," and "fetal lung masses." Titles and abstracts from electronic searches were scrutinised independently by three reviewers (E.A., K.L.R., and S.C.K.) and assessed for eligibility, and full-text articles of all eligible studies were obtained. Disagreements were resolved by discussion between the three review authors; if consensus could not be reached, the opinion of a fourth author (R.P.-D.) was sought. Similarly, data extraction was performed by the two authors (E.A. and S.C.K.), and any discrepancies were addressed by a joint re-evaluation of the article with the other authors.

Data Extraction
The strict study inclusion criteria ensured that papers providing ambiguous or insufficient data regarding antenatal or postnatal management were excluded, as were those pertaining solely to postnatal diagnosis and management of congenital lung malformations.
The basic characteristics of each study were recorded, including study type, study period, number of patients, and inclusion and exclusion criteria. Each study was cat-egorised with respect to the way in which it utilised the CVR: (1) for descriptive purposes only; (2) to determine the utility of a previously proposed threshold in a specific population, and (3) to determine a new CVR threshold for specific fetal and neonatal outcomes.
For papers in categories 2 and 3, additional information was extracted and recorded, including the outcomes predicted by each CVR threshold, the timing of CVR measurement (at initial referral vs. serially), and, where relevant, the process used to determine the CVR threshold when one was proposed.

Risk of Bias Assessment
For each individual study, the risk of bias and study quality were assessed qualitatively, including representativeness of the sample, prospective collection of data according to a pre-established protocol, endpoints appropriate to the aim of the study, unbiased assessment of the study endpoint, follow-up period appropriate to the aim of the study, loss to follow-up less than 5%, and inclusion of consecutive patients [25].

Statistical Analysis
Descriptive statistics were used to report on the outcome of specific studies where appropriate.

Results
The search strategy based on citations of the original study [21] and reference lists thereof yielded 346 potentially relevant papers. The search of electronic databases yielded 2,502 published works of potential relevance. Following assessment for eligibility and exclusion of duplicates, a total of 52 papers was included in the review [2,16,21,22,, representing 3,190 cases of ELL. The overall search process is summarised in Figure 2.
Of the 52 original papers reporting on the CVR, 34 [2, 16, 26, 28, 29, 31-34, 36, 37, 39, 43, 45, 47-62, 65, 68, 70, 73] did so in a descriptive capacity only: i.e., the ratio was used to describe the baseline characteristics of study populations, or to determine eligibility for inclusion in the study, or to evaluate the effect of interventions such as thoracoamniotic shunting and maternal steroid administration, without proposing or evaluating a threshold for the prediction of outcome.
Five papers [22,30,41,67,71] reported solely on evaluations of existing thresholds. Four used the threshold of 1.6 cm 2 for the development of hydrops as initially proposed by Crombleholme et al. [21] (2002), and 1 em-DOI: 10.1159/000502841 ployed the threshold of 2.0 cm 2 for the need for neonatal surgery as proposed by Cass et al. [27] (2011). Four papers proposing new thresholds also evaluated the utility of the 1.6 cm 2 threshold [27,38,66,72]. All of these papers are summarised in Table 1.
Thirteen papers [21,27,35,38,40,42,44,46,63,64,66,69,72] proposed CVR thresholds for the prediction of a range of fetal and neonatal outcomes. Five papers, including the original study by Crombleholme et al. [21], assessed fetal outcomes, including hydrops, hydrothorax, heart failure, mediastinal shift, need for fetal intervention, and fetal demise. Seven studies addressed neonatal complications, largely respiratory compromise and the need for respiratory support after birth, but also including nursery admission, need for emergent surgery or ECMO, and death. Two studies presented a composite endpoint of fetal and neonatal outcomes that included, inter alia, perinatal death, hydrops, and the need for respiratory support after birth. These studies are all summarised in Table 2, including the population studied, the timing of CVR measurement, and the specific outcomes for which thresholds were proposed.

Change in CVR over Time
Six studies explicitly assessed the natural history of the change in CVR over time [35,38,40,63,67,69]. In the largest series published to date -the French MALFPULM cohort study of 176 pregnancies affected by fetal lung lesions, first assessed at a mean of 25.8 weeks [35] -the CVR was noted to decrease in a constant fashion in cystic/mixed lesions, whereas in hyperechoic lesions, the CVR increased until 27 weeks and declined thereafter. Ehrenberg-Buchner et al. [38] (2013) found that the CVR of larger prenatal lung lesions increased to a peak at around 26 weeks, whereas that of smaller lesions stayed constant or decreased slightly, resulting in a peak mean CVR for the whole cohort of 0.89 cm 2 at 26.3 weeks, declining to a final average of 0.58 cm 2 at a mean of 30 weeks' gestation. Similarly, Tuzovic et al. [69] (2019) observed that the mean CVR decreased until 32 weeks and then stayed relatively constant in their cohort of 53 fetuses with congenital lung lesions. In contrast, the cohort of fetuses with congenital lung lesions studied by Feghali et al. [40] (2015) demonstrated an increase in the mean CVR until 32 weeks' gestation which then decreased if the infant was destined for regular nursery care. Infants who required neonatal intensive care had started with much larger CVRs (mean 1.92 vs. 0.31 cm 2 ), which gradually decreased with advancing gestation.
In a study limited to BPS, Riley et al. [63] (2018) found that the gestational age associated with the highest mean CVR for all BPSs was 26 ± 1 weeks with subsequent decrease in mean CVR thereafter. Extralobar BPSs were less likely than intralobar BPSs to decrease in CVR or become isoechoic from initial to final evaluation (71 vs. 94% of lesions, respectively). Finally, Stoiber et al. [67] (2017) identified differential growth patterns between presumed CPAM and BPS in the same cohort. CPAM cases reached a higher mean volume at a mean of 6 weeks after diagnosis (30 weeks). After reaching the peak volume, the CPAM volume decreased in 11 of 24 (46%), while 5 (25%) remained stable and 5 (25%) grew until delivery. BPS decreased or disappeared in 6 of 10 (60%) cases and 2 remained stable, while 1 increased in size until delivery and the other did not have growth assessed.

Risk of Bias Assessment
The 18 papers that form the basis of this review were all of sufficient quality to permit inclusion. Their populations were well defined, and the observations performed thereupon were clearly protocolised and performed consistently. The outcomes and timeframes for assessment thereof were appropriate, and there was limited loss to follow-up given the short-term endpoints employed by most studies. All studies but one, however, were retrospective in nature, and the national referral status of the institutions in which some studies were performed would likely have resulted in a preponderance of more severe disease states in their cohorts. 34

Discussion/Conclusion
In many centres, calculation of the CVR has become an integral aspect of the sonographic assessment of an echogenic fetal lung lesion, with guidelines now commonly suggesting its use [19]. Although the original study in which this index was proposed was elegant in design and execution [21], it is striking to observe the heterogeneous outcomes of studies undertaken since then, to the extent that a clearly defined threshold above which a poor outcome can reasonably be expected, or conversely below which a favourable outcome might be anticipated, cannot easily be defined.
There are many reasons as to why this might be the case. The studies themselves clearly demonstrate substantial heterogeneity, with populations that differ in institution(s) of origin, lesion studied and timing of CVR assessment, and with varied approaches to the calculation of predictive thresholds (absolute vs. AUC-derived), thereby precluding meta-analysis of their findings. Furthermore, ultrasound technology has advanced substantially in the time since Crombleholme's original paper [21], with considerable improvements in image resolution. Additionally, more pregnant women are having more scans, increasing the opportunity for the identification of lesions that previously would have gone unrecognised [3]. It is thus likely that we are now identifying a greater proportion of lesions of lower perinatal significance, highlighted by the relative infrequency of hydrops and generally favourable outcomes in more recent studies conducted in cohorts derived from broader populations [35,38,46,72] than those found at national referral centres for fetal intervention [21,26,60].
For this reason, leaving aside its heterogeneity, the evidence presented in this review might suggest that a threshold much lower than 1.6 cm 2 is likely to be of greater utility in most populations for many outcomes of perinatal relevance. For neonatal outcomes such as respira-tory symptoms, need for surgery, nursery admission, ECMO, or death, a CVR on the initial ultrasound scan ranging from 0.5 cm 2 [69] to 1.0 cm 2 [38] appears to have the greatest predictive value (i.e., thresholds above which these outcomes are more likely to occur in neonates), with most studies [40,42,72] tending toward the lower rather than upper end of this spectrum. Similarly, the two studies that evaluated composite fetal/neonatal outcomes (including perinatal death, hydrops, and the need for postnatal respiratory support) found thresholds of 0.91 and 0.45 cm 2 to have the highest negative predictive values (i.e., thresholds below which these composite outcomes did not occur) when derived from the initial ultrasound scan [44,46]. However, given that these alternative thresholds are largely based on relatively small retrospective studies, caution needs to be exercised in their application, as their true clinical meaning is yet to be tested in prospective studies of adequate size.
It is not surprising that CVR thresholds for the outcome of hydrops are relatively high, ranging up to 2.2 cm 2 [66], but even here one study found a lower threshold of 0.75 cm 2 [63] on the maximum rather than initial CVR assessment, although admittedly this was in a cohort limited to BPS. The fetal outcome of 'compression' used by the MALFPULM investigators -technically a composite of mediastinal shift, eversion of the diaphragm, hydrothorax, ascites, polyhydramnios or hydrops -may be more clinically useful than hydrops alone, and was unlikely with a CVR < 0.4 cm 2 on initial assessment in this large contemporary cohort [35].
The patterns of lesion growth as assessed by serial CVR measurement also varied among studies, although generally lesions were seen to grow until the late second or early third trimester, and then plateau or decrease slightly in size. The highest specificities of any CVR assessment in any study were those of the final assessments in the cohort of Ehrenberg-Buchner et al. [38] (2013), in keeping with the intuitive concept that the largest lesions at the end of NICU, neonatal intensive care unit; PPV, positive predictive value; NPV, negative predictive value; ToP, termination of pregnancy.  Many of the studies identified in this review sought to compare the capacity of the sonographically derived CVR to predict various fetal and neonatal outcomes with other indices and variables, such as MRI-derived CVR [39], cardiomediastinal shift angle [66], cardiac axis and cardiac position [69], and absolute measures of lesion volume by ultrasound [42] and MRI [16,73]. None has conclusively demonstrated sufficient superiority to the CVR to merit its replacement as a key component of the sonographic assessment of fetal ELL.
The disparate findings of studies included in this review invite the question of how a more precise estimate of the CVR's predictive utility might be achieved. Notwithstanding their other variations, studies included in this review were consistent without exception in the method employed for assessment of the CVR. As a result, an individual patient data meta-analysis (IPD-MA) could be one means by which potentially greater value might be extracted from extant data. However, a significant limitation of the individual studies, and thus any meta-analysis thereof, is that the vast majority are retrospective in nature, increasing the chance of case selection bias [74]. For this reason, further prospective studies -ideally at a national level, like the MALFPULM cohort [35], -would be very valuable. The requisite infrastructure for these register-type projects should, of course, be deployed for other fetal anomalies as well, enhancing their value and helping to justify the cost of their establishment. Such studies should use consistent nomenclature, assess the CVR at pre-specified gestations, use internationally agreed definitions for diagnoses such as hydrops, and ensure that any and all "core outcome" sets of relevance to these populations are employed (such as those developed by the Core Outcomes in Women's and Newborn Health [CROWN] initiative [75]).
In the meantime, it would not be inappropriate for clinical practice guidelines -and the institutions to which they apply -to adopt a more nuanced approach to the use of the CVR for echogenic fetal lung lesions, rather than the longstanding 1.6 cm 2 threshold generally employed to inform patient counselling and determine frequency of surveillance. Although this review has shown that this original threshold is not inappropriate for the prediction of hydrops, lower thresholds -once adequately assessed in prospective studies -may be of greater utility for the exclusion of other fetal and neonatal concerns that are more frequent in incidence, and thus potentially of greater relevance to fetal medicine practitioners and their patients alike. Number  PPV, positive predictive value; NPV, negative predictive value; CPAM, congenital pulmonary airway malformation; BPS, bronchopulmonary sequestration; SD, standard deviation; CI, confidence interval; AUC, area under the receiver operating characteristic curve; aOR, adjusted odds ratio; ECMO, extracorporeal membrane oxygenation.