Introduction
Obtaining precise growth and dating equations from ultrasound fetal measurements of crown-rump length (CRL) during the first trimester is of great importance, especially for the diagnosis of growth restriction. The time window for screening is during and at the boundary of the first trimester, for a gestational age (GA) between 11 weeks + 0 day and 13 weeks + 6 days (CRL between 45 and 84 mm) [1]. To date, the most consensual reference regarding CRL according to fetal age (FA) (FA = GA – 14) was proposed by Robinson [2 ]in 1973 based on measurements taken in an Australian population. Regardless of their validity, Robinson’s data were obtained using ultrasound technologies different from the modern ultrasound devices now routinely employed. Given the great potential impact of ultrasound measurements on medical decisions taken in the aneuploidy diagnosis process, it is crucial to obtain a more precise knowledge of CRL variations when using current ultrasound technologies. Our study of CRL variation during the first trimester in a French general population complement the results of Salomon et al. [3] regarding the second and third trimesters.
Since Robinson [2], several CRL curves have been proposed. Hadlock et al. [4] published an equation that integrates data from vaginal measurements, and Verwoerd-Dikkeboom et al. [5] proposed an equation obtained with data stemming from three-dimensional acquisitions. In 2008, Verbrug et al. [6] and in 2010, Papaioannou et al. [7] established new charts based on multiethnic samples taken by multiple operators with various devices. In 2010, Pexsters et al. [8] established a multi-operator curve on a population sample close to that used in our study. The present work provides a first-trimester external validation of these recent charts and proposes the use of our equations in routine fetal aneuploidy screening. Moreover, our work offers, for the first time, the modeling of standard deviations, which is a prerequisite step towards the definition of normal fetal growth parameters.
Materials and Methods
Data
This work was conducted in total agreement with French ethical principles.
Ultrasonic measurements from 402 in vitro fertilization (IVF) pregnancies were performed by a single operator (B.B.) in Nice and Monaco scan centers between February 1989 and September 2010. In this sample, non-viable embryos were excluded. There were 78 (19.4%) twin and 8 (1.99%) triplet pregnancies. In case of a multiple pregnancy, only 1 embryo measurement, chosen at random, was retained. Comparisons of CRL between single and multiple pregnancies were performed using the Wilcoxon and Chow tests. Wilcoxon’s signed-rank test compares two samples and tests whether the population means differ [9]. This was possible since the FA distributions were similar between groups. However, since CRL clearly depends on FA, the Chow test was more suitable as it compares the modeling of CRL as a function of FA and checks whether the same regression can be applied to both samples or whether two different models are needed [9,10,11,12].
The general population sample consisted of 2,123 spontaneous pregnancies examined in one center (Lambersart) by a single operator (M.C.) between January 2007 and February 2011. Pregnancies from IVF were excluded. To determine the date of fecundation as accurately as possible, we only took into account the pregnancies of women who had regular periods, with a cycle duration that was clearly known and expressed, who knew the date of their last menstruation and who were not on contraception for at least 3 months before pregnancy. As it is well known that CRL measurements are less accurate on large embryos, only CRL measures <85 mm were entered. A total of 513 pregnancies complied with our selection criteria and were scored. As estrus occurs 14 days before menstruation, and not in the middle of the menstrual cycle (due to delayed ovulation [13]), day 0 of fetal life was determined as 14 days before the next estimated menstruation. In case of a multiple pregnancy, the measurement of only 1 embryo was recorded.
As several examinations were sometimes conducted during follow-up visits, only the measurement taken during the first examination was included in the dataset. Both IVF (B.B. sample) and spontaneous (M.C. sample) pregnancy measures were performed with devices from the same manufacturer: Kretz/General Electric. IVF measurements (B.B. sample) were collected with different versions of Kretz Voluson 530, 730, Expert, General Electric E8 (from February 16, 1989, to September 17, 2010) and spontaneous pregnancy measures (M.C. sample) with Kretz Voluson Expert V5 (from January 2, 2006, to January 1, 2007) and General Electric E8 (from January 2, 2008, to December 23, 2010). The measures were taken and recorded using MEDIS PHINX software (http://www.isisphinx.com/), which allows a precise collection of the clinical circumstances during the ultrasound examination.
The method for measuring embryonic lengths was optimized for every examination, taking into account the size of the embryo and the prevailing technical conditions. For a fetal size <20 mm, the neck-rump length was measured [8]. The measurements taken were as close to a sagittal section as possible (this criterion is less important for small embryos). In most cases, a sagittal section was often obtained using a 2D scan via vaginal or abdominal access. In case of difficulties, the sagittal section was obtained using a 3D acquisition [6] (fig. 1). For CRL <45 mm, the measurement used was almost essentially the one of an embryo in flexion. For greater lengths, we looked for a neutral position, with the presence of an amniotic liquid triangle between the chin and the breastbone.
Fig. 1
New first-term CRL equations: ultrasound pictures illustrating embryonic length measurements. Sagittal sections of an 8-mm (a) and a 38-mm embryo (b). c Sagittal section of the 38-mm embryo built with triplane mode from a 3D acquisition.
The quality criteria defined for the measures of the CRL were as follows: For CRL <45 mm (fig. 2a)
Fig. 2
New first-term CRL equations. Ultrasound picture illustrating quality criteria for CRL <45 mm (19 mm; a) and >45 mm (59 mm; b).
(1) Sagittal section: visibility of the fourth ventricle and of the origin of the umbilical cord, followed by the addition of the rachis and the genital tuber.
(2) Favorable placement of calipers: crown and rump were well defined, without edge contact, and the calipers were placed on the outer side of the interface (greatest length). Note that at this age, the greatest length is obtained by placing the cephalic caliper at the crown-neck junction [8].
(3) Zoom: the image of the fetus filled at least 3/4 of the screen
For CRL >45 mm (fig. 2b)
(1) Fetus in neutral position: presence of a triangle of amniotic fluid between the chin and the breastbone.
(2) Sagittal section: visibility of the hard plate or the nasal bone, the rachis and the genital tuber (fig. 2).
(3) Favorable placement of calipers: as in the previous case.
(4) Zoom: as in the previous case.
To clean the data, we used non-parametric tolerance limits [14,15]. In this context, if the consensus reference values for a Gaussian distributed parameter are usually those comprised between –2SD and +2SD (95% of the whole population), for a non-Gaussian distributed parameter, as is probably the case here, there is no consensus range [16]. The 3rd and 97th or the 5th and 95th or the 10th and 90th percentiles were chosen as reference values according to the biological or physiological parameter under investigation, thus excluding either 6, 10 or 20% of the population with extreme values [17,18,19]. We performed a preliminary robust regression that is not sensitive to outliers, and provided weights which were used to clean the data. This led us to exclude 8.4% of the pregnancies with extreme values, 3.9% above and 4.5% below.
Mathematical Methods
Regressions
Several regression models were tested. Numerical tests, based on comparisons of R2 statistics, led us to growth equations where CRL was described by polynomials of degree 2 in FA. Fisher’s significance tests for the highest degree coefficients were performed and showed significant coefficients. The choice of degree 2 polynomials avoided overfitting of the model to the data and was in agreement with the settings used by Robinson [2], Verwoerd-Dikkeboom et al. [5] or Pexsters et al. [8]. Papaioannou et al. [7] used a polynomial of degree 2 in FA to describe the square root of the CRL.
Whereas classical least-square estimators assume a constant variance of the residuals, the discrepancy between predictions and observations clearly depended on the FA at the time of measure in our case. Thus, we corrected the heteroskedasticity to account for the different fluctuations of the residuals and to obtain an equation for the latter. Therefore, we applied generalized least-square regressions [9,20]. The modeling of the standard deviation was very important since our purpose was to further define Z-scores and normal parameters.
Our dataset exhibited outliers due to, among other factors, imprecisions concerning fecundation time, which could compromise the quality of the regression results. The robust regression methods we used are designed to be insensitive to the presence of outliers [12,21]. We used the lmRob function from the R Cran robust package [22]. The study was completed by a modeling of the variance of the residuals over time. Also, the observations corresponding to CRL <45 or >45 mm were reweighted so that these two periods had equivalent importance in the regression.
A similar methodology was used to provide dating equations, where FA is predicted as a function of CRL.
Finally, we also considered mixtures of regressions and looked for breakpoints in the fetal growth. This provides an alternative to growth equations currently found in the literature, in an attempt to better understand the observed curvature of the data. We used the FlexMix package [23] to separate the sample into two groups on which regressions are estimated.
Validation of the Results
We considered two criteria to compare the results provided by the different regressions. The cross-validation revealed methods with good predictive power; the study of the differences between predictions and measurements provided information on the accuracy of prediction.
The idea behind cross-validation was to separate the observations into two samples: the first one, the training sample, was used to obtain the regression formula that expressed the explained variable as a function of the explanatory variables; the explanatory variables of the second sample, the test sample, were used in these regressions to provide predictions of the explained variable. These predictions were compared with the true measurements. We chose a leave-one-out cross-validation that uses test samples of size 1 and repeats the procedure by successively excluding each observation from the training data. We did this for each regression that had been proposed (heteroskedastic, robust, Robinson’s curve) and ranked the methods according to the number of times their predictions were the most accurate.
To ensure that the methods did not perform too badly when they did not provide the most accurate predictions, we also studied the distribution of the deviations between the predictions and the measurements.
Results
Growth Equations from the IVF Pregnancies
Of the 402 pregnancies, there were 78 (19.40%) twin pregnancies and 8 (1.99%) triple pregnancies. Preliminary tests [Wilcoxon and Chow, for more information, see online electronic supplementary materials (ESM) at www.karger.com?doi=10.1159/000339272] showed that multiple pregnancies did not require to be separated from single pregnancies and that the measures made in Nice and Monaco could be considered simultaneously. Similarly, it was verified that the changes in machines during the 21 years of measurements did not have a significant impact.
Heteroskedastic and robust regressions were done. We performed cross-validation with the obtained regressions (table 1). Globally, when considering growth equations, the robust regression gave the best prediction in 64.56% cases:
CRL = –3.3108 – 0.2087 FA + 1.5250 10–2 FA2(1)
with R2 = 73.79%
σ2 = 46.2354 – 2.0194 FA + 0.0230 FA2
Tabulated values are given in the ESM.
We used robust tests that generalize the Wald test for linear hypothesis [12]. The null assumptions that the coefficients in equation 1 were equal to Robinson, Pexsters or Verwoerd-Dikkeboom’s coefficients were all rejected with p-value p < 0.01.
The cross-validation (table 1) indicated that CRL predicted with our model were closer to the true values in 64.56, 67.72, 63.68 and 54.56% of the cases compared with Robinson’s predictions, Papaioannou’s curve, Pexsters’ curve and Verwoerd-Dikkeboom’s curve, respectively. Prediction errors with equation 1 also had a smaller standard deviation (3.43 mm).
Growth Equation from the Spontaneous Pregnancies
Using the same methodology as above, we performed heteroskedastic and robust regressions with the CRL measures from spontaneous pregnancies.
CRL = –4.1212 – 0.1824 FA + 0.0148 FA2(2)
with R2 = 78.48%
σ2 = –53.1054 + 2.5634 FA – 0.0189 FA2
The 95% confidence interval (CI) of CRL was also determined:
CI = [–4.1212 – 0.1824 FA + 0.0148 FA2
± ĸ × (–53.1054 + 2.5634 FA – 0.0189 FA2)1/2]
The constant ĸ was calibrated so that 95% of the observations were included in the confidence interval, thus ĸ = 1.85.
This regression 2 was compared to Robinson, Papaioannou, Pexsters and Verwoerd-Dikkeboom’s equations and to equation 1 (obtained with IVFs). Robust Wald-type tests [12] showed that with the variance in our observations, our equation 2 could be considered as significantly different from Robinson’s regression (p = 0.0261). The comparison with Robinson’s growth equation showed that the latter leads to an overestimation of FA when the CRL measurements are <45 mm and an underestimation for CRL between 45 and 84 mm as detailed below. The equations of Pexsters, Verwoerd-Dikkeboom and equation 1 were not statistically different from equation 2 (p = 0.0768, 0.3969 and 0.8189, respectively), although the cross-validation and the study of the errors between predictions and observations (table 2) indicated that our equation 2 provided more accurate predictions.
Using spontaneous pregnancy data, for FA = 28 days, the spontaneous pregnancy equation 2 predicted CRL = 2.39 mm while the IVF equation 1 gave CRL = 2.80 mm. Thus, there is a difference of 0.41 mm between the two equations. Robinson’s curve predicted CRL = 3.23 mm (+0.84 mm) for this same FA (28 days), equivalent to an overestimation of 1 day, Papaioannou’s curve predicted CRL = 1.86 mm (–0.53 mm), Pexsters’ curve predicted CRL = 1.16 mm (–1.23 mm) and Verwoerd-Dikkeboom’s curve predicted CRL = 2.51 mm (–0.12 mm).
For FA = 70 days, the spontaneous pregnancy equation 2 predicted CRL = 55.77 mm. We found a difference of 1.03 mm with the IVF equation 1, –1.00 mm with Robinson’s curve, –11.11 mm with Papaioannou’s curve, –2.03 mm with Pexsters’ curve and –0.35 mm with Verwoerd-Dikkeboom’s curve.
For CRL <20 mm, no significant difference was noticed between the IVF equation 1 and the spontaneous pregnancy equation 2 (p = 0.8713). For CRL <20 mm, the mean difference between the predictions of equations 1 and 2 was 0.43 mm (median of 0.42 mm and standard deviation of 0.039 mm, p < 0.05) and fitted with less than 1 day difference for FA. Thus, during this period, datations predicted by these equations were either similar or with a negligible difference (see datation tables in ESM). However, for CRL ≥20 mm, the deviation between equations 1 and 2 slightly increased. Graphically (fig. 3b), the IVF equation 1 overestimated the CRL.
Fig. 3
New first-term growth equations: The data from spontaneous pregnancies are represented. A Figure with the IVF data is presented in the ESM. a Predictions of the CRL values with equations 1 and 2 (from our spontaneous pregnancy data). b Predictions of the CRL values using equation 2 and using Papaioannou’s, Pexsters’, Verwoerd-Dikkeboom’s, and Robinson’s models. The 95% confidence interval associated with equation 2, with ĸ = 1.96, is drawn in dashed black lines.
A t-test showed that, for the spontaneous pregnancies, the proportion of over-estimations by the IVF equation 1 was statistically higher than when the spontaneous pregnancy equation 2 was used (p = 0.0330). The mean difference between the CRL prediction by equations 1 and equation 2 was 0.94 mm, with a median of 0.94 mm and a standard deviation of 0.223 mm.
Dating Equations: Predicting FA with the Ultrasound Measure of CRL
Today, the determination of FA from the ultrasound measure of CRL is based on Robinson’s equation [24] or on the equation by Papaioannou et al. [7] (see equations in the ESM). Like these authors, we opted to build a mathematical model from our data predicting FA from CRL. Indeed, in a stochastic setting, inverting the equations linking measures of CRL to FA may not be optimal. We built mathematical models from our data using the same regression methods and validation procedures as described in the previous sections.
Using the IVF data, we obtained the IVF dating equation for predicting FA from CRL:
For the pregnancies from spontaneous conceptions, the heteroskedastic regression was shown to be more accurate when compared with the equation obtained by robust regression (with a cross-validation and a study of the prediction error, data not shown). These spontaneous pregnancy dating equations were:
and showed relatively small heteroskedasticity.
Fisher’s tests with heteroskedastic corrections showed that the coefficients of equation 4 and those of Robinson [24] were statistically different (p = 0.0321). The difference between the coefficients of the spontaneous pregnancy dating equation 4 and the IVF dating equation 3 was not statistically significant (p = 0.9943). Papaioannou et al. [7] used a polynomial of degree 2 to express FA as a function of CRL and CRL2, and thus the coefficients could not be compared with those of the IVF dating equation 4.
We compared the spontaneous pregnancy dating equation 4 with the IVF dating equation 3, Robinson’s equation and Papaioannou’s equation (table 3). This comparison showed that the spontaneous pregnancy dating equation 4 and Papaioannou’s equation had similar predictive power and equivalent precision in the predictions. Both outperformed Robinson’s equation, but the latter still provided correct predictions for CRL >45 mm, with an average error of 3.5 days (against 3 days for equation 4 and Papaioannou’s equation; fig. 4). When we compared equation 4 and Robinson’s dating model, we obtained an average discrepancy of 2.99 mm and a standard deviation of 1.94 for CRL <45 mm (mean: –0.39 mm, SD 0.32 for CRL of 45–84 mm). Thus, Robinson’s dating model was more accurate for CRL <45 mm than for CRL <45 mm.
Fig. 4
New first-term CRL equations: predictions of FA (time since fecundation) from the measurements of the CRL using the equations found with various methods. We represented data from the sample of spontaneous pregnancies.
Optimal Period to Predict FA with CRL
From equation 3, FA at which the range of the confidence interval of FA predicted from the CRL was the smallest was obtained for CRL = 23.5950 mm, corresponding to FA = 49.38 days. The CRL standard deviation at this time was estimated to be 1.8292 mm. This corresponds to the period where there is a trade-off between measuring greater lengths and the adequate positioning of the embryo. Around FA = 50 days, the embryo is indeed quite curved in shape and motionless.
Breakpoint
As the graphical representation of CRL by FA (fig. 3a) evoked a breakpoint in fetal growth, instead of using polynomials of degree 2 to model CRL as a function of FA, we aimed to reproduce the curvature of the growth curve by modeling a breakpoint between two subperiods. We used the package FlexMix [23]. Two groups were obtained (fig. 5) that may be modeled with the following equations:
Fig. 5
New first-term CRL equations: Estimation of mixed linear regression with the FlexMix package, with the two clusters identified. IVF data are used.
CRL = –21.15 + 0.7642 FA + 2.820 FA2
CRL = –28.1408 + 0.7106 FA + 7.364 10–3 FA2
with R2 statistics of 99.15 and 97.36%, respectively. Both equations disclosed a breakpoint at FA = 45.56 days. The methods distinguished two samples of 211 and 359 observations, respectively, in which FA differed, the first sample being composed of small FA. After the breakpoint, we observed an acceleration of the growth regime. We can consider that this change in the rate of fetal growth may be related to a better fetal drip, which corresponds to the beginning of the maternal blood connection.
Discussion
Robinson’s curve remains a historical benchmark that is still widely used, even though there have been many other studies concerning these measurements since 1973. In view of diagnosis of possible growth restriction, for example, our first purpose was to propose new growth equations as precise and robust as possible for modeling CRL as a function of FA using modern equipment and following the standardized procedures for ultrasound measurements. Careful statistical comparison with equations found in the literature was carried out.
The strength of our study was the use of a general population where ultrasound measures had been performed by a single operator using the same manufacturer devices: Kretz/General Electric. Moreover, an adequate software specifically developed for this work was used for the daily collection of data. Measurements were optimized by the choice of the best method, transabdominal or transvaginal ultrasound, for each examination. The difficult measures benefitted from 3D reconstructions. Moreover, we used an explicit methodology to clean our data. The statistical accuracy of our equations was established by cross-validation and studying the errors of the different predictions. Comparisons between the various predictions showed that for CRL between 45 and 85 mm, the differences between older and more recent charts were small. Also, the recent equations benefitted from larger datasets for the initial period collected by vaginal access. Our new equations allow a chronological follow-up for embryos with CRL <45 mm. The beginning of the curve is much more accurate, as, for example, when considering the predicted size 28 days after conception. Additionally, we used a large IVF dataset where the date of fecundation was precisely known as a reference. Again, these data were obtained from a single operator using Kretz/General Electric equipment.
Certain limitations need to be taken into account regarding the present study. The first limitation concerns the potential variability arising from the use of different ultrasound devices over a 21-year period for the IVF data. Although we performed stringent statistical controls on the results obtained from different devices, we cannot completely exclude the effect of this variability. The second limitation relates to the fact that although none of the patients addressed to MC reported a miscarriage or other problems in early pregnancy, we cannot exclude that some potentially unknown abnormal pregnancies (at the time of ultrasound examination) were kept as part of the dataset. Thus, in order to have ‘normal values’, we excluded extreme values that may be regarded as abnormal values.
We obtained growth equations (equations 1 and 2) and dating equations (equations 3 and 4) for the IVF data and spontaneous pregnancies. Moreover, we have provided models for the evolution of SD, which allows for building confidence intervals and Z-scores, paving the way for a work now in progress towards the possible ultrasound detection of abnormal fetal growth. In addition, we showed that the best period to predict FA is to measure embryos with an FA of ∼50 days, when its size is around 24 mm.
For CRL <20 mm there was no difference between the equations built on IVF data and those built on spontaneous pregnancy data. This objectively suggests that the age at which an embryo of 1–2 mm becomes visible in the gestational sac, as well as the embryonic growth for fetuses between 1 and 20 mm, are on average the same for the two types of pregnancies. For CRL ≥20 mm, the small difference between both models increased suggesting that IVF embryos could have a slightly higher growth rate than spontaneous pregnancy embryos, as was already reported [25,26].
Finally, models where CRL is expressed as a polynomial of degree 2 are convenient for practical purposes. However, a model with a breakpoint showed that there is a change in the growth regime around day 46, i.e. when CRL is around 20 mm. This could correspond to the beginning of the maternal connection.
Acknowledgments
We are grateful to Christian Duroux for his contribution to the optimization of the obstetric part of the MEDISPHINX software. Prior to this work, M.C. and V.C.T. supervised the internship trainings of Anthony Cousien and Anthony Létendart. We thank Dr. Laurent Salomon and Florian Odor for interesting discussions on this project. The English was edited by Dr. Omolade Alao and Ms. Diala Abu Awad.
Disclosure Statement
None of the authors has a conflict of interest.

Get Permission