Radiogenomic Associations Clear Cell Renal Cell Carcinoma: An Exploratory Study

Introduction: This study investigates how quantitative texture analysis can be used to non-invasively identify novel radiogenomic correlations with clear cell renal cell carcinoma (ccRCC) biomarkers. Methods: The Cancer Genome Atlas-Kidney Renal Clear Cell Carcinoma open-source database was used to identify 190 sets of patient genomic data that had corresponding multiphase contrast-enhanced CT images in The Cancer Imaging Archive. 2,824 radiomic features spanning fifteen texture families were extracted from CT images using a custom-built MATLAB software package. Robust radiomic features with strong inter-scanner reproducibility were selected. Random forest, AdaBoost, and elastic net machine learning (ML) algorithms evaluated the ability of the selected radiomic features to predict the presence of 12 clinically relevant molecular biomarkers identified from the literature. ML analysis was repeated with cases stratified by stage (I/II vs. III/IV) and grade (1/2 vs. 3/4). 10-fold cross validation was used to evaluate model performance. Results: Before stratification by tumor grade and stage, radiomics predicted the presence of several biomarkers with weak discrimination (AUC 0.60–0.68). Once stratified, radiomics predicted KDM5C, SETD2, PBRM1, and mTOR mutation status with acceptable to excellent predictive discrimination (AUC ranges from 0.70 to 0.86). Conclusions: Radiomic texture analysis can potentially identify a variety of clinically relevant biomarkers in patients with ccRCC and may have a prognostic implication.


Introduction
In 2023, there will be an estimated 81,800 new cases and 14,890 deaths due to renal cancer in the US [1]. Approximately 90-95% of these neoplasms are renal cell carcinoma (RCC), with 16% presenting with regional spread and another 16% presenting with distant metastasis [2,3]. Clear cell RCC (ccRCC) is the most common subtype, accounting for over 70-80% of RCC [4].
Recent research into the biochemical and genetic basis of ccRCC has led to discovery of new targeted therapeutic agents and development of prognostic risk stratifications [5][6][7][8][9][10]. Key mutations implicated in ccRCC oncogenesis include von Hippel-Lindau tumor suppressor (VHL), BRCA1-associated protein 1 (BAP1), SET domaincontaining 2 (SETD2), polybromo 1 (PBRM1), and lysine (K)-specific demethylase 5C (KDM5C) genes [11]. With the development of The Cancer Genome Atlas-Kidney Renal Clear Cell Carcinoma (TCGA-KIRC), a publicly available database of ccRCC multiomics data, many novel biomarkers associated with angiogenic and tumorigenic phenotypes have been elucidated [12]. The recent widescale availability of RNA expression data has enabled the identification of subsets of cancer cells expressing surface proteins targetable by precision therapeutics, such as tyrosine kinase inhibitors and anti-PD-L1 immunotherapy. These genomic datasets have utilized clustering algorithms correlated with outcomes data to develop validated genetic "signatures" to classify tumors into high-and low-risk prognostic groups (i.e., ClearCode34) [5,13].
Despite these advances, the clinical application of molecular profiling in ccRCC has been limited [14]. Barriers include poor reproducibility, nonavailability of tissue specimens, and intratumor heterogeneity causing sampling variability [15]. Furthermore, tumor cells with similar genotypes may produce different phenotypes [16]. Quantitative analysis of medical imaging (e.g., contrastenhanced computed tomography, CECT), commonly referred to as radiomics, allows for a noninvasive method of tumor assessment with the potential for determination of molecular biomarker status [17,18]. Radiomics is a multi-step advanced process that allows the extraction of the spatial distribution of signal intensities and pixel interrelationships from clinical imaging studies. By analyzing standard-of-care images, radiomics may improve clinical decision-making by providing a comprehensive, macroscopic characterization which complements molecular diagnostics.
Using either qualitative image assessments or quantitative radiomics analyses to investigate associations with molecular data, radiogenomics has been used as the basis for new models of cancer diagnosis and prognosis. Prior studies demonstrated the ability to identify breast cancerimaging phenotypes associated with likelihood for recurrence, identified non-small cell lung cancer phenotypes predictive of treatment response, and predicted overall survival in glioblastoma [19][20][21]. In ccRCC, prior CTbased radiogenomic studies have predicted mutations of genes such as BAP1 and PBRM1, calculated a prognostic score, or predicted treatment response with bevacizumab in patients with metastatic disease [22][23][24][25][26].
In this study, machine learning algorithms evaluated the performance of radiomic signatures as a surrogate marker of various significant genetic biomarkers including DNA mutations, RNA expression profiles, prognostic markers, and frameshift mutation burden status. Because unstratified data only weakly discriminated biomarkers, stratified sub-analyses were performed to evaluate performance in either low (Fuhrman grade 1/2) or high pathological grade (Fuhrman grade 3/4) and either low (stage I/II) or high (stage III/IV) TNM stage, as these are prognostic factors.

Biomarker Selection and Analysis
A literature search was conducted on PubMed using MeSH terms "clear cell renal cell carcinoma AND genomics OR frameshift mutations OR epigenomics OR proteomics" to identify significant molecular biomarkers in ccRCC (shown in Fig. 1). After a comprehensive literature review, twelve clinically relevant biomarkers were selected for inclusion, based on a combination of pathophysiological significance, clinical relevance, and availability for analysis from TCGA-KIRC [12]. Descriptions of these biomarkers can be found in Table 1.
Patients' genomic data were obtained from TCGA-KIRC database. Corresponding single and multiphase CT images were downloaded from the Cancer Imaging Archive [41]. Mutation status information was available on TCGA web portal. Somatic mutations that were noted to have oncogenic potential and therapeutic implications were included and further classified by type of mutation. Normalized RNA expression data were used to classify patients as high or low expressors of three sets of genes associated with angiogenesis (Angio), tumor immunity (Teff), myeloid  [39]. High expression of each gene set was defined as greater than the median expression of the study sample. Whole-exome sequencing data were used to quantify indel burden. High indel burden was defined as 10 or more frameshift indels [40]. All data were publicly accessible and de-identified and thus was exempt from Institutional Review Board (IRB) review.

Image Segmentation
Under the supervision of an experienced, fellowship-trained abdominal imaging radiologist (redacted), two of the authors (redacted) independently and manually segmented renal tumor voxels from surrounding voxels as 3D regions of interest on a Synapse 3D workstation (Fujifilm Medical Systems U.S.A., Stamford, CT). Cases with poor consensus were adjudicated by the experienced radiologist (redacted). The number of available CT phases per patient varied from two to four (precontrast, corticomedullary, nephrographic, and excretory). When multiphase data were available, the nephrographic phase was used as the target for co-registering the other phases. All cases were reviewed by a fellowship-trained radiologist and categorized into multiphase data. To avoid bias, no additional image preparation steps such as normalization, resizing, and denoising were performed. Custom MATLAB ® code was used to extract voxel data corresponding to the region of interest. 2D analysis was conducted on the image that provided the largest area in each phase and imaging plane. 3D analysis was conducted on the whole tumor volume.

Radiomics Analysis
Our comprehensive radiomics panel comprises 2,824 radiomic metrics derived from fifteen different texture methods ( Table 2, additional details provided in online suppl. Table 1; for all online suppl. material, see https://doi.org/10.1159/000530719). MATLAB ® (Mathworks, Natick, MA, USA) software was used to implement this previously reported radiomics framework that has been benchmarked to the Image Biomarker Standardization Initiative standards [42][43][44][45]. A subset of radiomic metrics with high intra-and inter-scanner reproducibility and repeatability was identified and included 352 features from nine texture methods and formed radiomic inputs to our robust model.

Machine Learning Classification
A schematic of our radiogenomic analysis is shown in Figure 2. As this is an exploratory study, three machine learning (ML) algorithms were used to evaluate the ability of radiomic features to predict biomarkers: random forest (RF), Real AdaBoost, and elastic net [46][47][48].
For all 3 classifiers, 10-fold cross validation was used to evaluate model performance. ROC curves were constructed using the predicted probability from 10 testing datasets combined. Area under the curve (AUC) with a 95% confidence interval was used to assess prediction accuracy. Within each iteration, we applied a 5fold cross validation to determine the final prediction model before scoring the 10% independent testing sample. This 10% was excluded from the learning phase to avoid data leakage. For RF and AdaBoost, Gini impurity index was used as the loss function. Predicted residual sum of squares (CVPRESS) was used for elastic net to select candidate predictors as well as the final model. Initial analysis was first performed with the entire cohort (n = 190). Sensitivity analyses were also conducted by grade 1/2 (n = 70) versus grade 3/4 (n = 120) and by stage I/II (n = 114) versus stage III/IV (n = 76). Heatmaps were used to visualize AUC patterns, where AUC ≥0.90 was considered outstanding discrimination, 0.80≤ AUC <0.90 excellent, 0.70≤ AUC <0.80 acceptable, and 0.60≤ AUC <0.70 weak [49]. Confusion matrixes were generated, and heatmaps for sensitivity and specificity were also constructed.  Increased RFS compared to BAP1 [29]; patients with gain-of-function benefit from anti-VEGF and anti-PD1 therapy [30]; increased benefit from anti-PD1 monotherapy in loss-of-function mutations [31]

Indels
Frameshift insertion-and-deletion mutations; increased indel burden in tumor specimens associated with increased immune gene expression, and significantly associated with immunotherapy response [40] RFS, recurrence-free survival; PFS, progression-free survival; OS, overall survival. Neighborhood gray tone difference matrix -2D GLSZM3D* Gray level size zone matrix -3D GLSZM2D* Gray level size zone matrix -2D GLRLM3D* Gray level run length matrix -3D GLRLM2D* Gray level run ength matrix -2D LTE3D Laws' texture energy measures -3D LTE2D Laws' texture energy measures -2D Discrete cosine transform -2D *Indicates features identified in the robust model.

Results
The details about model performance including AUC, sensitivity, and specificity with the corresponding 95% confidence interval are presented in online supplementary Table 2. In general, the robust model predicted performance better than the full model.

Patient Cohort
TCGA-KIRC included 537 cases, of which 237 had corresponding CT images in TCIA. Of the 237 cases, 47 were excluded if tumor size was smaller than 1 cm or if image quality was suboptimal on qualitative evaluation by an experienced radiologist. The remaining 190 patients were included in the study. Summary patient characteristics are shown in Table 3. Most of our patients were male (72.8%), Caucasian (92.0%), and had localized disease (85.8%).
Certain mutation frequencies were low in stratified groups, such as KDM5C in low grade (n = 2) and high stage (n = 1) ( Table 4). Mutation status information and RNA expression data were available on TCGA web portal for all 190 patients, ClearCode34 classification was available for 141 patients (ccA:73 and ccB:68), and indel data were available for 100 patients.

Stratified Analysis by Fuhrman Grade
In stratified analysis of grade 1/2 samples (shown in Fig.  3-5d), RF and AdaBoost predicted KDM5C mutations with acceptable discrimination (AUC 0.76, sensitivity 0.5, specificity 0.75 and AUC 0.76, sensitivity 0.5, specificity 0.68, respectively). Elastic net and AdaBoost also predicted PBRM1 mutations with acceptable discrimination (AUC 0.75, sensitivity 0.71, specificity 0.72 and AUC 0.72, sensitivity 0.59, specificity 0.62, respectively). In grade 3/4 samples (shown in Fig. 3-5e), RF predicted KDM5C mutations with acceptable discrimination (AUC 0.74, sensitivity 0.57, specificity 0.68) as well as several biomarkers with weak or acceptable discrimination, including VHL, BAP1, and mTOR mutations as well as Angio, Teff high, and high indel counts. AdaBoost predicted KDM5C and mTOR mutations and Angio expression mutations with acceptable discrimination (AUC 0.70, sensitivity 0.57, specificity 0.58, AUC 0.74 sensitivity 0.71, specificity 0.71, and AUC 0.71, sensitivity 0.63, specificity 0.65, respectively) as well as several biomarkers with weak or acceptable discrimination, including VHL and BAP1 mutations, and Teff high and high indel counts. Gray indicates exclusion from the analysis due to insufficient frequency (i.e., KDMC5 in high-stage analysis). AUC <0.5 indicates an unstable finding that occurs when accuracy is low and the predicted outcome variable is skewed, e.g., extremely low mutation frequencies. Radiomic Features of Importance Preliminary analyses of radiomic texture methods are shown in Figure 6. Each molecular marker's ranked radiomic features were examined within the ML model which achieved the highest predictive accuracy. These features represent the variables' most critical to the model for a given classification. Methods were evaluated as the percentage of features within each method that were classified as a VOI. Several biomarkers which were predicted with acceptable discrimination also had greater than 10% of features in most methods classified as a VOI, including KDM5C in stage I/II patients, grade 1/2 and 3/4. SETD2 had greater than 10% of features in most methods classified as a VOI for unstratified and stratified data. Conversely, some biomarkers, such as PBRm1 and mTOR, had only a few methods with greater than 10% of features identified as a VOI.

Discussion
Although several predictive biomarkers have been discovered in ccRCC, their clinical utility is currently limited due to the need for invasive tissue biopsy, intratumoral genetic heterogeneity, and lack of available assays for molecular characterization. Radiomics can potentially fill this gap by complementing these tissue-based biomarkers with a noninvasive approach using the imaging that the patient already gets without additional tests, while also potentially overcoming the limitations of tissue sampling and tumor heterogeneity. Our work furthers these goals by showing significant associations between radiomics and several key biomarkers with potential prognostic utility in ccRCC.
While the unstratified data did not show any significant predictive performance among the ML models, acceptable predictive discrimination was achieved with KDM5C mutation status by both the RF and AdaBoost models once stratified by stage I/II, grade 1/2, and grade 3/4. This finding is clinically relevant because KDM5C carries poor prognosis as KDM5C mutations from patients with ccRCC demonstrated features of chromatin disruption and genomic rearrangement [50]. Moreover, these findings validate that radiomics can indeed serve as a noninvasive method of identifying clinically relevant molecular markers.
Both RF and AdaBoost predicted SETD2 mutation status with excellent predictive discrimination when stratified in stages I/II. Again, SETD2-mutated ccRCCs are associated with poor overall survival and therefore could be used clinically to predict survival [51]; elastic net also managed an acceptable predictive discrimination when stratified into stage I/II. RF and AdaBoost predicted PBRM1 mutation status with acceptable predictive discrimination when stratified in stage III/IV and grade 1/2. PBRM1 is the second most common mutated gene in ccRCC and may serve as both a prognostic and predictive tumor biomarker in ccRCC [30]. Elastic net also managed an acceptable predictive discrimination when stratified into grade 1/2.
AdaBoost predicted mTOR mutation status with acceptable predictive discrimination when stratified in stage III/IV and grade 3/4, as well as Angio RNA expression with acceptable predictive discrimination when stratified in grade 3/4. mTOR has the potential to become a target for immune checkpoint inhibitor therapy in ccRCC [8]. Angio RNA expression is associated with improved PFS in patients receiving anti-VEGF therapy alone [39].
In addition to validating radiomics ability to identify clinical markers, we have also demonstrated that radiomic features in ML models have clinical relevance as well. Many radiomic features highlighted as VOIs act as direct measures of pixel heterogeneity, and likely represent the phenotypic variance found in ccRCC tumors. For instance, VHL mutations are known to promote angiogenesis, resulting in the heterogenous appearance and enhancement of ccRCC on CECT. This phenotypic characteristic is measured quantitatively in radiomic analysis as digital variance in individual pixels, resulting in differences in radiomic parameters like kurtosis and skewness. Although several radiomics methods are evaluated here, the first-order characterizations only account for data distribution characteristics and ignore the spatial information of the data (e.g., histogram analysis). The higher-order characterizations account for both spatial and data distribution characteristics (e.g., neighbor methods such as gray-level co-occurrence method) and in some cases also provide multi-level hierarchical characteristics (e.g., wavelet analysis). Further evaluation is needed to understand mechanisms that link molecular genotype to a specific radiomic method or imaging phenotype. Our study is a preliminary exploration, which makes it challenging to characterize the complex mechanistic pathways between genomics and radiomics at such an early stage. However, despite this limitation, the results we present offer an advantage over tissuebased biomarker research which is the current standard for cancer diagnosis, in that it is noninvasive, comparatively less expensive, easy to add into longitudinal studies. Likewise, they are not subjected to sampling biases, as a tumor is spatially heterogeneous with diverse subpopulations and tumor cells metastasize. Additionally, while radiomic signatures may not be possible for all mutations, an integrated biomarker combining both molecular and radiomic features may be feasible. The temporal change in the tumors radiomic profile would be worth exploring in future studies. Furthermore, our study has conducted a more thorough evaluation of the utility of radiogenomics that has existed prior. Historically, research in radiogenomics has identified associations between qualitative features identified by radiologists (e.g., necrosis, growth pattern, calcifications) with mutational status [52,53]. More recently, radiogenomic studies in ccRCC using quantitative radiomics analysis focused on optimizing algorithms for accurate prediction of single mutations [25,26,54]. Along with mutational status, other studies have also correlated CT imaging texture features to microRNA expression (n = 20) and Fuhrman grade (n = 131) [54,55]. In contrast, our study utilizes a larger sample size of 190, which is larger than most other studies. Furthermore, the utility of radiogenomics across a larger variety of biomarkers was explored, and data augmentation techniques that are known to introduce bias and artificially high AUCs were avoided. By evaluating various biomarkers, radiomic features, and machine learning methods, this study provides a more comprehensive evaluation of radiomics as a potential noninvasive surrogate marker in ccRCC. Our study has also detected stronger imaging genetic associations after stratifying by grade and stage. This finding suggests that different patterns of association exist between genetic and imaging features as well as high/low tumor grades and stages, particularly given that these associations were diluted in the unstratified model. This finding is supported by the VOI heatmap, which reveals very different patterns of VOI in predicting genetic markers between high/low grades and stages.
Another important finding is that sensitivity analysis showed better predictive discrimination from the robust model than the full model. A larger proportion of the robust models were classified as VOIs compared to the full model. Given the heterogeneity of scanners and imaging settings utilized in TCGA-KIRC cohort, this result is possibly explained by the robust model during the 5-fold cross validation in variable selection. The full model could have different signals due to weaker associations with genetic markers and were thus cancelled out during cross validation.
Our study attempted to eliminate bias whenever possible. The segmentations were performed under the supervision of a fellowship-trained abdominal radiologist. In our prior work, we conducted an interobserver reliability analysis with three radiologists. As previously reported, each radiologist segmented the tumor margins independently for 15 subjects. Intraclass correlation 2way mixed with absolute agreement was used to evaluate reliability of the radiomics results, as obtained using a RF classifier, despite the differences in segmentation contours. We also conducted a sensitivity analysis with all the three machine learning classifiers for predicting both grade and stage using the robust model and full models.
There are several limitations for this study. First, compilation of biomarkers from samples was limited by availability of data, most notably Whole-exome sequencing data. Several samples in our dataset had CECT imaging available but no DNA mutation data. Similarly, ClearCode34 lacked risk classification data for several samples. Second, beyond lack of genomic data, poor image quality led to difficulties with co-registration of CECT tumor and kidney segmentations, limiting potential cases. Third, our sample set primarily consisted of Caucasian and male patients which may limit the generalizability of our results to other populations. Fourth, there was significant variability in imaging protocols. CT scanners used for the CECT help control for this variability and will improve the discrimination of the radiogenomic correlation. Lastly, the low frequency of multiple biomarkers in our sample will affect the validity of our results, especially when stratified by pathological subtypes. For example, when stratified by grade 1/2 and 3/4, KDM5C mutation status only has 2 positive cases in grade 1/2 stratum. Fortunately, the AUC for strata with low case count all down to the null value which did not inflate the type 1 error. The stratum (e.g., grade 3/4) with relatively high case count could have higher AUC value after stratification, which can be considered as noise reduction when a minority pathological subtype been removed. The stratified analyses could inflate both the type 1 and type 2 error rate, however, given this was an exploratory study for the feasibility of ML-based radiogenomic prediction of biomarker status, the stratified analyses can still make contribution to the science with cautious result interpretation. Future studies will validate these results in real-world patients, exploring representative cases in which applying our results will aid in detailed patient-level assessments. Also, we trained our model on an entire tumor with the result on focal biopsy as output, without data on its location. Therefore, our training data may be polluted by results that do not correspond to the mutation (scanner characteristics present but area missed by the biopsy). In such a scenario, it would be helpful to perform some denoising with a pretraining on all the training data, delete the data with the highest variance, and retrain the model on "unpolluted" data. However, such statistical approaches of removing outliers may also result in removal of tumor heterogeneity. Therefore, more robust approaches that assess outliers on physics-or biology-based constraints are warranted. Future studies will also utilize larger, more diverse sample sets with consistent imaging and genomic data available to ensure an adequate number of rare mutations for optimal analysis. Subsequent studies could focus on correlating radiogenomic findings with clinical outcomes and responses to systemic therapy in advanced stage renal cancer.
In conclusion, our work focuses on the quantitative analysis of medical imaging, commonly referred to as radiomics. This method allows for a noninvasive method of tumor assessment with the potential for determination of molecular biomarker status. When stratified by stage (I/II vs. III/IV) and grade (1/2 vs. 3/4), radiomics predicted KDM5C, SETD2, PBRM1, and mTOR mutation status with acceptable to excellent predictive discrimination (AUC ranges from 0.70 to 0.86) in ccRCC. This is relevant because radiomic texture analysis can potentially identify a variety of clinically relevant biomarkers in patients with ccRCC and may be used in addition to biopsy results to predict prognosis.

Statement of Ethics
All data were publicly accessible and de-identified, and thus were exempt from Institutional Review Board (IRB) review. All data were publicly accessible and de-identified, and thus informed consent was not required.

Conflict of Interest Statement
Haris Zahoor reports honoraria and advisory board activity for Exelixis and Beyer. Ulka Vaishampayan reports research support from BMS and Merck and consulting activity and honoraria from Merck, Exelixis, Bayer, AAA, Pfizer, BMS, and Aveo. Vinay A. Duddalwar reports consulting fees from Radmetrix, medical advisory board activity for DeepTek Inc., and grant funding from Samsung Healthcare. The other authors have no relevant financial interests or disclosures.

Funding Sources
This work was supported in part by funding from the Radiological Society of North America (RSNA) and in part by funding from the Southern California Clinical and Translational Science Institute (SC-CTSI).

Author Contributions
Derek H. Liu, Komal A. Dani, Bino A. Varghese, Haris Zahoor, Steven Y. Cen, and Vinay A. Duddalwar participated in conception or design of the work, data collection, data analysis and interpretation, drafting the article, critical revision of the article, and final approval of the version to be published. Sharath S Reddy, Xiaomeng Lei, and Natalie L. Demirjian participated in conception or design of the work, data collection, data analysis and interpretation, and final approval of the version to be published. Darryl H. Hwang participated in data collection, data analysis and interpretation, critical revision of the article, and final approval of the version to be published. Suhn K. Rhie participated in data analysis and interpretation, critical revision of the article, and final approval of the version to be published. Felix Yap participated in conception or design of the work, drafting the article, critical revision of the article, and final approval of the version to be published. David Quinn, Imran Siddiqi, Manju Aron, Ulka Vaishampayan, and Inderbir S. Gill participated in conception or design of the work, critical revision of the article, and final approval of the version to be published.

Data Availability Statement
Publicly available datasets were used in this study. These can be found in The Cancer Genome Atlas-Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) at http://doi.org/10.1016/j.celrep. 2018.03.075, reference number 12. Further inquiries can be directed to the corresponding author.