Research Article: Machine Learning Analysis of Individual Tumor Lesions in Four Metastatic Colorectal Cancer Clinical Studies: Linking Tumor Heterogeneity to Overall Survival

Date Published: March 16, 2020

Publisher: Springer International Publishing

Author(s): Diego Vera-Yunca, Pascal Girard, Zinnia P. Parra-Guillen, Alain Munafo, Iñaki F. Trocóniz, Nadia Terranova.


Total tumor size (TS) metrics used in TS models in oncology do not consider tumor heterogeneity, which could help to better predict drug efficacy. We analyzed individual target lesions (iTLs) of patients with metastatic colorectal carcinoma (mCRC) to determine differences in TS dynamics by using the ClassIfication Clustering of Individual Lesions (CICIL) methodology. Results from subgroup analyses comparing genetic mutations and TS metrics were assessed and applied to survival analyses. Data from four mCRC clinical studies were analyzed (1781 patients, 6369 iTLs). CICIL was used to assess differences in lesion TS dynamics within a tissue (intra-class) or across different tissues (inter-class). First, lesions were automatically classified based on their location. Cross-correlation coefficients (CCs) determined if each pair of lesions followed similar or opposite dynamics. Finally, CCs were grouped by using the K-means clustering method. Heterogeneity in tumor dynamics was lower in the intra-class analysis than in the inter-class analysis for patients receiving cetuximab. More tumor heterogeneity was found in KRAS mutated patients compared to KRAS wild-type (KRASwt) patients and when using sum of longest diameters versus sum of products of diameters. Tumor heterogeneity quantified as the median patient’s CC was found to be a predictor of overall survival (OS) (HR = 1.44, 95% CI 1.08–1.92), especially in KRASwt patients. Intra- and inter-tumor tissue heterogeneities were assessed with CICIL. Derived metrics of heterogeneity were found to be a predictor of OS time. Considering differences between lesions’ TS dynamics could improve oncology models in favor of a better prediction of OS.

Partial Text

Model-Informed Drug Discovery and Development (MID3) (1) has demonstrated its usefulness to improve drug development in several cases, including the oncology area and the modeling of tumor size (TS) (2,3). TS is often expressed as the sum of longest diameters (SLD) of the individual tumor lesions (iTLs) measurable and defined as “target lesions” at baseline, as described by Response Evaluation Criteria in Solid Tumors (RECIST) (4). Each patient presents multiple iTLs, which can be primary or metastatic and located in several organs or tissues. This means that all tumor lesions, regardless of their location and status, are reduced to this single SLD value, also called the total TS, at each assessment visit within a patient. The iTLs are assessed throughout the clinical study; the SLD is derived at each time point and then categorized to quantify the tumor response to the treatment (4). Observed or model-derived TS metrics, such as the early tumor shrinkage (ETS, relative reduction of total TS at certain time points) or time to tumor growth, have been shown to be predictors of overall survival (OS) (5).

For the CICIL analysis, 1781 patients from the four previously described clinical studies were included: 1127 patients from CRYSTAL, 271 patients from APEC, 61 patients from Study 45 and 322 patients from OPUS. Separate CICIL analyses were performed for each study by considering different subsets of data: (i) all patients, (ii) patients receiving cetuximab, and (iii) KRASwt patients receiving cetuximab. All patients from the APEC and Study 045 studies received cetuximab. Figure 2 shows a tree diagram with the number of patients considered at each stage of this work.Fig. 2Tree diagram showing the distribution of patients across different subsets of CICIL analyses. Analyses on different subsets of patients were performed: (i) all patients, (ii) all patients receiving cetuximab either as monotherapy or in combination (cetuximab arm), and (iii) KRASwt patients receiving cetuximab. All APEC patients were KRASwt. *Not all patients were able to enter the CICIL analysis. Those without more than one tumor size (TS) assessment or with missing organ information were excluded

Despite enriching data analysis in oncology, lesion heterogeneity has been often ignored or disregarded in the TS modeling, as also highlighted by the limited number of published works in this field (9,10). This is due to the complex models and methods needed to consider differences between organs or tissues, as well as the intra-tumor heterogeneity within an anatomic area. Models with such features are complex and computationally expensive. Thus, being able to efficiently assess and quantify the heterogeneity in iTLs TS dynamics prior to any modeling analysis is extremely valuable to inform and to guide the best modeling strategies. The recently proposed CICIL methodology addressed this objective by exploiting ML techniques to inform subsequent modeling steps (11). This approach provides a user-friendly and automated framework to quantify heterogeneity in tumor dynamics at different levels: (i) between iTLs within an organ and (ii) between lesions located in different tissues. Results can be used to make data-driven modeling decisions, for example, about the use of total TS versus iTLs TS, and about the level (within or across tissues) of tumor heterogeneity to be accounted for in the model.

The CICIL outcome obtained from a large dataset of tumor measures was assessed with respect to different factors (genetic mutations, tumor metrics), and its direct link with a clinical endpoint was quantified. Comparisons between KRASwt and KRASmut patients indicated less heterogeneity in tumor lesions dynamics in the KRASwt subgroup which is known to well respond to Cetuximab treatment. The method used to measure the lesion TS did not lead to different results except for the single-study analysis of CRYSTAL where an apparent heterogeneity was compensated and reduced when including the perpendicular diameter and obtaining the SOPD. An increased risk of death with increased heterogeneity was suggested by all tested models, especially in KRASwt patients. A reduced impact of heterogeneity in TS dynamics on risk for KRASmut patients was indicated.




Leave a Reply

Your email address will not be published.