Date Published: March 7, 2019
Publisher: Public Library of Science
Author(s): Loïc Duron, Daniel Balvay, Saskia Vande Perre, Afef Bouchouicha, Julien Savatovsky, Jean-Claude Sadik, Isabelle Thomassin-Naggara, Laure Fournier, Augustin Lecler, Yong Fan.
To assess the influence of gray-level discretization on inter- and intra-observer reproducibility of texture radiomics features on clinical MR images.
We studied two independent MRI datasets of 74 lacrymal gland tumors and 30 breast lesions from two different centers. Two pairs of readers performed three two-dimensional delineations for each dataset. Texture features were extracted using two radiomics softwares (Pyradiomics and an in-house software). Reproducible features were selected using a combination of intra-class correlation coefficient (ICC) and concordance and coherence coefficient (CCC) with 0.8 and 0.9 as thresholds, respectively. We tested six absolute and eight relative gray-level discretization methods and analyzed the distribution and highest number of reproducible features obtained for each discretization. We also analyzed the number of reproducible features extracted from computer simulated delineations representative of inter-observer variability.
The gray-level discretization method had a direct impact on texture feature reproducibility, independent of observers, software or method of delineation (simulated vs. human). The absolute discretization consistently provided statistically significantly more reproducible features than the relative discretization. Varying the bin number of relative discretization led to statistically significantly more variable results than varying the bin size of absolute discretization.
When considering inter-observer reproducible results of MRI texture radiomics features, an absolute discretization should be favored to allow the extraction of the highest number of potential candidates for new imaging biomarkers. Whichever the chosen method, it should be systematically documented to allow replicability of results.
Medical imaging is progressively shifting from conventional visual image analysis to quantitative personalized medicine thanks to the recent development of data-driven analysis methods like radiomics . Radiomics is a high-throughput mining of quantitative features from medical imaging that enables data to be extracted and applied within clinical-decision support systems to improve diagnostic, prognostic, and predictive accuracy . The potential of radiomics-based phenotyping in precision medicine is encouraging [2–4] but the diversity of the implementation methods of the radiomics pipeline and the absence of widespread standards result in a high variability of the possible approaches that may lead to non-replicable results. More specifically, one of the steps in the radiomics process is feature reduction, to decrease data dimensionality and allow correlation of imaging features to predict outcome. Feature reduction can be performed in a number of ways, but a frequently-used method is based on the selection of the most reproducible features, based on the hypothesis that reproducibility is a mandatory quality for an imaging biomarker derived from the radiomics process. This feature-reduction step will therefore impact the results of a radiomics study, and may lead to potentially discard a highly informative feature because it is not reproducible enough to be used in clinical routine. However, there are no current recommendations or standardization of this step.
The results of the experiments 1, 2, 3 and 4 are detailed in the Tables 1, 2, 3 and 4, respectively.
We assessed the influence of gray-level discretization methods on radiomics texture feature inter- and intra-observer reproducibility in two independent clinical MR datasets. Texture feature reproducibility was shown to substantially depend on the gray-level discretization method. The absolute discretization (FBS) method was found to provide a higher number of reproducible features than the relative discretization (FBN) method. The relative discretization was also found to give more different results when varying the bin numbers than the absolute discretization when varying the bin sizes.