2019-06-18
Poster presented at the European Human Genetics Conference, Gothenburg, Sweden in 2019.
Open the poster in a new window and check out the abstract.
Introduction: RNA-seq has become a standard method for transcriptome profiling of cancer tissues. A great number of studies investigated genetic expression in order to identify marker genes that could be used for diagnostic and/or prognostic purposes. However, the generalisation of expression pattern with limited number of genes may be inaccurate. Instead of using a specific subset of genes, we propose to use the whole RNA-seq profile to improve the diagnosis and classification of Non-Small-Cell Lung Cancer (NSCLC) through a user-friendly platform.
Materials and Methods: A machine-learning algorithm (Random Forest) which learns from RNA-seq profiles of NSCLC tissues, collected in 123 patients (tumour and non-tumour samples) was applied in order to discriminate histological subtypes of NSCLC. The method has been implemented within the R environment and the web platform has been built with the Shiny package.
Results: The web platform actually allows the user, first, to provide its own RNA-seq library; then it automatically fits a Random Forest model based on the expressed genes; finally, it returns a diagnosis along with its false discovery rate. The prototype was assessed and cross validated. The best results were obtained when full RNA-seq profiles were provided.
Conclusions: We propose our method as a universal approach of use of whole RNA-seq data for diagnostic purposes in a simple and straightforward manner.
The study was funded by the Polish National Centre for Research and Development, the MOBIT project (STRATEGMED2/266484/2/NCBR/2015).
comments powered by Disqus