Skip to main content
Fig. 1 | Genome Medicine

Fig. 1

From: Novel personalized pathway-based metabolomics models reveal key metabolic pathways for breast cancer diagnosis

Fig. 1

The workflow of the pathway-based metabolomics data analysis. Step 1: conversion from metabolite- to pathway-based metabolomics data. The input data include the master file containing pathway-metabolite mapping information, the metabolomics profiling data, and the normal/tumor classification vector. The metabolomics-level data are transformed to pathway-level data by the pathifier algorithm. The output file of pathifier is the pathway dysregulation score matrix, within which each score measures the deregulation of a specific pathway for a specific sample. Step 2: model construction. Qualified COH plasma samples are split by 80/20 for training and holdout testing data. Correlation feature selection (CFS) is used for feature selection and the logistic regression model is used for classification. Tenfold cross-validation (10-fold CV) is applied with CFS feature selection in the plasma training data set. Two models are constructed: an all-stage diagnostic model and an early-stage diagnostic model. Step 3: model evaluation. The model performance is assessed using receiver operating characteristic (ROC) curves and various metrics, including AUC, MCC, sensitivity, specificity, and F1-statistic

Back to article page