Skip to content

mingjiewang/hcc_proteome

Repository files navigation

hcc_proteome

Codes for data analysis of serum proteome data in HBV infected HCC cohort

Below is an explanation of the relevant folders and the functionalities of the code:

  1. 1.sampleCluster.R

    Preprocessing of samples and exploratory clustering and visualization.

    In the code file, the prefixes group_ and timepoint_ are used for analyzing group data (group.Rdat) and sequential samples across time points (timepoint.Rdat), respectively. The code names are consistent here.

  2. 2.heatmap_filterHighVar.R

    Heatmap visualization and exploratory filtering of highly variable features between different groups (time points).

  3. 3.filter_marker.R

    Feature protein selection using machine learning methods such as RF and LASSO.

  4. 4.timepoint_trend_test.R

    Trend testing in sequential samples for feature protein selection.

  5. other_ML_methods.R

    Selection and validation of protein molecules using other machine learning methods (SVM-RFE, ElasticNet, XGBoost).

  6. ROC.R

    Validation of the ROC statistical performance of protein molecules.

  7. ./data/

    This contains data for comparisons between groups and sequential sample time series data, which can be directly read using the R language load() function. It includes protein quantitative data, protein annotation data, and sample group information. The protein quantitative data is truncated to the first 100 rows; please contact the author for the complete data.

The related research article is still under submission. If you use our code, please promptly add the citation. If you have any questions, please contact the author.

About

Codes for data analysis of serum proteome data in HBV infected HCC cohort

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages