Unsupervised Learning
Principal component analysis (PCA) and K-means clustering
Prerequisites
BioNT Applied Machine Learning for Biological Data
Module 1: Python Numpy and Pandas
Participants should gain skills introduced in above mentioned Lessons or equivalent skills.
Time
2 hours and 30 minutes
Objectives
Objectives
Demonstrate the use of unsupervised learning for drug sensitivity analysis.
Example workflow of PCA and K-means clustering with test dataset (drug sensitivity patterns across patients) for patient stratification
Note
ML use-case
Drug sensitivity scores: 50 drugs and 25 patients
Unsupervised learning (PCA and clustering) analysis will
Transform the drug sensitivity data (high-dimensional) into a dataset (lower-dimensional) that capture the most significant variance and patterns
Group patients into distinct strata based on similarities in their overall drug sensitivity patterns
Dataset
Imputed Drug Sensitivities:
This data was imputed for TCGA-BRCA patients based on a model trained on cancer cell line gene expression and corresponding in vitro drug response measurements
Source: Cancer drug sensitivity prediction from routine histology images