Unsupervised Learning

Principal component analysis (PCA) and K-means clustering

Prerequisites

  • BioNT Applied Machine Learning for Biological Data

    • Module 1: Python Numpy and Pandas

Participants should gain skills introduced in above mentioned Lessons or equivalent skills.

Time

2 hours and 30 minutes

Objectives

Objectives

  • Demonstrate the use of unsupervised learning for drug sensitivity analysis.

  • Example workflow of PCA and K-means clustering with test dataset (drug sensitivity patterns across patients) for patient stratification

Note

ML use-case

  • Drug sensitivity scores: 50 drugs and 25 patients

  • Unsupervised learning (PCA and clustering) analysis will

    1. Transform the drug sensitivity data (high-dimensional) into a dataset (lower-dimensional) that capture the most significant variance and patterns

    2. Group patients into distinct strata based on similarities in their overall drug sensitivity patterns

Dataset

download test dataset

Notebook

alt text