Objective: The goal of this project is to fine-tune a state-of-the-art, pre-trained Vision Transformer model (CT-CLIP) to classify the presence of 18 different abnormalities in 3D CT scans.

Methodology & Pipeline

The process is broken down into five key steps, from raw data to a trained model. Here’s what the code is doing at each stage:

1. Data Cleaning & Preparation:

2. Data Sampling (for Efficient Prototyping):

3. Model Loading & Fine-Tuning (Transfer Learning):

4. Handling Severe Class Imbalance:

5. Training & Validation Loop: