Lung cancer remains the leading cause of cancer-related deaths worldwide, with NSCLC accounting for approximately 85% of all cases. Among its subtypes, ADC and SQC are the most prevalent, each associated with distinct prognoses and treatment pathways. Accurate histological subtype classification is thus critical for effective personalized therapy. However, current standard procedures rely on invasive biopsies, which may be unsuitable or risky for certain patients. Moreover, biopsy samples are often limited in size or affected by tumor heterogeneity, leading to diagnostic uncertainty. These limitations have motivated the search for non-invasive, imaging-based methods to predict histological subtypes. To address this challenge, this thesis explores a series of deep learning strategies that leverage radiological imaging, particularly CT and PET scans, to develop robust and accurate classification systems. The work systematically addresses key barriers in this domain, including limited dataset sizes, data privacy concerns, and the challenge of effectively integrating multimodal information. The first part of the thesis investigates whether triplet networks, a type of metric learning model, can overcome the challenge of learning from small datasets, which is a common issue in medical imaging due to privacy concerns, annotation costs, and limited patient availability. Unlike conventional deep networks that rely on softmax classifiers and learn directly from individual samples, triplet networks learn by modeling the relationships between samples, specifically by comparing an anchor to both a similar (positive) and a dissimilar (negative) example. This relational learning framework effectively increases the number of informative training instances and enhances the model's ability to generalize. On a dataset of 87 NSCLC patients, triplet networks significantly outperformed these standard models, demonstrating their ability to learn discriminative representations even under limited data conditions. Although triplet networks improve learning under limited data conditions, increasing the amount of training data remains critical for further enhancing model performance. However, acquiring additional data from external sources is often hindered by concerns related to patient privacy, data ownership, and regulatory compliance. To overcome these challenges, the second part of the thesis introduces a federated learning framework combined with triplet loss. This approach enables multiple clinical institutions to collaboratively train a model without sharing patient data, thereby preserving privacy. Comparative experiments demonstrate that this federated learning method outperforms both isolated local training and federated models trained with conventional softmax loss, highlighting its potential to improve generalization while maintaining data confidentiality. Having addressed the issues of small datasets and data sharing, the thesis then turns to multimodal learning, aiming to further improve classification performance by combining structural and metabolic information from CT and PET images. The third part introduces MINT, a novel Multi-stage INTermediate fusion architecture. MINT fuses PET and CT features at multiple stages of the feature extraction process, allowing the network to capture complementary cues across different levels of abstraction while preserving spatial correlations. The model is benchmarked against unimodal baselines, early and late fusion strategies, and the only other existing intermediate fusion method for this task. MINT achieves the highest performance across all comparisons, validating the advantage of intermediate fusion in leveraging multimodal imaging data for subtype classification. Finally, while each of the previous contributions focused on developing specialized models, the final part of the thesis explores the potential of foundation models in this domain. These large-scale pretrained models have shown promise in generalizing across tasks with minimal fine-tuning. The study evaluates three 3D medical foundation models trained on diverse datasets and compares their performance against specialized architectures developed specifically for NSCLC subtype classification. Using a dataset of 714 patients, all foundation models outperformed the task-specific baselines, highlighting their potential for enabling accurate and data-efficient solutions in clinical imaging tasks. Collectively, the contributions of this thesis address critical challenges in building non-invasive, accurate, and scalable systems for NSCLC histological subtype classification. By integrating strategies for learning from limited data, protecting privacy, fusing multimodal information, and harnessing pretrained models, the proposed approaches pave the way toward more accessible and precise diagnostic tools in lung cancer care.
Deep Learning-Driven Classification of NSCLC Histological Subtypes Using PET and CT Imaging / Fatih Aksu , 2026 Mar 05. 37. ciclo
Deep Learning-Driven Classification of NSCLC Histological Subtypes Using PET and CT Imaging
AKSU, FATIH
2026-03-05
Abstract
Lung cancer remains the leading cause of cancer-related deaths worldwide, with NSCLC accounting for approximately 85% of all cases. Among its subtypes, ADC and SQC are the most prevalent, each associated with distinct prognoses and treatment pathways. Accurate histological subtype classification is thus critical for effective personalized therapy. However, current standard procedures rely on invasive biopsies, which may be unsuitable or risky for certain patients. Moreover, biopsy samples are often limited in size or affected by tumor heterogeneity, leading to diagnostic uncertainty. These limitations have motivated the search for non-invasive, imaging-based methods to predict histological subtypes. To address this challenge, this thesis explores a series of deep learning strategies that leverage radiological imaging, particularly CT and PET scans, to develop robust and accurate classification systems. The work systematically addresses key barriers in this domain, including limited dataset sizes, data privacy concerns, and the challenge of effectively integrating multimodal information. The first part of the thesis investigates whether triplet networks, a type of metric learning model, can overcome the challenge of learning from small datasets, which is a common issue in medical imaging due to privacy concerns, annotation costs, and limited patient availability. Unlike conventional deep networks that rely on softmax classifiers and learn directly from individual samples, triplet networks learn by modeling the relationships between samples, specifically by comparing an anchor to both a similar (positive) and a dissimilar (negative) example. This relational learning framework effectively increases the number of informative training instances and enhances the model's ability to generalize. On a dataset of 87 NSCLC patients, triplet networks significantly outperformed these standard models, demonstrating their ability to learn discriminative representations even under limited data conditions. Although triplet networks improve learning under limited data conditions, increasing the amount of training data remains critical for further enhancing model performance. However, acquiring additional data from external sources is often hindered by concerns related to patient privacy, data ownership, and regulatory compliance. To overcome these challenges, the second part of the thesis introduces a federated learning framework combined with triplet loss. This approach enables multiple clinical institutions to collaboratively train a model without sharing patient data, thereby preserving privacy. Comparative experiments demonstrate that this federated learning method outperforms both isolated local training and federated models trained with conventional softmax loss, highlighting its potential to improve generalization while maintaining data confidentiality. Having addressed the issues of small datasets and data sharing, the thesis then turns to multimodal learning, aiming to further improve classification performance by combining structural and metabolic information from CT and PET images. The third part introduces MINT, a novel Multi-stage INTermediate fusion architecture. MINT fuses PET and CT features at multiple stages of the feature extraction process, allowing the network to capture complementary cues across different levels of abstraction while preserving spatial correlations. The model is benchmarked against unimodal baselines, early and late fusion strategies, and the only other existing intermediate fusion method for this task. MINT achieves the highest performance across all comparisons, validating the advantage of intermediate fusion in leveraging multimodal imaging data for subtype classification. Finally, while each of the previous contributions focused on developing specialized models, the final part of the thesis explores the potential of foundation models in this domain. These large-scale pretrained models have shown promise in generalizing across tasks with minimal fine-tuning. The study evaluates three 3D medical foundation models trained on diverse datasets and compares their performance against specialized architectures developed specifically for NSCLC subtype classification. Using a dataset of 714 patients, all foundation models outperformed the task-specific baselines, highlighting their potential for enabling accurate and data-efficient solutions in clinical imaging tasks. Collectively, the contributions of this thesis address critical challenges in building non-invasive, accurate, and scalable systems for NSCLC histological subtype classification. By integrating strategies for learning from limited data, protecting privacy, fusing multimodal information, and harnessing pretrained models, the proposed approaches pave the way toward more accessible and precise diagnostic tools in lung cancer care.| File | Dimensione | Formato | |
|---|---|---|---|
|
PhD_Aksu_Fatih.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
2.82 MB
Formato
Adobe PDF
|
2.82 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


