In today’s data-rich world, multimodal learning has emerged as a powerful approach to integrate and process diverse data sources—such as text, images, and structured information—enabling machine learning systems to make more accurate and robust predictions. This PhD thesis focuses on advancing multimodal learning by addressing key challenges in feature extraction, transfer learning, and multimodal fusion, with applications in rumour detection on social media and healthcare. The thesis first explores how to design and evaluate feature sets that effectively capture multimodal information. Through extensive experiments on social media datasets, the research demonstrates that combining features from user behavior, network interactions, and content significantly enhances rumour detection performance. Next, the study examines the role of transfer learning in multimodal systems, revealing both its potential and its limitations in improving model generalizability when applied to new, unseen datasets. To further advance multimodal integration, this work introduces a novel fusion approach that dynamically selects the most relevant modality for each task. Evaluations on a Twitter dataset show that this method outperforms traditional early, joint, and late fusion strategies. The thesis then extends the fusion method to the healthcare domain, applying it to predict Long-COVID pulmonary sequelae using multimodal clinical data, demonstrating the flexibility and effectiveness of the proposed techniques in a critical healthcare context. While the proposed methods have made significant strides, the research also identifies several limitations, such as the challenges of handling complex data representations and the need for larger datasets. These insights point to promising avenues for future research, including exploring deep learning-based feature extraction, more sophisticated transfer learning techniques, and expanding the multimodal approach to other high-impact domains. This thesis contributes to the growing body of work in multimodal learning, offering novel solutions to enhance both accuracy and generalizability in realworld applications.
Representation and Multimodal Learning in Social Media and Healthcare / Luisa Francini , 2024 Nov 28. 36. ciclo, Anno Accademico 2022/2023.
Representation and Multimodal Learning in Social Media and Healthcare
FRANCINI, LUISA
2024-11-28
Abstract
In today’s data-rich world, multimodal learning has emerged as a powerful approach to integrate and process diverse data sources—such as text, images, and structured information—enabling machine learning systems to make more accurate and robust predictions. This PhD thesis focuses on advancing multimodal learning by addressing key challenges in feature extraction, transfer learning, and multimodal fusion, with applications in rumour detection on social media and healthcare. The thesis first explores how to design and evaluate feature sets that effectively capture multimodal information. Through extensive experiments on social media datasets, the research demonstrates that combining features from user behavior, network interactions, and content significantly enhances rumour detection performance. Next, the study examines the role of transfer learning in multimodal systems, revealing both its potential and its limitations in improving model generalizability when applied to new, unseen datasets. To further advance multimodal integration, this work introduces a novel fusion approach that dynamically selects the most relevant modality for each task. Evaluations on a Twitter dataset show that this method outperforms traditional early, joint, and late fusion strategies. The thesis then extends the fusion method to the healthcare domain, applying it to predict Long-COVID pulmonary sequelae using multimodal clinical data, demonstrating the flexibility and effectiveness of the proposed techniques in a critical healthcare context. While the proposed methods have made significant strides, the research also identifies several limitations, such as the challenges of handling complex data representations and the need for larger datasets. These insights point to promising avenues for future research, including exploring deep learning-based feature extraction, more sophisticated transfer learning techniques, and expanding the multimodal approach to other high-impact domains. This thesis contributes to the growing body of work in multimodal learning, offering novel solutions to enhance both accuracy and generalizability in realworld applications.File | Dimensione | Formato | |
---|---|---|---|
PhD_Francini_Luisa.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
2.23 MB
Formato
Adobe PDF
|
2.23 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.