In today’s data-rich world, multimodal learning has emerged as a powerful approach to integrate and process diverse data sources—such as text, images, and structured information—enabling machine learning systems to make more accurate and robust predictions. This PhD thesis focuses on advancing multimodal learning by addressing key challenges in feature extraction, transfer learning, and multimodal fusion, with applications in rumour detection on social media and healthcare. The thesis first explores how to design and evaluate feature sets that effectively capture multimodal information. Through extensive experiments on social media datasets, the research demonstrates that combining features from user behavior, network interactions, and content significantly enhances rumour detection performance. Next, the study examines the role of transfer learning in multimodal systems, revealing both its potential and its limitations in improving model generalizability when applied to new, unseen datasets. To further advance multimodal integration, this work introduces a novel fusion approach that dynamically selects the most relevant modality for each task. Evaluations on a Twitter dataset show that this method outperforms traditional early, joint, and late fusion strategies. The thesis then extends the fusion method to the healthcare domain, applying it to predict Long-COVID pulmonary sequelae using multimodal clinical data, demonstrating the flexibility and effectiveness of the proposed techniques in a critical healthcare context. While the proposed methods have made significant strides, the research also identifies several limitations, such as the challenges of handling complex data representations and the need for larger datasets. These insights point to promising avenues for future research, including exploring deep learning-based feature extraction, more sophisticated transfer learning techniques, and expanding the multimodal approach to other high-impact domains. This thesis contributes to the growing body of work in multimodal learning, offering novel solutions to enhance both accuracy and generalizability in realworld applications.

Representation and Multimodal Learning in Social Media and Healthcare / Luisa Francini , 2024 Nov 28. 36. ciclo, Anno Accademico 2022/2023.

Representation and Multimodal Learning in Social Media and Healthcare

FRANCINI, LUISA
2024-11-28

Abstract

In today’s data-rich world, multimodal learning has emerged as a powerful approach to integrate and process diverse data sources—such as text, images, and structured information—enabling machine learning systems to make more accurate and robust predictions. This PhD thesis focuses on advancing multimodal learning by addressing key challenges in feature extraction, transfer learning, and multimodal fusion, with applications in rumour detection on social media and healthcare. The thesis first explores how to design and evaluate feature sets that effectively capture multimodal information. Through extensive experiments on social media datasets, the research demonstrates that combining features from user behavior, network interactions, and content significantly enhances rumour detection performance. Next, the study examines the role of transfer learning in multimodal systems, revealing both its potential and its limitations in improving model generalizability when applied to new, unseen datasets. To further advance multimodal integration, this work introduces a novel fusion approach that dynamically selects the most relevant modality for each task. Evaluations on a Twitter dataset show that this method outperforms traditional early, joint, and late fusion strategies. The thesis then extends the fusion method to the healthcare domain, applying it to predict Long-COVID pulmonary sequelae using multimodal clinical data, demonstrating the flexibility and effectiveness of the proposed techniques in a critical healthcare context. While the proposed methods have made significant strides, the research also identifies several limitations, such as the challenges of handling complex data representations and the need for larger datasets. These insights point to promising avenues for future research, including exploring deep learning-based feature extraction, more sophisticated transfer learning techniques, and expanding the multimodal approach to other high-impact domains. This thesis contributes to the growing body of work in multimodal learning, offering novel solutions to enhance both accuracy and generalizability in realworld applications.
28-nov-2024
Representation and Multimodal Learning in Social Media and Healthcare / Luisa Francini , 2024 Nov 28. 36. ciclo, Anno Accademico 2022/2023.
File in questo prodotto:
File Dimensione Formato  
PhD_Francini_Luisa.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 2.23 MB
Formato Adobe PDF
2.23 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12610/83163
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact