Recent years have witnessed a drastic chance in information diffusion that has become more and more immediate and effortless thanks to social media, allowing not only certified and professional press practitioners but also common end users to share news contents at a little to no cost. Despite the clear advantages of this phenomenon, the absence of systematic control and moderation on these platforms easily leads to spread unreliable information. This is usually referred to as rumour, an unverified and instrumentally relevant statement in circulation. To prevent treacherous information to have social consequences, researchers have been directing considerable effort in studying automatic systems able to recognize rumours. Most of the work focuses on macro-level analyses, i.e. the detection system considers as rumour news carried by a set of microblog posts rather than by an individual post. However, a micro-level analysis that considers the individual posts, could be of major interest in specific domains, such as health, where a finer investigation is often needed. On these grounds, in this thesis we investigate machine learning methods to detect rumours at the micro-level, and we apply our research on two real-world test cases. To this goal, we investigated four main directions: the collection and the annotation of two datasets, the design of the feature set, the introduction of a novel feature selection approach and the investigation of how the knowledge can be transferred among different topics. First, we present two Twitter datasets on health trending topics, manually labeled in three classes, namely rumour, nonrumour, i.e. referenced news, and unknowns, i.e. posts that do not belong to rumours and nonrumours classes. Second, we design a novel feature set, accounting both descriptors based on the literature and newly conceived for the micro-level task, describing influence potential and network characteristics. Third, we explore the feature selection influence on the specific problem, proposing a novel filter algorithm, relying on a rule-based topology framework which characterizes the feature space aiming at reducing samples in unreliable configurations. Testing this third approach on two health datasets, we are able to obtain promising results, reaching even an accuracy of 96.8%. As a further step in micro-level research, we also tackle the problem of knowledge transfer among different topic domains. To this end, we present a novel hybrid transfer learning approach that exploits the rule-based topology framework used for feature selection. Comparing this novel method with sate-of-the-art techniques over our two datasets we are able to provide interesting results, showing the validity of the method and the potential of transfer learning for rumour detection.

Micro-level rumour detection on Twitter / Rosa Sicilia , 2020 Mar 20. 32. ciclo

Micro-level rumour detection on Twitter

SICILIA, ROSA
2020-03-20

Abstract

Recent years have witnessed a drastic chance in information diffusion that has become more and more immediate and effortless thanks to social media, allowing not only certified and professional press practitioners but also common end users to share news contents at a little to no cost. Despite the clear advantages of this phenomenon, the absence of systematic control and moderation on these platforms easily leads to spread unreliable information. This is usually referred to as rumour, an unverified and instrumentally relevant statement in circulation. To prevent treacherous information to have social consequences, researchers have been directing considerable effort in studying automatic systems able to recognize rumours. Most of the work focuses on macro-level analyses, i.e. the detection system considers as rumour news carried by a set of microblog posts rather than by an individual post. However, a micro-level analysis that considers the individual posts, could be of major interest in specific domains, such as health, where a finer investigation is often needed. On these grounds, in this thesis we investigate machine learning methods to detect rumours at the micro-level, and we apply our research on two real-world test cases. To this goal, we investigated four main directions: the collection and the annotation of two datasets, the design of the feature set, the introduction of a novel feature selection approach and the investigation of how the knowledge can be transferred among different topics. First, we present two Twitter datasets on health trending topics, manually labeled in three classes, namely rumour, nonrumour, i.e. referenced news, and unknowns, i.e. posts that do not belong to rumours and nonrumours classes. Second, we design a novel feature set, accounting both descriptors based on the literature and newly conceived for the micro-level task, describing influence potential and network characteristics. Third, we explore the feature selection influence on the specific problem, proposing a novel filter algorithm, relying on a rule-based topology framework which characterizes the feature space aiming at reducing samples in unreliable configurations. Testing this third approach on two health datasets, we are able to obtain promising results, reaching even an accuracy of 96.8%. As a further step in micro-level research, we also tackle the problem of knowledge transfer among different topic domains. To this end, we present a novel hybrid transfer learning approach that exploits the rule-based topology framework used for feature selection. Comparing this novel method with sate-of-the-art techniques over our two datasets we are able to provide interesting results, showing the validity of the method and the potential of transfer learning for rumour detection.
20-mar-2020
Micro-level; Rumour detection; Twitter; health-related topics; space characterization; transfer
Micro-level rumour detection on Twitter / Rosa Sicilia , 2020 Mar 20. 32. ciclo
File in questo prodotto:
File Dimensione Formato  
DT_245_SiciliaRosa.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 7.01 MB
Formato Adobe PDF
7.01 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12610/68737
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact