Purpose
This study empirically investigates how the COVID-infodemic manifests differently in different
languages and in different countries. This paper focuses on the topical and temporal features of misinformation
related to COVID-19 in five countries.
Design/methodology/approach
COVID-related misinformation was retrieved from 4,487 fact-checked
articles. A novel approach to conducting cross-lingual topic extraction was applied. The rectr algorithm,
empowered by aligned word-embedding, was utilised. To examine how the COVID-infodemic interplays with
the pandemic, a time series analysis was used to construct and compare their temporal development.
Findings
The cross-lingual topic model findings reveal the topical characteristics of each country. On an
aggregated level, health misinformation represents only a small portion of the COVID-infodemic. The time
series results indicate that, for most countries, the infodemic curve fluctuates with the epidemic curve. In this
study, this form of infodemic is referred to as “point-source infodemic”. The second type of infodemic is
continuous infodemic, which is seen in India and the United States (US). In those two countries, the infodemic is
predominantly caused by political misinformation; its temporal distribution appears to be largely unrelated to
the epidemic development.
Originality/value
Despite the growing attention given to misinformation research, existing scholarship is
dominated by single-country or mono-lingual research. This study takes a cross-national and cross-lingual
comparative approach to investigate the problem of online misinformation. This paper demonstrates how the
technological barrier of cross-lingual topic analysis can be overcome with aligned word-embedding algorithms.