Exploring the topics, sentiments and hate speech in the Spanish information environment

Authors

Alejandro Buitrago López ([email protected]) ^M
Javier Pastor-Galindo ([email protected]) ^M
José A. Ruipérez-Valiente ([email protected]) ^M

Affiliation

^M Department of Information and Communications Engineering, University of Murcia, Spain

Abstract

In the digital era, the internet and social media have revolutionized communication but also facilitated the spread of hate speech and disinformation, provoking radicalization, polarization, and toxicity. This phenomenon is particularly concerning for reputable news sources, as it can undermine public trust and contribute to social discord. To characterize the content surrounding these sources, this paper analyzes the topics, sentiments, and prevalence of hate in 337,807 messages (website comments and tweets) responding to news from five Spanish media outlets (La Vanguardia, ABC, El País, El Mundo, and 20 Minutos) in January 2021.

Organization

The repository is organized as follows:

notebooks/: contains the jupyter notebooks used in the study.
- topics_media: characterize the five Spanish media outlets.
- Data-exploration: explore data and perform sentiment analysis.
- topic_categories: classify topics into categories .
- topic_modelling: main notebooks with topic modelling.
- categories_analysis: analyze hate, sentiments for each category.
data: contains the data
- Human-in-the-loop-topic-labeling: topics with the entire process of labeling.
- topics_media: topics with distribution for each media outlet.
- humantag: topics with human tag and category.
- keywords: topics with keywords.
- id_topics_categories_sentiment: increase of the dataset and id with respect to the original dataset.

Note

The full dataset is not accessible.

License

This project is licensed under the MIT License.

Acknowledgements

This work has been partially funded by the strategic project CDL-TALENTUM from the Spanish National Institute of Cybersecurity (INCIBE) and the Recovery, Transformation, and Resilience Plan, Next Generation EU, and the University of Murcia by FPU contract.

We thank the Hatemedia project (PID2020-114584GB-I00), financed by MCIN/AEI/10.13039/501100011033, for providing the dataset used in this work.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
notebooks		notebooks
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring the topics, sentiments and hate speech in the Spanish information environment

Authors

Affiliation

Abstract

Organization

License

Acknowledgements

About

Releases

Packages

Languages

Alexbl7/Toxicity-in-Spanish-information-environment

Folders and files

Latest commit

History

Repository files navigation

Exploring the topics, sentiments and hate speech in the Spanish information environment

Authors

Affiliation

Abstract

Organization

License

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages