In this repository we provide the data set for the corpus of Hate speech in Online Comments from German Newspapers (HOCON34k). Please see the corpus website for more information about the corpus.
- datasets: Folder that contains the data set
- expert-agreement-annotation.ipynb: Python jupyter notebook that allows to reproduce the calculation and optimization of the interrater realiability
- environment.yml: File for creating an Anaconda environment, which contains the packages required to execute the code.
As a prerequisite, please make sure that you have installed a current version of the Python distribution Anaconda. To install the environment, execute the following command:
conda env create --file=environment.yml
conda activate hatespeech-dataset-hocon34k