-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathindex.html
131 lines (104 loc) · 5.67 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>HUMOR: a Crowd-Annotated Spanish Corpus for Humor Analysis</title>
<meta name="description"
content="Crowd-annotated corpus of 27k tweets written in Spanish, labeled by humor and funniness (1 to 5) value, created for Humor Analysis and Natural Language Processing.">
<meta name="author" content="Santiago Castro; Luis Chiruzzo, Aiala Rosá; Diego Garat; Guillermo Moncecchi"/>
<meta name="viewport" content="width=device-width, initial-scale=1">
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Dataset",
"name": "HUMOR",
"description": "Crowd-annotated corpus of 27k tweets written in Spanish, labeled by humor and funniness (1 to 5) value, created for Humor Analysis and Natural Language Processing.",
"creator": "Santiago Castro, Luis Chiruzzo, Aiala Rosá, Diego Garat, and Guillermo Moncecchi",
"distribution": {
"@type": "DataDownload",
"encodingFormat": "CSV",
"contentUrl": "https://pln-fing-udelar.github.io/humor/annotations_by_tweet.csv"
},
"datePublished": "2018-07-20"
}
</script>
<meta property="og:type" content="website"/>
<meta property="og:site_name" content="HUMOR: a Crowd-Annotated Spanish Corpus for Humor Analysis"/>
<meta property="og:image" content="https://pln-fing-udelar.github.io/humor/og.png"/>
<meta property="og:image:height" content="630"/>
<meta property="og:image:width" content="1200"/>
<meta property="og:title" content="HUMOR: a Crowd-Annotated Spanish Corpus for Humor Analysis"/>
<meta property="og:description"
content="Crowd-annotated corpus of 27k tweets written in Spanish, labeled by humor and funniness (1 to 5) value, created for Humor Analysis and Natural Language Processing."/>
<meta property="og:url" content="https://pln-fing-udelar.github.io/humor/"/>
<meta property="fb:app_id" content="1887710507982042"/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:site" content="@PLN_UdelaR"/>
<meta name="twitter:creator" content="@PLN_UdelaR"/>
<link href="index.css" rel="stylesheet">
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-34392230-8"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'UA-34392230-8');
</script>
</head>
<body>
<header>
<h1>HUMOR</h1>
<p id="subtitle">A Crowd-Annotated Spanish Corpus for Humor Analysis</p>
</header>
<div id="authors-affiliations">
<p>
<a href="https://github.com/bryant1410">Santiago Castro</a>, Luis Chiruzzo, Aiala Rosá, Diego Garat, and
<a href="https://www.fing.edu.uy/~gmonce/">Guillermo Moncecchi</a>
</p>
<p>
<a href="https://www.fing.edu.uy/inco/grupos/pln/">Grupo de Procesamiento de Lenguaje Natural</a> (NLP Group),
<a href="https://udelar.edu.uy/">Universidad de la República</a> — Uruguay
</p>
</div>
<p>Crowd-annotated corpus of 27k tweets written in Spanish, labeled by humor and funniness (1 to 5) value, created for
Humor Analysis and Natural Language Processing.</p>
<a href="https://github.com/pln-fing-udelar/humor" target="_blank" class="github-corner"
aria-label="View source on Github">
<svg width="80" height="80" viewbox="0 0 250 250"
style="fill:#151513; color:#fff; position: absolute; top: 0; border: 0; right: 0;" aria-hidden="true">
<path d="M0,0 L115,115 L130,115 L142,142 L250,250 L250,0 Z"></path>
<path
d="M128.3,109.0 C113.8,99.7 119.0,89.6 119.0,89.6 C122.0,82.7 120.5,78.6 120.5,78.6 C119.2,72.0 123.4,76.3 123.4,76.3 C127.3,80.9 125.5,87.3 125.5,87.3 C122.9,97.6 130.6,101.9 134.4,103.2"
fill="currentColor" style="transform-origin: 130px 106px;" class="octo-arm"></path>
<path
d="M115.0,115.0 C114.9,115.1 118.7,116.5 119.8,115.4 L133.7,101.6 C136.9,99.2 139.9,98.4 142.2,98.6 C133.8,88.0 127.5,74.4 143.8,58.0 C148.5,53.4 154.0,51.2 159.7,51.0 C160.3,49.4 163.2,43.6 171.4,40.1 C171.4,40.1 176.1,42.5 178.8,56.2 C183.1,58.6 187.2,61.8 190.9,65.4 C194.5,69.0 197.7,73.2 200.1,77.6 C213.8,80.2 216.3,84.9 216.3,84.9 C212.7,93.1 206.9,96.0 205.4,96.6 C205.1,102.4 203.0,107.8 198.3,112.5 C181.9,128.9 168.3,122.5 157.7,114.1 C157.9,116.9 156.7,120.9 152.7,124.9 L141.0,136.5 C139.8,137.7 141.6,141.9 141.8,141.8 Z"
fill="currentColor" class="octo-body">
</path>
</svg>
</a>
<a href="http://www.aclweb.org/anthology/W18-3502">Paper</a>
<h2>Downloads</h2>
The dataset consists of the following 2 files:
<ul>
<li><a href="annotations.csv">Annotations CSV</a></li>
<li><a href="tweets.csv">Tweets CSV</a></li>
</ul>
<p>Aggregated version (one row per tweet with the sum of annotations for each category): <a
href="annotations_by_tweet_all.csv">All Annotations by Tweet CSV</a></p>
<p>Aggregated version, without the annotations from people who did not pass the test tweets: <a
href="annotations_by_tweet.csv"><b>Annotations by Tweet CSV</b></a>. This one was used by the <a
href="https://www.fing.edu.uy/inco/grupos/pln/haha/">HAHA task, about Humor Recognition and Funniness Detection</a>.
</p>
<h2>Citation</h2>
If you publish work that uses this dataset, please cite as follows:
<pre><code>@inproceedings{castro2018,
title={A Crowd-Annotated Spanish Corpus for Humor Analysis},
author={Castro, Santiago and Chiruzzo, Luis and Ros{\'a}, Aiala and Garat, Diego and Moncecchi, Guillermo},
booktitle={Proceedings of SocialNLP 2018, The 6th International Workshop on Natural Language Processing for Social Media},
year={2018}
}</code></pre>
<h2>Slides</h2>
<a href="slides.pdf">SocialNLP 2018 @ ACL slides</a>
</body>
</html>