-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmanifest.yml
60 lines (59 loc) · 1.78 KB
/
manifest.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
toolforge: 1.0
container: kdvh6tkl
type: tool
environment:
size: medium
parameters:
- type: int
minimum: 1
maximum: 5
default: 1
name: MinNgramLength
description: |
This tool returns ngrams in a range of lengths. How many words should be
in the shortest ngrams the tool returns? A 1-gram is one word; a 2-gram
is two words; and so on.
required: true
- type: int
minimum: 1
maximum: 5
default: 3
name: MaxNgramLength
description: |
This tool returns ngrams in a range of lengths. How many words should be
in the longest ngrams the tool returns? A 1-gram is one word; a 2-gram
is two words; and so on. This value must be no less than the value given
in the MinNgramLength parameter.
required: true
- type: string
domain:
type: pattern
pattern: .{1,80}
name: TextColumnName
description: |
Within the Data spreadsheet input, what is the name of the column
containing the text to analyze? The column name must match exactly.
required: true
inputs:
- name: Data
description: |
The spreadsheet containing the text to analyze. The spreadsheet must
contain a column with a name that matches the value given in
`TextColumnName` exactly.
extensions:
- txt
- csv
- xls
- xlsx
outputs:
- name: Ngrams
description: |
The ngram data, sorted by frequency descending. If there are more than
one million unique ngrams, the only the first one million are shown. The
output will contain the following columns:
* `ngram` -- An ngram found in the text
* `length` -- The length of this ngram, in tokens
* `count` -- The number of times this ngram appeared in the given corpus
extensions:
- csv
- xlsx