Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update manpage to 0.966 behavior etc. #50

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
2 changes: 1 addition & 1 deletion dic.html
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ <h2>エントリのフォーマット (活用しない語)</h2>
<p>左から,</p>

<pre>
表層形,左文脈ID,右文脈ID,コスト,品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用形,活用型,原形,読み,発音
表層形,左文脈ID,右文脈ID,コスト,品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用型,活用形,原形,読み,発音
</pre>
<p>です. </p>

Expand Down
4 changes: 2 additions & 2 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ <h2><a name="news">新着情報</a></h2>
<ul>
<li><strong>2013-02-18</strong> MeCab 0.996<br>
<ul>
<li>configure script の不備によりこiconvへのリンクに失敗する問題を修正
<li>configure script の不備によりiconvへのリンクに失敗する問題を修正
<li>ユーザ辞書用CSVファイルのコストと左/右文脈IDを付与し, 新たなCSVファイルを生成する機能の追加
<li>解析結果からLattice を作成する Lattice::set_result() メソッドを追加. 単体テスト時のスタブの作成等に利用可能
</ul>
Expand Down Expand Up @@ -443,7 +443,7 @@ <h3><a name="parse">とりあえず解析してみる</a></h3>
左から, </p>

<pre>
表層形\t品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用形,活用型,原形,読み,発音
表層形\t品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用型,活用形,原形,読み,発音
</pre>

<p>となっています。 </p>
Expand Down
10 changes: 5 additions & 5 deletions learn.html
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ <h3>rewrite.def</h3>
<pre>
[unigram rewrite]
# 読み,発音をとりのぞいて, 品詞1,2,3,4,活用形,活用型,原形,よみ を使う
# 読み,発音をとりのぞいて, 品詞1,2,3,4,活用型,活用形,原形,よみ を使う
*,*,*,*,*,*,*,* $1,$2,$3,$4,$5,$6,$7,$8
# 読みがない場合は無視
*,*,*,*,*,*,* $1,$2,$3,$4,$5,$6,$7,*
Expand Down Expand Up @@ -597,11 +597,11 @@ <h2><a name="eval">評価</a></h2>

<p>-l オプションによって, どの素性のレベルを使って評価するか指定できます.
<ul>
<li>-l 0: 0 番目の素性のみを使って評価します.
<li>-l 4: 0〜4 番目の素性を使って評価します
<li>-l 0: 分かち書きの精度を評価します.
<li>-l 4: 1(先頭)〜4 番目の素性を使って評価します
<li>-l -1: 全レベルの素性を使って評価します
<li>-l "0 1 2" 0番目, 0〜1番目, 0〜4番目の3つの評価を表示します.
<li>-l "0 1 -1" 0番目, 0〜1番目, 全レベルの3つの評価を表示します.
<li>-l "0 1 4" 分かち書き, 1番目, 1〜4番目の3つの評価を表示します.
<li>-l "0 1 -1" 分かち書き, 1番目, 全レベルの3つの評価を表示します.
</ul>

<h2><a name="retrain">再学習</a></h2>
Expand Down
156 changes: 122 additions & 34 deletions mecab.html
Original file line number Diff line number Diff line change
@@ -1,85 +1,173 @@
Content-type: text/html
Content-type: text/html; charset=UTF-8

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML><HEAD><TITLE>Man page of MECAB</TITLE>
</HEAD><BODY>
<H1>MECAB</H1>
Section: MeCab (1)<BR>Updated: July 2006<BR><A HREF="#index">Index</A>
Section: User Commands (1)<BR>Updated: February 2019<BR><A HREF="#index">Index</A>
<A HREF="/cgi-bin/man/man2html">Return to Main Contents</A><HR>

<A NAME="lbAB">&nbsp;</A>
<H2>NAME</H2>

mecab - manual page for mecab of 0.92
mecab - Yet Another Part-of-Speech and Morphological Analyzer
<A NAME="lbAC">&nbsp;</A>
<H2>SYNOPSIS</H2>

<B>mecab</B>

[<I>options</I>] <I>files</I>
[<I>,options/</I>] <I>,files/</I>
<A NAME="lbAD">&nbsp;</A>
<H2>DESCRIPTION</H2>

MeCab: Yet Another Part-of-Speech and Morphological Analyzer
Mecab is a morphological analysis system. It reads continuous text sentences
such as Japanese ones from the standard input, segments them into morpheme
sequences, and outputs them to the standard output with many additional pieces
of information (pronunciation, semantic information, etc).
<A NAME="lbAE">&nbsp;</A>
<H2>COPYRIGHT</H2>

Copyright &#169; 2001-2006 Taku Kudo
<BR>
<H2>OPTIONS</H2>

Copyright &#169; 2004-2006 Nippon Telegraph and Telephone Corporation
<DL COMPACT>
<DT><B>-r</B>, <B>--rcfile</B>=<I>FILE</I><DD>
<DT><B>-r</B>, <B>--rcfile</B>=<I>,FILE/</I><DD>
use FILE as resource file
<DT><B>-d</B>, <B>--dicdir</B>=<I>DIR</I><DD>
<DT><B>-d</B>, <B>--dicdir</B>=<I>,DIR/</I><DD>
set DIR as a system dicdir
<DT><B>-u</B>, <B>--userdic</B>=<I>FILE</I><DD>
<DT><B>-u</B>, <B>--userdic</B>=<I>,FILE/</I><DD>
use FILE as a user dictionary
<DT><B>-l</B>, <B>--lattice-level</B>=<I>INT</I><DD>
lattice information level (default 0)
<DT><B>-l</B>, <B>--lattice-level</B>=<I>,INT/</I><DD>
lattice information level (DEPRECATED)
<DT><B>-D</B>, <B>--dictionary-info</B><DD>
show dictionary information and exit
<DT><B>-O</B>, <B>--output-format-type</B>=<I>,TYPE/</I><DD>
set output format type (SEE OUTPUT FORMAT)
<DT><B>-a</B>, <B>--all-morphs</B><DD>
output all morphs (default false)
<DT><B>-O</B>, <B>--output-format-type</B>=<I>TYPE</I><DD>
set output format type (wakati,none,...)
output all morphs(default false)
<DT><B>-N</B>, <B>--nbest</B>=<I>,INT/</I><DD>
output N best results (default 1)
<DT><B>-p</B>, <B>--partial</B><DD>
partial parsing mode
<DT><B>-F</B>, <B>--node-format</B>=<I>STR</I><DD>
partial parsing mode (default false)
<DT><B>-m</B>, <B>--marginal</B><DD>
output marginal probability (default false)
<DT><B>-M</B>, <B>--max-grouping-size</B>=<I>,INT/</I><DD>
maximum grouping size for unknown words (default 24)
<DT><B>-F</B>, <B>--node-format</B>=<I>,STR/</I><DD>
use STR as the user-defined node format
<DT><B>-U</B>, <B>--unk-format</B>=<I>STR</I><DD>
use STR as the user-defined unk format
<DT><B>-B</B>, <B>--bos-format</B>=<I>STR</I><DD>
use STR as the user-defined bos format
<DT><B>-E</B>, <B>--eos-format</B>=<I>STR</I><DD>
use STR as the user-defined eos format
<DT><B>-b</B>, <B>--input-buffer-size</B>=<I>INT</I><DD>
set input buffer size (default BUF_SIZE)
<DT><B>-U</B>, <B>--unk-format</B>=<I>,STR/</I><DD>
use STR as the user-defined unknown node format
<DT><B>-B</B>, <B>--bos-format</B>=<I>,STR/</I><DD>
use STR as the user-defined beginning-of-sentence format
<DT><B>-E</B>, <B>--eos-format</B>=<I>,STR/</I><DD>
use STR as the user-defined end-of-sentence format
<DT><B>-S</B>, <B>--eon-format</B>=<I>,STR/</I><DD>
use STR as the user-defined end-of-NBest format
<DT><B>-x</B>, <B>--unk-feature</B>=<I>,STR/</I><DD>
use STR as the feature for unknown word
<DT><B>-b</B>, <B>--input-buffer-size</B>=<I>,INT/</I><DD>
set input buffer size (default 8192)
<DT><B>-P</B>, <B>--dump-config</B><DD>
dump MeCab parameters
<DT><B>-C</B>, <B>--allocate-sentence</B><DD>
allocate new memory for input sentence
<DT><B>-N</B>, <B>--nbest</B>=<I>INT</I><DD>
output N best results (default 1)
<DT><B>-t</B>, <B>--theta</B>=<I>FLOAT</I><DD>
<DT><B>-t</B>, <B>--theta</B>=<I>,FLOAT/</I><DD>
set temparature parameter theta (default 0.75)
<DT><B>-o</B>, <B>--output</B>=<I>FILE</I><DD>
<DT><B>-c</B>, <B>--cost-factor</B>=<I>,INT/</I><DD>
set cost factor (default 700)
<DT><B>-o</B>, <B>--output</B>=<I>,FILE/</I><DD>
set the output file name
<DT><B>-v</B>, <B>--version</B><DD>
show the version and exit.
<DT><B>-h</B>, <B>--help</B><DD>
show this help and exit.
</DL>
<A NAME="lbAF">&nbsp;</A>
<H2>OUTPUT FORMAT</H2>

<P>
The default output format and the selectable output formats from the
<B>-O</B> option argument are defined in the resource file.
There are few special hard coded formats.
<P>
<DL COMPACT>
<DT><B>&quot;&quot;</B> (null string)<DD>
disable format setting of resource file. This is required to set user-defined
format from the command line.
<DT><B>wakati</B><DD>
output each node separated by a space
<DT><B>dump</B><DD>
dump all node data in one line
<DT><B>none</B><DD>
no output
<P>
</DL>
<P>

See &lt;<A HREF="https://taku910.github.io/mecab/format.html">https://taku910.github.io/mecab/format.html</A>&gt; for details of format
definition.
<P>
<A NAME="lbAG">&nbsp;</A>
<H2>DICTIONARY</H2>

<P>
See
<DL COMPACT>
<DT>&bull;<DD>
&lt;<A HREF="https://taku910.github.io/mecab/learn.html">https://taku910.github.io/mecab/learn.html</A>&gt;
<DT>&bull;<DD>
&lt;<A HREF="https://taku910.github.io/mecab/dic-detail.html">https://taku910.github.io/mecab/dic-detail.html</A>&gt;
</DL>
<P>

for details of preparation and updating of the mecab dictionary.
<P>
<A NAME="lbAH">&nbsp;</A>
<H2>EXAMPLE</H2>

<P>
Output reading in KataKana with installed and fully configured UniDic.
<P>
<BR>&nbsp;&nbsp;&nbsp;&nbsp;$&nbsp;mecab&nbsp;-O&nbsp;&quot;&quot;&nbsp;-F&quot;%pS%f[9]&quot;&nbsp;-U&quot;%M&quot;&nbsp;-E&quot;\n&quot;&nbsp;&lt;input_file
<P>
Output writing in Hiragana with installed and fully configured UniDic.
<P>
<BR>&nbsp;&nbsp;&nbsp;&nbsp;$&nbsp;mecab&nbsp;-O&nbsp;&quot;&quot;&nbsp;-F&quot;%pS%f[6]&quot;&nbsp;-U&quot;%M&quot;&nbsp;-E&quot;\n&quot;&nbsp;&lt;input_file&nbsp;\
<BR>

<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|nkf&nbsp;--hiragana
<P>
Output reading in KataKana with installed and fully configured Ipadic.
<P>
<BR>&nbsp;&nbsp;&nbsp;&nbsp;$&nbsp;mecab&nbsp;-O&nbsp;yomi&nbsp;&lt;input_file
<P>
<A NAME="lbAI">&nbsp;</A>
<H2>COPYRIGHT</H2>

Copyright(C) 2001-2012 Taku Kudo
<BR>

Copyright(C) 2004-2008 Nippon Telegraph and Telephone Corporation
<A NAME="lbAJ">&nbsp;</A>
<H2>SEE ALSO</H2>

Full documentation at: &lt;<A HREF="https://taku910.github.io/mecab/">https://taku910.github.io/mecab/</A>&gt;
<P>

<HR>
<A NAME="index">&nbsp;</A><H2>Index</H2>
<DL>
<DT><A HREF="#lbAB">NAME</A><DD>
<DT><A HREF="#lbAC">SYNOPSIS</A><DD>
<DT><A HREF="#lbAD">DESCRIPTION</A><DD>
<DT><A HREF="#lbAE">COPYRIGHT</A><DD>
<DT><A HREF="#lbAE">OPTIONS</A><DD>
<DT><A HREF="#lbAF">OUTPUT FORMAT</A><DD>
<DT><A HREF="#lbAG">DICTIONARY</A><DD>
<DT><A HREF="#lbAH">EXAMPLE</A><DD>
<DT><A HREF="#lbAI">COPYRIGHT</A><DD>
<DT><A HREF="#lbAJ">SEE ALSO</A><DD>
</DL>
<HR>
This document was created by
<A HREF="/cgi-bin/man/man2html">man2html</A>,
using the manual pages.<BR>
Time: 15:16:13 GMT, July 09, 2006
Time: 06:47:11 GMT, February 24, 2019
</BODY>
</HTML>
2 changes: 1 addition & 1 deletion mecab/doc/dic.html
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ <h2>エントリのフォーマット (活用しない語)</h2>
<p>左から,</p>

<pre>
表層形,左文脈ID,右文脈ID,コスト,品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用形,活用型,原形,読み,発音
表層形,左文脈ID,右文脈ID,コスト,品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用型,活用形,原形,読み,発音
</pre>
<p>です. </p>

Expand Down
2 changes: 1 addition & 1 deletion mecab/doc/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -443,7 +443,7 @@ <h3><a name="parse">とりあえず解析してみる</a></h3>
左から, </p>

<pre>
表層形\t品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用形,活用型,原形,読み,発音
表層形\t品詞,品詞細分類1,品詞細分類2,品詞細分類3,活用型,活用形,原形,読み,発音
</pre>

<p>となっています。 </p>
Expand Down
2 changes: 1 addition & 1 deletion mecab/doc/learn.html
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ <h3>rewrite.def</h3>
<pre>
[unigram rewrite]
# 読み,発音をとりのぞいて, 品詞1,2,3,4,活用形,活用型,原形,よみ を使う
# 読み,発音をとりのぞいて, 品詞1,2,3,4,活用型,活用形,原形,よみ を使う
*,*,*,*,*,*,*,* $1,$2,$3,$4,$5,$6,$7,$8
# 読みがない場合は無視
*,*,*,*,*,*,* $1,$2,$3,$4,$5,$6,$7,*
Expand Down
Loading