Skip to content

Commit

Permalink
docs: make documentation corresponding to the current branch.
Browse files Browse the repository at this point in the history
  • Loading branch information
mgolosova committed Sep 4, 2018
1 parent eec5452 commit 132b720
Show file tree
Hide file tree
Showing 79 changed files with 7,961 additions and 1,838 deletions.
22 changes: 18 additions & 4 deletions Docs/build/html/_modules/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,26 @@ <h1>All modules for which code is available</h1>
<li><a href="pyDKB/common/exceptions.html">pyDKB.common.exceptions</a></li>
<li><a href="pyDKB/common/hdfs.html">pyDKB.common.hdfs</a></li>
<li><a href="pyDKB/common/json_utils.html">pyDKB.common.json_utils</a></li>
<li><a href="pyDKB/dataflow/dkbID.html">pyDKB.dataflow.dkbID</a></li>
<li><a href="pyDKB/dataflow/communication/consumer.html">pyDKB.dataflow.communication.consumer</a></li>
<ul><li><a href="pyDKB/dataflow/communication/consumer/Consumer.html">pyDKB.dataflow.communication.consumer.Consumer</a></li>
<li><a href="pyDKB/dataflow/communication/consumer/FileConsumer.html">pyDKB.dataflow.communication.consumer.FileConsumer</a></li>
<li><a href="pyDKB/dataflow/communication/consumer/HDFSConsumer.html">pyDKB.dataflow.communication.consumer.HDFSConsumer</a></li>
<li><a href="pyDKB/dataflow/communication/consumer/StreamConsumer.html">pyDKB.dataflow.communication.consumer.StreamConsumer</a></li>
</ul><li><a href="pyDKB/dataflow/communication/messages.html">pyDKB.dataflow.communication.messages</a></li>
<li><a href="pyDKB/dataflow/communication/producer.html">pyDKB.dataflow.communication.producer</a></li>
<ul><li><a href="pyDKB/dataflow/communication/producer/FileProducer.html">pyDKB.dataflow.communication.producer.FileProducer</a></li>
<li><a href="pyDKB/dataflow/communication/producer/HDFSProducer.html">pyDKB.dataflow.communication.producer.HDFSProducer</a></li>
<li><a href="pyDKB/dataflow/communication/producer/Producer.html">pyDKB.dataflow.communication.producer.Producer</a></li>
<li><a href="pyDKB/dataflow/communication/producer/StreamProducer.html">pyDKB.dataflow.communication.producer.StreamProducer</a></li>
</ul><li><a href="pyDKB/dataflow/communication/stream.html">pyDKB.dataflow.communication.stream</a></li>
<ul><li><a href="pyDKB/dataflow/communication/stream/InputStream.html">pyDKB.dataflow.communication.stream.InputStream</a></li>
<li><a href="pyDKB/dataflow/communication/stream/OutputStream.html">pyDKB.dataflow.communication.stream.OutputStream</a></li>
<li><a href="pyDKB/dataflow/communication/stream/Stream.html">pyDKB.dataflow.communication.stream.Stream</a></li>
<li><a href="pyDKB/dataflow/communication/stream/exceptions.html">pyDKB.dataflow.communication.stream.exceptions</a></li>
</ul><li><a href="pyDKB/dataflow/dkbID.html">pyDKB.dataflow.dkbID</a></li>
<li><a href="pyDKB/dataflow/exceptions.html">pyDKB.dataflow.exceptions</a></li>
<li><a href="pyDKB/dataflow/messages.html">pyDKB.dataflow.messages</a></li>
<li><a href="pyDKB/dataflow/stage/AbstractProcessorStage.html">pyDKB.dataflow.stage.AbstractProcessorStage</a></li>
<li><a href="pyDKB/dataflow/stage/AbstractStage.html">pyDKB.dataflow.stage.AbstractStage</a></li>
<li><a href="pyDKB/dataflow/stage/processors.html">pyDKB.dataflow.stage.processors</a></li>
<li><a href="pyDKB/dataflow/stage/ProcessorStage.html">pyDKB.dataflow.stage.ProcessorStage</a></li>
</ul>

</div>
Expand Down
66 changes: 64 additions & 2 deletions Docs/build/html/_modules/pyDKB/common/hdfs.html
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ <h1>Source code for pyDKB.common.hdfs</h1><div class="highlight"><pre>
<span class="kn">import</span> <span class="nn">subprocess</span>
<span class="kn">import</span> <span class="nn">select</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">posixpath</span> <span class="k">as</span> <span class="nn">path</span>
<span class="kn">import</span> <span class="nn">tempfile</span>

<span class="kn">from</span> <span class="nn">.</span> <span class="k">import</span> <span class="n">HDFSException</span>

Expand Down Expand Up @@ -103,13 +105,24 @@ <h1>Source code for pyDKB.common.hdfs</h1><div class="highlight"><pre>
<span class="s2">&quot;Error message: </span><span class="si">%s</span><span class="se">\n</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">fname</span><span class="p">,</span> <span class="n">err</span><span class="p">))</span></div>


<div class="viewcode-block" id="movefile"><a class="viewcode-back" href="../../../pyDKB/pyDKB.common.hdfs.html#pyDKB.common.hdfs.movefile">[docs]</a><span class="k">def</span> <span class="nf">movefile</span><span class="p">(</span><span class="n">fname</span><span class="p">,</span> <span class="n">dest</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot; Move local file to HDFS. &quot;&quot;&quot;</span>
<span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">exists</span><span class="p">(</span><span class="n">fname</span><span class="p">):</span>
<span class="n">putfile</span><span class="p">(</span><span class="n">fname</span><span class="p">,</span> <span class="n">dest</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">os</span><span class="o">.</span><span class="n">remove</span><span class="p">(</span><span class="n">fname</span><span class="p">)</span>
<span class="k">except</span> <span class="ne">OSError</span><span class="p">,</span> <span class="n">err</span><span class="p">:</span>
<span class="n">sys</span><span class="o">.</span><span class="n">stderr</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="s2">&quot;(WARN) Failed to remove local copy of HDFS file&quot;</span>
<span class="s2">&quot; (</span><span class="si">%s</span><span class="s2">): </span><span class="si">%s</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">fname</span><span class="p">,</span> <span class="n">err</span><span class="p">))</span></div>


<div class="viewcode-block" id="getfile"><a class="viewcode-back" href="../../../pyDKB/pyDKB.common.hdfs.html#pyDKB.common.hdfs.getfile">[docs]</a><span class="k">def</span> <span class="nf">getfile</span><span class="p">(</span><span class="n">fname</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot; Download file from HDFS.</span>

<span class="sd"> Return value: file name (without directory)</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="n">cmd</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&quot;hadoop&quot;</span><span class="p">,</span> <span class="s2">&quot;fs&quot;</span><span class="p">,</span> <span class="s2">&quot;-get&quot;</span><span class="p">,</span> <span class="n">fname</span><span class="p">]</span>
<span class="n">name</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">basename</span><span class="p">(</span><span class="n">fname</span><span class="p">)</span>
<span class="n">name</span> <span class="o">=</span> <span class="n">basename</span><span class="p">(</span><span class="n">fname</span><span class="p">)</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">proc</span> <span class="o">=</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">Popen</span><span class="p">(</span><span class="n">cmd</span><span class="p">,</span>
<span class="n">stdin</span><span class="o">=</span><span class="n">subprocess</span><span class="o">.</span><span class="n">PIPE</span><span class="p">,</span>
Expand All @@ -124,6 +137,32 @@ <h1>Source code for pyDKB.common.hdfs</h1><div class="highlight"><pre>
<span class="k">return</span> <span class="n">name</span></div>


<div class="viewcode-block" id="File"><a class="viewcode-back" href="../../../pyDKB/pyDKB.common.hdfs.html#pyDKB.common.hdfs.File">[docs]</a><span class="k">def</span> <span class="nf">File</span><span class="p">(</span><span class="n">fname</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot; Get and open temporary local copy of HDFS file</span>

<span class="sd"> Return value: open file object (TemporaryFile).</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="n">cmd</span> <span class="o">=</span> <span class="p">[</span><span class="s2">&quot;hadoop&quot;</span><span class="p">,</span> <span class="s2">&quot;fs&quot;</span><span class="p">,</span> <span class="s2">&quot;-cat&quot;</span><span class="p">,</span> <span class="n">fname</span><span class="p">]</span>
<span class="n">tmp_file</span> <span class="o">=</span> <span class="n">tempfile</span><span class="o">.</span><span class="n">TemporaryFile</span><span class="p">()</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">proc</span> <span class="o">=</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">Popen</span><span class="p">(</span><span class="n">cmd</span><span class="p">,</span>
<span class="n">stdin</span><span class="o">=</span><span class="n">subprocess</span><span class="o">.</span><span class="n">PIPE</span><span class="p">,</span>
<span class="n">stderr</span><span class="o">=</span><span class="n">subprocess</span><span class="o">.</span><span class="n">PIPE</span><span class="p">,</span>
<span class="n">stdout</span><span class="o">=</span><span class="n">tmp_file</span><span class="p">)</span>
<span class="n">check_stderr</span><span class="p">(</span><span class="n">proc</span><span class="p">)</span>
<span class="n">tmp_file</span><span class="o">.</span><span class="n">seek</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="k">except</span> <span class="p">(</span><span class="n">subprocess</span><span class="o">.</span><span class="n">CalledProcessError</span><span class="p">,</span> <span class="ne">OSError</span><span class="p">),</span> <span class="n">err</span><span class="p">:</span>
<span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">err</span><span class="p">,</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">CalledProcessError</span><span class="p">):</span>
<span class="n">err</span><span class="o">.</span><span class="n">cmd</span> <span class="o">=</span> <span class="s1">&#39; &#39;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span>
<span class="n">tmp_file</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="k">raise</span> <span class="n">HDFSException</span><span class="p">(</span><span class="s2">&quot;Failed to get file from HDFS: </span><span class="si">%s</span><span class="se">\n</span><span class="s2">&quot;</span>
<span class="s2">&quot;Error message: </span><span class="si">%s</span><span class="se">\n</span><span class="s2">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="n">fname</span><span class="p">,</span> <span class="n">err</span><span class="p">))</span>
<span class="k">if</span> <span class="n">tmp_file</span><span class="o">.</span><span class="n">closed</span><span class="p">:</span>
<span class="k">return</span> <span class="kc">None</span>

<span class="k">return</span> <span class="n">tmp_file</span></div>


<div class="viewcode-block" id="listdir"><a class="viewcode-back" href="../../../pyDKB/pyDKB.common.hdfs.html#pyDKB.common.hdfs.listdir">[docs]</a><span class="k">def</span> <span class="nf">listdir</span><span class="p">(</span><span class="n">dirname</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="s1">&#39;a&#39;</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot; List files and/or subdirectories of HDFS directory.</span>

Expand Down Expand Up @@ -172,7 +211,7 @@ <h1>Source code for pyDKB.common.hdfs</h1><div class="highlight"><pre>

<span class="c1"># We need to return only the name of the file or subdir</span>
<span class="n">filename</span> <span class="o">=</span> <span class="n">line</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span>
<span class="n">filename</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">basename</span><span class="p">(</span><span class="n">filename</span><span class="p">)</span>
<span class="n">filename</span> <span class="o">=</span> <span class="n">basename</span><span class="p">(</span><span class="n">filename</span><span class="p">)</span>
<span class="k">if</span> <span class="n">line</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s1">&#39;d&#39;</span><span class="p">:</span>
<span class="n">subdirs</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">filename</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">line</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s1">&#39;-&#39;</span><span class="p">:</span>
Expand All @@ -186,6 +225,29 @@ <h1>Source code for pyDKB.common.hdfs</h1><div class="highlight"><pre>
<span class="n">result</span> <span class="o">=</span> <span class="n">subdirs</span>

<span class="k">return</span> <span class="n">result</span></div>


<div class="viewcode-block" id="basename"><a class="viewcode-back" href="../../../pyDKB/pyDKB.common.hdfs.html#pyDKB.common.hdfs.basename">[docs]</a><span class="k">def</span> <span class="nf">basename</span><span class="p">(</span><span class="n">path</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot; Return file name without path. &quot;&quot;&quot;</span>
<span class="k">if</span> <span class="n">path</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">path</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span>
<span class="k">return</span> <span class="n">path</span><span class="o">.</span><span class="n">basename</span><span class="p">(</span><span class="n">path</span><span class="p">)</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span></div>


<div class="viewcode-block" id="dirname"><a class="viewcode-back" href="../../../pyDKB/pyDKB.common.hdfs.html#pyDKB.common.hdfs.dirname">[docs]</a><span class="k">def</span> <span class="nf">dirname</span><span class="p">(</span><span class="n">path</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot; Return dirname without filename. &quot;&quot;&quot;</span>
<span class="k">if</span> <span class="n">path</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">path</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span>
<span class="k">return</span> <span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">path</span><span class="p">)</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span></div>


<div class="viewcode-block" id="join"><a class="viewcode-back" href="../../../pyDKB/pyDKB.common.hdfs.html#pyDKB.common.hdfs.join">[docs]</a><span class="k">def</span> <span class="nf">join</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">filename</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot; Join path and filename. &quot;&quot;&quot;</span>
<span class="k">if</span> <span class="n">path</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">path</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span>
<span class="k">if</span> <span class="n">filename</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">filename</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span>
<span class="k">return</span> <span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">filename</span><span class="p">)</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span></div>
</pre></div>

</div>
Expand Down
Loading

0 comments on commit 132b720

Please sign in to comment.