-
Notifications
You must be signed in to change notification settings - Fork 0
/
atom.xml
97 lines (64 loc) · 7.18 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>a.Smiley</title>
<link href="asmiley.github.io/atom.xml" rel="self"/>
<link href="asmiley.github.io/"/>
<updated>2015-02-01T22:56:19-05:00</updated>
<id>asmiley.github.io</id>
<author>
<name>Alex Smiley</name>
<email>[email protected]</email>
</author>
<entry>
<title>Natural Language Processing in the Cloud</title>
<link href="asmiley.github.io/social-media-mining/2015/02/01/nltk-in-the-cloud/"/>
<updated>2015-02-01T00:00:00-05:00</updated>
<id>asmiley.github.io/social-media-mining/2015/02/01/nltk-in-the-cloud</id>
<content type="html"><p>In recent years cloud-based IDEs have become increasingly popular. Services like <a href="https://koding.com/R/asmiley">Koding</a> and <a href="https://c9.io/">Cloud9</a> provide users with web-based development environments that make it easy to access your projects wherever you are and even collaborate with other users in real-time. The two previously mentioned services provide users with their own virtual machine (VM) running Ubuntu, and are preconfigured with Python, Node.js, Ruby, PHP, and more so that you can immediately jump in and start being productive. </p>
<p><a href="http://www.nltk.org/">NLTK</a>, The Natural Language Toolkit, is a collection of natural language processing libraries and programs for Python. I&rsquo;ll show you how simple it is to install NLTK on your VM so you can get started learning about natural language processing without having to worry about configuring it on multiple devices. In addition, I&rsquo;ll walk through installing the NLTK Data packages used in <a href="http://www.nltk.org/book/">Natural Language Processing with Python</a> so that you can follow along with this great resource. </p>
<p>We&rsquo;ll be using Koding because that&rsquo;s the service I&rsquo;m more familiar with and because it provides the most free space out of all of the cloud IDEs that I&rsquo;ve investigated, which leads us into an important factor to note: Koding&rsquo;s free accounts are limited to 3 gigs of disk space. While this is enough to get started with NLTK and download the data libraries that we&rsquo;ll use to follow along with the book, you may run into problems trying to install additional packages down the road. </p>
<h3>1. Setup your Koding account</h3>
<p>To begin, you&rsquo;ll have to set up your <a href="https://koding.com/R/asmiley">Koding</a> account. Head on over to their site, register, and complete the email confirmation process. Once this is complete you should be able to log into your workspace. The first time you log in you will be prompted to initialize your VM. This process may take a minute or two. </p>
<p>Take a moment to acquaint yourself with the Koding interface. On the left-hand side you&rsquo;ll find menus, settings, and the file tree for your workspace. On the right side you&rsquo;ll see the text editor and below that is the terminal. We&rsquo;ll be working in the terminal to get NLTK installed.</p>
<h3>2. Install pip</h3>
<p>pip is a package manager for Python that will make it extremely easy to install NLTK and other Python packages. pip does not come pre-installed on our VM, so we&rsquo;ll have to go down to our terminal and enter:</p>
<div class="highlight"><pre><code class="language-ruby" data-lang="ruby">sudo apt-get install python-pip</code></pre></div>
<p>You&rsquo;ll be asked if you want to continue. Type <code>Y</code> and press <code>Enter</code>.
<br />
<br /></p>
<p><img src="/assets/images/nltk-koding/pip-confirmation.jpg" alt="Confirm pip download">
<br /></p>
<h3>3. Install NLTK</h3>
<p>Now that pip has been installed we can install nltk with a single command in our terminal. Go ahead and enter:</p>
<div class="highlight"><pre><code class="language-ruby" data-lang="ruby">sudo pip install -U nltk</code></pre></div>
<p>As the install runs you&rsquo;ll probably receive a few warnings. You can ignore these. </p>
<p>To test that NLTK has been properly installed, enter <code>python</code> in your terminal. Once Python is running, enter: </p>
<div class="highlight"><pre><code class="language-ruby" data-lang="ruby">import nltk</code></pre></div>
<p>If no errors are returned, NLTK has been succesfully installed. You should see something like this.
<br />
<br /></p>
<p><img src="/assets/images/nltk-koding/nltk-test.jpg" alt="Test that NLTK has been installed correctly">
<br /></p>
<h3>4. Download NLTK Data packages</h3>
<p>NLTK comes with a number of data sources for us to work with. However, we&rsquo;ll have to download them before we can start playing around with them. In your terminal, now that you have imported NLTK into Python, enter:</p>
<div class="highlight"><pre><code class="language-ruby" data-lang="ruby">nltk.download()</code></pre></div>
<p>This will display the NLTK Downloader interface. It should look something like this:
<br />
<br />
<img src="/assets/images/nltk-koding/downloader.jpg" alt="NLTK Data download interface">
<br />
At this point, type <code>d</code> and press <code>enter</code>. It will ask you the identifier of the package you want to download. We want to download the resources used in <a href="http://www.nltk.org/book/">Natural Language Processing with Python</a>, so we&rsquo;ll type <code>book</code> and press <code>Enter</code>. Finally, to exit the downloader, type <code>q</code> and press <code>Enter</code> one more time.</p>
<p>Now, before we can actually do anything with these texts we&rsquo;ll have to import them into Python. To do this, enter:</p>
<div class="highlight"><pre><code class="language-ruby" data-lang="ruby">from nltk.book import *</code></pre></div>
<p>You should see:
<br />
<br />
<img src="/assets/images/nltk-koding/downloaded.jpg" alt="Terminal once the NLTK Data book library has been downloaded">
<br /></p>
<h3>That&rsquo;s it!</h3>
<p>You now have NLTK installed on your own VM in the cloud and can access it on any computer. Have fun learning about natural language processing with Python! </p>
<p>If you run into any problems or have any additional questions, feel free to leave a comment below.
<br /></p>
</content>
</entry>
</feed>