-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathsocket-science.html
241 lines (229 loc) · 14.3 KB
/
socket-science.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
<!DOCTYPE html>
<html>
<head>
<link rel="canonical" href="https://hardmath123.github.io/socket-science.html"/>
<link rel="stylesheet" type="text/css" href="/static/base.css"/>
<title>It's not socket science: Part I - Comfortably Numbered</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<meta charset="utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<link rel="alternate" type="application/rss+xml" title="Comfortably Numbered" href="/feed.xml" />
<!--
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script>
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$']]}
});
</script>
-->
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.css" integrity="sha384-Um5gpz1odJg5Z4HAmzPtgZKdTBHZdw8S29IecapCSB31ligYPhHQZMIlWLYQGVoc" crossorigin="anonymous">
<script defer src="https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.js" integrity="sha384-YNHdsYkH6gMx9y3mRkmcJ2mFUjTd0qNQQvY9VYZgQd7DcN7env35GzlmFaZ23JGp" crossorigin="anonymous"></script>
<script defer src="https://cdn.jsdelivr.net/npm/[email protected]/dist/contrib/auto-render.min.js" integrity="sha384-vZTG03m+2yp6N6BNi5iM4rW4oIwk5DfcNdFfxkk9ZWpDriOkXX8voJBFrAO7MpVl" crossorigin="anonymous"></script>
<script>
document.addEventListener("DOMContentLoaded", function() {
renderMathInElement(document.body, {
// customised options
// • auto-render specific keys, e.g.:
delimiters: [
{left: '$$', right: '$$', display: true},
{left: '$', right: '$', display: false},
{left: '\\begin{align}', right: '\\end{align}', display: true},
{left: '\\(', right: '\\)', display: false},
{left: '\\[', right: '\\]', display: true}
],
// • rendering keys, e.g.:
throwOnError : false
});
});
</script>
</head>
<body>
<header id="header">
<script src="static/main.js"></script>
<div>
<a href="/"><span class="left-word">Comfortably</span> <span class="right-word">Numbered</span></a>
</div>
</header>
<article id="postcontent" class="centered">
<section>
<h1>It's not socket science: Part I</h1>
<center><em><p>A hands-on introduction to networking.</p>
</em></center>
<h4>Saturday, November 29, 2014 · 6 min read</h4>
<p>I like protocols. The Internet is like being in a party, and trying to have a
conversation with the person across the room by passing post-it notes. Except
you can only fit a couple of words onto a post-it note (of which you have, of
course, a limited supply). And people take as long as they want to pass along
the note. Or they could just forget about it. Some of them might read the
notes, others may replace your notes with their own. And the person across the
room only speaks Finnish.</p>
<p>Despite these hostile conditions, the Internet works. It works because we have
protocols—rules that computers in a network obey so that they can all get
along.</p>
<p>And you can understand these protocols. It’s not rocket science: it’s socket
science! (I promise that was the only pun in this post.)</p>
<p>Protocols fit onto other protocols. The lowest-level protocol you should really
care about is TCP: the Transmission Control Protocol. TCP handles taking a
large message, dividing it among many post-it notes, and then reassembling the
message at the other end. If some notes get lost along the way, TCP sends
replacements. Each post-it is called a “packet”.</p>
<p>Of course, TCP fits on top of another protocol, the Internet Protocol (IP, as
in IP Address), which handles even messier things like ensuring a packet gets
passed on from its source to its destination. There are other protocols that
live on IP: UDP is like TCP, except it doesn’t care whether packets get there.
If you’re writing a video conferencing service, you don’t need to ensure that
each packet makes it, because they’ll be out of date. So you use UDP.</p>
<p>TCP is handled at the kernel level, so when you send out a message, it’s
wrapped in TCP automatically. In fact, you need administrator privileges on
UNIX to send out “raw” packets (there are occasionally reasons to do this).</p>
<p>To create a TCP connection, we use <em>sockets</em>. Most languages provide socket
bindings: I’ll use Python’s API (which is very similar to C’s), but Node.js’s
<code>net</code> module does the same thing.</p>
<p>A quick way to get a socket working is to use <code>netcat</code> (it’s called <code>nc</code> on
most UNIX shells). There’s also <code>telnet</code>, but telnet listens for its own
protocol (for instance, if your connection sends a specific string, <code>telnet</code>
will automatically send back your screen width).</p>
<p>Alternatively, you can use <code>ncat</code>, which comes bundled with <code>nmap</code>. I prefer
<code>ncat</code>; I’ll explain why in a little bit. This command-line utility is pretty
much a UNIX stream that sends out stdin by TCP, and writes incoming messages to
stdout.</p>
<p>Once you have TCP working, you can do all sorts of stuff, because now message
length and integrity has been abstracted away. For example, you explore the web
by abiding by the HyperText Transfer Protocol (HTTP, as in <a href="http://something">http://something</a>).</p>
<p>In fact, it’s worth trying right now. HTTP has the concept of a “request” and a
“response”. HTTP requests look sort of like <code>GET /index.html HTTP/1.1</code>. GET is
the “method”, you can also POST, PUT, or DELETE (or even
<a href="http://en.wikipedia.org/wiki/Hyper_Text_Coffee_Pot_Control_Protocol">BREW</a>).</p>
<p><code>/index.html</code> is the path (the stuff you would type after <code>www.google.com</code> in
the address bar), and <code>HTTP/1.1</code> is the protocol (you could, in theory, have
another protocol running—HTTP 2.0 is being drafted as I write this).</p>
<p>Let’s do it. Open up a shell and try <code>nc google.com 80</code>. You’re now connected
to Google. Try sending it <code>GET /index.html HTTP/1.1</code>. You’ll need to hit
“enter” <em>twice</em> (it’s part of the protocol!).</p>
<p>You’ll be greeted by a huge mess of symbols, which is the HTML code that makes
up the Google homepage. Note that the connection doesn’t close, so you can send
another request if you want. In fact, let’s do that: if you only want to check
out the protocol, you can send a <code>HEAD</code> request, which is identical to <code>GET</code>
except it doesn’t send back an actual message. In the real world, <code>HEAD</code> is
useful to efficiently check if a file exists on a server. If you try <code>HEAD
/index.html HTTP/1.1</code>, you get:</p>
<pre><code>HTTP/1.1 200 OK
Date: Sat, 29 Nov 2014 19:40:02 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=67a496862b9f3c29:FF=0:TM=1417290002:LM=1417290002:S=8UjQDBRWYSa1y9tA; expires=Mon, 28-Nov-2016 19:40:02 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=NnwRLRx4JVz-x3lWFTSxzV_ZxLi_TLVmbw8oDifyhzT2iuWwQ0mVveS15bE8jI28kI-p8cMIEXmmwDmwlxojTY07azz6XzcmeRD7mHerDLuVjPwjV180AxNqWBHqJrfp; expires=Sun, 31-May-2015 19:40:02 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic,p=0.02
Transfer-Encoding: chunked
</code></pre><p>This looks messy, but it really isn’t. You can see how the protocol works: you
start with the protocol name and <code>200 OK</code>, which is the <em>response code</em>. You
are probably familiar with another response code, <code>404 NOT FOUND</code>.</p>
<p>Then each line begins with some header, a colon, and then information. For
instance, you get the date, you get the content type (<code>text/html</code>), etc.</p>
<p>The Cookie headers instruct the browser to save those values in a local file.
When the web browser sends further requests, the protocol instructs it to send
the saved cookies as a part of the request. This lets websites track you—and
is the reason Gmail keeps you logged in even when you close the window.</p>
<p>So far, so good. One thing that may have bothered you was the <code>80</code> you typed
as an argument for <code>nc</code>. That’s the <em>port number</em>. The idea is that a computer
can serve multiple websites by having multiple active sockets. To allow this,
TCP has a port argument: your computer has 65,536 ports and it delivers packets
to the right one.</p>
<p>As I said, <code>ncat</code> comes bundled with <code>nmap</code>. <code>nmap</code> is a <em>port scanner</em>, a
script that checks every port of a computer to see if anything is listening
(this is one of those places where raw sockets make things much more
efficient). Running port scans lets an attacker find vulnerable programs
running, and then exploit them (for instance, test servers or outdated services
that have known security issues are easy targets).</p>
<p>Don’t run port scans on computers you don’t own. <code>nmap</code> is designed to be used
by professional network security people, who keep huge sites like Google up and
running safely.</p>
<p><code>80</code> is the conventional port for HTTP, but you can serve a website on any
port. To access it from a web browser, you append the port after the domain
name, like <code>http://example.com:81/index.html</code>.</p>
<p>The other thing that may have bothered you was how the computer know who
<code>google.com</code> was. The answer to that is another protocol: the DNS protocol,
which is used to ask a DNS server to resolve a domain name (like Google.com)
into an IP address. You can try this with the <code>host</code> command:</p>
<pre><code>$ host google.com
google.com has address 74.125.239.135
google.com has address 74.125.239.129
google.com has address 74.125.239.134
google.com has address 74.125.239.131
google.com has address 74.125.239.133
google.com has address 74.125.239.142
google.com has address 74.125.239.128
google.com has address 74.125.239.137
google.com has address 74.125.239.136
google.com has address 74.125.239.130
google.com has address 74.125.239.132
google.com has IPv6 address 2607:f8b0:4005:800::1009
google.com mail is handled by 30 alt2.aspmx.l.google.com.
google.com mail is handled by 40 alt3.aspmx.l.google.com.
google.com mail is handled by 10 aspmx.l.google.com.
google.com mail is handled by 20 alt1.aspmx.l.google.com.
google.com mail is handled by 50 alt4.aspmx.l.google.com.
</code></pre><p>Once you choose an IP address, the protocol lets you track down that computer
and establish a connection.</p>
<p>Now, you’re often told to always use HTTPS, because it’s secure. You can
probably already tell how insecure HTTP is: any guest at the party can read
your post-it packets and know everything.</p>
<p>A fun thing to try is to run <code>tcpdump</code>: it’ll dump packets from your computer
as they’re sent out or received (you may like the <code>-X</code> option). Mess around
with the options a bit, and you can read the raw contents of HTTP packets as
you surf the web. You’ll need to be an administrator to run it, but if you
think about it, that’s probably a good thing.</p>
<p>Anyhow, back to HTTPS: it’s just HTTP, except sitting on top of another
protocol called SSL (or TLS—it’s sort of complicated). SSL handles finding an
encryption that both you and your connection agree is secure, negotiating a
shared secret key, and then sending encrypted messages. It also lets you
authenticate people by passing around certificates that are cryptographically
signed by authorities.</p>
<p>HTTPS runs on port 443, which is the other default port that your browser
doesn’t need to be told. You can try the above HTTP fun on port 443: most
websites will get mad at you and kill the connection.</p>
<p>This is the reason I like <code>ncat</code>: the <code>--ssl</code> option wraps your connection in
the SSL negotiations and encrypts what you send (you can’t viably do this
manually). Try <code>ncat --ssl google.com 443</code>: things should work as normal now,
but <code>tcpdump</code> will show you gibberish.</p>
<p>At this point we’ve foiled almost all the hurdles in our initial party analogy,
so I’m going to take a break.</p>
<p>In <a href="socket-science-2.html">Part II</a> of this series, we’ll explore some more
protocols and write clients using Python. We’ll talk about how protocols are
established, and why it’s important that it works the way it does right now.</p>
<p>In Part III (yet-to-be-written), we’ll explore four recent showstopper
exploits, all of which make plenty of sense once you understand protocols: Goto
Fail, Heartbleed, Shellshock, and POODLE.</p>
</section>
<div id="comment-breaker">◊ ◊ ◊</div>
</article>
<footer id="footer">
<div>
<ul>
<li><a href="https://github.com/kach">
Github</a></li>
<li><a href="feed.xml">
Subscribe (RSS feed)</a></li>
<li><a href="https://twitter.com/hardmath123">
Twitter</a></li>
<li><a href="https://creativecommons.org/licenses/by-nc/3.0/deed.en_US">
CC BY-NC 3.0</a></li>
</ul>
</div>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-46120535-1', 'hardmath123.github.io');
ga('require', 'displayfeatures');
ga('send', 'pageview');
</script>
</footer>
</body>
</html>