-
Notifications
You must be signed in to change notification settings - Fork 0
/
II.PRE-PRODUCTION.txt
159 lines (91 loc) · 6.77 KB
/
II.PRE-PRODUCTION.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
II. Pre-Production: Laying the Foundation.
* * * * *
Eucalyptus, in a minimal proof-of-concept, is simple.
Eucalyptus, running in production, is complicated.
Why? Because Eucalyptus is a distributed system that manages virtual compute and storage resources across a network. And managing distributed systems is complicated. It's not necessarily hard -- <em>if you do the right things to manage that complexity</em>.
That's what this whole guide is about. This part, in particular, is about doing the right things <em>before</em> you install Eucalyptus.
Let's start with reviewing what the components do.
* * * * *
A. Component Review
(Be sure to review the official documentation can be found at: https://www.eucalyptus.com/docs/eucalyptus/3.4/index.html#install-guide/euca_components.html)
Here's a brief review of what the different Eucalyptus components do, and why they're important.
1. The Cloud Contoller (CLC)
The Cloud Controller (CLC for short) is the brain of Eucalyptus. When you ask Eucalyptus to do something -- "hey, give me five instances, please" -- you're making a web services request to the CLC. All service requests go through the CLC, and state information about all Eucalyptus services live in a database on the CLC. Which means that keeping the CLC happy -- and most importantly, keeping the CLC's database happy -- is extremely important.
As of Eucalyptus 4.0, the CLC consists of two logical components: the CLC Frontend, which is the listener for all web services; and the CLC Backend, which contains the CLC database and the JDBC listeners. These components can now be put on different physical hardware, and multiple CLC Frontends can be built to serve a single CLC Backend.
When building CLC components for your production environment, you will want to set aside plenty of storage and plenty of memory. A slow CLC won't affect your cloud instances themselves, but it will affect your ability to send commands to those instances. If the CLC is juggling state information about 1000 instances in swap space, it'll take longer to get the five instances you asked for -- maybe a lot longer.
Thus, in general, CLC Frontends should be optimized to serve web traffic, and the CLC Backend should be optimized for database performance. We'll discuss this more
[TODO: Something here about keeping the Java heap appropriately sized -- do we have scoping guides here? Chris recommends setting Java heap to be exactly half of the physical memory available on the CLC. Need to vet that with Ken et al.]
2. The Node Controller (NC)
The Node Controller (NC for short) runs the actual cloud instances. The more instances you need, the more NCs you will need.
When building NCs for your production environment, memory and processing power are key. Storage is also important, but how much you needs depends upon your projected balance of usage between instance storage and elastic block storage. We'll talk more about storage optimization later.
Another key thing to be aware of: instances on an NC are all competing for processor time. If an NC has ten instances, and all ten instances have the CPU pegged, all ten instances will suffer degraded performance. For this reason, it's important to understand the performance characteristics of workloads likely to run on your cloud. We'll talk more about instance placement optimization strategies later.
3. The Cluster Controller (CC)
The Cluster Controller is a bit tricky;
[FIXME]
4. The Storage Controller (SC)
[FIXME]
5. The Object Storage Gateway (OSG)
[FIXME]
Now let's review how the components fit together into various reference architectures.
* * * * *
B. Reference Architectures
This is really just a pointer to this set of exceptional guidelines for Eucalyptus reference architectures:
https://eucalyptus.atlassian.net/wiki/display/EUCA/Eucalyptus+Reference+Architectures
[FIXME]
Now let's talk more about storage in Eucalyptus.
* * * * *
C. Building your storage strategy.
There are three fundamental types of storage you'll need to consider for your Eucalyptus installation:
* Object storage is managed by Walrus [FIXME, what's the name of that new thing?], and is used to store images and snapshots, and can be used by Eucalyptus users for their own object storage needs.
* Block storage is managed by the Storage Controller (SC), and is used for persistent storage in the form of EBS volumes.
* Ephemeral storage is part of the instance itself, and lives on the Node Controller (NC) where the instance runs. When the instance shuts down, the ephemeral storage goes away.
[FIXME: integrate this: https://www.eucalyptus.com/blog/2013/03/18/eucalyptus-walrus-storage-considerations]
[FIXME: more content here]
Now let's talk more about the network environment around a Eucalyptus installation.
* * * * *
D. Building your network strategy.
[FIXME, way later, this is probably last]
Now let's talk more about how you should be monitoring your Eucalyptus cloud.
* * * * *
E. Building your monitoring strategy.
[FIXME]
Now let's talk about how you should manage your Eucalyptus configuration.
* * * * *
F. Building your configuration management strategy.
[FIXME]
Now let's talk about setting up a backup strategy for Eucalyptus.
* * * * *
G. Building your backup strategy.
[FIXME]
Finally, let's talk about testing your Eucalyptus installation before you roll it out to users.
* * * * *
H. Acceptance Testing: Why it Matters and How To Do It.
[FIXME]
* * * * *
By now, you should have a cloud up and running. Let's talk about how to manage your cloud once it's in production.
* * * *
A. For five nines at the cloud level, build five nines beneath.
B. Reviewing the reference architectures.
1. A brief overview of what the components do.
2. A brief overview of capacity planning.
3. Pointers to reference architectures.
C. Building your storage strategy.
1. The difference between Walrus/S3 and EBS.
2. The importance of a SAN for heavy EBS usage.
a. Steal Zach's email from 3/4/14: "Storage Controller - Main Bottlenecks?"
3. The importance of a gateway for heavy S3 usage.
D. Building your network strategy.
1. Reviewing the network modes and what features are needed in each.
2. Discussion of VLANs.
3. Discussion of DNS.
4. Discussion of multipath and why to get it right the first time.
5. The network tomography tool.
E. Building your monitoring strategy.
F. Building your config management strategy.
1. Why a config management system for Euca is essential
2. Briefly: Difference between config files and Java DB properties
G. Building your backup strategy.
1. The importance of backing up the database.
H. Acceptance Testing: Why it Matters and How To Do It.
1. Micro-QA.
2. eutester.