-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial ASG group check to avoid adding nodes to cluster without also removing a node #52
Conversation
…an Desired Capacity to allow for ReattachOldInstance step to complete, meaning clusters are not left with an extra node after the failed completion. Move all checks which use the retrieved instance information to the new autoScalingGroupCheck to avoid extra calls to AWS. Put totalRunningExecutions in new step, since this is the new first step.
@@ -38,12 +38,3 @@ export function describeAsg(asgName: string): Promise<AutoScaling.Types.AutoScal | |||
const params = { AutoScalingGroupNames: [ asgName ] }; | |||
return retry(() => awsAutoscaling.describeAutoScalingGroups(params).promise(), `describing ASG ${asgName}`, 5) | |||
} | |||
|
|||
export function getDesiredCapacity(asgInfo: AutoScalingGroupsType): number { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this because I think it's clearer that we're returning a single ASG with the singleASG
function, and DesiredCapacity is a simple property of an AutoScalingGroup
@@ -8,20 +8,6 @@ const awsEc2 = new AWS.EC2(); | |||
|
|||
type InstanceFilter = (instances: Instance[]) => Instance; | |||
|
|||
export function getInstances(asgName: string): Promise<string[]> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer used
return new Promise<ClusterStatusResponse>((resolve, reject) => { | ||
Promise.all([ | ||
describeAsg(event.asgName), | ||
getClusterHealth(event.oldestElasticsearchNode.ec2Instance.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this to autoScalingGroupCheck
since we have access to ASG info here so can avoid the extra call
const asg = event.asgName | ||
const instances: string[] = event.instanceIds | ||
console.log(`Searching for oldest node in ${asg}`) | ||
const specificInstance = await getSpecificInstance(instances, findOldestInstance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSpecificInstance requires another call to AWS, so we might as well keep this in a separate step
getInstances(asg) | ||
]) | ||
.then(([runningExecutions, instances]: [number, string[]]) => { | ||
if (runningExecutions !== 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to autoScalingGroupCheck
src/autoScalingGroupCheck.ts
Outdated
export function singleASG(groups: AutoScalingGroups): AutoScalingGroup { | ||
const groupNumber = groups.length; | ||
if (groupNumber !== 1) { | ||
throw new Error(`Expected single ASG, but got ${groupNumber}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay to throw an error from the function, or should this be reflected in the return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, not a big fan in general of throwing in js. But unless you want to make an error type and redo error checking- this is your best bet.
Here we introduce an initial step which checks that ASG MaxSize is greater than Desired Capacity to allow for ReattachOldInstance step to complete, meaning clusters are not left with an extra node after the failed completion. This is in response to issue #51
I've moved all checks which use the retrieved instance information to the new autoScalingGroupCheck to avoid extra calls to AWS. The totalRunningExecutions is also now in the new step, since this is the new first step.
cc @AWare I've tried to follow your advice for managing async failure, but I'm not sure I've done this properly. If you get a minute to take a look I'd be grateful.
TODO: