-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle dash for xff, and region id starting the path #712
Conversation
@@ -119,6 +120,16 @@ Resources: | |||
}); | |||
}; | |||
|
|||
const findIp = (xff, ip) => { | |||
if (xff === '-') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the bug that made all the hashed ips the same - the xff is coming through most of the time as -
, which is not blank, but also not an ip. This dash was being used as the ip instead of the client ip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof - good catch. I think I had similar in the counts-lambda, but apparently forgot about it here.
@@ -146,6 +157,12 @@ Resources: | |||
// podcast id and episode guid (only works for dovetail3-cdn requests) | |||
const datas = mappedRows.filter(data => { | |||
const parts = data['cs-uri-stem'].split('/').filter(s => s); | |||
|
|||
// if the path starts with a region like usw2, shift that off | |||
if (parts[0] && parts[0].match(/^[a-z][a-z0-9\-]+$/)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the other bug, that we have requests with an aws region name prefix, like /usw2/
These were all getting filtered out
@@ -163,8 +180,7 @@ Resources: | |||
// calculate listener_ids | |||
datas.forEach(data => { | |||
// use leftmost XFF or IP | |||
const xffParts = (data['x-forwarded-for'] || '').split(',').map(s => s.trim()).filter(s => s); | |||
const leftMostIp = xffParts[0] || data['c-ip']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comparison was picking the dash, -
, value of the xff over the client ip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 looks good to me. Onwards to coordinating how to deploy this.
@@ -119,6 +120,16 @@ Resources: | |||
}); | |||
}; | |||
|
|||
const findIp = (xff, ip) => { | |||
if (xff === '-') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof - good catch. I think I had similar in the counts-lambda, but apparently forgot about it here.
@@ -146,6 +157,12 @@ Resources: | |||
// podcast id and episode guid (only works for dovetail3-cdn requests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related directly to this change but ...
For re-processing purposes, it may be useful to also have this lambda log what S3 input file it's processing, and how many rows it had. Just above this line somewhere:
console.info(`Read ${rows.length} rows from s3://${Bucket}/${Key}`);
fixes #714