Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process registered voter data #10

Open
1 task
conorgil opened this issue Sep 2, 2020 · 1 comment
Open
1 task

Process registered voter data #10

conorgil opened this issue Sep 2, 2020 · 1 comment
Assignees
Milestone

Comments

@conorgil
Copy link
Member

conorgil commented Sep 2, 2020

We purchased and downloaded the information for registered voters in Philadelphia. The data is uploaded to our google drive here (I gave you both access).

The download came with a doc that explains the file format, column headers, etc.

Acceptance Criteria:

  • The registered voter data is processed and loaded into a DB so that our web service can look up the mailing addresses of registered voters to mail them postcards.

I can imagine a few heuristics for who we should send postcards to, so I think our DB should include more than the name and mailing address of each voter. Few ideas:

  1. Choose randomly from voters who have participated in an election more recently than X date.
  2. Choose randomly from voters who have not participated in an election more recently than X date.
  3. Should we contact inactive voters, or only active voters?
  4. Should we avoid contacting voters who registered to vote very recently? Perhaps, since they just registered to vote recently there is a higher chance that they already know they can vote by mail compared to someone who registered to vote years ago and just has not heard the news that they can vote by mail this year?
  5. other? What are your ideas?
@conorgil conorgil added this to the Postcards milestone Sep 2, 2020
@ravenac95
Copy link
Collaborator

ravenac95 commented Sep 4, 2020

So I think there are two aspects here that are important:

  1. What's the interface to this DB in code?
  2. How do we pseudo-randomly choose people and how do we store them in the DB?

I'm only going to address (1) because I think we that (2) should be encapsulated in (1).

Proposal for the DB so whoever writes the DB:

# pseudo go code coming up

# Voter Object
struct Voter {
  id String
  name String
  address String
}

# Retrieving random voters
# Returns an array of Voter objects
voters := voterDb.chooseRandomVoters()

# Get a single voter
voter := voters[0]

# Commit this list of voters so they're not chosen again
voter.confirmPostcardSentToVoter(voter)

Generally, in terms of how we store this. I think the easiest thing to do is simply process all the data and throw it into DynamoDB in some pre-sorted order. So we just scan from the beginning of the db and then move data between tables. However the interface above should make it flexible enough that we could use a more random selection set and change the underlying implementation. There's a small likelihood of double sending to some person. Though, we will combat that by ensuring only a single process of this worker is running at a given time. For now hacky but keeping user data safe is the quickest way to the end (our own costs are our own burden).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants