Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we be storing normalized known speaker names? #343

Open
reefdog opened this issue Mar 17, 2020 · 0 comments
Open

Should we be storing normalized known speaker names? #343

reefdog opened this issue Mar 17, 2020 · 0 comments
Assignees
Labels
conversation Let's talk about this

Comments

@reefdog
Copy link
Collaborator

reefdog commented Mar 17, 2020

While reviewing PR #342, these lines made me 🤔. But since it's not the fault of that PR, I'm creating this issue for discussion.

A speaker is just a person, but we have two "speaker" concepts: a claim's speaker (currently not a distinct model, just speaker-prefixed fields on Claim) and KnownSpeaker.

The former stores speaker's names in full, no first/last split. The latter stores first and last in separate fields. This reflects the different data sources for each: web scraping and Google Sheet syncing, respectively.

That means that to compare the two, we have to first coerce the known speaker fields to look like what we expect the claim speaker's full name will look like, and then compare. There could/should be a JavaScript utility function we use to centralize this coercion.

What triggered me about the PR, though, is we're writing raw SQL, and thus couldn't use a JS utility function anyway; instead, we have to bake in an assumption about how to combine known speaker name fields to match a normalized claim speaker name. This strikes me as a little dangerous.

Now, are we likely ever going to change how names are composed? No. But should we have to remember that we're doing this composition in raw SQL in a newsletter file? I don't think so.

Short of being able to replace our raw SQL with a Sequelize Query (which we determined it's too complex for), should KnownSpeaker just combine first/last names into a single full name field during sync? Are we gaining anything by keeping them separate, given every other part of our system will be dealing with full names?

@reefdog reefdog added the conversation Let's talk about this label Mar 17, 2020
@reefdog reefdog removed their assignment Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conversation Let's talk about this
Projects
None yet
Development

No branches or pull requests

2 participants