Should we be storing normalized known speaker names? #343

reefdog · 2020-03-17T19:44:21Z

While reviewing PR #342, these lines made me 🤔. But since it's not the fault of that PR, I'm creating this issue for discussion.

A speaker is just a person, but we have two "speaker" concepts: a claim's speaker (currently not a distinct model, just speaker-prefixed fields on Claim) and KnownSpeaker.

The former stores speaker's names in full, no first/last split. The latter stores first and last in separate fields. This reflects the different data sources for each: web scraping and Google Sheet syncing, respectively.

That means that to compare the two, we have to first coerce the known speaker fields to look like what we expect the claim speaker's full name will look like, and then compare. There could/should be a JavaScript utility function we use to centralize this coercion.

What triggered me about the PR, though, is we're writing raw SQL, and thus couldn't use a JS utility function anyway; instead, we have to bake in an assumption about how to combine known speaker name fields to match a normalized claim speaker name. This strikes me as a little dangerous.

Now, are we likely ever going to change how names are composed? No. But should we have to remember that we're doing this composition in raw SQL in a newsletter file? I don't think so.

Short of being able to replace our raw SQL with a Sequelize Query (which we determined it's too complex for), should KnownSpeaker just combine first/last names into a single full name field during sync? Are we gaining anything by keeping them separate, given every other part of our system will be dealing with full names?

The text was updated successfully, but these errors were encountered:

reefdog added the conversation Let's talk about this label Mar 17, 2020

reefdog assigned reefdog and slifty Mar 17, 2020

reefdog removed their assignment Oct 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we be storing normalized known speaker names? #343

Should we be storing normalized known speaker names? #343

reefdog commented Mar 17, 2020 •

edited

Loading

Should we be storing normalized known speaker names? #343

Should we be storing normalized known speaker names? #343

Comments

reefdog commented Mar 17, 2020 • edited Loading

reefdog commented Mar 17, 2020 •

edited

Loading