Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Division title importer/exporter #1765

Merged
merged 5 commits into from
Mar 21, 2024
Merged

Division title importer/exporter #1765

merged 5 commits into from
Mar 21, 2024

Conversation

ajparsons
Copy link
Contributor

@ajparsons ajparsons commented Feb 12, 2024

This PR adds division_io.py - a click CLI to handle importing and exporting division data.

image

This handles bulk importing updates to division titles (e.g. from Parliament's commons votes api), and exporting parquet tables of divisions as a basic api to feed twfy-votes.

There is an adjustment to the division table schema - adding a title_priority field. This is to keep track of the origin of the current title, and stop manual updates to titles being overridden by automated ones. Existing titles with a yes_text are retrospectively updated in priority.

For the moment, this uses a requirements.txt file for the needed python packages. When the server is updated, I'll do a new PR to tidy up the python-tooling in general. Unsure for the moment if this needs the morningupdate commands to run in a venv, or if system packages can just be updated in the short term.

Populate the php setting from an
env var.
@ajparsons
Copy link
Contributor Author

ajparsons commented Feb 20, 2024

I've added a venv creation process to deploy.bash.

Added some packages to the packages file for future docker use but these are all present on the server anyway so should be fine.

Copy link
Member

@dracos dracos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny things is all!

bin/deploy.bash Outdated Show resolved Hide resolved
db/0023-add-division-title-priority.sql Outdated Show resolved Hide resolved
scripts/division_io.py Outdated Show resolved Hide resolved
@ajparsons
Copy link
Contributor Author

Issues above resolved, I've also just quickly updated the packages now we're not stuck on an old pandas version because of 3.7 - 4905383

- Update existing data to be 'Manual' when it has a yes_text.
- Export parquet dumps of divisions and votes
- Ingest manual and automated updates of titles
from remote sources

Requirements.txt until server python is updated
 - IN - division titles from Parliament where they exist
 - OUT - a dump of the divisions table (for twfy-votes).
@ajparsons ajparsons merged commit 9bc5f66 into master Mar 21, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants