Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize Duckdb Without Huggingface #1891

Open
5 tasks
Tracked by #1890
ryanfchase opened this issue Jan 11, 2025 · 0 comments
Open
5 tasks
Tracked by #1890

Initialize Duckdb Without Huggingface #1891

ryanfchase opened this issue Jan 11, 2025 · 0 comments
Labels
Complexity: Small p-feature: data P-feature: Map Role: Frontend React front end work Size: 1pt Can be done in 6 hours Time sensitive This ticket should be completed ASAP
Milestone

Comments

@ryanfchase
Copy link
Member

ryanfchase commented Jan 11, 2025

Overview

We need to create a version of our app that initializes the map without loading Huggingface, since we are in the process of migrating to a Socrata API data ingestion

Developer Info

We are NOT merging this into develop. This is meant to be a staging branch for the Blank Map feature ONLY

  • I understand

Action Items

  • designate a feature branch, e.g. staging-feature-blank-map)
  • manually define our requests table on application load (see createRequestsTable)
  • configure the data population to use a config variable to determine its source (see proposed constant, DATA_SOURCE)
  • (optional) only create table once, ensure only 1 table exists (as opposed to a table for every year)

Resources/Instructions

Notes

  • currently we rely on parquet files (e.g. 2024.parquet) from HuggingFace to define the requests table
    • obtain the table columns and field types and use them to define the requests table

Resources

Code References

  • createRequestsTable
    • path: components > Map > index.jsx::MapContainer, L69
    • note: SQL on L77 will need to modified. We will not be using datasetFileName and instead we'll simply define the table inline
  • CONSTANTS.js
    • path: components > common > CONSTANTS.js
  • proposed object: DATA_SOURCE
    • HUGGING_FACE: load the data from HuggingFace, (this is the usual way we load the application)
    • SOCRATA: load the data directly from Socrata API call (this is yet to be implemented)
export const DATA_SOURCE = {
  'HUGGING_FACE': 0,
  'SOCRATA': 1,
};
  • updateHfDataset.py
    • path: scripts > updateHfDataset.py
    • note: use this as a reference to how we are handling timestampformat
@ryanfchase ryanfchase added this to the 04 - Map Page milestone Jan 11, 2025
@ryanfchase ryanfchase self-assigned this Jan 11, 2025
@github-project-automation github-project-automation bot moved this to New Issue Approval in P: 311: Project Board Jan 11, 2025
@ryanfchase ryanfchase added Time sensitive This ticket should be completed ASAP ready for prioritization and removed draft labels Jan 12, 2025
@ryanfchase ryanfchase removed their assignment Jan 12, 2025
@ryanfchase ryanfchase moved this from New Issue Approval to Prioritized Backlog in P: 311: Project Board Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Complexity: Small p-feature: data P-feature: Map Role: Frontend React front end work Size: 1pt Can be done in 6 hours Time sensitive This ticket should be completed ASAP
Projects
Status: Prioritized Backlog
Development

No branches or pull requests

1 participant