Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change output of individual scrapers to JSON #11

Open
cameron-toy opened this issue Feb 3, 2020 · 1 comment
Open

Change output of individual scrapers to JSON #11

cameron-toy opened this issue Feb 3, 2020 · 1 comment

Comments

@cameron-toy
Copy link
Collaborator

Currently, each module outputs a csv string with just the data. Making that data one field in a JSON string with errors, timestamps, and other metadata in the others would allow for better logging and error handling.

@austinsilveria
Copy link
Contributor

Had a conversation with the data team yesterday and we will be storing the data in the database through SQLAlchemy object mapper classes. We ran through an example for storing an AudioSampleMetaData object which can be seen here: calpoly-csai/api#35

Our use case is very similar to what was done in this PR, so I imagine we will be building JSON representations of each scraped object (Course, Club, ...) so it can be mapped to its respective SQLAlchemy entity (Courses source code).

The solution to logging posed in this issue would be great for integrating with how we are going to store the data. We could build the wrapped object as follows, then save the List[Course] in bulk through one API call:

CoursesData {
    errors: List[Error]
    timestamp: Timestamp
    data: List[Course]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants