Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add finished_at timestamp to datasets #462

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

dale-wahl
Copy link
Member

super simple read last line from log. only using it in the dataset results details at the moment.

using it for the sort feature is not trivial unfortunately since we are reading from the log instead of a database column.

possibly it would be better to use something like this in a migrate script to update the database instead? actually, i immediately like that idea more. tell me why not.

@dale-wahl
Copy link
Member Author

Switched to added finished_at timestamp that is created when DataSet.finish() is called. Created migrate script to fill in finished_at for existing datasets using the last log entry. In testing on my existing 4CAT instances, I did find examples of datasets without logs as well as some logs where the final line does not contain a timestamp.

The non-existant log files were from either a failed 4CAT import (the 4CAT import changes the dataset key and, somehow, lost the log file in the failure) and a deprecated analysis that left a hanging dataset (weird dev case I think). The unable to parse were all related to failed datasets that were marked as failures but with long errors as the final log entry.

@dale-wahl dale-wahl marked this pull request as ready for review December 5, 2024 16:14
@dale-wahl dale-wahl requested a review from stijn-uva December 5, 2024 16:14
@dale-wahl dale-wahl changed the title get last_update timestamp from log; display in dataset results Add finished_at timestamp to datasets Dec 5, 2024
Copy link
Member

@stijn-uva stijn-uva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of nitpicky, but could we change the column name to timestamp_finished? I think we use timestamp_something for all columns containing timestamps (in any case I've tried to do that), and consistency is nice.

Looks good to me otherwise.

@dale-wahl
Copy link
Member Author

Changed to timestamp_finished.

This PR has a migrate script and thus needs a VERSION change. Do we auto run migrate on restarts now? If not we'll want to release on merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants