Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Broker updates (#108) * Updates to docker setup * testing adding event harvesters * Updated parsing of events to be strict to align with marshmallow version 3 * removed unnecesary debug message * Added keywords to ads harvester * Updated to Python 3.8 and latest invenio packages * Added metadata endpoint for metadata query * Updated Pipfile.lock and install using it * Updated metadata queries to ignore cases for journals and allow simple query strings on keywords * Fixed mistake * Added automatic harvesting after event is finished * Added error monitoring for harvesters and events * Skip retrying tasks and publish error * Fixed simple query syntax * Fixed bug with extra definition of query filter * Fixed bug where target scheme was overwritten due to duplicate variable names * Updated error message with payload on failed validation * Moved error handling to cover the full harvesting process * Moved error handling to cover the full harvesting process * Readded retries on failed harvest and event tasks * Added harvest taks monitoring * Updated error catching on DOI harvester * Updated harvester monitoring to add which harvester has been run * Added error monitoring on updating metadata * Added error monitoring on updating metadata * Added aggregations on metadata search on amount of softwares cited * Updated the reruns to have a progressive increase in time for each retry * Added slackclient to write status reports every week * Moved error catching to cli so the api fucntions transfer the error to the harvesters * Updated Piplock file * Updated metadata search to only show aggregated Target results Co-authored-by: Mattias Wångblad <[email protected]> * Added vscode files to gitignore * Added azure dev docker compose for lets encrypt ssl * Updated path for letsencrypt to account for symlinks * Updated Azure Dev HAproxy config to enable Letsencrypt SSL certificate * Added additional docker compose file for Azure dev instance to enable SSL through Lets Encrypt (#110) * Added azure dev docker compose for lets encrypt ssl * Updated path for letsencrypt to account for symlinks * Updated Azure Dev HAproxy config to enable Letsencrypt SSL certificate * Initial compose file * Broker updates2 (#111) * Upgraded to elasticsearch v7 * Updated formatting for slack monitoring messages * Added a github harvester * Updated pagination for metadata search * Added mappings for es7 * Fixed parsing bugs in github harvester * Fixed issues with adding github relations in zeondo harvester * Made sure to order the metadata aggregations correctly on count * Updated slack notifications to send to correct channel * Fixed 2 more bugs on github parsing of github urls * Fixed bug where the providers where dictionaried twice * Added check to ensure it won't fail if github tags doesn't exists * Updated payload on error monitoring to always be with the identifiers * Fixed bug mixing up source and target in github relations * Added possibility to have a github token for more queries per hour * Increased rate limit for logined users to allow more events per hour * Fixed bug setting github api token * Fixed bug setting github api token * Fixed bug setting github api token * Added message in case there is nothing to report on monitoring * Addec CLI commands to trigger monitor report * Addded unique key requirement on groupm2m subgroup_id column * Fixed bug on monitoring reporting * Fixed another montioring report bug on getting the errors to dict * Split the error monitoring into chunks because of Slack limitations * Added payload to error report * Fixed typo in error report * Fixed bug where github url ending with / would cause error * Added rollbacks to make sure event reporting is fixed * Fixed bug on rolling back DB errors * Fixed bug which caused monitoring message to fail on some occasions * Added CLi to rerun failed events * Added possibility to rerun specific id * Reomved erorrneus import * Fixed typo * Made ID optional in event rerun CLI * Added rerun CLI to harvester * Fixed bug on parsing github release id * Updated names of harvesters to be able to rerun them from CLI * Added the unique key on subgroup in groupm2m Co-authored-by: Mattias Wångblad <[email protected]> * Changed over to production SSL certificates * Inital Docker Compose production script * Keeping all networking in links rather than networks * typo fix * Changed networking for "elastic" and "bridge" * Troubleshooting elasticsearch cluster * es debug * es debug * es docker debug * es debug * es debug * es debug * es debug * Fixed networking for es * Aas deployment (#112) * Initial compose file * Changed over to production SSL certificates * Inital Docker Compose production script * Keeping all networking in links rather than networks * typo fix * Changed networking for "elastic" and "bridge" * Troubleshooting elasticsearch cluster * es debug * es debug * es docker debug * es debug * es debug * es debug * es debug * Fixed networking for es * Added keyword and journal filter to relation search * Added monitoring and harvesting to documentation and fixed some typos on the naming and comments * Set specific version of rabbitmq and increased timeout on acknowledging to 1 week to not imeout on retries * Added memory limits to rabbitmq * Broker updates (#115) * Removed retry of tasks after 1 week since that could fill up rabbitmq channels * Aas deployment (#114) * Initial compose file * Changed over to production SSL certificates * Inital Docker Compose production script * Keeping all networking in links rather than networks * typo fix * Changed networking for "elastic" and "bridge" * Troubleshooting elasticsearch cluster * es debug * es debug * es docker debug * es debug * es debug * es debug * es debug * Fixed networking for es * Added memory limits to rabbitmq * Added a 1s sleep on harvesters to avoid query overload * Changed retry of tasks to only once after 10min and added cronjob to rerun last two days of errors every night * Possibility to only rerun errors between timestamps on CLI * Added link between error_monitoring, event and harvester_event to avoid having double creation of errors * Fixed inconsistency in import * Added check on github link on event parsing * Allow null values of LicenseUrl on event json parsing * Lowered acknowledge time to 30min * Refactored to avoid circular references * Removed unused imports * Refactored to avoid circular references when checking github events * Another move to avoid circular references * Fixed bug pointing to wrong rerun function * Small fixes on the github validation * Changed error reporting to only take errors that have not successfully been rerun Co-authored-by: mattiasw <[email protected]> * Broker updates (#116) * Removed retry of tasks after 1 week since that could fill up rabbitmq channels * Aas deployment (#114) * Initial compose file * Changed over to production SSL certificates * Inital Docker Compose production script * Keeping all networking in links rather than networks * typo fix * Changed networking for "elastic" and "bridge" * Troubleshooting elasticsearch cluster * es debug * es debug * es docker debug * es debug * es debug * es debug * es debug * Fixed networking for es * Added memory limits to rabbitmq * Added a 1s sleep on harvesters to avoid query overload * Changed retry of tasks to only once after 10min and added cronjob to rerun last two days of errors every night * Possibility to only rerun errors between timestamps on CLI * Added link between error_monitoring, event and harvester_event to avoid having double creation of errors * Fixed inconsistency in import * Added check on github link on event parsing * Allow null values of LicenseUrl on event json parsing * Lowered acknowledge time to 30min * Refactored to avoid circular references * Removed unused imports * Refactored to avoid circular references when checking github events * Another move to avoid circular references * Fixed bug pointing to wrong rerun function * Small fixes on the github validation * Changed error reporting to only take errors that have not successfully been rerun * Make sure that the error message aren't larger than slack limits * Added unique constraint on groupm2m subgroup since it's never intended to have multiple of them Co-authored-by: mattiasw <[email protected]> * Update AUTHORS.rst Added new authors * Update Dockerfile Adding Ignore pipfile to only use Pipfile.lock for docker install * Added ignore pipfile to dockerfile * Added flask version to Pipfile * Added specific Invenio versions to ensure compatibility Co-authored-by: mattiaswangblad <[email protected]> Co-authored-by: Mattias Wångblad <[email protected]> Co-authored-by: mattiasw <[email protected]>
- Loading branch information