Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GitHub Actions Testing Workflows - Stop Using Docker #1589

Open
omad opened this issue May 31, 2024 · 2 comments
Open

Improve GitHub Actions Testing Workflows - Stop Using Docker #1589

omad opened this issue May 31, 2024 · 2 comments
Labels

Comments

@omad
Copy link
Member

omad commented May 31, 2024

We currently use Docker Builds inside GitHub actions for running automated tests. It "works", but is very complicated and very fragile, as recently evidenced by multiple days of work on #1587 and #1588.

The world has moved on since this workflow was created, and it's now possible to get up to date binaries of all the geospatial and scientific Python libraries that we use. From several different sources. Without having to depend on system libraries, or recompile anything.

As far as I can tell. The docker images built from datacube-core aren't being used anywhere, it's only downstream software like Alchemist, Statistician, Tools, OWS and Explorer where docker images are used. They've all ended up going in somewhat divergent directions, due to their differing dependencies and requirements. While consolidation would be good, I don't think the setup in GHA here would be a sensible place to start.

In brief, I think we should:

  • Ditch docker in GH Actions.
  • Switch to a build matrix of Python versions crossed with pip and conda.
  • Rely on the binary wheels available from PyPI for pip installs
  • Rely on binary packages available from conda-forge for conda environments.
  • Follow the GH Actions advice for running a PostgreSQL service container

I think it will be much faster, and simpler, and more comprehensive

@omad omad added the tests label May 31, 2024
@pjonsson
Copy link
Contributor

pjonsson commented Jun 2, 2024

The linked PRs are making 3 upgrades at once:

  1. Ubuntu LTS version upgrade (happens once every two years)
  2. Python version from 3.10 to 3.12 (one 3.x release per year, so two years worth of upgrades)
  3. GDAL version from 3.8 to 3.9 (happens frequently, GDAL moves fast)

This time it was GDAL that mandated all upgrades at once, and switching to running directly on the Github CI runners will shift the thing mandating the upgrades of Ubuntu LTS to Github instead of GDAL, but it will still be an external decision.

The third bullet goes against psycopg2's recommended production use (https://www.psycopg.org/docs/install.html#psycopg-vs-psycopg-binary).

Having the database in a Github service will give a different local test environment, so that might reduce the reproducibility of CI issues compared to a containerized setup.

I'm not advocating any certain technical direction, just that the causes, consequences, and costs of directions are considered.

@Kirill888
Copy link
Member

current docker based workflow is really slooow. Looks like it's building docker from scratch, without any cache, it then seems to push it to ghcr.io just to download it right back in the next step spending couple minutes on each side. We should not need to push and pull at least, even if we keep building from the very scratch without cache. Or better just stop using docker where it hurts rather then helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants