Improve GitHub Actions Testing Workflows - Stop Using Docker #1589

omad · 2024-05-31T04:22:29Z

We currently use Docker Builds inside GitHub actions for running automated tests. It "works", but is very complicated and very fragile, as recently evidenced by multiple days of work on #1587 and #1588.

The world has moved on since this workflow was created, and it's now possible to get up to date binaries of all the geospatial and scientific Python libraries that we use. From several different sources. Without having to depend on system libraries, or recompile anything.

As far as I can tell. The docker images built from datacube-core aren't being used anywhere, it's only downstream software like Alchemist, Statistician, Tools, OWS and Explorer where docker images are used. They've all ended up going in somewhat divergent directions, due to their differing dependencies and requirements. While consolidation would be good, I don't think the setup in GHA here would be a sensible place to start.

In brief, I think we should:

Ditch docker in GH Actions.
Switch to a build matrix of Python versions crossed with pip and conda.
Rely on the binary wheels available from PyPI for pip installs
Rely on binary packages available from conda-forge for conda environments.
Follow the GH Actions advice for running a PostgreSQL service container

I think it will be much faster, and simpler, and more comprehensive

The text was updated successfully, but these errors were encountered:

pjonsson · 2024-06-02T21:12:49Z

The linked PRs are making 3 upgrades at once:

Ubuntu LTS version upgrade (happens once every two years)
Python version from 3.10 to 3.12 (one 3.x release per year, so two years worth of upgrades)
GDAL version from 3.8 to 3.9 (happens frequently, GDAL moves fast)

This time it was GDAL that mandated all upgrades at once, and switching to running directly on the Github CI runners will shift the thing mandating the upgrades of Ubuntu LTS to Github instead of GDAL, but it will still be an external decision.

The third bullet goes against psycopg2's recommended production use (https://www.psycopg.org/docs/install.html#psycopg-vs-psycopg-binary).

Having the database in a Github service will give a different local test environment, so that might reduce the reproducibility of CI issues compared to a containerized setup.

I'm not advocating any certain technical direction, just that the causes, consequences, and costs of directions are considered.

Kirill888 · 2024-06-21T06:05:05Z

current docker based workflow is really slooow. Looks like it's building docker from scratch, without any cache, it then seems to push it to ghcr.io just to download it right back in the next step spending couple minutes on each side. We should not need to push and pull at least, even if we keep building from the very scratch without cache. Or better just stop using docker where it hurts rather then helps.

omad added the tests label May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve GitHub Actions Testing Workflows - Stop Using Docker #1589

Improve GitHub Actions Testing Workflows - Stop Using Docker #1589

omad commented May 31, 2024

pjonsson commented Jun 2, 2024

Kirill888 commented Jun 21, 2024

Improve GitHub Actions Testing Workflows - Stop Using Docker #1589

Improve GitHub Actions Testing Workflows - Stop Using Docker #1589

Comments

omad commented May 31, 2024

pjonsson commented Jun 2, 2024

Kirill888 commented Jun 21, 2024