You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently use Docker Builds inside GitHub actions for running automated tests. It "works", but is very complicated and very fragile, as recently evidenced by multiple days of work on #1587 and #1588.
The world has moved on since this workflow was created, and it's now possible to get up to date binaries of all the geospatial and scientific Python libraries that we use. From several different sources. Without having to depend on system libraries, or recompile anything.
As far as I can tell. The docker images built from datacube-core aren't being used anywhere, it's only downstream software like Alchemist, Statistician, Tools, OWS and Explorer where docker images are used. They've all ended up going in somewhat divergent directions, due to their differing dependencies and requirements. While consolidation would be good, I don't think the setup in GHA here would be a sensible place to start.
In brief, I think we should:
Ditch docker in GH Actions.
Switch to a build matrix of Python versions crossed with pip and conda.
Rely on the binary wheels available from PyPI for pip installs
Rely on binary packages available from conda-forge for conda environments.
Ubuntu LTS version upgrade (happens once every two years)
Python version from 3.10 to 3.12 (one 3.x release per year, so two years worth of upgrades)
GDAL version from 3.8 to 3.9 (happens frequently, GDAL moves fast)
This time it was GDAL that mandated all upgrades at once, and switching to running directly on the Github CI runners will shift the thing mandating the upgrades of Ubuntu LTS to Github instead of GDAL, but it will still be an external decision.
Having the database in a Github service will give a different local test environment, so that might reduce the reproducibility of CI issues compared to a containerized setup.
I'm not advocating any certain technical direction, just that the causes, consequences, and costs of directions are considered.
current docker based workflow is really slooow. Looks like it's building docker from scratch, without any cache, it then seems to push it to ghcr.io just to download it right back in the next step spending couple minutes on each side. We should not need to push and pull at least, even if we keep building from the very scratch without cache. Or better just stop using docker where it hurts rather then helps.
We currently use Docker Builds inside GitHub actions for running automated tests. It "works", but is very complicated and very fragile, as recently evidenced by multiple days of work on #1587 and #1588.
The world has moved on since this workflow was created, and it's now possible to get up to date binaries of all the geospatial and scientific Python libraries that we use. From several different sources. Without having to depend on system libraries, or recompile anything.
As far as I can tell. The docker images built from datacube-core aren't being used anywhere, it's only downstream software like Alchemist, Statistician, Tools, OWS and Explorer where docker images are used. They've all ended up going in somewhat divergent directions, due to their differing dependencies and requirements. While consolidation would be good, I don't think the setup in GHA here would be a sensible place to start.
In brief, I think we should:
pip
andconda
.pip install
sI think it will be much faster, and simpler, and more comprehensive
The text was updated successfully, but these errors were encountered: