Add coffea 2024 and coffea 0.7.x images #1

oshadura · 2024-02-01T15:27:56Z

Thanks @nsmith- for starting point: https://github.com/CoffeaTeam/docker-coffea-base/blob/coffea2023

The goal is to merge https://github.com/CoffeaTeam/docker-coffea-base and https://github.com/CoffeaTeam/docker-coffea-dask repositories and have the possibility to maintain both coffea versions separately (CalVer and legacy 0.7.x) as separate images.

I propose to name coffeateam/coffea-basev0 as legacy 0.7.x and coffeateam/coffea-dask the calver version. I added also two versions of each image, depending on linux distro: coffeateam/coffea-basev0-ubuntu, coffeateam/coffea-basev0-alma8 and coffeateam/coffea-dask-ubuntu, coffeateam/coffea-dask-alma8.

The next step would be to slim the image and add ML learning libraries as separate image (?).

nsmith- · 2024-02-01T16:15:45Z

Can I suggest a slightly different directory structure?

	renamed:    alma8/Dockerfile -> alma8.dockerfile
	renamed:    alma8/environment.yaml -> environment.yaml
	renamed:    ubuntu/Dockerfile -> ubuntu.dockerfile
	deleted:    ubuntu/environment.yaml

so that we have a common environment.yaml for all OS versions? The CI file will need a tiny change as well

oshadura · 2024-02-01T16:19:28Z

@nsmith- sure! I will try tomorrow before you are awake. Any other suggestions? Maybe we need some other flavor? What should we do with ML libraries? Add additional environment.yaml?

nsmith- · 2024-02-01T16:32:51Z

Regarding image dependencies and naming, personally I'm in favor of something like

coffea-base
- coffea calver
- dask (not distributed)
coffea-dask-v0
- coffea 0.7.x
- dask
- distributed
coffea-lpc (FROM coffea-base)
- distributed
- lpcjobqueue
coffea-eaf (FROM coffea-base)
- distributed
- other stuff for FNAL EAF?
coffea-casa (FROM coffea-base)
- distributed
- dask-gateway
- any extra casa stuff
coffea-casa-v0 (FROM coffea-dask-v0)
- distributed
- any extra casa stuff
coffea-vine (FROM coffea-base)
- taskvine
- whatever else

and potentially other AF variants. @btovar @mapsacosta @mcremone @rpsimeon34 may have opinions as well.
I don't see a strong case for having multiple base OS images. Maybe I am missing something but since our ~whole stack is coming from conda the base OS is somewhat irrelevant as far as I can tell?

nsmith- · 2024-02-01T16:35:26Z

This does leave up in the air where the ML packages go in the stack. Part of me thinks it should be AF-dependent, so that AFs that support GPUs can ensure they have the correct CUDA versions.

nsmith- · 2024-02-01T17:06:20Z

This may also be of interest to @kondratyevd

kondratyevd · 2024-02-01T17:16:40Z

Thanks @nsmith- - at Purdue AF we provide only a single Docker image for users, and manage environments using conda, so probably at the moment this is not super relevant for us, unless I'm missing some important caveats

nsmith- · 2024-02-01T17:22:34Z

at Purdue AF we provide only a single Docker image for users

can you describe/link that image? That implies users' full conda environments must be shipped to workers?

kondratyevd · 2024-02-01T17:31:03Z

The image is here: https://github.com/PurdueAF/purdue-af/blob/master/jupyterhub/docker/cmsaf-alma8/Dockerfile

To use conda envs with Dask, users must place their environments into a shared storage volume (Purdue's "Depot" storage is standard work area for local Purdue users, it is mounted to all worker nodes; and we are working on a similar solution for external users). Then, the path to conda env must be specified in Dask Gateway setup.

For local Python imports it's similar - user's files should be in shared storage and PYTHONPATH can be modified in Dask Gateway setup to point to the user's analysis framework.

nsmith- · 2024-02-01T18:59:15Z

Thanks for the input @kondratyevd, it is also in line with some discussion I had with Burt about the fact that the user server (notebook/lab) image often needs extensive AF customization and essentially they are only interested in conda install coffea or using a common environment.yaml file for optional extras (most notably, xrootd). This makes me wonder if we are better off supplying just a set of conda environment files? For workers an image may be useful because it can be cached more effectively.

kondratyevd · 2024-02-01T19:08:41Z

Doesn’t this (set of conda files) already mostly provided with just a few optional dependencies? I think our users didn’t have any issues with just conda install coffea==0.7.21 and similarly for v2023/2024. Xrootd is the only exception that I remember.

…tch releases

…eleases: 2024.x.x

and build relevant image instead

… 4.x can only be used with Awkward 1.x; you have Awkward 2.6.1

├─ pytorch [1.12.0|1.12.1] would require 40.92 │ └─ python >=3.10,<3.11.0a0 , which can be installed;

…ffea 0.7.x)

oshadura · 2024-02-15T16:54:02Z

@nsmith- @lgray Now only one combination is failing: python 3.11 + coffea 0.7.22 with:

2024-02-15 16:26:20,860 - distributed.worker - WARNING - Compute Failed
Key:       MyProcessor-31e87f80bfa2fb4d21e30f667186a569
Function:  MyProcessor
args:      ((WorkItem(dataset='dimuon', filename='https://github.com/CoffeaTeam/coffea/raw/master/tests/samples/nano_dimuon.root', treename='Events', entrystart=0, entrystop=40, fileuuid=b'\xa2\x10\xa3\xf86H\x11\xea\xa2\x9f\xf5\xb5\\\x90\xbe\xef', usermeta={}), b'\x04"M\x18h@-\x00\x00\x00\x00\x00\x00\x00v,\x00\x00\x00R\x80\x05\x95"\x00\x01\x00\xf0\x13\x8c\x0btest_dimuon\x94\x8c\x0bMyProcessor\x94\x93\x94)\x81\x94.\x00\x00\x00\x00'))
kwargs:    {}
Exception: 'TypeError("__class__ assignment: \'NanoEventsArray\' object layout differs from \'Array\'")'

https://github.com/CoffeaTeam/af-images/actions/runs/7918852148/job/21618260586

oshadura · 2024-02-15T16:54:54Z

Otherwise, I can disable it for now, and then if everything else is ok, we can finally push these images.

TypeError: __class__ assignment: 'NanoEventsArray' object layout differs from 'Array'

…able as latest image tag

…repository in dockerhub)

oshadura · 2024-02-16T09:26:36Z

I am going to fix the failing image in the next pull request as well as add other combinations. The goal is to test the first batch of images on coffee-casa already today.

nsmith- self-requested a review February 1, 2024 16:10

oshadura force-pushed the add-skeleton-images branch from f87d9c3 to 84e13f9 Compare February 1, 2024 16:11

oshadura marked this pull request as draft February 1, 2024 16:14

oshadura mentioned this pull request Feb 1, 2024

dask_jobqueue in CalVer Images CoffeaTeam/docker-coffea-base#96

Closed

nsmith- mentioned this pull request Feb 1, 2024

Create environment from environment.yml CoffeaTeam/docker-coffea-base#73

Open

oshadura added 6 commits February 9, 2024 14:41

Adding two docker images that will provide images for coffea 0.7.x pa…

9c6e30a

…tch releases

Adding two docker images that will provide images for coffea calver r…

0d23db3

…eleases: 2024.x.x

Fix typo

b50e159

Checking if CI is working

59e3557

Add forgotten fi

7511c2a

Reshuffle images according Nick's proposal

3d0f4ea

oshadura force-pushed the add-skeleton-images branch 3 times, most recently from c742a37 to 8f46ade Compare February 9, 2024 13:58

Update GH actions to support detection v0 versus calever version coffea

7631365

and build relevant image instead

oshadura force-pushed the add-skeleton-images branch 4 times, most recently from afee57c to 57a8ee1 Compare February 9, 2024 14:56

Fix typo in name of yaml file

78175b5

oshadura added 6 commits February 9, 2024 16:04

Remove pin on version of uproot

691b492

Add basic coffeav0 dimuon pytest

bd17570

Fix for coffee 0.7.x error in unit test - ModuleNotFoundError: Uproot…

81e1271

… 4.x can only be used with Awkward 1.x; you have Awkward 2.6.1

Pytorch 1.x is not available for python 3.11+

2aca4ce

├─ pytorch [1.12.0|1.12.1] would require 40.92 │ └─ python >=3.10,<3.11.0a0 , which can be installed;

Add test ADL1 for coffeacalver

cc48992

Switch test dimuon for coffeav0 to test Dask executor

86c2a33

oshadura force-pushed the add-skeleton-images branch from d4f8d4f to 489c723 Compare February 13, 2024 13:49

oshadura added 3 commits February 13, 2024 16:27

Update GH actions to run different marked functions

fc3460e

Register markers for pytest

382dbb0

Update tests accordingly

4e7609e

oshadura force-pushed the add-skeleton-images branch from 489c723 to 4e7609e Compare February 13, 2024 15:28

oshadura added 4 commits February 13, 2024 16:53

Disable vons-proxy-init test (fail at alma8)

b35956a

Merge tests together since they share the same processor

7e26faa

Match the fastjet version to last without awk2 (it doesnt work for co…

2a6a7eb

…ffea 0.7.x)

Try for now only alma8 builds

7fd5c48

oshadura force-pushed the add-skeleton-images branch from e139352 to 7fd5c48 Compare February 14, 2024 15:12

oshadura added 4 commits February 14, 2024 18:04

Try to separate coffea with dask-historgram and without for pytest

89df2f7

Pythpon 3.9 doesnt work anymore on alma8

ee4377a

Update tests

96ec188

Update coffea 2022.02.1 due bug in processing files

4a353dd

oshadura added 4 commits February 16, 2024 09:37

remove python 3.11 for coffea 0.7.x

2e77629

TypeError: __class__ assignment: 'NanoEventsArray' object layout differs from 'Array'

Update to almalinux8 instead, we have a ready docker repo for it

4ad7f7e

Tag with the most recent python tag available in matrix is also avail…

a6059d3

…able as latest image tag

Rename coffea-basev0 as acoffea-base (we are going to reuse existing …

1fa5050

…repository in dockerhub)

oshadura marked this pull request as ready for review February 16, 2024 09:23

Fix broken tag env

9503386

oshadura merged commit 33ad9fd into CoffeaTeam:main Feb 16, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add coffea 2024 and coffea 0.7.x images #1

Add coffea 2024 and coffea 0.7.x images #1

oshadura commented Feb 1, 2024 •

edited

Loading

nsmith- commented Feb 1, 2024

oshadura commented Feb 1, 2024 •

edited

Loading

nsmith- commented Feb 1, 2024

nsmith- commented Feb 1, 2024

nsmith- commented Feb 1, 2024

kondratyevd commented Feb 1, 2024

nsmith- commented Feb 1, 2024

kondratyevd commented Feb 1, 2024 •

edited

Loading

nsmith- commented Feb 1, 2024

kondratyevd commented Feb 1, 2024

oshadura commented Feb 15, 2024

oshadura commented Feb 15, 2024

oshadura commented Feb 16, 2024

Add coffea 2024 and coffea 0.7.x images #1

Add coffea 2024 and coffea 0.7.x images #1

Conversation

oshadura commented Feb 1, 2024 • edited Loading

nsmith- commented Feb 1, 2024

oshadura commented Feb 1, 2024 • edited Loading

nsmith- commented Feb 1, 2024

nsmith- commented Feb 1, 2024

nsmith- commented Feb 1, 2024

kondratyevd commented Feb 1, 2024

nsmith- commented Feb 1, 2024

kondratyevd commented Feb 1, 2024 • edited Loading

nsmith- commented Feb 1, 2024

kondratyevd commented Feb 1, 2024

oshadura commented Feb 15, 2024

oshadura commented Feb 15, 2024

oshadura commented Feb 16, 2024

oshadura commented Feb 1, 2024 •

edited

Loading

oshadura commented Feb 1, 2024 •

edited

Loading

kondratyevd commented Feb 1, 2024 •

edited

Loading