diff --git a/.readthedocs.yaml b/.readthedocs.yaml new file mode 100644 index 00000000..6725ddff --- /dev/null +++ b/.readthedocs.yaml @@ -0,0 +1,17 @@ +# Read the Docs configuration file +# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details + +version: 2 + +build: + os: ubuntu-22.04 + tools: + python: "3.12" + commands: + - asdf plugin add uv + - asdf install uv latest + - asdf global uv latest + - uv venv + - uv pip install .[dev] + - .venv/bin/python -m sphinx -T -b html -d docs/_build/doctrees -D + language=en docs $READTHEDOCS_OUTPUT/html diff --git a/README.md b/README.md index 6f4ed946..015611e0 100644 --- a/README.md +++ b/README.md @@ -3,323 +3,19 @@ # flint -A pirate themed toy ASKAP-RACS pipeline. + -Yarrrr-Harrrr fiddly-dee! - -Capn' Flint - Credit: DALLE 3 - -## Installation - -Provided an appropriate environment installation should be as simple as a -`pip install`. - -However, on some systems there are interactions with `casacore` and building -`python-casacore` appropriately. Issues have been noted when interacting with -large measurement sets across components with different `casacore` versions. -This seems to happen even across container boundaries (i.e. different versions -in containers might play a role). The exact cause is not at all understood, but -it appears to be related to the version of `python-casacore`, `numpy` and -whether pre-built wheels are used. - -In practise it might be easier to leverage `conda` to install the appropriate -`boost` and `casacore` libraries. - -A helpful script below may be of use. - -``` - -BRANCH="main" # replace this with appropriate branch or tag -DIR="flint_${BRANCH}" -PYVERSION="3.12" - -mkdir "${DIR}" || exit -cd "${DIR}" || exit - - -git clone git@github.com:tjgalvin/flint.git && \ - cd flint && \ - git checkout "${BRANCH}" - -conda create -y -n "${DIR}" python="${PYVERSION}" && \ - source /home/$(whoami)/.bashrc && \ - conda activate "${DIR}" && \ - conda install -y -c conda-forge boost casacore && \ - PIP_NO_BINARY="python-casacore" pip install -e . -``` - -This may set up an appropriate environment that is compatible with the -containers currently being used. - -### The error - -The error looked something like this: - -``` -An unhandled exception occurred: FiledesIO::read /path/to/data.ms/table.mf - read returned a bad value -``` - -## About - -This `flint` package is trying to get a minimum start-to-finish calibration and -imaging workflow written for `RACS` style ASKAP data. `python` functions are -used to do the work, and `prefect` is used to orchestrate their usage into a -larger pipeline. - -Most of the `python` routines have a CLI that can be used to test them in a -piecewise sense. These entry points are installed as programs available on the -command line. They are listed below with a brief description: - -- `flint_skymodel`: derives a sky-model using a reference catalogue suitable to - perform bandpass calibration against. Note that it is not "science quality" as - it assumes an ideal primary beam response and the reference catalogues do not - incorporate spectral information. -- `flint_aocalibrate`: Performs amplitude and phase calibration against a - sky-model, intended for bandpass calibration, and leverage's Andre Offringa's - `calibrate` program. -- `flint_flagger`: Performs basic flagging on an input measurement set. -- `flint_bandpass`: A small workflow to bandpass calibrate ASKAP measurement - sets that have observed PKS B1934-638. -- `flint_ms`: Utility functions related to inspecting and pre-processing an - ASKAP measurement set. -- `flint_wsclean`: Uses `wsclean` to image and clean an ASKAP measurement set - with pre-defined options. -- `flint_gaincal`: Uses the `casa` task `gaincal` and `applysolutions` to - perform self-calibration of an ASKAP measurement set. -- `flint_convol`: Convols a collection of images to a common resolution. -- `flint_yandalinmos`: Will co-add a collection of images of a single field - together, optionally including holography measurements. -- `flint_config`: The beginnings of a configuration-based scheme to specify - options throughout a workflow. -- `flint_aegean`: Simple interface to execute BANE and aegean against a provided - image. These tools are expected to be packaged in a singularity container. -- `flint_validation_plot`: Create a simple, quick look figure that expresses the - key quality statistics of an image. It is intended to be used against a full - continuum field image, but in-principal be used for a per beam image. -- `flint_potato`: Attempt to peel out known sources from a measurement set using - [potatopeel](https://gitlab.com/Sunmish/potato/-/tree/main). Criteria used to - assess which sources to peel is fairly minimumal, and at the time of writing - only the reference set of sources packaged within `flint` are - considered. -`flint_archive`: Operations around archiving and copying final - data products into place. -`flint_catalogue`: Download reference catalogues - that are expected by `flint` - -The following commands use the `prefect` framework to link together individual -tasks together (outlined above) into a single data-processing pipeline. - -- `flint_flow_bandpass_calibrate`: Executes a prefect flow run that will - calibrate a set of ASKAP measurement sets taken during a normal bandpass - observation sequence. -- `flint_flow_continuum_pipeline`: Performs bandpass calibration, solution - copying, imaging, self-calibration and mosaicing. -- `flint_flow_subtract_cube_pipeline`: Subtract a continuum model and image the - residual data. - -## Sky-model catalogues - -The `flint_skymodel` command will attempt to create an in-field sky-model for a -particular measurement set using existing source catalogues and an idealised -primary beam response. 'Supported' catalogues are those available through -`flint_catalogue download`. Note this mode has not be thoroughly tested and may -not be out-of-date relative to how the `flint_flow_continuum_pipeline` operates. -In the near future this may be expanded. - -If calibrating a bandpass (i.e. `1934-638`) `flint` will use the packaged source -model. At the moment this is only provided for `calibrate`. - -## About ASKAP Measurement Sets +A pirate themed ASKAP pipeline. -Some of the innovative components of ASKAP and the `yandasoft` package have -resulted in measurement sets that are not immediately inline with external -tools. Measurement sets should first be processed with -[fixms](https://github.com/AlecThomson/FixMS). Be careful -- most (all) `flint` -tasks don't currently do this automatically. Be aware, me hearty. - -## Containers - -At the moment this toy pipeline uses `singularity` containers to use compiled -software that are outside the `python` ecosystem. For the moment there are no -'supported' container packaged with this repository -- sorry! - -In a nutshell, the containers used throughout are passed in as command line -arguments, whose context should be enough to explain what it is expecting. At -the time of writing there are six containers for: - -- calibration: this should contain `calibrate` and `applysolutions`. These are - tools written by Andre Offringa. -- flagging: this should contain `aoflagger`, which is installable via a - `apt install aoflagger` within ubuntu. -- imaging: this should contain `wsclean`. This should be at least version 3. At - the moment a modified version is being used (which implements a - `-force-mask-round` option). -- source finding: `aegeam` is used for basic component catalogue creation. It is - not intedended to be used to produce final source catalogues, but to help - construct quick-look data products. A minimal set of `BANE` and `aegean` - options are used. -- source peeling: `potatopeel` is a package that uses `wsclean`, `casa` and a - customisable rule set to peel out troublesome annoying objects. Although it is - a python installable and importable package, there are potential conflicts - with the `python-casacore` modules that `flint` uses. See - [potatopeel's github repository for more information](https://gitlab.com/Sunmish/potato/-/tree/main) -- linear mosaicing: The `linmos` task from `yandasoft` is used to perform linear - mosaicing. Importanting this `linmos` is capable of using the ASKAP primary - beam responses characterised through holography. `yandasoft` docker images - [are available from the CSIRO dockerhub page.](https://hub.docker.com/r/csirocass/askapsoft). -- self-calibration: `casa` is used to perform antenna-based self-calibration. - Specifically the tasks `gaincal`, `applysolutions`, `cvel` and `mstransform` - are used throughout this process. Careful selection of an appropriate CASA - version should be made to keep the `casacore` library in compatible state with - other components. Try the `docker://alecthomson/casa:ks9-5.8.0` image. - -## Configuration based settings - -Most settings within `flint` are stored in immutable option classes, e.g. -`WSCleanOptions`, `GainCalOptions`. Once they such an option class has been -created, any new option values may only be set by creating a new instance. In -such cases there is an appropriate `.with_options` method that might be of use. -This 'nothing changes unless explicitly done so' was adopted early as a way to -avoid confusing when moving to a distributed multi-node execution environment. - -The added benefit is that it has defined very clear interfaces into key stages -throughout `flint`s calibration and imaging stages. The `flint_config` program -can be used to create template `yaml` file that lists default values of these -option classes that are expected to be user-tweakable, and provides the ability -to change values of options throughout initial imaging and subsequent rounds of -self-calibration. - -In a nutshell, the _currently_ supported option classes that may be tweaked -through this template method are: - -- `WSCleanOptions` (shorthand `wsclean`) -- `GainCalOptions` (shorthand `gaincal`) -- `MaskingOptions` (shorthand `masking`) -- `ArchiveOptions` (shorthand `archive`) -- `BANEOptions` (shorthand `bane`) -- `AegeanOptions` (shorthand `aegean`) - -All attributed supported by these options may be set in this template format. -Not that these options would have to be retrieved within a particular flow and -passed to the appropriate functions - they are not (currently) automatically -accessed. - -The `defaults` scope sets all of the default values of these classes. The -`initial` scope overrides the default imaging `wsclean` options to be used with -the first round of imaging _before self-calibration_. - -The `selfcal` scope contains a key-value mapping, where an `integer` key relates -the options to that specific round of masking, imaging and calibration options -for that round of self-calibration. Again, options set here override the -corresponding options defined in the `defaults` scope. - -`flint_config` can be used to generate a template file, which can then be -tweaked. The template file uses YAML to define scope and settings. So, use the -YAML standard when modifying this file. There are primitive verification -functions to ensure the modified template file is correctly form. - -## CLI Configuration file - -To help manage (and avoid) long CLI calls to configure `flint`, most command -line options may be dumped into a new-line delimited text file which can then be -set as the `--cli-config` option of some workflows. See the `configargparse` -python utility to read up on more on how options may be overridden if specified -in both the text file and CLI call. - -## Validation Plots - -The validation plots that are created are simple and aim to provide a quality -assessment at a quick glance. An RMS image and corresponding source component -catalogue are the base data products derived from the ASKAP data that are -supplied to the routine. - -`flint` requires a set of reference catalogues to be present for some stages of -operation, the obvious being the validation plots described above. In some -computing environments (e.g. HPC) network access to external services are -blocked. To avoid these issues `flint` has a built in utility to download the -reference catalogues it expected from vizier and write them to a specified user -directory. See: - -> `flint_catalogue download --help` - -The parent directory that contains these cataloguues should be provided to the -appropriate tasks when appropriate. - -In the current `flint` package these catalogues (and their expected columns) -are: - -- ICRF - -``` -Catalogue( - survey="ICRF", - file_name="ICRF.fits", - freq=1e9, - ra_col="RAJ2000", - dec_col="DEJ2000", - name_col="ICRF", - flux_col="None", - maj_col="None", - min_col="None", - pa_col="None", - vizier_id="I/323/icrf2", -) -``` - -- NVSS - -``` -Catalogue( - survey="NVSS", - file_name="NVSS.fits", - name_col="NVSS", - freq=1.4e9, - ra_col="RAJ2000", - dec_col="DEJ2000", - flux_col="S1.4", - maj_col="MajAxis", - min_col="MinAxis", - pa_col="PA", - vizier_id="VIII/65/nvss", -) -``` - -- SUMSS +Yarrrr-Harrrr fiddly-dee! -``` -Catalogue( - survey="SUMSS", - file_name="SUMSS.fits", - freq=8.43e8, - ra_col="RAJ2000", - dec_col="DEJ2000", - name_col="Mosaic", - flux_col="St", - maj_col="dMajAxis", - min_col="dMinAxis", - pa_col="dPA", - vizier_id="VIII/81B/sumss212", -) -``` +Capn' Flint - Credit: DALLE 3 -- RACS-LOW +## Documentation -``` -Catalogue( - file_name="racs-low.fits", - survey="RACS-LOW", - freq=887.56e6, - ra_col="RAJ2000", - dec_col="DEJ2000", - name_col="GID", - flux_col="Ftot", - maj_col="amaj", - min_col="bmin", - pa_col="PA", - vizier_id="J/other/PASA/38.58/gausscut", -) -``` + -The known filename is used to find the appropriate catalogue and its full path, -and are appropriately named when using the `flint_catalogue download` tool. +Full documentation is provided on [ReadtheDocs](https://readthedocs.io/). ## Contributions diff --git a/_static b/_static new file mode 120000 index 00000000..44822ed5 --- /dev/null +++ b/_static @@ -0,0 +1 @@ +docs/_static \ No newline at end of file diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 00000000..1a6f9a7f --- /dev/null +++ b/docs/README.md @@ -0,0 +1,306 @@ +[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/tjgalvin/flint/main.svg)](https://results.pre-commit.ci/latest/github/tjgalvin/flint/main) +[![codecov](https://codecov.io/github/tjgalvin/flint/graph/badge.svg?token=7ZEKJ78TBZ)](https://codecov.io/github/tjgalvin/flint) + +# flint + +A pirate themed ASKAP pipeline. + +Yarrrr-Harrrr fiddly-dee! + +Capn' Flint - Credit: DALLE 3 + +## Installation + +Provided an appropriate environment installation should be as simple as a +`pip install`. + +However, on some systems there are interactions with `casacore` and building +`python-casacore` appropriately. Issues have been noted when interacting with +large measurement sets across components with different `casacore` versions. +This seems to happen even across container boundaries (i.e. different versions +in containers might play a role). The exact cause is not at all understood, but +it appears to be related to the version of `python-casacore`, `numpy` and +whether pre-built wheels are used. + +In practise it might be easier to leverage `conda` to install the appropriate +`boost` and `casacore` libraries. + +A helpful script below may be of use. + +```bash +BRANCH="main" # replace this with appropriate branch or tag +DIR="flint_${BRANCH}" +PYVERSION="3.12" + +mkdir "${DIR}" || exit +cd "${DIR}" || exit + + +git clone git@github.com:tjgalvin/flint.git && \ + cd flint && \ + git checkout "${BRANCH}" + +conda create -y -n "${DIR}" python="${PYVERSION}" && \ + source /home/$(whoami)/.bashrc && \ + conda activate "${DIR}" && \ + conda install -y -c conda-forge boost casacore && \ + PIP_NO_BINARY="python-casacore" pip install -e . +``` + +This may set up an appropriate environment that is compatible with the +containers currently being used. + +## About + +This `flint` package is trying to get a minimum start-to-finish calibration and +imaging workflow written for `RACS` style ASKAP data. `python` functions are +used to do the work, and `prefect` is used to orchestrate their usage into a +larger pipeline. + +Most of the `python` routines have a CLI that can be used to test them in a +piecewise sense. These entry points are installed as programs available on the +command line. They are listed below with a brief description: + +- `flint_skymodel`: derives a sky-model using a reference catalogue suitable to + perform bandpass calibration against. Note that it is not "science quality" as + it assumes an ideal primary beam response and the reference catalogues do not + incorporate spectral information. +- `flint_aocalibrate`: Performs amplitude and phase calibration against a + sky-model, intended for bandpass calibration, and leverage's Andre Offringa's + `calibrate` program. +- `flint_flagger`: Performs basic flagging on an input measurement set. +- `flint_bandpass`: A small workflow to bandpass calibrate ASKAP measurement + sets that have observed PKS B1934-638. +- `flint_ms`: Utility functions related to inspecting and pre-processing an + ASKAP measurement set. +- `flint_wsclean`: Uses `wsclean` to image and clean an ASKAP measurement set + with pre-defined options. +- `flint_gaincal`: Uses the `casa` task `gaincal` and `applysolutions` to + perform self-calibration of an ASKAP measurement set. +- `flint_convol`: Convols a collection of images to a common resolution. +- `flint_yandalinmos`: Will co-add a collection of images of a single field + together, optionally including holography measurements. +- `flint_config`: The beginnings of a configuration-based scheme to specify + options throughout a workflow. +- `flint_aegean`: Simple interface to execute BANE and aegean against a provided + image. These tools are expected to be packaged in a singularity container. +- `flint_validation_plot`: Create a simple, quick look figure that expresses the + key quality statistics of an image. It is intended to be used against a full + continuum field image, but in-principal be used for a per beam image. +- `flint_potato`: Attempt to peel out known sources from a measurement set using + [potatopeel](https://gitlab.com/Sunmish/potato/-/tree/main). Criteria used to + assess which sources to peel is fairly minimumal, and at the time of writing + only the reference set of sources packaged within `flint` are + considered. -`flint_archive`: Operations around archiving and copying final + data products into place. -`flint_catalogue`: Download reference catalogues + that are expected by `flint` + +The following commands use the `prefect` framework to link together individual +tasks together (outlined above) into a single data-processing pipeline. + +- `flint_flow_bandpass_calibrate`: Executes a prefect flow run that will + calibrate a set of ASKAP measurement sets taken during a normal bandpass + observation sequence. +- `flint_flow_continuum_pipeline`: Performs bandpass calibration, solution + copying, imaging, self-calibration and mosaicing. +- `flint_flow_subtract_cube_pipeline`: Subtract a continuum model and image the + residual data. + +## About ASKAP Measurement Sets + +Some of the innovative components of ASKAP and the `yandasoft` package have +resulted in measurement sets that are not immediately inline with external +tools. Measurement sets should first be processed with +[fixms](https://github.com/AlecThomson/FixMS). Be careful -- most (all) `flint` +tasks don't currently do this automatically. Be aware, me hearty. + +## Containers + +At the moment this toy pipeline uses `singularity` containers to use compiled +software that are outside the `python` ecosystem. For the moment there are no +'supported' container packaged with this repository -- sorry! + +In a nutshell, the containers used throughout are passed in as command line +arguments, whose context should be enough to explain what it is expecting. At +the time of writing there are six containers for: + +- calibration: this should contain `calibrate` and `applysolutions`. These are + tools written by Andre Offringa. +- flagging: this should contain `aoflagger`, which is installable via a + `apt install aoflagger` within ubuntu. +- imaging: this should contain `wsclean`. This should be at least version 3. At + the moment a modified version is being used (which implements a + `-force-mask-round` option). +- source finding: `aegeam` is used for basic component catalogue creation. It is + not intedended to be used to produce final source catalogues, but to help + construct quick-look data products. A minimal set of `BANE` and `aegean` + options are used. +- source peeling: `potatopeel` is a package that uses `wsclean`, `casa` and a + customisable rule set to peel out troublesome annoying objects. Although it is + a python installable and importable package, there are potential conflicts + with the `python-casacore` modules that `flint` uses. See + [potatopeel's github repository for more information](https://gitlab.com/Sunmish/potato/-/tree/main) +- linear mosaicing: The `linmos` task from `yandasoft` is used to perform linear + mosaicing. Importanting this `linmos` is capable of using the ASKAP primary + beam responses characterised through holography. `yandasoft` docker images + [are available from the CSIRO dockerhub page.](https://hub.docker.com/r/csirocass/askapsoft). +- self-calibration: `casa` is used to perform antenna-based self-calibration. + Specifically the tasks `gaincal`, `applysolutions`, `cvel` and `mstransform` + are used throughout this process. Careful selection of an appropriate CASA + version should be made to keep the `casacore` library in compatible state with + other components. Try the `docker://alecthomson/casa:ks9-5.8.0` image. + +## Configuration based settings + +Most settings within `flint` are stored in immutable option classes, e.g. +`WSCleanOptions`, `GainCalOptions`. Once they such an option class has been +created, any new option values may only be set by creating a new instance. In +such cases there is an appropriate `.with_options` method that might be of use. +This 'nothing changes unless explicitly done so' was adopted early as a way to +avoid confusing when moving to a distributed multi-node execution environment. + +The added benefit is that it has defined very clear interfaces into key stages +throughout `flint`s calibration and imaging stages. The `flint_config` program +can be used to create template `yaml` file that lists default values of these +option classes that are expected to be user-tweakable, and provides the ability +to change values of options throughout initial imaging and subsequent rounds of +self-calibration. + +In a nutshell, the _currently_ supported option classes that may be tweaked +through this template method are: + +- `WSCleanOptions` (shorthand `wsclean`) +- `GainCalOptions` (shorthand `gaincal`) +- `MaskingOptions` (shorthand `masking`) +- `ArchiveOptions` (shorthand `archive`) +- `BANEOptions` (shorthand `bane`) +- `AegeanOptions` (shorthand `aegean`) + +All attributed supported by these options may be set in this template format. +Not that these options would have to be retrieved within a particular flow and +passed to the appropriate functions - they are not (currently) automatically +accessed. + +The `defaults` scope sets all of the default values of these classes. The +`initial` scope overrides the default imaging `wsclean` options to be used with +the first round of imaging _before self-calibration_. + +The `selfcal` scope contains a key-value mapping, where an `integer` key relates +the options to that specific round of masking, imaging and calibration options +for that round of self-calibration. Again, options set here override the +corresponding options defined in the `defaults` scope. + +`flint_config` can be used to generate a template file, which can then be +tweaked. The template file uses YAML to define scope and settings. So, use the +YAML standard when modifying this file. There are primitive verification +functions to ensure the modified template file is correctly form. + +## CLI Configuration file + +To help manage (and avoid) long CLI calls to configure `flint`, most command +line options may be dumped into a new-line delimited text file which can then be +set as the `--cli-config` option of some workflows. See the `configargparse` +python utility to read up on more on how options may be overridden if specified +in both the text file and CLI call. + +## Validation Plots + +The validation plots that are created are simple and aim to provide a quality +assessment at a quick glance. An RMS image and corresponding source component +catalogue are the base data products derived from the ASKAP data that are +supplied to the routine. + +`flint` requires a set of reference catalogues to be present for some stages of +operation, the obvious being the validation plots described above. In some +computing environments (e.g. HPC) network access to external services are +blocked. To avoid these issues `flint` has a built in utility to download the +reference catalogues it expected from vizier and write them to a specified user +directory. See: + +> `flint_catalogue download --help` + +The parent directory that contains these cataloguues should be provided to the +appropriate tasks when appropriate. + +In the current `flint` package these catalogues (and their expected columns) +are: + +- ICRF + +```python +Catalogue( + survey="ICRF", + file_name="ICRF.fits", + freq=1e9, + ra_col="RAJ2000", + dec_col="DEJ2000", + name_col="ICRF", + flux_col="None", + maj_col="None", + min_col="None", + pa_col="None", + vizier_id="I/323/icrf2", +) +``` + +- NVSS + +```python +Catalogue( + survey="NVSS", + file_name="NVSS.fits", + name_col="NVSS", + freq=1.4e9, + ra_col="RAJ2000", + dec_col="DEJ2000", + flux_col="S1.4", + maj_col="MajAxis", + min_col="MinAxis", + pa_col="PA", + vizier_id="VIII/65/nvss", +) +``` + +- SUMSS + +```python +Catalogue( + survey="SUMSS", + file_name="SUMSS.fits", + freq=8.43e8, + ra_col="RAJ2000", + dec_col="DEJ2000", + name_col="Mosaic", + flux_col="St", + maj_col="dMajAxis", + min_col="dMinAxis", + pa_col="dPA", + vizier_id="VIII/81B/sumss212", +) +``` + +- RACS-LOW + +```python +Catalogue( + file_name="racs-low.fits", + survey="RACS-LOW", + freq=887.56e6, + ra_col="RAJ2000", + dec_col="DEJ2000", + name_col="GID", + flux_col="Ftot", + maj_col="amaj", + min_col="bmin", + pa_col="PA", + vizier_id="J/other/PASA/38.58/gausscut", +) +``` + +The known filename is used to find the appropriate catalogue and its full path, +and are appropriately named when using the `flint_catalogue download` tool. + +## Contributions + +Contributions are welcome! Please do submit a pull-request or issue if you spot +something you would like to address. diff --git a/docs/logo.jpeg b/docs/_static/logo.jpeg similarity index 100% rename from docs/logo.jpeg rename to docs/_static/logo.jpeg diff --git a/docs/about.md b/docs/about.md new file mode 100644 index 00000000..9989fc34 --- /dev/null +++ b/docs/about.md @@ -0,0 +1,55 @@ +## About + +This `flint` package is trying to get a minimum start-to-finish calibration and +imaging workflow written for `RACS` style ASKAP data. `python` functions are +used to do the work, and `prefect` is used to orchestrate their usage into a +larger pipeline. + +Most of the `python` routines have a CLI that can be used to test them in a +piecewise sense. These entry points are installed as programs available on the +command line. They are listed below with a brief description: + +- `flint_skymodel`: derives a sky-model using a reference catalogue suitable to + perform bandpass calibration against. Note that it is not "science quality" as + it assumes an ideal primary beam response and the reference catalogues do not + incorporate spectral information. +- `flint_aocalibrate`: Performs amplitude and phase calibration against a + sky-model, intended for bandpass calibration, and leverage's Andre Offringa's + `calibrate` program. +- `flint_flagger`: Performs basic flagging on an input measurement set. +- `flint_bandpass`: A small workflow to bandpass calibrate ASKAP measurement + sets that have observed PKS B1934-638. +- `flint_ms`: Utility functions related to inspecting and pre-processing an + ASKAP measurement set. +- `flint_wsclean`: Uses `wsclean` to image and clean an ASKAP measurement set + with pre-defined options. +- `flint_gaincal`: Uses the `casa` task `gaincal` and `applysolutions` to + perform self-calibration of an ASKAP measurement set. +- `flint_convol`: Convols a collection of images to a common resolution. +- `flint_yandalinmos`: Will co-add a collection of images of a single field + together, optionally including holography measurements. +- `flint_config`: The beginnings of a configuration-based scheme to specify + options throughout a workflow. +- `flint_aegean`: Simple interface to execute BANE and aegean against a provided + image. These tools are expected to be packaged in a singularity container. +- `flint_validation_plot`: Create a simple, quick look figure that expresses the + key quality statistics of an image. It is intended to be used against a full + continuum field image, but in-principal be used for a per beam image. +- `flint_potato`: Attempt to peel out known sources from a measurement set using + [potatopeel](https://gitlab.com/Sunmish/potato/-/tree/main). Criteria used to + assess which sources to peel is fairly minimumal, and at the time of writing + only the reference set of sources packaged within `flint` are + considered. -`flint_archive`: Operations around archiving and copying final + data products into place. -`flint_catalogue`: Download reference catalogues + that are expected by `flint` + +The following commands use the `prefect` framework to link together individual +tasks together (outlined above) into a single data-processing pipeline. + +- `flint_flow_bandpass_calibrate`: Executes a prefect flow run that will + calibrate a set of ASKAP measurement sets taken during a normal bandpass + observation sequence. +- `flint_flow_continuum_pipeline`: Performs bandpass calibration, solution + copying, imaging, self-calibration and mosaicing. +- `flint_flow_subtract_cube_pipeline`: Subtract a continuum model and image the + residual data. diff --git a/docs/conf.py b/docs/conf.py index 7d724b3d..a639f494 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -7,16 +7,21 @@ # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information from __future__ import annotations +import importlib.metadata + project = "flint" -copyright = "2023, Tim Galvin" +copyright = "2025, Tim Galvin" author = "Tim Galvin" -release = "0.0.1" +version = release = importlib.metadata.version("flint") # -- General configuration --------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration + extensions = [ + "myst_parser", "sphinx.ext.autodoc", + "sphinx_autodoc_typehints", "sphinx.ext.doctest", "sphinx.ext.intersphinx", "sphinx.ext.todo", @@ -26,8 +31,21 @@ "sphinx.ext.viewcode", "sphinx.ext.githubpages", "sphinx.ext.napoleon", + "sphinx_copybutton", + "autoapi.extension", ] +myst_enable_extensions = [ + "colon_fence", +] + +autoapi_type = "python" +autoapi_dirs = ["../flint"] +autoapi_member_order = "groupwise" +autoapi_keep_files = False +autoapi_root = "autoapi" +autoapi_add_toctree_entry = True + # Napoleon settings napoleon_google_docstring = True napoleon_numpy_docstring = True @@ -41,6 +59,7 @@ napoleon_use_param = True napoleon_use_rtype = True +source_suffix = [".rst", ".md"] templates_path = ["_templates"] exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] @@ -48,6 +67,6 @@ # -- Options for HTML output ------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output -html_theme = "alabaster" +html_theme = "furo" # html_theme = "sphinx_rtd_theme" html_static_path = ["_static"] diff --git a/docs/config.md b/docs/config.md new file mode 100644 index 00000000..b0fcdccc --- /dev/null +++ b/docs/config.md @@ -0,0 +1,62 @@ +# Configuration + +## Configuration based settings + +Most settings within `flint` are stored in immutable option classes, e.g. +`WSCleanOptions`, `GainCalOptions`. Once they such an option class has been +created, any new option values may only be set by creating a new instance. In +such cases there is an appropriate `.with_options` method that might be of use. +This 'nothing changes unless explicitly done so' was adopted early as a way to +avoid confusing when moving to a distributed multi-node execution environment. + +The added benefit is that it has defined very clear interfaces into key stages +throughout `flint`s calibration and imaging stages. The `flint_config` program +can be used to create template `yaml` file that lists default values of these +option classes that are expected to be user-tweakable, and provides the ability +to change values of options throughout initial imaging and subsequent rounds of +self-calibration. + +In a nutshell, the _currently_ supported option classes that may be tweaked +through this template method are: + +- `WSCleanOptions` (shorthand `wsclean`) +- `GainCalOptions` (shorthand `gaincal`) +- `MaskingOptions` (shorthand `masking`) +- `ArchiveOptions` (shorthand `archive`) +- `BANEOptions` (shorthand `bane`) +- `AegeanOptions` (shorthand `aegean`) + +All attributed supported by these options may be set in this template format. +Not that these options would have to be retrieved within a particular flow and +passed to the appropriate functions - they are not (currently) automatically +accessed. + +The `defaults` scope sets all of the default values of these classes. The +`initial` scope overrides the default imaging `wsclean` options to be used with +the first round of imaging _before self-calibration_. + +The `selfcal` scope contains a key-value mapping, where an `integer` key relates +the options to that specific round of masking, imaging and calibration options +for that round of self-calibration. Again, options set here override the +corresponding options defined in the `defaults` scope. + +`flint_config` can be used to generate a template file, which can then be +tweaked. The template file uses YAML to define scope and settings. So, use the +YAML standard when modifying this file. There are primitive verification +functions to ensure the modified template file is correctly form. + +## CLI Configuration file + +To help manage (and avoid) long CLI calls to configure `flint`, most command +line options may be dumped into a new-line delimited text file which can then be +set as the `--cli-config` option of some workflows. See the `configargparse` +python utility to read up on more on how options may be overridden if specified +in both the text file and CLI call. + +## Configuration schema + +TODO: Document the config file schema + +## Configuration examples + +TODO: Add configuration examples diff --git a/docs/data.md b/docs/data.md new file mode 100644 index 00000000..941ad3e6 --- /dev/null +++ b/docs/data.md @@ -0,0 +1,61 @@ +# Data + +## Sky-model catalogues + +The `flint_skymodel` command will attempt to create an in-field sky-model for a +particular measurement set using existing source catalogues and an idealised +primary beam response. Supported catalogues are those available through +`flint_catalogue download`. Note this mode has not be thoroughly tested and may +not be out-of-date relative to how the `flint_flow_continuum_pipeline` operates. +In the near future this may be expanded. + +If calibrating a bandpass (i.e. `1934-638`) `flint` will use the packaged source +model. At the moment this is only provided for `calibrate`. + +## About ASKAP Measurement Sets + +Some of the innovative components of ASKAP and the `yandasoft` package have +resulted in measurement sets that are not immediately inline with external +tools. Measurement sets should first be processed with +[fixms](https://github.com/AlecThomson/FixMS). Be careful -- most (all) `flint` +tasks don't currently do this automatically. Be aware, me hearty. + +(containers)= + +## Containers + +At the moment this pipeline uses `singularity` containers to use compiled +software that are outside the `python` ecosystem. + +:::{attention} For the moment there are no 'supported' container packaged within +this repository -- sorry! ::: + +In a nutshell, the containers used throughout are passed in as command line +arguments, whose context should be enough to explain what it is expecting. At +the time of writing there are six containers for: + +- calibration: this should contain `calibrate` and `applysolutions`. These are + tools written by Andre Offringa. +- flagging: this should contain `aoflagger`, which is installable via a + `apt install aoflagger` within ubuntu. +- imaging: this should contain `wsclean`. This should be at least version 3. At + the moment a modified version is being used (which implements a + `-force-mask-round` option). +- source finding: `aegeam` is used for basic component catalogue creation. It is + not intedended to be used to produce final source catalogues, but to help + construct quick-look data products. A minimal set of `BANE` and `aegean` + options are used. +- source peeling: `potatopeel` is a package that uses `wsclean`, `casa` and a + customisable rule set to peel out troublesome annoying objects. Although it is + a python installable and importable package, there are potential conflicts + with the `python-casacore` modules that `flint` uses. See + [potatopeel's github repository for more information](https://gitlab.com/Sunmish/potato/-/tree/main) +- linear mosaicing: The `linmos` task from `yandasoft` is used to perform linear + mosaicing. Importanting this `linmos` is capable of using the ASKAP primary + beam responses characterised through holography. `yandasoft` docker images + [are available from the CSIRO dockerhub page.](https://hub.docker.com/r/csirocass/askapsoft). +- self-calibration: `casa` is used to perform antenna-based self-calibration. + Specifically the tasks `gaincal`, `applysolutions`, `cvel` and `mstransform` + are used throughout this process. Careful selection of an appropriate CASA + version should be made to keep the `casacore` library in compatible state with + other components. Try the `docker://alecthomson/casa:ks9-5.8.0` image. diff --git a/docs/examples.md b/docs/examples.md new file mode 100644 index 00000000..5229cecf --- /dev/null +++ b/docs/examples.md @@ -0,0 +1,3 @@ +# Examples + +TODO: Add a examples for new users. diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 00000000..fe7f3184 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,24 @@ +# Flint + +```{include} ../README.md +:start-after: +``` + +## Table of contents + +```{toctree} +:maxdepth: 2 +installation.md +quickstart.md +about.md +data.md +config.md +validation.md +examples.md +``` + +## Indices and tables + +- {ref}`genindex` +- {ref}`modindex` +- {ref}`search` diff --git a/docs/index.rst b/docs/index.rst deleted file mode 100644 index 705f3c98..00000000 --- a/docs/index.rst +++ /dev/null @@ -1,29 +0,0 @@ -.. flint documentation master file, created by - sphinx-quickstart on Sun Jul 16 18:05:01 2023. - You can adapt this file completely to your liking, but it should at least - contain the root toctree directive. - -Welcome to flint's documentation! -================================= - -A pirate themed toy ASKAP-RACS pipeline. - -Yarrrr-Harrrr fiddly-dee! - -.. image:: logo.jpeg - :width: 400 - :alt: Capn' Flint - Credit: DALLE 3 - - -.. toctree:: - :maxdepth: 2 - :caption: Contents: - - - -Indices and tables -================== - -* :ref:`genindex` -* :ref:`modindex` -* :ref:`search` diff --git a/docs/installation.md b/docs/installation.md new file mode 100644 index 00000000..5832f46d --- /dev/null +++ b/docs/installation.md @@ -0,0 +1,45 @@ +# Installation + +Provided an appropriate environment installation should be as simple as a +`pip install`. + +However, on some systems there are interactions with `casacore` and building +`python-casacore` appropriately. Issues have been noted when interacting with +large measurement sets across components with different `casacore` versions. +This seems to happen even across container boundaries (i.e. different versions +in containers might play a role). The exact cause is not at all understood, but +it appears to be related to the version of `python-casacore`, `numpy` and +whether pre-built wheels are used. + +In practise it might be easier to leverage `conda` to install the appropriate +`boost` and `casacore` libraries. + +A helpful script below may be of use. + +```bash +BRANCH="main" # replace this with appropriate branch or tag +DIR="flint_${BRANCH}" +PYVERSION="3.12" + +mkdir "${DIR}" || exit +cd "${DIR}" || exit + + +git clone git@github.com:tjgalvin/flint.git && \ + cd flint && \ + git checkout "${BRANCH}" + +conda create -y -n "${DIR}" python="${PYVERSION}" && \ + source /home/$(whoami)/.bashrc && \ + conda activate "${DIR}" && \ + conda install -y -c conda-forge boost casacore && \ + PIP_NO_BINARY="python-casacore" pip install -e . +``` + +This may set up an appropriate environment that is compatible with the +containers currently being used. + +:::{attention} For the moment there are no 'supported' container packaged within +this repository -- sorry! ::: + +See [containers](#containers) for more information. diff --git a/docs/quickstart.md b/docs/quickstart.md new file mode 100644 index 00000000..74810363 --- /dev/null +++ b/docs/quickstart.md @@ -0,0 +1,3 @@ +# Quickstart + +TODO: Add a quickstart guide for new users. diff --git a/pyproject.toml b/pyproject.toml index 16069912..2f9b80f8 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -68,6 +68,11 @@ ruff = "^0.1.12" pytest = "^7.4.0" pytest-cov = "*" sphinx = "*" +myst-parser = "*" +sphinx-autodoc-typehints = "*" +sphinx-copybutton = "*" +sphinx-autoapi = "*" +furo = "*" [tool.poetry.extras] dev = [ @@ -79,6 +84,11 @@ dev = [ "pytest", "pytest-cov", "sphinx", + "myst-parser", + "sphinx-autodoc-typehints", + "sphinx-copybutton", + "sphinx-autoapi", + "furo" ] [tool.poetry.scripts] diff --git a/validation.md b/validation.md new file mode 100644 index 00000000..00075ec9 --- /dev/null +++ b/validation.md @@ -0,0 +1,96 @@ +# Validation + +The validation plots that are created are simple and aim to provide a quality +assessment at a quick glance. An RMS image and corresponding source component +catalogue are the base data products derived from the ASKAP data that are +supplied to the routine. + +`flint` requires a set of reference catalogues to be present for some stages of +operation, the obvious being the validation plots described above. In some +computing environments (e.g. HPC) network access to external services are +blocked. To avoid these issues `flint` has a built in utility to download the +reference catalogues it expected from vizier and write them to a specified user +directory. See: + +> `flint_catalogue download --help` + +The parent directory that contains these cataloguues should be provided to the +appropriate tasks when appropriate. + +In the current `flint` package these catalogues (and their expected columns) +are: + +- ICRF + +```python +Catalogue( + survey="ICRF", + file_name="ICRF.fits", + freq=1e9, + ra_col="RAJ2000", + dec_col="DEJ2000", + name_col="ICRF", + flux_col="None", + maj_col="None", + min_col="None", + pa_col="None", + vizier_id="I/323/icrf2", +) +``` + +- NVSS + +```python +Catalogue( + survey="NVSS", + file_name="NVSS.fits", + name_col="NVSS", + freq=1.4e9, + ra_col="RAJ2000", + dec_col="DEJ2000", + flux_col="S1.4", + maj_col="MajAxis", + min_col="MinAxis", + pa_col="PA", + vizier_id="VIII/65/nvss", +) +``` + +- SUMSS + +```python +Catalogue( + survey="SUMSS", + file_name="SUMSS.fits", + freq=8.43e8, + ra_col="RAJ2000", + dec_col="DEJ2000", + name_col="Mosaic", + flux_col="St", + maj_col="dMajAxis", + min_col="dMinAxis", + pa_col="dPA", + vizier_id="VIII/81B/sumss212", +) +``` + +- RACS-LOW + +```python +Catalogue( + file_name="racs-low.fits", + survey="RACS-LOW", + freq=887.56e6, + ra_col="RAJ2000", + dec_col="DEJ2000", + name_col="GID", + flux_col="Ftot", + maj_col="amaj", + min_col="bmin", + pa_col="PA", + vizier_id="J/other/PASA/38.58/gausscut", +) +``` + +The known filename is used to find the appropriate catalogue and its full path, +and are appropriately named when using the `flint_catalogue download` tool.