-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature MIVOT (Model Instance in VOTable) #497
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left some generic comments that may help going forward with this PR.
Besides that, the big picture comment would be that I feel there a a lot of methods/properties/attributes, are all of those necessary for the users or the code could be significantly simplyfied?
pyvo/mivot/utils/xml_utils.py
Outdated
|
||
@author: laurentmichel | ||
""" | ||
from lxml import etree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only the a test file, so should not matter that much, but I recall the question of whether use lxml or defusedxml was raised in connection with this mivot work.
Also, either of the new dependencies is being used, they need to be added properly as a dependency (whether they will be mandatory or optional one is open for discussion, but at least they have to be added as a test dependency)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, lxml is used by the model_viewer module which is unavoidable. So whatever parser we will be using, we will have to add a new dependency. See slack thread.
There is here a long discussion about lxml vs defused xml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have replaced lxml. xpath
queries with code based on the built-in xml parser.
So that we do not need the lxml
dependency anymore.
Unfortunately, the Python xml parser is not safe either.
It must be replaced with defusedxml
if possible.
Our code has been setup to work with defusedxml
if available or to keep working with the built-in xml parser otherwise.
Thank you for the review of this big code chunk.
We're well aware that we're arriving with a code package that far exceeds the usual PR size. |
Then add a |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #497 +/- ##
==========================================
+ Coverage 80.38% 81.24% +0.86%
==========================================
Files 52 69 +17
Lines 6189 7129 +940
==========================================
+ Hits 4975 5792 +817
- Misses 1214 1337 +123 ☔ View full report in Codecov by Sentry. |
After some testing achieved with Vizier people by using the service they have deployed, it turns out that the a few tweaks must be fixed :
In addition a documentation page must be written with code examples. |
Done (fork CI passed) |
@astropy/coordinators - please do add the contributors from this PR to the org, so CI will run here. And it would be nice to have a better, more automated way to do this. |
Sure, but there are a lot of conversations to comb through here, can you please give me the usernames? Also, such requests can also be made via https://github.com/astropy/astropy-project/blob/main/.github/ISSUE_TEMPLATE/github-admin.yaml in the future. Thanks! |
p.s. We tried to automate but it's broken. Maybe @mwcraig would have time to fix it at some point. |
Aren't they in the commits list? (and sorry, I don't know where/how to add them to the org myself) |
There are 103 commits. I cannot quickly figure out without combing through them. |
I think @somilia is the only one that needs to be added -- it is the fact that the person who opened the PR isn't part of the org that causes the issue IIRC. |
Looks like @lmichel is the only other committer... |
Thanks! I added both to |
e8114d2
to
ec38dfa
Compare
Hello,
Thank you for adding us.
I still have a "3 workflows awaiting approval " on #497
Regards
LM
Le 06/02/2024 à 20:15, Brigitta Sipőcz a écrit :
… @astropy/coordinators <https://github.com/orgs/astropy/teams/coordinators> - please do add the contributors from this PR to the
org, so CI will run here. And it would be nice to have a better, more automated way to do this.
—
Reply to this email directly, view it on GitHub <#497 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACXOP6HSAF2A4PYSLT3VPHDYSJ6NXAVCNFSM6AAAAAA64SVL2KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZQGU4TKNZSGU>.
You are receiving this because you commented.Message ID: ***@***.***>
--
English version: https: //www.deepl.com/translator
--
jesuischarlie/Tunis/Paris/Bruxelles/Berlin
Laurent Michel
SSC XMM-Newton
Tél : +33 (0)3 68 85 24 37
Fax : +33 (0)3 )3 68 85 24 32
Université de Strasbourg <http://www.unistra.fr>
Observatoire Astronomique
11 Rue de l'Université
F - 67200 Strasbourg
|
Ah, those approvals. That is controlled by GitHub, not the org. As a rule to deter spammers, GitHub says workflows do not automatically run unless you have a merged commit in the repo already. https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks By default, all first-time contributors require approval to run workflows. |
To make things easier for everyone (e.g., not having to wait around for such approvals in this PR), you can run the test suite locally instead if you are unsure. And hopefully when this gets merged, your next PR will no longer require such approvals. Good luck! p.s. |
For which the workaround suggested by GH was to add the contributor to the org. Maybe they changed policies since that discussion a while back. ps: yes, the size is one of the reasons why it takes this long to get the reviews/etc going. |
Le 8 févr. 2024 à 19:05, Brigitta Sipőcz ***@***.***> a écrit :
By default, all first-time contributors require approval to run workflows.
For which the workaround suggested by GH was to add the contributor to the org. Maybe they changed policies since that discussion a while back.Anyway, no commits anymore until the review.
ps: yes, the size is one of the reasons why it takes this long to get the reviews/etc goingI’m aware about the size but if I split the PR, I’ll do a second one just after: no gain thus.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Well, I did add lmichel to astropy-contributors but didn't seem to help. I dunno if the addition has to be done before this PR was opened or the workaround assumption was wrong. 🤷 I guess another workaround is for lmichel to do a very trivial fix somewhere else in this repo and get that PR merged first. (Unproven.) |
That also worked before. But as Matt said above, it might need to be Somia to do these as they opened the PR (though I would be surprised if one of us maintainers to push to it at that commit wouldn't trigger the CI). Anyway, at this point I would say wait for the review, and then we try to wrap it up as smoothly as possible. No point in trying to split it up into multiple PRs at this point, etc. And a rebase and some targeted squashing will clean up the history significantly, too, but I would wait with that after the content review, too. |
FWIW I also added somilia to the same astropy-contributors team at the same time as lmichel . Anyways, sorry I couldn't help. If you need to rerun CI , feel free to ping me. I am usually responsive when I am working (USA New York time). |
They both contributed in the big feature PR to core in astropy/astropy#15390, so maybe some policy to automatically add people to the org who contribute significantly would have prevented this issue. (note: I do not advocate for inviting all who adds a single character doc fix to the org, but here we are talking about a significant enhancement (to core astropy) that has a follow-up (this PR) that runs into technical gates as people are automatically put into newcomer bins while there are not at all newcomers to the org). |
epoc_propagation removed (postponed for a next ¨R) And various style fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super tiny comments after the rebase. I'll push a fix for these along with adding back the changelog entry.
pyvo/conftest.py
Outdated
try: | ||
PYTEST_HEADER_MODULES['defusedxml'] = 'defusedxml' | ||
except (NameError, KeyError): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for try/except
try: | |
PYTEST_HEADER_MODULES['defusedxml'] = 'defusedxml' | |
except (NameError, KeyError): | |
pass | |
PYTEST_HEADER_MODULES['defusedxml'] = 'defusedxml' |
if __name__ == '__main__': | ||
pytest.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for these in any of the files
CI failure is unrelated and I'll fix it separately. |
class. Comments and documentation have been updated. Public methods have been renamed to make the API easier to understand.
the coverage level - add some tests
does not support all MIVOT features and which ones are not supported
…passing volint 3) re-wording 4) spurious files removed 5) remove unused stuff form vocabulary
…test file renamed
…3) replace NotImplementedException with built-in error 4) logging message to debug level 5) type corrected 6) Mivot_Viewer (_)instance renamed as dm_instance 7) docstrings 8) similar Xpath methods refactored 9) table_name renamed 10) docstring 11) Replace @Property when get_ methods 12) MivotViewer imported from mivot._init__ 13) 3 exception classes 14) add a __repr__ for MivotInstance 14) docstring completed 15) Activate MIVOT feature at package import 16) json.load documentation 17) tests against MivotException types and messages 18) Use @pytest.mark.skipif 19) Astropy version checked at module loading 20) MivotViewer automatically select the resource 21) Class JSONEncoder renamed 22) Add a slim mode to the class dict generator 23) Documentation 24) code style 25) doc highlight on dict serialization 26) doc string indentation fixed
+ pm + parallax + Rv + errors + correlations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll hold off on merging pending a final update to CHANGES.rst. |
Mentions to MANGO and EpochPropagation removed since these features have been postponed.
CI failure is unrelaed |
Processing VO Model Annotations
Introduction
Model Instances in VOTables (MIVOT) defines a syntax for mapping VOTable data to any model serialized in VO-DML (Virtual Observatory Data Modeling Language).
This annotation schema acts as a bridge between data and models. It associates both column/parameter metadata and VOTable data with data model elements (such as classes, attributes, types, etc.). It also complements VOTable data or metadata that may be missing from the table, e.g., coordinate system descriptions or curation tracing.
The data model elements are organized in an independent annotation block complying with the MIVOT XML schema, which is added as an extra resource above the TABLE element. The MIVOT syntax allows a data structure to be described as a hierarchy of classes. It can also represent relationships and compositions between them. Furthermore, it can construct data model objects by aggregating instances from different tables of the VOTable.
Usage
The API allows you to obtain different levels of model views on the last read data row. These levels are described below.
The lowest levels are model agnostic.
They provide tools to browse model instances dynamically generated. The understanding of the model elements is the responsibility of the final user.
The highest level (4) is based on the MANGO draft model and especially to its. It has been designed to solve the EpochPropagation use case risen at 2023 South Spring Interop.
The end user API allows to obtain a model view on the last read data row, this usage corresponds to the level 3 and 4 described below.
You can access each value of the object of the model. e.g.:
>>> print(row_view.Coordinate_coosys.PhysicalCoordSys_frame.spaceRefFrame.value)
>>> ICRS
The model view is a dynamically generated Python object whose field names are derived from the
dmroles
of the MIVOT elements. There is no checking against the model structure at this level.In this pull request, JOIN and dynamic references are not implemented. We keep focused on simpler patterns: The MIVOT block maps one mapped data table with the coordinate systems in the GLOBALS
PyVO Implementation
The implementation relies on the Astropy's write and read annotation modules (PR#15390) available from astropy 6.0, which allows to get and set Mivot blocks from/into VOTables. We use this new Astropy feature, MIVOT, to retrieve the MIVOT block.
This implementation is built in 3 levels, denoting the abstraction level in relation to the XML block.
Level 1:
ModelViewerLevel1
Provide the MIVOT block as it is in the VOTable: No references are resolved. The Mivot block is provided as an xml tree.
Level 2:
ModelViewerLevel2
Provide access to an xml tree whose structure matches the model view of the current row. The internal references have been resolved (by
_get_model_view()
function of the ModelViewerLevel1). The attribute values have been set with the actual data values. This XML element is intended to be used as a basis for building any objects. The level2 output can be browsed using XPATH queries allowing users to retrieve MIVOT elements by their@dmrole
or@dmtype
. At this level, the MIVOT block must still be handled as an xml element.Level 3:
ModelViewerLevel3
ModelViewerLevel3 generates, from the level 1 output, a nested dictionary representing the entire XML INSTANCE with its hierarchy.
Level 4:
MivotClass
From this dictionary, we build a
~pyvo.mivot.viewer.mivot_class.MivotClass
object, which is a dictionary containing only the essential information used to process data.MivotClass
basically stores all XML objects in its attribute dictionary__dict__
.Level stacking
The more levels we have , the more elements are hidden:
(e.g.
get_attribute_by_role(coords:LonLatPoint.longitude)
)(e.g.
EpochPosition.longitude.value
)(e.g.
EpochPosition.get_sky_coord()
)Mivot package content:
Utils sub-package
This package contains modules that make it easier to handle XML elements and dictionaries, as well as the logger setup.
Seeker sub-package
AnnotationSeeker
provides a set of tools for extracting mapping sub-blocks to retrieve XML elements.RessourceSeeker
provides a set of getters for tables.TableIterator
is a simple wrapper that iterates over table rows.Feature sub-package
This package contains features such as
StaticReferenceResolver
which is used to resolve references (replace REFERENCE elements with the referenced objects set with the roles of the REFERENCEs). Future features to be added to this sub-package includeDynamicReference
andJoin
.Exception sub-package
This package contains exception classes related to MIVOT.
Viewer sub-package
This package contains all the levels of ModelViewer described above, as well as the
MivotClass
file.Test strategy:
The test module contains one pytest file per Python module, and the datasets used for the tests are located in the 'test/data' directory. The tests check for values, intermediate objects (dictionaries), errors, and thrown exceptions.
EDIT:
Few changes have been made lastly:
get_next_row_view()
function now allows to pass to the next row, returning the MivotClass with the new data row.astropy.coordinates.sky_coordinate.SkyCoord
containing only the data available in the row.