Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-45263: Add new tap_schema module #90

Merged
merged 31 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
c1a366c
Add function to find the Table of a Column object
JeremyMcCormick Jul 18, 2024
41980b0
Add method on Schema to find an object by ID and type
JeremyMcCormick Jul 19, 2024
d973f2f
Add validator for arraysize to Column
JeremyMcCormick Sep 13, 2024
96e7b58
Make a few improvements to connection handling and statement execution
JeremyMcCormick Sep 3, 2024
d3fd81b
Add methods for creating a Schema from a resource or stream
JeremyMcCormick Sep 3, 2024
6db4915
Add several exception types to sphinx nitpick_ignore
JeremyMcCormick Sep 11, 2024
5fedbf1
Add tests of Schema utility methods
JeremyMcCormick Sep 4, 2024
94e9c47
Add tests.utils module with test utility functions
JeremyMcCormick Sep 3, 2024
03e6c32
Add lsst-resources dependency
JeremyMcCormick Aug 30, 2024
dfac8d0
Add YAML file representing the standard TAP_SCHEMA tables
JeremyMcCormick Aug 21, 2024
2a5f465
Include YAML files in the schemas dir with project packaging
JeremyMcCormick Aug 15, 2024
82e7980
Add initial version of tap_schema module
JeremyMcCormick Aug 14, 2024
d225f1d
Add tap_schema module to API documentation
JeremyMcCormick Sep 9, 2024
743347c
Add a YAML file for testing nonstandard TAP_SCHEMA names
JeremyMcCormick Sep 3, 2024
c18d7bf
Add YAML file with simple schema for testing TAP_SCHEMA
JeremyMcCormick Aug 14, 2024
4b850df
Add tests of the tap_schema module
JeremyMcCormick Aug 14, 2024
d9eb40a
Add tests of the tap_schema module within a Postgres environment
JeremyMcCormick Aug 14, 2024
ece2c67
Add load-tap-schema command to cli
JeremyMcCormick Aug 21, 2024
db66635
Add Makefile target for installing dependencies from requirements.txt
JeremyMcCormick Aug 30, 2024
f2c37c5
Add LSST pipelines to intersphinx projects
JeremyMcCormick Sep 3, 2024
9661d84
Add tests.utils to API documentation
JeremyMcCormick Sep 3, 2024
e782674
Print the type of the engine in error message if it is not recognized
JeremyMcCormick Sep 4, 2024
07894e1
Remove usage of context manager for connection
JeremyMcCormick Sep 4, 2024
3bd3c4c
Add cli test of load-tap-schema command
JeremyMcCormick Sep 4, 2024
8358086
Remove trailing slashes from intersphinx links
JeremyMcCormick Sep 4, 2024
a33db35
Add a few utility functions for checking the database engine
JeremyMcCormick Sep 23, 2024
f2e4db8
Fix typing of find_object_by_id function
JeremyMcCormick Oct 3, 2024
d44cd58
Add news fragment
JeremyMcCormick Oct 3, 2024
42c8317
Improve the validation function which sets default arraysize values
JeremyMcCormick Oct 3, 2024
9456140
Remove some now unncessary handling of arraysize from tap module
JeremyMcCormick Oct 3, 2024
d49165d
Correct the types table record for text so it maps to VOTable char
JeremyMcCormick Oct 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ print_target:
build:
@uv pip install --force-reinstall --no-deps -e .

deps:
@uv pip install --upgrade -r requirements.txt

docs:
@rm -rf docs/dev/internals docs/_build
@tox -e docs
Expand Down
3 changes: 3 additions & 0 deletions docs/changes/DM-45263.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Added a new ``tap_schema_`` module designed to deprecate and eventually replace the ``tap`` module.
This module provides utilities for translating a Felis schema into a TAP_SCHEMA representation.
The command ``felis load-tap-schema`` can be used to activate this functionality.
26 changes: 17 additions & 9 deletions docs/dev/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,33 +7,41 @@ Python API
.. automodapi:: felis.datamodel
:include-all-objects:

.. automodapi:: felis.metadata
.. automodapi:: felis.db.dialects
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.tap
.. automodapi:: felis.db.sqltypes
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.types
:include-all-objects:

.. automodapi:: felis.db.dialects
.. automodapi:: felis.db.utils
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.db.sqltypes
.. automodapi:: felis.db.variants
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.db.utils
.. automodapi:: felis.metadata
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.db.variants
.. automodapi:: felis.tap
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.tap_schema
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.tests.postgresql
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.tests.utils
:include-all-objects:
:no-inheritance-diagram:

.. automodapi:: felis.types
:include-all-objects:
7 changes: 5 additions & 2 deletions docs/documenteer.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ nitpick_ignore = [
["py:class", "sqlalchemy.orm.decl_api.Base"],
["py:class", "sqlalchemy.engine.mock.MockConnection"],
["py:class", "pydantic.main.BaseModel"],
["py:exc", "pydantic.ValidationError"],
["py:exc", "yaml.YAMLError"]
]
nitpick_ignore_regex = [
# Bug in autodoc_pydantic.
Expand All @@ -29,5 +31,6 @@ nitpick_ignore_regex = [
python_api_dir = "dev/internals"

[sphinx.intersphinx.projects]
python = "https://docs.python.org/3/"
sqlalchemy = "https://docs.sqlalchemy.org/en/latest/"
python = "https://docs.python.org/3"
sqlalchemy = "https://docs.sqlalchemy.org/en/latest"
lsst = "https://pipelines.lsst.io/v/weekly"
2 changes: 1 addition & 1 deletion docs/user-guide/datatypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ The following table shows these mapping:
+-----------+---------------+----------+------------------+--------------+
| unicode | NVARCHAR | NVARCHAR | VARCHAR | unicodeChar |
+-----------+---------------+----------+------------------+--------------+
| text | TEXT | LONGTEXT | TEXT | uncodeChar |
| text | TEXT | LONGTEXT | TEXT | char |
+-----------+---------------+----------+------------------+--------------+
| binary | BLOB | LONGBLOB | BYTEA | unsignedByte |
+-----------+---------------+----------+------------------+--------------+
Expand Down
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ dependencies = [
"click >= 7",
"pyyaml >= 6",
"pydantic >= 2, < 3",
"lsst-utils"
"lsst-utils",
"lsst-resources"
]
requires-python = ">=3.11.0"
dynamic = ["version"]
Expand Down Expand Up @@ -55,7 +56,7 @@ zip-safe = true
license-files = ["COPYRIGHT", "LICENSE"]

[tool.setuptools.package-data]
"felis" = ["py.typed"]
"felis" = ["py.typed", "schemas/*.yaml"]

[tool.setuptools.dynamic]
version = { attr = "lsst_versions.get_lsst_version" }
Expand Down
97 changes: 83 additions & 14 deletions python/felis/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,22 +23,21 @@

from __future__ import annotations

import io
import logging
from collections.abc import Iterable
from typing import IO

import click
import yaml
from pydantic import ValidationError
from sqlalchemy.engine import Engine, create_engine, make_url
from sqlalchemy.engine.mock import MockConnection
from sqlalchemy.engine.mock import MockConnection, create_mock_engine

from . import __version__
from .datamodel import Schema
from .db.utils import DatabaseContext
from .db.utils import DatabaseContext, is_mock_url
from .metadata import MetaDataBuilder
from .tap import Tap11Base, TapLoadingVisitor, init_tables
from .tap_schema import DataLoader, TableManager

__all__ = ["cli"]

Expand Down Expand Up @@ -107,7 +106,7 @@
dry_run: bool,
output_file: IO[str] | None,
ignore_constraints: bool,
file: IO,
file: IO[str],
) -> None:
"""Create database objects from the Felis file.

Expand All @@ -133,8 +132,7 @@
Felis file to read.
"""
try:
yaml_data = yaml.safe_load(file)
schema = Schema.model_validate(yaml_data, context={"id_generation": ctx.obj["id_generation"]})
schema = Schema.from_stream(file, context={"id_generation": ctx.obj["id_generation"]})
url = make_url(engine_url)
if schema_name:
logger.info(f"Overriding schema name with: {schema_name}")
Expand Down Expand Up @@ -261,7 +259,7 @@
tap_keys_table: str,
tap_key_columns_table: str,
tap_schema_index: int,
file: io.TextIOBase,
file: IO[str],
) -> None:
"""Load TAP metadata from a Felis file.

Expand Down Expand Up @@ -304,8 +302,7 @@
The data will be loaded into the TAP_SCHEMA from the engine URL. The
tables must have already been initialized or an error will occur.
"""
yaml_data = yaml.load(file, Loader=yaml.SafeLoader)
schema = Schema.model_validate(yaml_data)
schema = Schema.from_stream(file)

tap_tables = init_tables(
tap_schema_name,
Expand Down Expand Up @@ -345,6 +342,79 @@
tap_visitor.visit_schema(schema)


@cli.command("load-tap-schema", help="Load metadata from a Felis file into a TAP_SCHEMA database")
@click.option("--engine-url", envvar="FELIS_ENGINE_URL", help="SQLAlchemy Engine URL")
@click.option("--tap-schema-name", help="Name of the TAP_SCHEMA schema in the database")
@click.option(
"--tap-tables-postfix", help="Postfix which is applied to standard TAP_SCHEMA table names", default=""
)
@click.option("--tap-schema-index", type=int, help="TAP_SCHEMA index of the schema in this environment")
@click.option("--dry-run", is_flag=True, help="Execute dry run only. Does not insert any data.")
@click.option("--echo", is_flag=True, help="Print out the generated insert statements to stdout")
@click.option("--output-file", type=click.Path(), help="Write SQL commands to a file")
@click.argument("file", type=click.File())
@click.pass_context
def load_tap_schema(
ctx: click.Context,
engine_url: str,
tap_schema_name: str,
tap_tables_postfix: str,
tap_schema_index: int,
dry_run: bool,
echo: bool,
output_file: str | None,
file: IO[str],
) -> None:
"""Load TAP metadata from a Felis file.

Parameters
----------
engine_url
SQLAlchemy Engine URL.
tap_tables_postfix
Postfix which is applied to standard TAP_SCHEMA table names.
tap_schema_index
TAP_SCHEMA index of the schema in this environment.
dry_run
Execute dry run only. Does not insert any data.
echo
Print out the generated insert statements to stdout.
output_file
Output file for writing generated SQL.
file
Felis file to read.

Notes
-----
The TAP_SCHEMA database must already exist or the command will fail. This
command will not initialize the TAP_SCHEMA tables.
"""
url = make_url(engine_url)
engine: Engine | MockConnection
if dry_run or is_mock_url(url):
engine = create_mock_engine(url, executor=None)

Check warning on line 395 in python/felis/cli.py

View check run for this annotation

Codecov / codecov/patch

python/felis/cli.py#L395

Added line #L395 was not covered by tests
else:
engine = create_engine(engine_url)
mgr = TableManager(
engine=engine,
apply_schema_to_metadata=False if engine.dialect.name == "sqlite" else True,
schema_name=tap_schema_name,
table_name_postfix=tap_tables_postfix,
)

schema = Schema.from_stream(file, context={"id_generation": ctx.obj["id_generation"]})

DataLoader(
schema,
mgr,
engine,
tap_schema_index=tap_schema_index,
dry_run=dry_run,
print_sql=echo,
output_path=output_file,
).load()


@cli.command("validate", help="Validate one or more Felis YAML files")
@click.option(
"--check-description", is_flag=True, help="Check that all objects have a description", default=False
Expand Down Expand Up @@ -372,7 +442,7 @@
check_redundant_datatypes: bool,
check_tap_table_indexes: bool,
check_tap_principal: bool,
files: Iterable[io.TextIOBase],
files: Iterable[IO[str]],
) -> None:
"""Validate one or more felis YAML files.

Expand Down Expand Up @@ -406,9 +476,8 @@
file_name = getattr(file, "name", None)
logger.info(f"Validating {file_name}")
try:
data = yaml.load(file, Loader=yaml.SafeLoader)
Schema.model_validate(
data,
Schema.from_stream(
file,
context={
"check_description": check_description,
"check_redundant_datatypes": check_redundant_datatypes,
Expand Down
Loading
Loading