Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependency cognite-extractor-utils to v7 #59

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Feb 20, 2024

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
cognite-extractor-utils ^4.0.0 -> ^7.0.0 age adoption passing confidence

Release Notes

cognitedata/python-extractor-utils (cognite-extractor-utils)

v7.5.0

Added
  • File processing utils

v7.4.9

Added
  • CastableInt class the represents an interger to be used in config schema definitions. The difference from using int is that the field of this type in the yaml file can be either a string or a number, while a field of type int must be a number in yaml.
  • PortNumber class that represents a valid port number to be used in config schema definitions. Just like CastableInt it can be a string or a number in the yaml file. This allows for example setting a port number using an environment variable.

v7.4.8

Fixed
  • Fix file upload when private link is used

v7.4.7

Added
  • Configuration for ignore regexp pattern

v7.4.6

Fixed
  • Fix file metadata update

v7.4.5

Removed
  • Remove warnings from data models file uploads

v7.4.4

Changed
  • Updated cognite SDK version.

v7.4.3

Fixed:
  • Regression: Reverting change related to file_meta parameter in IOUploadQueue

v7.4.2

Added
  • Added support for AWS file upload

v7.4.1

Changed
  • Updated cognite sdk version

v7.4.0

Added
  • Upload to Core DM/Classic file.

v7.3.0

Changed
  • Use httpx to upload files to CDF instead of the python SDK.
    May improve performance on windows.

v7.2.3

Added
  • Add additional validation to cognite config before creating a cognite client,
    to provide better error messages when configuration is obviously wrong.

v7.2.2

Fixed
  • Produce a config error when missing token-url and tenant, instead of eventually
    producing an OAuth 2 MUST utilize https error when getting the token.

v7.2.1

Changed
  • Reformat log messages to not have newlines
Fixed
  • Fixed using the keyvault tag in remote config.

v7.2.0

Fixed
  • Fixed an issue with the retry decorator where functions would not be
    called at all if the cancellation token was set. This resulted in errors
    with for example upload queues.
Added
  • An upload queue for data model instances.
  • A new type of state store that stores hashes of ingested items. This can be
    used to detect changed RAW rows or data model instances.

v7.1.6

Changed
  • Update cognite-sdk version to 7.43.3

v7.1.5

Fixed
  • Fixed an issue preventing retries in file uploads from working properly
Added
  • File external ID when logging failed file uploads

v7.1.4

Fixed
  • Fixed a race condition in state stores and uploaders where a shutdown could result in corrupted state stores.

v7.1.3

Fixed
  • Update type hints for the time series upload queue to allow status codes

v7.1.2

Fixed
  • cognite_exceptions() did not properly retry file uploads

v7.1.1

Fixed
  • Enhancement of 7.0.5: more use cases covered (to avoid repeatedly fetching a new token).
  • When using remote config, the full local idp-authentication is now injected (some fields were missing).

v7.1.0

Added
  • The file upload queue is now able to stream files larger than 5GiB.

v7.0.5

Fixed
  • The background thread ConfigReloader now caches the CogniteClient to avoid repeatedly fetching a new token.

v7.0.4

Fixed
  • Max parallelism in file upload queue properly can set larger values than the max_workers in the ClientConfig object.
  • Storing states with the state store will lock the state store. This fixes an issue where iterating through a changing dict could cause issues.

v7.0.3

Fixed
  • Fix file size upper limit.

v7.0.2

Added
  • Support for files without content.

v7.0.1

Fixed
  • Ensure that CancellationToken.wait(timeout) only waits for at most timeout, even if it is notified in that time.

v7.0.0

Changed
  • The file upload queues have changed behaviour.

    • Instead of waiting to upload until a set of conditions, it starts
      uploading immedeately.
    • The upload() method now acts more like a join, wating on all the
      uploads in the queue to complete before returning.
    • A call to add_to_upload_queue when the queue is full will hang until
      the queue is no longer full before returning, instead of triggering and
      upload and hanging until everything is uploaded.
    • The queues now require to be set up with a max size. The max upload
      latencey is removed. As long as you use the queue in as a context (ie,
      using with FileUploadQueue(...) as queue:) you should not have to
      change anything in your code. The behaviour of the queue will change, it
      will most likely be much faster, but it will not require any changes from
      you as a user of the queue.
  • threading.Event has been replaced globally with CancellationToken. The
    interfaces are mostly compatible, though CancellationToken does not have a
    clear method. The compatibility layer is deprecated.

    • Replace calls to is_set with the property is_cancelled.
    • Replace calls to set with the method cancel.
    • All methods which took threading.Event now take CancellationToken.
      You can use create_child_token to create a token that can be canceled
      without affecting its parent token, this is useful for creating stoppable
      sub-modules that are stopped if a parent module is stopped. Notably,
      calling stop on an upload queue no longer stops the parent extractor,
      this was never intended behavior.
Removed
  • The deprecated middleware module has been removed.
  • set_event_on_interrupt has been replaced with
    CancellationToken.cancel_on_interrupt.
Added
  • You can now use Path as a type in your config files.
  • CancellationToken as a better abstraction for cancellation than
    threading.Event.
Migration guide

To migrate from version 6.* to 7, you need to update how you interract with
cancellation tokens. The type has now changed from Event to
CancellationToken, so make sure to update all of your type hints etc. There is
a compatability layer for the CancellationToken class, so that it has the same
methods as an Event (except for clear()) which means it should act as a
drop-in replacement for now. This compatability layer is deprected, and will be
removed in version 8.

If you are using file upload queues, read the entry in the Changed section.
You will most likely not need to change your code, but how the queue behaves has
changed for this version.

v6.4.1

Changed
  • File upload queues now reuse a single thread pool across runs instead of
    creating a new one each time upload() is called.

v6.4.0

Added
  • Option to specify retry exceptions as a dictionary instead of a tuple. Values
    should be a callable determining whether a specific exception object should
    be retied or not. Example:

    @​retry(
        exceptions = {ValueError: lambda x: "Invalid" not in str(x)}
    )
    def func() -> None:
        value = some_function()
    
        if value is None:
            raise ValueError("Could not retrieve value")
    
        if not_valid(value):
            raise ValueError(f"Invalid value: {value}")
  • Templates for common retry scenarios. For example, if you're using the
    requests library, you can do

    retry(exceptions = request_exceptions())
Changed
  • Default parameters in retry has changed to be less agressive. Retries will
    apply backoff by default, and give up after 10 retries.

v6.3.2

Added
  • Aliases for keyvault config to align with dotnet utils

v6.3.1

Fixed
  • Improved the state store retry behavior to handle both fundamental
    and wrapped network connection errors.

v6.3.0

Added
  • Added support to retrieve secrets from Azure Keyvault.

v6.2.2

Added
  • Added an optional security-categories attribute to the cognite config
    section.

v6.2.1

Fixed
  • Fixed a type hint in the post_upload_function for upload queues.

v6.2.0

Added
  • Added IOFileUploadQueue as a base class of both FileUploadQueue and BytesUploadQueue.
    This is an upload queue for functions that produce BinaryIO to CDF Files.

v6.1.1

Fixed
  • Correctly handle equality comparison of TimeIntervalConfig objects.

v6.1.0

Added
  • Added ability to specify dataset under which metrics timeseries are created

v6.0.2

Fixed
  • Improved the state store retry behavior to handle connection errors

v6.0.1

Fixed
  • Fixed iter method on the state store to return an iterator

v6.0.0

Changed
  • cognite-sdk to v7

v5.5.1

Added
  • Added iter method on the state store to return the keys of the local state dict

v5.5.0

Added
  • Added load_yaml_dict to configtools.loaders.
Fixed
  • Fixed getting the config type when !env was used in the config file.

v5.4.3

Added
  • Added len method on the state store to return the length of the local state dict

v5.4.2

Fixed
  • Fix on find_dotenv call

v5.4.1

Changed
  • Update cognite-sdk version to 6.24.0

v5.4.0

Fixed
  • Fixed the type hint for the retry decorator. The list of exception types
    must be given as a tuple, not an arbitrary iterable.
  • Fixed retries for sequence upload queue.
  • Sequence upload queue reported number of distinct sequences it had rows
    for, not the number of rows. That is now changed to number of rows.
  • When the sequence upload queue uploaded, it always reported 0 rows uploaded
    because of a bug in the logging.
Removed
  • Latency metrics for upload queues.

v5.3.0

Added
  • Added support for queuing assets upload

v5.2.1

Changed
  • Timestamps before 1970 are no longer filtered out, to align with changes to
    the timeseries API.

v5.2.0

Changed
  • The event upload queue now upserts events. If creating an event fails due
    to the event already existing, it will be updated instead.

v5.1.0

Added
  • Support for connection parameters

v5.0.1

Changed
  • Upload queue size limit now triggers an upload when the size has reached
    the limit, not when it exceeded the limit.

v5.0.0

Removed
  • Legacy authentication through API keys has been removed throughtout the code
    base.

  • A few deprecated modules (authentication, prometheus_logging) have been
    deleted.

Changed
  • uploader and configtools have been changed from one module to a package
    of multiple modules. The content has been re-exported to preserve
    compatability, so you can still do

    from cognite.extractorutils.configtools import load_yaml, TimeIntervalConfig
    from cognite.extractorutils.uploader import TimeSeriesUploadQueue

    But now, you can also import from the submodules directly:

    from cognite.extractorutils.configtools.elements import TimeIntervalConfig
    from cognite.extractorutils.configtools.loaders import load_yaml
    from cognite.extractorutils.uploader.time_series import TimeSeriesUploadQueue

    This has first and foremost been done to improve the codebase and make it
    easier to continue to develop.

  • Updated the version of the Cognite SDK to version 6. Refer to the
    changelog
    and migration
    guide

    for the SDK for details on the changes it entails for users.

  • Several small single-function modules have been removed and the content have
    been moved to the catch-all util module. This includes:

    • The add_extraction_pipeline decorator from the extraction_pipelines
      module

    • The throttled_loop generator from the throttle module

    • The retry decorator from the retry module

Added
  • Support for audience parameter in idp-authentication
Migration guide

The deletion of API keys and the legacy OAuth2 implementation should not affect
your extractors or your usage of the utils unless you were depending on the old
OAuth implementation directly and not through configtools or the base classes.

To update to version 5 of extractor-utils, you need to

  • Change where you import a few things.

    • Change from

      from cognite.extractorutils.extraction_pipelines import add_extraction_pipeline

      to

      from cognite.extractorutils.util import add_extraction_pipeline
    • Change from

      from cognite.extractorutils.throttle import throttled_loop

      to

      from cognite.extractorutils.util import throttled_loop
    • Change from

      from cognite.extractorutils.retry import retry

      to

      from cognite.extractorutils.util import retry
  • Consult the migration
    guide

    for the Cognite SDK version 6 for details on the changes it entails for
    users.

The changes in this version are only breaking for your usage of the utils. Any
extractor you have written will not be affected by the changes, meaning you do
not need to bump the major version for your extractors.


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot requested a review from a team as a code owner February 20, 2024 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants