Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema inference: Shape::widen() #1126

Merged
merged 11 commits into from
Aug 4, 2023
Merged

Schema inference: Shape::widen() #1126

merged 11 commits into from
Aug 4, 2023

Commits on Aug 4, 2023

  1. feature: Implement Shape::widen which enables widening a Shape to…

    … fit a provided `AsNode`. This sets the groundwork for performantly keeping track of a running inferred schema in the combiner.
    
    Of note, the schemas "inferred" by `widen()` are maximally strict:
    * By default, newly inferred objects have have `additionalProperties: false`.
    * Object fields initially have `required: true` until we encounter a document missing that field, at which point it will be downgraded to `required: false`
    
    The next piece of work is implementing the stubbed-out `enforce_field_count_limits`. With that, we should have everything we need to implement the running inferred schema and emit it to the ops logs. Also of note is that the `reduce: flow-inferred-schema-merge` reduction annotation implementation can and should also use `enforce_field_count_limits`, as it's possible for itra-transaction documents to not exceed the limits while inter-transaction documents do, and we care about limiting both of those cases.
    jshearer committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    8e01e11 View commit details
    Browse the repository at this point in the history
  2. PR review feedback:

    * Factor out `ObjShape::widen`
    * Refactor to be zero-cost by default
    * Ensure ObjShape.properties stays sorted
    * Get rid of the helper function `Shape::widen_inner`
    * Set `is_required` for new fields properly based on whether they have always been present or not
    jshearer committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    8e2e6ce View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    80e620a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    db38eec View commit details
    Browse the repository at this point in the history
  5. More updates from PR review:

    * Infer string formats following similar logic to `is_required`: the first string gets inferred, subsequent ones get checked, and after a non-conforming string causes the format to drop off, don't re-infer it
    * Reduce some nesting in `ObjShape::widen`
    * Widen array min and max length
    * Correctly detect `integer`, and `fractional` types
    * Tests
    jshearer committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    c320f0e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    a26b89d View commit details
    Browse the repository at this point in the history
  7. fix: Recur in Shape::enforce_field_count_limits() even when we're a…

    …lready in a location that's getting squashed because we also need to ensure that the newly-squashed `additionalProperties` isn't _also_ excessively large
    jshearer committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    3d5bf01 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    e53ec23 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    612df07 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    9d636ee View commit details
    Browse the repository at this point in the history
  11. Update logic to handle patternProperties, and to widen explicit pro…

    …perties first, even if `additionalProperties` is defined.
    jshearer committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    c895a0e View commit details
    Browse the repository at this point in the history