From 4756830f587a32d4e9f6ab69b137d5881db20f85 Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Tue, 31 Oct 2023 13:04:24 +1300 Subject: [PATCH 1/5] update charter to align with #47 --- charter.md | 15 ++++----- notes/arrays.md | 0 notes/non-structural.md | 27 +++++++++++++++++ notes/required-info.md | 34 +++++++++++++++++++++ notes/subschemas.md | 67 +++++++++++++++++++++++++++++++++++++++++ 5 files changed, 136 insertions(+), 7 deletions(-) create mode 100644 notes/arrays.md create mode 100644 notes/non-structural.md create mode 100644 notes/required-info.md create mode 100644 notes/subschemas.md diff --git a/charter.md b/charter.md index ab1cb10..820db3e 100644 --- a/charter.md +++ b/charter.md @@ -1,13 +1,14 @@ # JSON Schema IDL charter + The purpose of this charter is to give a brief introduction to the project, the SIG, and the community how they all relate. ## Section 0: Guiding Principles -Because of the nature of JSON Schema, you don't define the data structure but the validation rules that data should comply with, trying to interpret these rules definitions is a pain point for many. Providing a clear description of how this can be achieved would help not only the developer that uses JSON Schema in this area, but also helps standardize the behavior improving the user experience. +Because of the nature of JSON Schema (constraints, not data definition), trying to interpret these rules definitions is a pain point for many. Providing a clear description of how this can be achieved would standardize the behavior improving the user experience and thus help any developers working with JSON Schema. ## Section 1: Scope -Help and clarify how JSON Schema can be interpreted from validation rules to data definition. This extends to how those data definitions can be represented in any programming language. To accommodate the JSON Schema spec and the interpretation process a code generation vocabulary will be created. +Help and clarify how data definition and JSON Schema's validation rules can interact. This includes how those validation rules can represent code in any programming language. To accommodate the JSON Schema spec and the interpretation process, a code generation vocabulary will be created. Implementations are neither in nor out of scope for the project, but it is not a high prioritization. @@ -21,23 +22,23 @@ The JSON Schema organization will force this project to comply with the organiza ## Section 3: The Special interest group -The SIG members are no different from the regular community besides it is their responsibility to understand all points of view of the community and push the project forward. +The SIG members are no different from the regular community besides it is their responsibility to understand all points of view of the community and push the project forward. The community can get involved by jumping into discussions in issues, pull requests, etc, as long as the [Code of Conduct](./CODE_OF_CONDUCT.md) and [git workflow](./git_workflow.md) is adhered to. - SIG memberships are not time-limited. There is no maximum size of the SIG. It is expected that the SIG members actively participate in discussions and maintain the repository, otherwise they will remain a contributor. Any SIG members that do not actively participate in discussions or maintain the repository within 30 days will be removed from the SIG list and return to being contributors. The SIG group is currently made up of the following individuals. Feel free to submit an issue requested to be added. -- Jonas Lagoni ([Senior Software Engineer at Postman, working on AsyncAPI](https://www.linkedin.com/in/jonaslagoni/)) +- [Greg Dennis](https://github.com/gregsdennis) +- [Jonas Lagoni](https://github.com/jviotti) - [Jason Desrosiers](https://github.com/jdesrosiers) ## Section 4: Definitions Definitions that should be clarified to align meaning. -- **Validation rules**, i.e. a JSON Schema as such `{type: string}` define that data should validate against a string, it does not define that the data is a string. For small validation rules, there is almost no difference, but with more complex ones it becomes apparent. -- **Data definition**, i.e. it defines the exact structure of the data. +- **Validation rules**, i.e. a JSON Schema as such `{ "type": "string" }` defines that data, as represented in the JSON data model, should be a string; it does not define that the data must be a string in programming languages. For example,`"2023-10-31"` is a string in JSON; however, this value represents a date, for which there will likely be a dedicated type separate from a string in many programming languages. For small validation rules, there is almost no difference, but with more complex ones it becomes apparent. +- **Data definition**, i.e. it defines the exact structure of the data. This is often how programming languages build types. These type systems are generally considerably more extensive than the JSON data model. diff --git a/notes/arrays.md b/notes/arrays.md new file mode 100644 index 0000000..e69de29 diff --git a/notes/non-structural.md b/notes/non-structural.md new file mode 100644 index 0000000..6bf697e --- /dev/null +++ b/notes/non-structural.md @@ -0,0 +1,27 @@ +For non-structural details, many languages support (either through built-in mechanisms or third-party libraries) some way to annotate a type to indicate what range of values are valid for an object's members. In JSON Schema, these are represented with keywords like + +- `minimum` +- `maxLength` +- `multipleOf` +- `uniqueItems` +- etc. + +However, the way to represent these in language code varies greatly. + +Translating these requirements to the language model allows validation logic that's already in the language to be invoked. However it also promotes using that validation logic (which may differ from JSON Schema's intent) over using JSON Schema directly. This may cause disparity between the validations from the language and JSON Schema. + +Unless we can ensure that the code annotations provide the same validation that an equivalent JSON Schema would, I'm not convinced including them is a good idea. + +--- + +Another aspect of this is schemas which have these requirements at the top-level: + +```jsonc +{ + "type": "array", + "items": { /* ... */ }, + "minItems": 4 +} +``` + +Usually these types of constraints are expressed on object members, not on types. Does it make sense to say that an instance of a array must have at least 4 items, or that a lone integer must be more than 50? (I can see a case for `minimum: 0` being used to identify an unsigned integer type, though.) \ No newline at end of file diff --git a/notes/required-info.md b/notes/required-info.md new file mode 100644 index 0000000..2a615fc --- /dev/null +++ b/notes/required-info.md @@ -0,0 +1,34 @@ +When generating a schema from a type, we generally have all of the information we need. (I'm having trouble thinking of a type declaration that doesn't have enough information to define a schema.) + +However, code generation sometimes requires some annotative keywords. For example, + +```json +{ + "type": "object", + "properties": { + "foo": { "type": "integer" }, + "bar": { "type": "string" } + } +} +``` + +provides enough information to describe the structure of a type, but a type usually needs a name. + +Some types don't need this, though. For an array of objects, + +```jsonc +{ + "type": "array", + "items": { /* ... */ } +} +``` + +most languages have one or more built-in collection types that can handle this, e.g. `List` in .Net or `NSArray` in Objective-C, or even a simple array (which probably every usable language has). + +For custom objects, though, in order to generate type code correctly, there should be a minimum amount of information present. Specifically, + +- `title` should hold the type name +- `description` could map to a comment or other in-code documentation about the type +- any other ideas? + +We also need to identify when these should be required. Can that be represented in a meta-schema, or will that be up to the generator to decide? \ No newline at end of file diff --git a/notes/subschemas.md b/notes/subschemas.md new file mode 100644 index 0000000..fb77afa --- /dev/null +++ b/notes/subschemas.md @@ -0,0 +1,67 @@ +Subschemas and nested types appear everywhere. Generally, a subschema will represent a new type. + +```json +{ + "type": "object", + "properties": { + "foo": { "type": "integer" }, + "bar": { + "type": "object", + "properties": { + "numbers": { + "type": "array", + "items": { "type": "number" } + } + } + } + } +} +``` + +For this schema, we have four types being represented: + +- the top-level object, with properties `foo` and `bar` +- an integer +- an object at `/properties/bar` +- a number + +(#45 discusses built-in types, so we'll leave that out of this issue.) + +Specifically, we need to focus on the two custom objects: the top-level and whatever `bar` is. + +I would expect a generator to create two types from this schema. Are there any restrictions that people can see? + +--- + +What happens when a subschema is duplicated? + +```json +{ + "type": "object", + "properties": { + "foo": { "type": "integer" }, + "bar": { + "type": "object", + "properties": { + "numbers": { + "type": "array", + "items": { "type": "number" } + } + } + }, + "baz": { + "type": "object", + "properties": { + "numbers": { + "type": "array", + "items": { "type": "number" } + } + } + } + } +} +``` + +Here, `/properties/bar` and `/properties/baz` are identical. Is this the author's intent, or are these two semantically (business rules) different yet functionally (shape/structure) identical? Do we need a way to discern this? Maybe #50 can help identify author intent. For example, tf the subschemas have the same `title`, it was intended that they're the same type. + +(`$ref`'d subschemas are obviously the same type.) \ No newline at end of file From e832b8eec544450d41751efe56c8d14eca0b3f49 Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Tue, 31 Oct 2023 13:10:25 +1300 Subject: [PATCH 2/5] update readme --- README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 73cf115..b27afb0 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ [![Financial Contributors on Open Collective](https://opencollective.com/json-schema/all/badge.svg?label=financial+contributors)](https://opencollective.com/json-schema)

- Help and clarify how JSON Schema can be interpreted from validation rules to data definition. This extends to how those data definitions can be represented in any programming language + Help and clarify how data definition and JSON Schema's validation rules can interact. This includes how those validation rules can represent code in any programming language.

@@ -17,16 +17,20 @@ ## Status While the project is only first getting started, the initial agenda is: -1. Define a processing model for interpreting JSON Schema to data definitions. - * Define code generation vocabulary as needed, while the processing model is being defined +1. Identify programming language paradigms and determine how they might interact with JSON Schema. +2. Identify areas where JSON Schema is insufficiently expressive to handle these language paradigms and define a new vocabulary to fill the gaps. +3. Define a processing model for translating between data definition and JSON Schema. ## Charter + To help explain the process, the project, the SIG and the community, in terms of how they all relate, please refer to the [charter](./charter.md). ## Code of Conduct + Respect each other. Choose empathy over judgement. Act according to the [Code of Conduct](./CODE_OF_CONDUCT.md). ## Getting involved + If you want to review changes, creating documentation, help form suggestions or take charge in figuring out a specific task, we welcome any way you want to get involved. If you find an issue you would like to work on you can drop a comment, ask questions or discuss the approaches. Or if you see we are missing specific things feel free to create new issues. @@ -36,9 +40,11 @@ We have a dedicated slack channel `#vocab-idl`- in the JSON Schema slack, join t You can use this channel if you don't know where to get started, discuss specifics of issues, get updates, or in general hang out. ## Contributing + In case contributions are for changing content on git, please refer to [git workflow](./git_workflow.md). Any contributions that changes the content of this repository MUST go through pull requests. + ## Contributors ✨ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)): From 37bcc06920dd8de9fd2d7227cc2d35538ecf24cb Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Tue, 31 Oct 2023 13:12:30 +1300 Subject: [PATCH 3/5] use org-level code of conduct --- CODE_OF_CONDUCT.md | 51 ---------------------------------------------- README.md | 2 +- 2 files changed, 1 insertion(+), 52 deletions(-) delete mode 100644 CODE_OF_CONDUCT.md diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md deleted file mode 100644 index 2f0b7de..0000000 --- a/CODE_OF_CONDUCT.md +++ /dev/null @@ -1,51 +0,0 @@ - -> Copied mostly from the [AsyncAPI Git-workflow](https://github.com/asyncapi/.github/blob/master/git-workflow.md) - -# Contributor Covenant Code of Conduct - -## Our Pledge - -In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation. - -## Our Standards - -Examples of behavior that contributes to creating a positive environment include: - -* Using welcoming and inclusive language -* Being respectful of differing viewpoints and experiences -* Gracefully accepting constructive criticism -* Focusing on what is best for the community -* Showing empathy towards other community members - -Examples of unacceptable behavior by participants include: - -* The use of sexualized language or imagery and unwelcome sexual attention or advances -* Trolling, insulting/derogatory comments, and personal or political attacks -* Public or private harassment -* Publishing others' private information, such as a physical or electronic address, without explicit permission -* Other conduct which could reasonably be considered inappropriate in a professional setting - -## Our Responsibilities - -Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. - -Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. - -## Scope - -This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. - -## Enforcement - -Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project lead report@jsonschema.dev. - -The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. - -Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. - -## Attribution - -This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version] - -[homepage]: http://contributor-covenant.org -[version]: http://contributor-covenant.org/version/1/4/ diff --git a/README.md b/README.md index b27afb0..866d917 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ To help explain the process, the project, the SIG and the community, in terms of ## Code of Conduct -Respect each other. Choose empathy over judgement. Act according to the [Code of Conduct](./CODE_OF_CONDUCT.md). +Respect each other. Choose empathy over judgement. Act according to the [Code of Conduct](https://github.com/json-schema-org/.github/blob/main/CODE_OF_CONDUCT.md). ## Getting involved From b18778486904cf9ecd2ddf87a41c77777e33640e Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Tue, 31 Oct 2023 13:12:55 +1300 Subject: [PATCH 4/5] use org-level code of conduct (charter) --- charter.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/charter.md b/charter.md index 820db3e..b17a684 100644 --- a/charter.md +++ b/charter.md @@ -24,7 +24,7 @@ The JSON Schema organization will force this project to comply with the organiza The SIG members are no different from the regular community besides it is their responsibility to understand all points of view of the community and push the project forward. -The community can get involved by jumping into discussions in issues, pull requests, etc, as long as the [Code of Conduct](./CODE_OF_CONDUCT.md) and [git workflow](./git_workflow.md) is adhered to. +The community can get involved by jumping into discussions in issues, pull requests, etc, as long as the [Code of Conduct](https://github.com/json-schema-org/.github/blob/main/CODE_OF_CONDUCT.md) and [git workflow](./git_workflow.md) is adhered to. SIG memberships are not time-limited. There is no maximum size of the SIG. It is expected that the SIG members actively participate in discussions and maintain the repository, otherwise they will remain a contributor. From 2af57da85c6d762f21cca6600eaa161082eb8d6f Mon Sep 17 00:00:00 2001 From: Greg Dennis Date: Tue, 31 Oct 2023 13:22:14 +1300 Subject: [PATCH 5/5] remove notes --- .gitignore | 4 ++- notes/arrays.md | 0 notes/non-structural.md | 27 ----------------- notes/required-info.md | 34 --------------------- notes/subschemas.md | 67 ----------------------------------------- 5 files changed, 3 insertions(+), 129 deletions(-) delete mode 100644 notes/arrays.md delete mode 100644 notes/non-structural.md delete mode 100644 notes/required-info.md delete mode 100644 notes/subschemas.md diff --git a/.gitignore b/.gitignore index ccc9fd9..8414047 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,3 @@ -*.DS_Store \ No newline at end of file +*.DS_Store + +notes/ \ No newline at end of file diff --git a/notes/arrays.md b/notes/arrays.md deleted file mode 100644 index e69de29..0000000 diff --git a/notes/non-structural.md b/notes/non-structural.md deleted file mode 100644 index 6bf697e..0000000 --- a/notes/non-structural.md +++ /dev/null @@ -1,27 +0,0 @@ -For non-structural details, many languages support (either through built-in mechanisms or third-party libraries) some way to annotate a type to indicate what range of values are valid for an object's members. In JSON Schema, these are represented with keywords like - -- `minimum` -- `maxLength` -- `multipleOf` -- `uniqueItems` -- etc. - -However, the way to represent these in language code varies greatly. - -Translating these requirements to the language model allows validation logic that's already in the language to be invoked. However it also promotes using that validation logic (which may differ from JSON Schema's intent) over using JSON Schema directly. This may cause disparity between the validations from the language and JSON Schema. - -Unless we can ensure that the code annotations provide the same validation that an equivalent JSON Schema would, I'm not convinced including them is a good idea. - ---- - -Another aspect of this is schemas which have these requirements at the top-level: - -```jsonc -{ - "type": "array", - "items": { /* ... */ }, - "minItems": 4 -} -``` - -Usually these types of constraints are expressed on object members, not on types. Does it make sense to say that an instance of a array must have at least 4 items, or that a lone integer must be more than 50? (I can see a case for `minimum: 0` being used to identify an unsigned integer type, though.) \ No newline at end of file diff --git a/notes/required-info.md b/notes/required-info.md deleted file mode 100644 index 2a615fc..0000000 --- a/notes/required-info.md +++ /dev/null @@ -1,34 +0,0 @@ -When generating a schema from a type, we generally have all of the information we need. (I'm having trouble thinking of a type declaration that doesn't have enough information to define a schema.) - -However, code generation sometimes requires some annotative keywords. For example, - -```json -{ - "type": "object", - "properties": { - "foo": { "type": "integer" }, - "bar": { "type": "string" } - } -} -``` - -provides enough information to describe the structure of a type, but a type usually needs a name. - -Some types don't need this, though. For an array of objects, - -```jsonc -{ - "type": "array", - "items": { /* ... */ } -} -``` - -most languages have one or more built-in collection types that can handle this, e.g. `List` in .Net or `NSArray` in Objective-C, or even a simple array (which probably every usable language has). - -For custom objects, though, in order to generate type code correctly, there should be a minimum amount of information present. Specifically, - -- `title` should hold the type name -- `description` could map to a comment or other in-code documentation about the type -- any other ideas? - -We also need to identify when these should be required. Can that be represented in a meta-schema, or will that be up to the generator to decide? \ No newline at end of file diff --git a/notes/subschemas.md b/notes/subschemas.md deleted file mode 100644 index fb77afa..0000000 --- a/notes/subschemas.md +++ /dev/null @@ -1,67 +0,0 @@ -Subschemas and nested types appear everywhere. Generally, a subschema will represent a new type. - -```json -{ - "type": "object", - "properties": { - "foo": { "type": "integer" }, - "bar": { - "type": "object", - "properties": { - "numbers": { - "type": "array", - "items": { "type": "number" } - } - } - } - } -} -``` - -For this schema, we have four types being represented: - -- the top-level object, with properties `foo` and `bar` -- an integer -- an object at `/properties/bar` -- a number - -(#45 discusses built-in types, so we'll leave that out of this issue.) - -Specifically, we need to focus on the two custom objects: the top-level and whatever `bar` is. - -I would expect a generator to create two types from this schema. Are there any restrictions that people can see? - ---- - -What happens when a subschema is duplicated? - -```json -{ - "type": "object", - "properties": { - "foo": { "type": "integer" }, - "bar": { - "type": "object", - "properties": { - "numbers": { - "type": "array", - "items": { "type": "number" } - } - } - }, - "baz": { - "type": "object", - "properties": { - "numbers": { - "type": "array", - "items": { "type": "number" } - } - } - } - } -} -``` - -Here, `/properties/bar` and `/properties/baz` are identical. Is this the author's intent, or are these two semantically (business rules) different yet functionally (shape/structure) identical? Do we need a way to discern this? Maybe #50 can help identify author intent. For example, tf the subschemas have the same `title`, it was intended that they're the same type. - -(`$ref`'d subschemas are obviously the same type.) \ No newline at end of file