Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Basic external ref support #31

Closed
wants to merge 5 commits into from

Conversation

lloydmeta
Copy link

Closes #3

tl;dr Add basic support for "external" $ref fields that point to files.

When expanding fields, if field's ref is pointing to a .json file, resolve it, and merge all the types it references into the current parent Expander.

The changes in this PR do this by:

  • Adding field to Expander that holds the directory of the file from which its Schema is derived; this is re-used for reading relative referenced files.
  • When expanding a field that references an external file, create a child Expander to expand the referenced Schema and merging its data (types and further referenced Schemas) into the current one. Do this recursively.
  • Maybe using a tad too many clone()s...

To be honest, I need this for another project I plan on working on, so mostly hacked around until the tests passed without paying too much attention to efficiency. Here to learn so suggestions are welcome :)

Things not handled thus far:

  • Circular dependencies (compiler stack overflows)
  • Namespacing of types (model name clashes just fail the compile)

When expanding fields, if field's ref is pointing to a
`.json` file, resolve it, and merge all the types it
references into the current parent Expander.

The changes here do this by:
* Adding field to Expander that holds the directory of the
  file from which its Schema is derived; this is re-used
  for reading relative referenced files.
* When expanding a field that references an external file,
  create a child Expander to expand the referenced Schema
  and merging its data (types and further referenced Schemas)
  into the current one. Do this recursively.
* Maybe using a tad too many clone()s...

Signed-off-by: lloydmeta <[email protected]>
Signed-off-by: lloydmeta <[email protected]>
Copy link
Owner

@Marwes Marwes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cloning is going to be very expensive for large structures and could cause the code to run in exponetial time.

A relatively easy solution would be to add the ability to hook into the generation and allow it to replace the type being generated for any given field. Then override the type of any fields holding a Schema and make it hold an Arc<Schema> which would give us cheap clones.

schemafy_lib/src/lib.rs Outdated Show resolved Hide resolved
}
}

fn to_absolute_path(&self, s: &PathBuf) -> PathBuf {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn to_absolute_path(&self, s: &PathBuf) -> PathBuf {
fn to_absolute_path(&self, s: &Path) -> PathBuf {

schemafy_lib/src/lib.rs Outdated Show resolved Hide resolved
schemafy_lib/src/lib.rs Outdated Show resolved Hide resolved
@@ -571,7 +689,23 @@ impl<'r> Expander<'r> {
None => self.expand_definitions(schema),
}

let types = self.types.iter().map(|t| &t.1);
// Dedupe because we may have visited the same file multiple times when de-referencing
// external files
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be avoided altogether? If we have already visited a file it should be cached and not visited again

Copy link
Author

@lloydmeta lloydmeta Jun 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye. I was going to go back and revisit this but forgot.

Had to do some refactoring/bug-fixing to get this fixed, but the short version is that while we don't visit the same file twice, when we kick off a child Expander for a new file, we shove all the types and Schemas from the parent Expander to it (to prevent unnecessary resolving.

when we merge the resolved types from the child Expander back into the parent Expander, we'd also get the same types that came from the Parent originally. I've updated types to be a HashMap to get around this and added a sort by type name before generating the AST to keep things deterministic (see
7976226)

@lloydmeta
Copy link
Author

A relatively easy solution would be to add the ability to hook into the generation and allow it to replace the type being generated for any given field. Then override the type of any fields holding a Schema and make it hold an Arc which would give us cheap clones.

Can you give a concrete example of how to do what you’re suggesting? You say easy but it doesn’t sounds trivial given the bootstrapping of the structs.

I could just wrap all the Schemas in Expander in Arc (or even Rc since we’re not threaded?) quite easily though I think and get cheap clones.

Fewer PathBufs, more Paths

Signed-off-by: lloydmeta <[email protected]>
@lloydmeta lloydmeta force-pushed the feature/external-ref branch 2 times, most recently from de4e72e to 18a55f5 Compare June 20, 2020 05:56
@lloydmeta
Copy link
Author

I gave the reference-counted Schema idea a shot: it turns out we also need to at least RefCell the Schema because merge_all_of needs a mutable Schema

fn merge_all_of(result: &mut Schema, r: &Schema) {
use std::collections::btree_map::Entry;
for (k, v) in &r.properties {
match result.properties.entry(k.clone()) {
Entry::Vacant(entry) => {
entry.insert(v.clone());
}
Entry::Occupied(mut entry) => merge_all_of(entry.get_mut(), v),
}
}
if let Some(ref ref_) = r.ref_ {
result.ref_ = Some(ref_.clone());
}
if let Some(ref description) = r.description {
result.description = Some(description.clone());
}
merge_option(&mut result.required, &r.required, |required, r_required| {
required.extend(r_required.iter().cloned());
});
result.type_.retain(|e| r.type_.contains(e));
}

called from

merge_all_of(result.to_mut(), &self.schema(def));

Not sure at this point if I'm getting further and further away from what you're envisioning, so it's the last commit and can easily be reverted if you just want to drive bit that yourself.

@lloydmeta
Copy link
Author

lloydmeta commented Jun 20, 2020

I opened another PR that contains a commit (14aef62) that might be closer to what you envisioned wrt adding a trigger to wrap Schema in a reference counting type, please take a look.

Use canonical paths to ensure we don't parse paths twice
even when the references are relative.

Signed-off-by: lloydmeta <[email protected]>
Signed-off-by: lloydmeta <[email protected]>
@lloydmeta
Copy link
Author

@Marwes I'm closing this because I think #32 is the way forward with cheaper clones due to the wrapping.

@lloydmeta lloydmeta closed this Jun 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support external refs
2 participants