Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for json_object function #664

Merged
merged 18 commits into from
Jan 20, 2025

Conversation

jorgehermo9
Copy link
Contributor

@jorgehermo9 jorgehermo9 commented Jan 13, 2025

Relates to #127. This PR is still in draft and I have a few left things to do (tests, improve implementation), just opening it so anyone can track this work meanwhile.

@@ -85,6 +85,8 @@ pub fn json_array(values: &[OwnedValue]) -> crate::Result<OwnedValue> {
let mut s = String::new();
s.push('[');

// TODO: use `convert_db_type_to_json` and map each value with that function,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems to do the same as convert_db_type_to_json and therefore it can be simplified a lot. I can deal with this refactor in a follow-up PR. I think it would lead to more simple array construction than pushing strings.

// it was obtained from another JSON function.
// TODO: the `json_array` function should use something like this.
// That implementation could be a lot simpler with this function
let json_val = convert_db_type_to_json(value)?;
Copy link
Contributor Author

@jorgehermo9 jorgehermo9 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with convert_db_type_to_json we convert the treat the values as a JSON object only if
it was obtained from another JSON function.

This matches sqlite's behaviour: https://www.sqlite.org/json1.html#jobj

An argument with SQL type TEXT it is normally converted into a quoted JSON string even if the input text is well-formed JSON. However, if the argument is the direct result from another JSON function or the -> operator (but not the ->> operator), then it is treated as JSON and all of its JSON type information and substructure is preserved. This allows calls to json_object() and json_array() to be nested. The json() function can also be used to force strings to be recognized as JSON.

the json_array function should use something like this. That implementation could be a lot simpler with this function https://github.com/tursodatabase/limbo/pull/664/files#r1917348517

// serde's `rc` feature so we can serialize Rc<String>
OwnedValue::Text(t) => t.value.to_string(),
// TODO: I matched sqlite message error here. Is this ok?
_ => crate::bail_constraint_error!("labels must be TEXT"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I matched sqlite's message error in this case. Is this ok?

let key = match key {
// TODO: We can construct the IndexMap from Rc<String>, but we must enable the
// serde's `rc` feature so we can serialize Rc<String>
OwnedValue::Text(t) => t.value.to_string(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to use String instead of Rc<String> because of the IndexMap<String,Val> serialization. We can't serialize an IndexMap<Rc<String>,Val> if we don't have the rc feature of serde enabled.
Should be better to add that feature to avoid cloning the string here?
https://serde.rs/feature-flags.html#rc

/// the returned `Val` will be an object. If the internal text value is a regular text,
/// then a string will be returned. This is useful to track if the value came from a json
/// function and therefore we must interpret it as json instead of raw text when working with it.
fn convert_db_type_to_json(value: &OwnedValue) -> crate::Result<Val> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use a impl TryFrom<&OwnedValue> for Val? Seems more idiomatic

(the same goes to convert_json_to_db_type function a few lines above)

I can address this refactor in another PR if it makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea!

Copy link
Contributor Author

@jorgehermo9 jorgehermo9 Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect! will do it in a follow-up PR

_ => crate::bail_constraint_error!("json_object requires an even number of values"),
})
// TODO: collecting into a IndexMap does not allow for repeated keys
.collect::<Result<IndexMap<String, Val>, _>>()?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collecting into an IndexMap does not allow to have repeated keys. Sqlite documentation about this is that:

The json_object() function currently allows duplicate labels without complaint, though this might change in a future enhancement.

What is the behaviour we should have here? I think it does not make sense to have repeated keys in JSON.

}

#[test]
fn test_json_object_duplicated_keys() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECT json(json_object('key','value'));
} {{{"key":"value"}}}

# FIXME: this behaviour differs from sqlite. Although, sqlite docs states
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented in https://github.com/tursodatabase/limbo/pull/664/files#r1917368455. How should we proceed here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually expect this test to pass as-is, ie it should return {"key":"value2"}. Any application will eventually convert this object to a valid JSON and duplicate the keys anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I agree.

The problem is, sqlite outputs {"key:"value","key":"value2"} and this test won't pass in CI

@@ -506,3 +506,41 @@ do_execsql_test json_error_position_null {
do_execsql_test json_error_position_complex {
SELECT json_error_position('{a:null,{"h":[1,[1,2,3]],"j":"abc"}:true}');
} {{9}}

do_execsql_test json_object_simple {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inspired from sqlite's tests about this function https://github.com/sqlite/sqlite/blob/f1747f93e0f8df7984b595b91649c7789217fe59/test/json102.test#L21

But did not copy them entirely. Are those enough test cases or should I include more from sqlite's test suite?

$func.to_string()
);
};
// The only function right now that requires an even number is `json_object` and it allows
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove those comments if required but seemed a bit confusing that this behaviour differs from expect_arguments_max and expected_arguments_min.

Maybe writing a macro for this single case is a bit overengineered, was thinking about futures uses of this. I can remove this macro and extend it in L846 if required

@jorgehermo9 jorgehermo9 marked this pull request as ready for review January 15, 2025 21:43
@jorgehermo9
Copy link
Contributor Author

jorgehermo9 commented Jan 15, 2025

I'm ready for a review. Left a feed comments about some doubts I had, as I'm new in the project and I don't know much about the codebase & how you work.

Thank you very much!!

@@ -270,7 +270,7 @@ Feature support of [sqlite expr syntax](https://www.sqlite.org/lang_expr.html).
| json ->> path | Yes | |
| json_insert(json,path,value,...) | | |
| jsonb_insert(json,path,value,...) | | |
| json_object(label1,value1,...) | | |
| json_object(label1,value1,...) | Yes | When keys are duplicated, only the last one processed is returned. This differs from sqlite, where the keys in the output can be duplicated |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this note ok? Or should I work to fix that?

https://github.com/tursodatabase/limbo/pull/664/files#r1917368455

@penberg penberg changed the title feat: implement json_object Add support for json_object function Jan 16, 2025
core/json/mod.rs Outdated Show resolved Hide resolved
@penberg penberg merged commit bda1e4e into tursodatabase:main Jan 20, 2025
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants