Skip to content

Commit

Permalink
Merge pull request #2997 from szarnyasg/nits-20240605f
Browse files Browse the repository at this point in the history
Rework JSON page examples
  • Loading branch information
szarnyasg authored Jun 5, 2024
2 parents 3ba7207 + 21e75f9 commit 439edec
Show file tree
Hide file tree
Showing 6 changed files with 67 additions and 37 deletions.
3 changes: 3 additions & 0 deletions data/arrays.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
5 changes: 5 additions & 0 deletions data/records-in-array.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[
{"key1":"value1", "key2": "value1"},
{"key1":"value2", "key2": "value2"},
{"key1":"value3", "key2": "value3"}
]
3 changes: 3 additions & 0 deletions data/records.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"key1":"value1", "key2": "value1"}
{"key1":"value2", "key2": "value2"}
{"key1":"value3", "key2": "value3"}
12 changes: 12 additions & 0 deletions data/unstructured.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"key1":"value1",
"key2":"value1"
}
{
"key1":"value2",
"key2":"value2"
}
{
"key1":"value3",
"key2":"value3"
}
75 changes: 40 additions & 35 deletions docs/data/json/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,11 @@ CREATE TABLE todos (userId UBIGINT, id UBIGINT, title VARCHAR, completed BOOLEAN
COPY todos FROM 'todos.json';
```

Alternatively, create a table without specifying the schema manually:
Alternatively, create a table without specifying the schema manually with a [`CREATE TABLE ... AS SELECT` clause](../../sql/statements/create_table#create-table--as-select-ctas):

```sql
CREATE TABLE todos AS SELECT * FROM 'todos.json';
CREATE TABLE todos AS
SELECT * FROM 'todos.json';
```

Write the result of a query to a JSON file:
Expand Down Expand Up @@ -99,6 +100,8 @@ FROM filename.json;
With `format = 'newline_delimited'` newline-delimited JSON can be parsed.
Each line is a JSON.

We use the example file [`records.json`](/data/records.json) with the following content:

```json
{"key1":"value1", "key2": "value1"}
{"key1":"value2", "key2": "value2"}
Expand All @@ -112,15 +115,16 @@ FROM read_json_auto('records.json', format = 'newline_delimited');

<div class="narrow_table"></div>

| key1 | key2 |
|----------|----------|
| `value1` | `value1` |
| `value2` | `value2` |
| `value3` | `value3` |
| key1 | key2 |
|--------|--------|
| value1 | value1 |
| value2 | value2 |
| value3 | value3 |

### Format: `array`

If the JSON file contains a JSON array of objects (pretty-printed or not), `array_of_objects` may be used.
To demonstrate its use, we use the example file [`records-in-array.json`](/data/records-in-array.json):

```json
[
Expand All @@ -132,20 +136,21 @@ If the JSON file contains a JSON array of objects (pretty-printed or not), `arra

```sql
SELECT *
FROM read_json_auto('array.json', format = 'array');
FROM read_json_auto('records-in-array.json', format = 'array');
```

<div class="narrow_table"></div>

| key1 | key2 |
|----------|----------|
| `value1` | `value1` |
| `value2` | `value2` |
| `value3` | `value3` |
| key1 | key2 |
|--------|--------|
| value1 | value1 |
| value2 | value2 |
| value3 | value3 |

### Format: `unstructured`

If the JSON file contains JSON that is not newline-delimited or an array, `unstructured` may be used.
To demonstrate its use, we use the example file [`unstructured.json`](/data/unstructured.json):

```json
{
Expand All @@ -169,18 +174,18 @@ FROM read_json_auto('unstructured.json', format = 'unstructured');

<div class="narrow_table"></div>

| key1 | key2 |
|----------|----------|
| `value1` | `value1` |
| `value2` | `value2` |
| `value3` | `value3` |
| key1 | key2 |
|--------|--------|
| value1 | value1 |
| value2 | value2 |
| value3 | value3 |

## Examples of Records Settings

The JSON extension can attempt to determine whether a JSON file contains records when setting `records = auto`.
When `records = true`, the JSON extension expects JSON objects, and will unpack the fields of JSON objects into individual columns.

Continuing with the same example file from before:
Continuing with the same example file, [`records.json`](/data/records.json):

```json
{"key1":"value1", "key2": "value1"}
Expand All @@ -195,11 +200,11 @@ FROM read_json_auto('records.json', records = true);

<div class="narrow_table"></div>

| key1 | key2 |
|----------|----------|
| `value1` | `value1` |
| `value2` | `value2` |
| `value3` | `value3` |
| key1 | key2 |
|--------|--------|
| value1 | value1 |
| value2 | value2 |
| value3 | value3 |

When `records = false`, the JSON extension will not unpack the top-level objects, and create `STRUCT`s instead:

Expand All @@ -210,13 +215,13 @@ FROM read_json_auto('records.json', records = false);

<div class="narrow_table"></div>

| json |
|------------------------------------|
| `{'key1': value1, 'key2': value1}` |
| `{'key1': value2, 'key2': value2}` |
| `{'key1': value3, 'key2': value3}` |
| json |
|----------------------------------|
| {'key1': value1, 'key2': value1} |
| {'key1': value2, 'key2': value2} |
| {'key1': value3, 'key2': value3} |

This is especially useful if we have non-object JSON, for example:
This is especially useful if we have non-object JSON, for example, [`arrays.json`](/data/arrays.json):

```json
[1, 2, 3]
Expand All @@ -231,11 +236,11 @@ FROM read_json_auto('arrays.json', records = false);

<div class="narrow_table"></div>

| json |
|-------------|
| `[1, 2, 3]` |
| `[4, 5, 6]` |
| `[7, 8, 9]` |
| json |
|-----------|
| [1, 2, 3] |
| [4, 5, 6] |
| [7, 8, 9] |

## Writing

Expand Down
6 changes: 4 additions & 2 deletions docs/extensions/json.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,8 +249,10 @@ DuckDB can convert JSON arrays directly to its internal `LIST` type, and missing

```sql
SELECT *
FROM read_json(['my_file1.json', 'my_file2.json'],
columns = {duck: 'INTEGER', goose: 'INTEGER[]', swan: 'DOUBLE'});
FROM read_json(
['my_file1.json', 'my_file2.json'],
columns = {duck: 'INTEGER', goose: 'INTEGER[]', swan: 'DOUBLE'}
);
```

<div class="narrow_table"></div>
Expand Down

0 comments on commit 439edec

Please sign in to comment.