Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
lmangani authored Dec 8, 2024
1 parent 284a9c2 commit 92d6986
Showing 1 changed file with 29 additions and 84 deletions.
113 changes: 29 additions & 84 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,93 +1,38 @@
# DuckDB Rust extension template
This is an **experimental** template for Rust based extensions based on the C Extension API of DuckDB. The goal is to
turn this eventually into a stable basis for pure-Rust DuckDB extensions that can be submitted to the Community extensions
repository
<img src="https://github.com/user-attachments/assets/46a5c546-7e9b-42c7-87f4-bc8defe674e0" width=250 />

Features:
- No DuckDB build required
- No C++ or C code required
- CI/CD chain preconfigured
- (Coming soon) Works with community extensions
# DuckDB Clickhouse Native File reader
This experimental rust extension allows reading ClickHouse Native Format database files.

## Cloning
> Experimental: USE AT YOUR OWN RISK!
Clone the repo with submodules

```shell
git clone --recurse-submodules <repo>
```

## Dependencies
In principle, these extensions can be compiled with the Rust toolchain alone. However, this template relies on some additional
tooling to make life a little easier and to be able to share CI/CD infrastructure with extension templates for other languages:

- Python3
- Python3-venv
- [Make](https://www.gnu.org/software/make)
- Git

Installing these dependencies will vary per platform:
- For Linux, these come generally pre-installed or are available through the distro-specific package manager.
- For MacOS, [homebrew](https://formulae.brew.sh/).
- For Windows, [chocolatey](https://community.chocolatey.org/).

## Building
After installing the dependencies, building is a two-step process. Firstly run:
```shell
make configure
<!--
### 📦 Installation
```sql
INSTALL clickhouse_native FROM community;
LOAD clickhouse_native;
```
This will ensure a Python venv is set up with DuckDB and DuckDB's test runner installed. Additionally, depending on configuration,
DuckDB will be used to determine the correct platform for which you are compiling.
-->

Then, to build the extension run:
```shell
make debug
```
This delegates the build process to cargo, which will produce a shared library in `target/debug/<shared_lib_name>`. After this step,
a script is run to transform the shared library into a loadable extension by appending a binary footer. The resulting extension is written
to the `build/debug` directory.

To create optimized release binaries, simply run `make release` instead.

## Testing
This extension uses the DuckDB Python client for testing. This should be automatically installed in the `make configure` step.
The tests themselves are written in the SQLLogicTest format, just like most of DuckDB's tests. A sample test can be found in
`test/sql/<extension_name>.test`. To run the tests using the *debug* build:

```shell
make test_debug
```
### Input
Generate some files with `clickhouse-local` or `clickhouse-server`

or for the *release* build:
```shell
make test_release
```

### Version switching
Testing with different DuckDB versions is really simple:

First, run
```
make clean_all
```
to ensure the previous `make configure` step is deleted.

Then, run
```
DUCKDB_TEST_VERSION=v1.1.2 make configure
```
to select a different duckdb version to test with

Finally, build and test with
```
make debug
make test_debug
```sql
--- simple w/ one row, two columns
SELECT version(), number FROM numbers(1) INTO OUTFILE '/tmp/numbers.clickhouse' FORMAT Native;
--- simple w/ one column, five rows
SELECT number FROM numbers(5) INTO OUTFILE '/tmp/data.clickhouse' FORMAT Native;
--- complex w/ multiple types
SELECT * FROM system.functions LIMIT 10 INTO OUTFILE '/tmp/functions.clickhouse' FORMAT Native;
```

### Known issues
This is a bit of a footgun, but the extensions produced by this template may (or may not) be broken on windows on python3.11
with the following error on extension load:
```shell
IO Error: Extension '<name>.duckdb_extension' could not be loaded: The specified module could not be found
### Usage
Read ClickHouse Native files with DuckDB. _Full fils scans, no filtering/range implemented._
```sql
D SELECT * FROM clickhouse_native('/tmp/numbers.clickhouse');
┌──────────────┬─────────┐
│ version() │ number
varcharvarchar
├──────────────┼─────────┤
24.12.1.12730
└──────────────┴─────────┘
```
This was resolved by using python 3.12

0 comments on commit 92d6986

Please sign in to comment.