Skip to content

Commit

Permalink
Merge pull request #4 from rdf-connect/timebased-bucketizer
Browse files Browse the repository at this point in the history
feat: Move time-based fragmentation to dedicated bucketizer
chore: add formatting rules
0.3.0-alpha.4
  • Loading branch information
ajuvercr authored Aug 22, 2024
2 parents bc34c14 + fce14cf commit 049360d
Show file tree
Hide file tree
Showing 16 changed files with 3,150 additions and 2,551 deletions.
2 changes: 2 additions & 0 deletions .eslintignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
lib
node_modules
24 changes: 24 additions & 0 deletions .eslintrc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"env": {
"browser": true,
"es2021": true
},
"extends": [
"eslint:recommended",
"plugin:@typescript-eslint/recommended",
"prettier"
],
"parser": "@typescript-eslint/parser",
"parserOptions": {
"ecmaVersion": "latest",
"sourceType": "module"
},
"plugins": ["@typescript-eslint"],
"rules": {
"indent": ["error", 4],
"linebreak-style": ["error", "unix"],
"quotes": ["error", "double"],
"semi": ["error", "always"],
"@typescript-eslint/no-unused-vars": "warn"
}
}
1 change: 1 addition & 0 deletions .husky/pre-commit
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
npx lint-staged
3 changes: 3 additions & 0 deletions .lintstagedrc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"*.ts": ["eslint --fix", "prettier --write"]
}
2 changes: 2 additions & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
lib
node_modules
6 changes: 6 additions & 0 deletions .prettierrc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"trailingComma": "all",
"tabWidth": 4,
"semi": true,
"singleQuote": false
}
10 changes: 2 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

Given an [SDS stream](https://w3id.org/sds/specification) and its correspondent stream of members, this processor will write everything into a supported data storage system. So far, it only supports MongoDB instances.

SDS stream updates are stored into MongoDB collections for the LDES server to find this information when serving requests. If a `ldes:timestampPath` property is given as part of the dataset metadata, the storage writer will automatically start up a timestamp fragmentation, based on a B+ Tree strategy.
SDS stream updates are stored into MongoDB collections for the LDES server to find this information when serving requests.

An example of a SDS data stream with a predefined fragmentation strategy is shown next:

Expand Down Expand Up @@ -54,9 +54,7 @@ This processor can be used within data processing pipelines to write a SDS strea
js:metadata "METADATA";
js:data "DATA";
js:index "INDEX";
];
js:pageSize 500;
js:branchSize 4.
].
```

### As a library
Expand All @@ -68,8 +66,6 @@ async function ingest(
data: Stream<string | Quad[]>,
metadata: Stream<string | RDF.Quad[]>,
database: DBConfig,
maxsize: number = 100,
k: number = 4
) { /* snip */ }
```

Expand All @@ -78,8 +74,6 @@ arguments:
- `data`: a stream reader that carries data (as `string` or `Quad[]`).
- `metadata`: a stream reader that carries SDS metadata (as `string` or `Quad[]`).
- `database`: connection parameters for a reachable MongoDB instance.
- `maxsize`: max number of members per fragment.
- `k`: max number of child nodes in the default time-based B+ Tree fragmentation.

## Authors and License

Expand Down
Loading

0 comments on commit 049360d

Please sign in to comment.