Skip to content

Commit

Permalink
doc: wip
Browse files Browse the repository at this point in the history
  • Loading branch information
batosai committed Oct 26, 2024
1 parent 994ded0 commit 4ce3174
Show file tree
Hide file tree
Showing 6 changed files with 205 additions and 13 deletions.
35 changes: 35 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# ETL for Node.js

ETL (Extract, Transform, Load) is a process used to extract information from multiple sources, transform it according to your needs, and then load it into a target system, such as a database.

**Extract (Source):**
This step involves collecting data from various sources, which may include databases, files (JSON, CSV, XLS, etc.), APIs, etc.

**Transform:**
The extracted data is often in different formats or does not meet the requirements of the target system. The transformation adjusts and formats the data to make it consistent and usable.
This may include data type conversion, data cleaning (removing duplicates, correcting errors), data enrichment, and applying business rules.

**Load (Destination):**
After transformation, the data is sent to the destination, such as a database, file, API, etc.
This step can be done incrementally (only adding new data) or by reloading the entire dataset.


## Installation

```sh
npm install @jrmc/etl
```

## Run

```ts
import etl from '@jrmc/etl'

await etl.run({
source: UserSource,
transform: UserTransform, // optional
destination: UserDestination,
})
```

view Documentation
4 changes: 2 additions & 2 deletions docs/.vitepress/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ export default defineConfig({
{
text: 'Samples',
items: [
{ text: 'DB to CSV', link: '/samples/getting-started' },
{ text: 'xlsx to db', link: '/samples/usage' },
{ text: 'DB to CSV', link: '/samples/db-to-csv' },
{ text: 'xlsx to db', link: '/samples/xlsx-to-db' },
]
},
{
Expand Down
6 changes: 5 additions & 1 deletion docs/changelog.md
Original file line number Diff line number Diff line change
@@ -1 +1,5 @@
# changelog
# changelog

## 1.0.0

- first version
16 changes: 12 additions & 4 deletions docs/guide/detail.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,18 @@ export interface Etl {
## Type

```ts
type EtlAttributes = {
source: LazyImport | AsyncIterator
transform?: LazyImport | AsyncWithData
destination: LazyImport | AsyncWithData
export type LazyImport = () => Promise<ImportConstructor>
export type AsyncIterator = () => AsyncIterableIterator<any>
export type AsyncWithData = (data: any) => Promise<any>

export type SourceEtl = LazyImport | [LazyImport, options: Object] | AsyncIterator
export type TransformEtl = LazyImport | AsyncWithData
export type DestinationEtl = LazyImport | [LazyImport, options: Object] | AsyncWithData

export type EtlAttributes = {
source: SourceEtl
transform?: TransformEtl
destination: DestinationEtl
}
```
Expand Down
155 changes: 151 additions & 4 deletions docs/guide/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ await etl.run({
```

```ts [user_array_source.js]
import { Source } from '@jrmc/etl'
import { Source } from '@jrmc/etl/types'

export default class UserArraySource extends Source {
async *each() {
Expand All @@ -36,7 +36,7 @@ export default class UserArraySource extends Source {
```

```ts [user_array_to_db_transform.js]
import { Transform } from '@jrmc/etl'
import { Transform } from '@jrmc/etl/types'

type User = {
firstname: string
Expand All @@ -55,7 +55,7 @@ export default class UserArrayToDbTransform extends Transform {
```

```ts [user_db_destination.js]
import { Destination } from '@jrmc/etl'
import { Destination } from '@jrmc/etl/types'

export default class UserDbDestination extends Destination {
async write(row: any) {
Expand All @@ -66,6 +66,78 @@ export default class UserDbDestination extends Destination {

:::

## by [LazyImport](/guide/detail#type) with options

::: code-group


```ts [main.js]
import etl from '@jrmc/etl'

const UserSource = () => import('./user_array_source.js')
const UserDestination = () => import('./user_db_destination.js')

await etl.run({
source: [UserSource, {
data: [
{ lastname: 'Doe', firstname: 'John', age: 30 },
{ lastname: 'Doe', firstname: 'Jane', age: 25 },
]
}],
destination: [UserDestination, {
age: 22
}],
})
```

```ts [user_array_source.js]
import { Source } from '@jrmc/etl/types'

type Options = Record<string, Array<Object>>

export default class TestWithOptionsSource implements Source {
#data: Array<Object>

constructor(options: Options) {
this.#data = options.data || [];
}

async *each() {
for (let item of this.#data) {
yield item
}
}
}
```

```ts [user_db_destination.js]
import { Destination } from '@jrmc/etl/types'

type Options = {
age: number
}

type Person = {
firstname: string
lastname: string
age: number
}

export default class TestWithOptionsDestination implements Destination {
#age: number | null

constructor(options: Options) {
this.#age = options.age || null;
}

async write(row: Person) {
User.create({ lastname: row.lastname, firstname: row.firstname, age: this.#age ? this.#age : row.age })
}
}
```

:::

## by Functions

Use [AsyncIterator and AsyncWithData](/guide/detail#type) functions
Expand Down Expand Up @@ -94,4 +166,79 @@ await etl.run({
User.create(row)
},
})
```
```

## get results

`run` method return array if Destination class return a result.

::: code-group

```ts [main.js]
import etl from '@jrmc/etl'

const UserSource = () => import('./user_array_source.js')
const UserTransform = () => import('./user_array_to_db_transform.js')
const UserDestination = () => import('./user_db_destination.js')

const results = await etl.run({
source: UserSource,
transform: UserTransform,
destination: UserDestination,
})

/*
return :
[
{ name: 'John Doe', age: 30 },
{ name: 'Jane Doe', age: 25 },
]
*/
```

```ts [user_array_source.js]
import { Source } from '@jrmc/etl/types'

export default class UserArraySource extends Source {
async *each() {
const dataArray = [
{ lastname: 'Doe', firstname: 'John', age: 30 },
{ lastname: 'Doe', firstname: 'Jane', age: 25 },
]

for (let item of dataArray) {
yield item
}
}
}
```

```ts [user_array_to_db_transform.js]
import { Transform } from '@jrmc/etl/types'

type User = {
firstname: string
lastname: string
age: number
}

export default class UserArrayToDbTransform extends Transform {
async process(row: User) {
return {
name: `${row.firstname} ${row.lastname}`,
age: row.age,
}
}
}
```

```ts [user_db_destination.js]
import { Destination } from '@jrmc/etl/types'

export default class UserDbDestination extends Destination {
async write(row: any) {
return row
}
}

:::
2 changes: 0 additions & 2 deletions src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,9 @@ export type TransformEtl = LazyImport | AsyncWithData
export type DestinationEtl = LazyImport | [LazyImport, options: Object] | AsyncWithData

export type EtlAttributes = {
preProcess?: () => Promise<any>
source: SourceEtl
transform?: TransformEtl
destination: DestinationEtl
postProcess?: () => Promise<any>
}

export interface Etl {
Expand Down

0 comments on commit 4ce3174

Please sign in to comment.