Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documented example of extending TransformComponent is incorrect #1353

Open
brettimus opened this issue Oct 20, 2024 · 4 comments
Open

Documented example of extending TransformComponent is incorrect #1353

brettimus opened this issue Oct 20, 2024 · 4 comments
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@brettimus
Copy link

Issue in apps/docs/docs/modules/ingestion_pipeline/transformations.md

There is an issue with an example of ingestion/transformation on the documentation site.

Regarding custom transformations (website here | doc page in repo here), it's shown you need to implement a transform method on an extension of the TransformComponent class:

import { TransformComponent, TextNode } from "llamaindex";

export class RemoveSpecialCharacters extends TransformComponent {
  async transform(nodes: TextNode[]): Promise<TextNode[]> {
    for (const node of nodes) {
      node.text = node.text.replace(/[^\w\s]/gi, "");
    }

    return nodes;
  }
}

Then use it like this:

async function main() {
  const pipeline = new IngestionPipeline({
    transformations: [new RemoveSpecialCharacters()],
  });

  const nodes = await pipeline.run({
    documents: [
      new Document({ text: "I am 10 years old. John is 20 years old." }),
    ],
  });

  for (const node of nodes) {
    console.log(node.getContent(MetadataMode.NONE));
  }
}

However, the implementation of TransformComponent expects a transform function to be passed to the constructor (code here, in packages/core/src/schema/types.ts)

export class TransformComponent {
  constructor(transformFn: TransformComponentSignature) {
    Object.defineProperties(
      transformFn,
      Object.getOwnPropertyDescriptors(this.constructor.prototype),
    );
    const transform = function transform(
      ...args: Parameters<TransformComponentSignature>
    ) {
      return transformFn(...args);
    };
    Reflect.setPrototypeOf(transform, new.target.prototype);
    transform.id = randomUUID();
    return transform;
  }
}

This means we get a type error when using the code from the docs:

image image
@himself65 himself65 added bug Something isn't working documentation Improvements or additions to documentation labels Oct 20, 2024
@himself65
Copy link
Member

This is kind of document issue, but I'm thinking to improve some logic

@brettimus
Copy link
Author

I can open a docfix if you’d like, but wasn’t sure of how to document behavior given all the possibilities here ((filtering node type, etc))

If you have some guidance I could take a crack

@brettimus
Copy link
Author

Doesn’t have to be super concrete guidance, just something to get me started writing

@himself65
Copy link
Member

You can start with this code:

export class TransformComponent {

Add the code to sth like

abstract transform();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants