Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Adding TRT options/task #435

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
Draft

WIP: Adding TRT options/task #435

wants to merge 11 commits into from

Conversation

pranavm-nvidia
Copy link
Collaborator

No description provided.

Comment on lines 38 to 50
// TODO (pranavm): Figure out a better way to reuse TRT translation options -
// maybe move to options providers?
struct TensorRTOptions
: public mlirtrt::compiler::OptionsProvider<TensorRTOptions> {
mlir::tensorrt::TensorRTTranslationOptions options;

void addToOptions(mlir::OptionsContext &context) {
options.addToOptions(context);
}
};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move TensorRTTranslationOptions to make them an options provider if that makes sense to do.

Comment on lines 30 to 42
TensorRTToExecutableOptions::TensorRTToExecutableOptions(
TaskExtensionRegistry extensions) {
// TODO (pranavm): Do we need to support extensions?
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want to require extensions for all options types or if we need to handle both cases in the options registry. If it's the former, then I can just assert that the extensions are empty here (or maybe even just add support?). If it's the latter, we could have a setExtensions method so it becomes optional instead of having it part of the constructor.

@yizhuoz004 yizhuoz004 force-pushed the trt-task branch 2 times, most recently from 0c2e89c to 250f6f4 Compare January 21, 2025 23:45
Fix TensorRTOptions registration

LINK_LIBS PUBLIC
MLIRIR
MLIRTensorRTRegistration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch-all library has been removed in a commit which hasn't been sync'd up yet. Just remove this and add in dependent libraries explicitly.


buildPostClusteringPipeline(pm, options);

mlir::executor::ConvertStdToExecutorPassOptions stdToExecOpts;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will want to add a host-target flag which can take options executor, llvm or emitc. This executor lowering pass is just for the executor option. The pipeliens for the other branches will be sync'd up Friday

SmallVector<Value> inputs =
makeRegionIsolatedFromAbove(rewriter, inlineGroupOp.getRegion());

tensorrt::TensorRTModuleOp trtModule = getOrCreateTensorRTModuleOp(inlineGroupOp);
Copy link
Collaborator

@yizhuoz004 yizhuoz004 Jan 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christopherbate I had to move getOrCreateTensorRTModuleOp here to make this pass work, otherwise in createOutlinedFunc, at line SymbolTable(module) will hit seg fault. Could you check if this is correct?

An example run, input:

func.func @trt_gather_default1(%arg0: tensor<10x20x30xf32>, %arg1: tensor<2x5xi32>,
                %arg2: tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> {
  %0 = tensorrt.gather {
    axis = 1 : i64
  } ins(%arg0, %arg1 : tensor<10x20x30xf32>, tensor<2x5xi32>) -> tensor<10x2x5x30xf32>
  %1 = tensorrt.element_wise <kSUM>(%0, %arg2 : tensor<10x2x5x30xf32>, tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32>
  return %1 : tensor<10x2x5x30xf32>
}

Output:

module {
  func.func @trt_gather_default1(%arg0: tensor<10x20x30xf32>, %arg1: tensor<2x5xi32>, %arg2: tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> {
    %0 = tensorrt.call_alloc @trt_engines::@tensorrt_cluster(%arg0, %arg1, %arg2 : tensor<10x20x30xf32>, tensor<2x5xi32>, tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32>
    return %0 : tensor<10x2x5x30xf32>
  }
  tensorrt.module @trt_engines {
    func.func @tensorrt_cluster(%arg0: tensor<10x20x30xf32>, %arg1: tensor<2x5xi32>, %arg2: tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> {
      %0 = tensorrt.gather {axis = 1 : i64} ins(%arg0, %arg1 : tensor<10x20x30xf32>, tensor<2x5xi32>) -> tensor<10x2x5x30xf32>
      %1 = tensorrt.element_wise <kSUM>(%0, %arg2 : tensor<10x2x5x30xf32>, tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32>
      return %1 : tensor<10x2x5x30xf32>
    }
  }
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented above on how to fix. You should move it out of outlineOp back to its original position in order to avoid performing multiple linear scans unnecessarily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants