WIP: Adding TRT options/task #435

pranavm-nvidia · 2024-12-09T23:22:34Z

No description provided.

pranavm-nvidia · 2024-12-09T23:27:39Z

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable.h

+// TODO (pranavm): Figure out a better way to reuse TRT translation options -
+// maybe move to options providers?
+struct TensorRTOptions
+    : public mlirtrt::compiler::OptionsProvider<TensorRTOptions> {
+  mlir::tensorrt::TensorRTTranslationOptions options;
+
+  void addToOptions(mlir::OptionsContext &context) {
+    options.addToOptions(context);
+  }
+};


I can move TensorRTTranslationOptions to make them an options provider if that makes sense to do.

mlir-tensorrt/compiler/include/mlir-tensorrt/Registration/RegisterMlirTensorRtPasses.h

mlir-tensorrt/compiler/lib/Compiler/CMakeLists.txt

pranavm-nvidia · 2024-12-09T23:32:49Z

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable.cpp

+TensorRTToExecutableOptions::TensorRTToExecutableOptions(
+    TaskExtensionRegistry extensions) {
+  // TODO (pranavm): Do we need to support extensions?
+}


Not sure if we want to require extensions for all options types or if we need to handle both cases in the options registry. If it's the former, then I can just assert that the extensions are empty here (or maybe even just add support?). If it's the latter, we could have a setExtensions method so it becomes optional instead of having it part of the constructor.

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/OptionsProviders.h

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/PassManagerUtils.h

…task

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable/Passes.td

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp

...tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable/TensorRTToExecutable.h

Fix TensorRTOptions registration

mlir-tensorrt/python/bindings/Compiler/CompilerPyBind.cpp

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/OptionsRegistry.h

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable/Passes.h

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable/Passes.td

...tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable/TensorRTToExecutable.h

mlir-tensorrt/compiler/include/mlir-tensorrt/Registration/RegisterMlirTensorRtPasses.h

christopherbate · 2025-01-23T21:54:54Z

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/CMakeLists.txt

+
+    LINK_LIBS PUBLIC
+    MLIRIR
+    MLIRTensorRTRegistration


This catch-all library has been removed in a commit which hasn't been sync'd up yet. Just remove this and add in dependent libraries explicitly.

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp

christopherbate · 2025-01-23T21:57:00Z

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/TensorRTToExecutable.cpp

+
+  buildPostClusteringPipeline(pm, options);
+
+  mlir::executor::ConvertStdToExecutorPassOptions stdToExecOpts;


You will want to add a host-target flag which can take options executor, llvm or emitc. This executor lowering pass is just for the executor option. The pipeliens for the other branches will be sync'd up Friday

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/TensorRTToExecutable.cpp

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/TensorRTToExecutable.cpp

yizhuoz004 · 2025-01-25T00:05:37Z

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp

+  SmallVector<Value> inputs =
+      makeRegionIsolatedFromAbove(rewriter, inlineGroupOp.getRegion());
+
+  tensorrt::TensorRTModuleOp trtModule = getOrCreateTensorRTModuleOp(inlineGroupOp);


@christopherbate I had to move getOrCreateTensorRTModuleOp here to make this pass work, otherwise in createOutlinedFunc, at line SymbolTable(module) will hit seg fault. Could you check if this is correct?

An example run, input:

func.func @trt_gather_default1(%arg0: tensor<10x20x30xf32>, %arg1: tensor<2x5xi32>, %arg2: tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> { %0 = tensorrt.gather { axis = 1 : i64 } ins(%arg0, %arg1 : tensor<10x20x30xf32>, tensor<2x5xi32>) -> tensor<10x2x5x30xf32> %1 = tensorrt.element_wise <kSUM>(%0, %arg2 : tensor<10x2x5x30xf32>, tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> return %1 : tensor<10x2x5x30xf32> }

Output:

module { func.func @trt_gather_default1(%arg0: tensor<10x20x30xf32>, %arg1: tensor<2x5xi32>, %arg2: tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> { %0 = tensorrt.call_alloc @trt_engines::@tensorrt_cluster(%arg0, %arg1, %arg2 : tensor<10x20x30xf32>, tensor<2x5xi32>, tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> return %0 : tensor<10x2x5x30xf32> } tensorrt.module @trt_engines { func.func @tensorrt_cluster(%arg0: tensor<10x20x30xf32>, %arg1: tensor<2x5xi32>, %arg2: tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> { %0 = tensorrt.gather {axis = 1 : i64} ins(%arg0, %arg1 : tensor<10x20x30xf32>, tensor<2x5xi32>) -> tensor<10x2x5x30xf32> %1 = tensorrt.element_wise <kSUM>(%0, %arg2 : tensor<10x2x5x30xf32>, tensor<10x2x5x30xf32>) -> tensor<10x2x5x30xf32> return %1 : tensor<10x2x5x30xf32> } } }

Commented above on how to fix. You should move it out of outlineOp back to its original position in order to avoid performing multiple linear scans unnecessarily.

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp

pranavm-nvidia commented Dec 9, 2024

View reviewed changes

mlir-tensorrt/compiler/include/mlir-tensorrt/Registration/RegisterMlirTensorRtPasses.h Outdated Show resolved Hide resolved

pranavm-nvidia commented Dec 9, 2024

View reviewed changes

mlir-tensorrt/compiler/lib/Compiler/CMakeLists.txt Outdated Show resolved Hide resolved

pranavm-nvidia commented Dec 9, 2024

View reviewed changes

christopherbate reviewed Dec 10, 2024

View reviewed changes

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/OptionsProviders.h Outdated Show resolved Hide resolved

christopherbate reviewed Dec 10, 2024

View reviewed changes

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/PassManagerUtils.h Outdated Show resolved Hide resolved

pranavm-nvidia force-pushed the trt-task branch from 137caf5 to 08f90f3 Compare December 11, 2024 19:01

pranavm-nvidia added 5 commits December 18, 2024 13:39

initial commit

9fc6f31

review comments

e50393b

add todo

b95a52a

add passes

04bfde2

fix compile

88d48b9

pranavm-nvidia force-pushed the trt-task branch from 08f90f3 to 88d48b9 Compare December 18, 2024 21:55

Merge branch 'main' of github.com:NVIDIA/TensorRT-Incubator into trt-…

7965ac4

…task

yizhuoz004 force-pushed the trt-task branch from a76d7e8 to b57dbba Compare January 16, 2025 00:30

yizhuoz004 reviewed Jan 16, 2025

View reviewed changes

mlir-tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable/Passes.td Outdated Show resolved Hide resolved

yizhuoz004 reviewed Jan 16, 2025

View reviewed changes

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp Outdated Show resolved Hide resolved

yizhuoz004 reviewed Jan 16, 2025

View reviewed changes

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp Outdated Show resolved Hide resolved

yizhuoz004 reviewed Jan 16, 2025

View reviewed changes

mlir-tensorrt/compiler/lib/Compiler/TensorRTToExecutable/Passes.cpp Outdated Show resolved Hide resolved

yizhuoz004 force-pushed the trt-task branch 2 times, most recently from 0c2e89c to 250f6f4 Compare January 21, 2025 23:45

Add OutlineTensorRTOpPass

9183619

yizhuoz004 force-pushed the trt-task branch from 250f6f4 to 9183619 Compare January 22, 2025 19:21

yizhuoz004 reviewed Jan 23, 2025

View reviewed changes

...tensorrt/compiler/include/mlir-tensorrt/Compiler/TensorRTToExecutable/TensorRTToExecutable.h Outdated Show resolved Hide resolved

Add pybindings for TensorRTToExecutableOptions

6e552ca

Fix TensorRTOptions registration

yizhuoz004 force-pushed the trt-task branch from f402267 to 6e552ca Compare January 23, 2025 19:30

pranavm-nvidia commented Jan 23, 2025

View reviewed changes

mlir-tensorrt/python/bindings/Compiler/CompilerPyBind.cpp Outdated Show resolved Hide resolved

yizhuoz004 force-pushed the trt-task branch from b9f57ad to aa1eda2 Compare January 23, 2025 21:01

Remove pybindings

bd34da5

yizhuoz004 force-pushed the trt-task branch from aa1eda2 to bd34da5 Compare January 23, 2025 21:21