[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906

rahulchaphalkar · 2024-10-03T16:37:26Z

OpenVINO Version

2024.3.0 https://github.com/rahulchaphalkar/openvino/tree/add-extension

Operating System

Ubuntu 20.04 (LTS)

Device used for inference

CPU

Framework

None

Model used

Detokenizer.xml from TinyLlama-1.1B-Chat-v1.0

Issue description

The STRING element_type has been added to C-API, but in my testing with models that expect string tensors and output them, I see incorrect results. I have a test case below comparing a working C++ case, and a failing C case. I have done some processing on the received string data as you can see in the test case below, but I'm not able to get a valid string output.
Reference - https://docs.openvino.ai/2024/openvino-workflow/running-inference/string-tensors.html

Step-by-step reproduction

Reproduction of getting string data from output of a model -
I was working with TinyLlama-1.1B-Chat-v1.0 which I got from recommended steps in optimum-cli/gen.ai repos. I'm loading an extension for both cases, I have added support for loading extensions in C-API in my open PR, so you will need to use that for C case below.
I am providing the detokenizer model with tokens extracted previously from Tinyllama model.

C++ case prints this correct output

./main /home/rahul/tools/TinyLlama-1.1B-Chat-v1.0
=2
- 2+2=4
- 3+3=6

C-Case prints some unvalid utf-8.

C++/Working Case

#include <openvino/openvino.hpp>

std::string detokenize(ov::InferRequest& detokenizer, std::vector<int64_t>& tokens) {
    constexpr size_t BATCH_SIZE = 1;
    detokenizer.set_input_tensor(ov::Tensor{ov::element::i64, {BATCH_SIZE, tokens.size()}, tokens.data()});
    detokenizer.infer();
    return detokenizer.get_output_tensor().data<std::string>()[0];
}

int main(int argc, char* argv[]) {

    std::vector<int64_t> accumulator = {29922, 29906, 13, 29899, 29871, 29906, 29974, 29906, 29922, 29946, 13, 29899, 29871, 29941, 29974, 29941, 29922, 29953, 13};
    ov::Core core;
    core.add_extension("/home/rahul/tools/tokenizers/libopenvino_tokenizers.so");

    ov::InferRequest detokenizer = core.compile_model(
        std::string{argv[1]} + "/openvino_detokenizer.xml", "CPU").create_infer_request();

    std::string text = detokenize(detokenizer, accumulator);
    std::cout << text << std::endl;
}

C/C-API/ Failing Case

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "openvino/c/openvino.h"

#define CHECK_STATUS(return_status)                                                      \
    if (return_status != OK) {                                                           \
        fprintf(stderr, "[ERROR] return status %d, line %d\n", return_status, __LINE__);                                                                       \
    }

char* detokenize(ov_infer_request_t* detokenizer, int64_t* tokens, int num_tokens) {
    const size_t BATCH_SIZE = 1;
    ov_status_e status;
    ov_tensor_t* input_tensor = NULL;
    ov_shape_t input_shape;
    int64_t input_shape_dims[2] = {BATCH_SIZE, num_tokens};

    status = ov_shape_create(2, input_shape_dims, &input_shape);
    if (status != OK) {
        fprintf(stderr, "Failed to create shape\n");
        return NULL;
    }

    status = ov_tensor_create_from_host_ptr(I64, input_shape, tokens, &input_tensor);
    if (status != OK) {
        fprintf(stderr, "Failed to create input tensor\n");
        return NULL;
    }

    status = ov_infer_request_set_input_tensor(detokenizer, input_tensor);
    if (status != OK) {
        fprintf(stderr, "Failed to set input tensor\n");
        return NULL;
    }

    status = ov_infer_request_infer(detokenizer);
    if (status != OK) {
        fprintf(stderr, "Failed to run inference\n");
        return NULL;
    }

    ov_tensor_t* output_tensor = NULL;
    status = ov_infer_request_get_output_tensor_by_index(detokenizer, 0, &output_tensor);
    if (status != OK) {
        fprintf(stderr, "Failed to get output tensor\n");
        return NULL;
    }

    void* output_data = NULL;
    status = ov_tensor_data(output_tensor, &output_data);
    if (status != OK) {
        fprintf(stderr, "Failed to get data from output tensor\n");
        return NULL;
    }

    size_t output_string_length = strlen((const char*)output_data);
    char* detokenized_string = (char*)malloc(output_string_length + 1);
    if (!detokenized_string) {
        fprintf(stderr, "Failed to allocate memory for detokenized string\n");
        return NULL;
    }
    strncpy(detokenized_string, (const char*)output_data, output_string_length);
    detokenized_string[output_string_length] = '\0';

    ov_tensor_free(input_tensor);
    ov_tensor_free(output_tensor);

    return detokenized_string;
}

int main(int argc, char** argv) {
    
    ov_core_t* core = NULL;
    ov_model_t* model = NULL;
    ov_compiled_model_t* compiled_model = NULL;
    ov_infer_request_t* detokenizer_request = NULL;
    char* text = NULL;
    int64_t accumulator[] = {29922, 29906, 13, 29899, 29871, 29906, 29974, 29906, 29922, 29946, 13, 29899, 29871, 29941, 29974, 29941, 29922, 29953, 13};

    const char* input_model = argv[1];
    const char* input_model_bin = argv[2];

    const char* tokenizers_path="/home/rahul/tools/tokenizers/libopenvino_tokenizers.so";

    CHECK_STATUS(ov_core_create(&core));
    CHECK_STATUS(ov_core_add_extension(core, tokenizers_path));
    CHECK_STATUS(ov_core_read_model(core, input_model, input_model_bin, &model));
    CHECK_STATUS(ov_core_compile_model(core, model, "CPU", 0, &compiled_model));
    CHECK_STATUS(ov_compiled_model_create_infer_request(compiled_model, &detokenizer_request));

    text = detokenize(detokenizer_request, accumulator, sizeof(accumulator) / sizeof(accumulator[0]));
    printf("text is %s", text);
}

Relevant log output

No response

Issue submission checklist

I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.

The text was updated successfully, but these errors were encountered:

mlukasze · 2024-10-04T04:44:27Z

@peterchen-intel could you take a look, please?

rahulchaphalkar · 2024-10-07T16:35:56Z

Ok, I think get string tensor case shown above works, I can access the string data in c like such -

void* output_data = NULL;
    status = ov_tensor_data(output_tensor, &output_data);
    if (status != OK) {
        fprintf(stderr, "Failed to get data from output tensor\n");
        return NULL;
    }

    const char* string_data = *(const char**)output_data;

    size_t output_string_length = strlen(string_data);
    char* detokenized_string = (char*)malloc(output_string_length + 1);
    if (!detokenized_string) {
        fprintf(stderr, "Failed to allocate memory for detokenized string\n");
        return NULL;
    }
    strncpy(detokenized_string, string_data, output_string_length);
    detokenized_string[output_string_length] = '\0';

If there's a better way to access string data, do let me know.

I'm still having problems setting a string tensor from C-api. I will try to update with another comment to document test-cases if possible.

riverlijunjie · 2024-10-08T02:14:22Z

@rahulchaphalkar I don't think current C APIs support string tensor, so we need new APIs to support string tensor in C API, such as create, access and free string tensor.

rkazants · 2024-10-08T08:21:54Z

Hi @rahulchaphalkar,

Now it is possible to create ov::Tensor of element::string type in C from scratch (without using host_ptr to create tensor on top of pre-allocated data). However, it is not possible to assign tensor elements for element::string. That is because we expect std::string objects by data pointer that is not possible to create in C. Do not mix std::string with C strings.

We may consider to extend ov_tensor_create_from_host_ptr function where you pass array of C strings to and we create std::string objects under the hood in bindings and assign to ov::Tensor element by element so that the tensor owns these std::string objects. However, it contradicts concepts of ov_tensor_create_from_host_ptr a bit and leads extra copies.
@riverlijunjie, how do you think?

Best regards,
Roman

riverlijunjie · 2024-10-08T13:47:37Z

Hi @rahulchaphalkar,

Now it is possible to create ov::Tensor of element::string type in C from scratch (without using host_ptr to create tensor on top of pre-allocated data). However, it is not possible to assign tensor elements for element::string. That is because we expect std::string objects by data pointer that is not possible to create in C. Do not mix std::string with C strings.

We may consider to extend ov_tensor_create_from_host_ptr function where you pass array of C strings to and we create std::string objects under the hood in bindings and assign to ov::Tensor element by element so that the tensor owns these std::string objects. However, it contradicts concepts of ov_tensor_create_from_host_ptr a bit and leads extra copies. @riverlijunjie, how do you think?

Best regards, Roman

From C API perspective, string tensor is very different with other numerical tensor, why not add dedicated C APIs to support string tensor, such as ov_tensor_create_from_string_array ?

rkazants · 2024-10-08T13:50:35Z

From C API perspective, string tensor is very different with other numerical tensor, why not add dedicated C APIs to support string tensor, such as ov_tensor_create_from_string_array ?

Yep, we can do this.

@rahulchaphalkar, what is the priority of this task to support string tensors in C? Who is the customer and user scenario? We can continue to discuss this by work email.

Best regards,
Roman

rahulchaphalkar · 2024-10-09T03:11:08Z

Thanks for the details, both of you.
@rkazants This is primarily being used to generate rust bindings, and then in places like wasmtime-wasi-nn to showcase some LLM examples.
I started using String Tensor type because it is exposed in /bindings/c/ dir like here.
@riverlijunjie I can create (draft?) PRs for these string C-Apis if you'd like. I can create a "discussion" so that design choices like where we're allocating memory are finalized. Or this issue is fine as well for that.

rkazants · 2024-10-09T04:51:36Z

@rahulchaphalkar, this one issue is fine for discussion and to start development of this feature. Please create PR to openvino repository.

Best regards,
Roman

riverlijunjie · 2024-10-09T10:41:04Z

Thanks for the details, both of you. @rkazants This is primarily being used to generate rust bindings, and then in places like wasmtime-wasi-nn to showcase some LLM examples. I started using String Tensor type because it is exposed in /bindings/c/ dir like here. @riverlijunjie I can create (draft?) PRs for these string C-Apis if you'd like. I can create a "discussion" so that design choices like where we're allocating memory are finalized. Or this issue is fine as well for that.

@rahulchaphalkar It will be great if you can create PR for it!

rahulchaphalkar added bug Something isn't working support_request labels Oct 3, 2024

mlukasze assigned peterchen-intel Oct 4, 2024

ilya-lavrenov added this to the 2024.5 milestone Oct 7, 2024

ilya-lavrenov assigned riverlijunjie Oct 7, 2024

rkazants self-assigned this Oct 8, 2024

rkazants removed the bug Something isn't working label Oct 8, 2024

rahulchaphalkar mentioned this issue Oct 9, 2024

Update bindings to v2024.4.0 intel/openvino-rs#144

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906

[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906

rahulchaphalkar commented Oct 3, 2024

mlukasze commented Oct 4, 2024

rahulchaphalkar commented Oct 7, 2024

riverlijunjie commented Oct 8, 2024

rkazants commented Oct 8, 2024

riverlijunjie commented Oct 8, 2024

rkazants commented Oct 8, 2024

rahulchaphalkar commented Oct 9, 2024

rkazants commented Oct 9, 2024 •

edited

Loading

riverlijunjie commented Oct 9, 2024

[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906

[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906

Comments

rahulchaphalkar commented Oct 3, 2024

OpenVINO Version

Operating System

Device used for inference

Framework

Model used

Issue description

Step-by-step reproduction

Relevant log output

Issue submission checklist

mlukasze commented Oct 4, 2024

rahulchaphalkar commented Oct 7, 2024

riverlijunjie commented Oct 8, 2024

rkazants commented Oct 8, 2024

riverlijunjie commented Oct 8, 2024

rkazants commented Oct 8, 2024

rahulchaphalkar commented Oct 9, 2024

rkazants commented Oct 9, 2024 • edited Loading

riverlijunjie commented Oct 9, 2024

rkazants commented Oct 9, 2024 •

edited

Loading