Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906

Open
3 tasks done
rahulchaphalkar opened this issue Oct 3, 2024 · 9 comments
Open
3 tasks done

[Bug]: Set/Get String Tensor Data via C-API Does Not Work #26906

rahulchaphalkar opened this issue Oct 3, 2024 · 9 comments
Assignees
Milestone

Comments

@rahulchaphalkar
Copy link

OpenVINO Version

2024.3.0 https://github.com/rahulchaphalkar/openvino/tree/add-extension

Operating System

Ubuntu 20.04 (LTS)

Device used for inference

CPU

Framework

None

Model used

Detokenizer.xml from TinyLlama-1.1B-Chat-v1.0

Issue description

The STRING element_type has been added to C-API, but in my testing with models that expect string tensors and output them, I see incorrect results. I have a test case below comparing a working C++ case, and a failing C case. I have done some processing on the received string data as you can see in the test case below, but I'm not able to get a valid string output.
Reference - https://docs.openvino.ai/2024/openvino-workflow/running-inference/string-tensors.html

Step-by-step reproduction

Reproduction of getting string data from output of a model -
I was working with TinyLlama-1.1B-Chat-v1.0 which I got from recommended steps in optimum-cli/gen.ai repos. I'm loading an extension for both cases, I have added support for loading extensions in C-API in my open PR, so you will need to use that for C case below.
I am providing the detokenizer model with tokens extracted previously from Tinyllama model.

C++ case prints this correct output

./main /home/rahul/tools/TinyLlama-1.1B-Chat-v1.0
=2
- 2+2=4
- 3+3=6

C-Case prints some unvalid utf-8.

C++/Working Case

#include <openvino/openvino.hpp>

std::string detokenize(ov::InferRequest& detokenizer, std::vector<int64_t>& tokens) {
    constexpr size_t BATCH_SIZE = 1;
    detokenizer.set_input_tensor(ov::Tensor{ov::element::i64, {BATCH_SIZE, tokens.size()}, tokens.data()});
    detokenizer.infer();
    return detokenizer.get_output_tensor().data<std::string>()[0];
}

int main(int argc, char* argv[]) {

    std::vector<int64_t> accumulator = {29922, 29906, 13, 29899, 29871, 29906, 29974, 29906, 29922, 29946, 13, 29899, 29871, 29941, 29974, 29941, 29922, 29953, 13};
    ov::Core core;
    core.add_extension("/home/rahul/tools/tokenizers/libopenvino_tokenizers.so");

    ov::InferRequest detokenizer = core.compile_model(
        std::string{argv[1]} + "/openvino_detokenizer.xml", "CPU").create_infer_request();

    std::string text = detokenize(detokenizer, accumulator);
    std::cout << text << std::endl;
}

C/C-API/ Failing Case

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "openvino/c/openvino.h"

#define CHECK_STATUS(return_status)                                                      \
    if (return_status != OK) {                                                           \
        fprintf(stderr, "[ERROR] return status %d, line %d\n", return_status, __LINE__);                                                                       \
    }

char* detokenize(ov_infer_request_t* detokenizer, int64_t* tokens, int num_tokens) {
    const size_t BATCH_SIZE = 1;
    ov_status_e status;
    ov_tensor_t* input_tensor = NULL;
    ov_shape_t input_shape;
    int64_t input_shape_dims[2] = {BATCH_SIZE, num_tokens};

    status = ov_shape_create(2, input_shape_dims, &input_shape);
    if (status != OK) {
        fprintf(stderr, "Failed to create shape\n");
        return NULL;
    }

    status = ov_tensor_create_from_host_ptr(I64, input_shape, tokens, &input_tensor);
    if (status != OK) {
        fprintf(stderr, "Failed to create input tensor\n");
        return NULL;
    }

    status = ov_infer_request_set_input_tensor(detokenizer, input_tensor);
    if (status != OK) {
        fprintf(stderr, "Failed to set input tensor\n");
        return NULL;
    }

    status = ov_infer_request_infer(detokenizer);
    if (status != OK) {
        fprintf(stderr, "Failed to run inference\n");
        return NULL;
    }

    ov_tensor_t* output_tensor = NULL;
    status = ov_infer_request_get_output_tensor_by_index(detokenizer, 0, &output_tensor);
    if (status != OK) {
        fprintf(stderr, "Failed to get output tensor\n");
        return NULL;
    }

    void* output_data = NULL;
    status = ov_tensor_data(output_tensor, &output_data);
    if (status != OK) {
        fprintf(stderr, "Failed to get data from output tensor\n");
        return NULL;
    }

    size_t output_string_length = strlen((const char*)output_data);
    char* detokenized_string = (char*)malloc(output_string_length + 1);
    if (!detokenized_string) {
        fprintf(stderr, "Failed to allocate memory for detokenized string\n");
        return NULL;
    }
    strncpy(detokenized_string, (const char*)output_data, output_string_length);
    detokenized_string[output_string_length] = '\0';

    ov_tensor_free(input_tensor);
    ov_tensor_free(output_tensor);

    return detokenized_string;
}

int main(int argc, char** argv) {
    
    ov_core_t* core = NULL;
    ov_model_t* model = NULL;
    ov_compiled_model_t* compiled_model = NULL;
    ov_infer_request_t* detokenizer_request = NULL;
    char* text = NULL;
    int64_t accumulator[] = {29922, 29906, 13, 29899, 29871, 29906, 29974, 29906, 29922, 29946, 13, 29899, 29871, 29941, 29974, 29941, 29922, 29953, 13};

    const char* input_model = argv[1];
    const char* input_model_bin = argv[2];

    const char* tokenizers_path="/home/rahul/tools/tokenizers/libopenvino_tokenizers.so";

    CHECK_STATUS(ov_core_create(&core));
    CHECK_STATUS(ov_core_add_extension(core, tokenizers_path));
    CHECK_STATUS(ov_core_read_model(core, input_model, input_model_bin, &model));
    CHECK_STATUS(ov_core_compile_model(core, model, "CPU", 0, &compiled_model));
    CHECK_STATUS(ov_compiled_model_create_infer_request(compiled_model, &detokenizer_request));

    text = detokenize(detokenizer_request, accumulator, sizeof(accumulator) / sizeof(accumulator[0]));
    printf("text is %s", text);
}

Relevant log output

No response

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@mlukasze
Copy link
Contributor

mlukasze commented Oct 4, 2024

@peterchen-intel could you take a look, please?

@rahulchaphalkar
Copy link
Author

Ok, I think get string tensor case shown above works, I can access the string data in c like such -

void* output_data = NULL;
    status = ov_tensor_data(output_tensor, &output_data);
    if (status != OK) {
        fprintf(stderr, "Failed to get data from output tensor\n");
        return NULL;
    }

    const char* string_data = *(const char**)output_data;

    size_t output_string_length = strlen(string_data);
    char* detokenized_string = (char*)malloc(output_string_length + 1);
    if (!detokenized_string) {
        fprintf(stderr, "Failed to allocate memory for detokenized string\n");
        return NULL;
    }
    strncpy(detokenized_string, string_data, output_string_length);
    detokenized_string[output_string_length] = '\0';

If there's a better way to access string data, do let me know.

I'm still having problems setting a string tensor from C-api. I will try to update with another comment to document test-cases if possible.

@riverlijunjie
Copy link
Contributor

@rahulchaphalkar I don't think current C APIs support string tensor, so we need new APIs to support string tensor in C API, such as create, access and free string tensor.

@rkazants
Copy link
Contributor

rkazants commented Oct 8, 2024

Hi @rahulchaphalkar,

Now it is possible to create ov::Tensor of element::string type in C from scratch (without using host_ptr to create tensor on top of pre-allocated data). However, it is not possible to assign tensor elements for element::string. That is because we expect std::string objects by data pointer that is not possible to create in C. Do not mix std::string with C strings.

We may consider to extend ov_tensor_create_from_host_ptr function where you pass array of C strings to and we create std::string objects under the hood in bindings and assign to ov::Tensor element by element so that the tensor owns these std::string objects. However, it contradicts concepts of ov_tensor_create_from_host_ptr a bit and leads extra copies.
@riverlijunjie, how do you think?

Best regards,
Roman

@rkazants rkazants self-assigned this Oct 8, 2024
@riverlijunjie
Copy link
Contributor

Hi @rahulchaphalkar,

Now it is possible to create ov::Tensor of element::string type in C from scratch (without using host_ptr to create tensor on top of pre-allocated data). However, it is not possible to assign tensor elements for element::string. That is because we expect std::string objects by data pointer that is not possible to create in C. Do not mix std::string with C strings.

We may consider to extend ov_tensor_create_from_host_ptr function where you pass array of C strings to and we create std::string objects under the hood in bindings and assign to ov::Tensor element by element so that the tensor owns these std::string objects. However, it contradicts concepts of ov_tensor_create_from_host_ptr a bit and leads extra copies. @riverlijunjie, how do you think?

Best regards, Roman

From C API perspective, string tensor is very different with other numerical tensor, why not add dedicated C APIs to support string tensor, such as ov_tensor_create_from_string_array ?

@rkazants
Copy link
Contributor

rkazants commented Oct 8, 2024

From C API perspective, string tensor is very different with other numerical tensor, why not add dedicated C APIs to support string tensor, such as ov_tensor_create_from_string_array ?

Yep, we can do this.

@rahulchaphalkar, what is the priority of this task to support string tensors in C? Who is the customer and user scenario? We can continue to discuss this by work email.

Best regards,
Roman

@rkazants rkazants removed the bug Something isn't working label Oct 8, 2024
@rahulchaphalkar
Copy link
Author

Thanks for the details, both of you.
@rkazants This is primarily being used to generate rust bindings, and then in places like wasmtime-wasi-nn to showcase some LLM examples.
I started using String Tensor type because it is exposed in /bindings/c/ dir like here.
@riverlijunjie I can create (draft?) PRs for these string C-Apis if you'd like. I can create a "discussion" so that design choices like where we're allocating memory are finalized. Or this issue is fine as well for that.

@rkazants
Copy link
Contributor

rkazants commented Oct 9, 2024

@rahulchaphalkar, this one issue is fine for discussion and to start development of this feature. Please create PR to openvino repository.

Best regards,
Roman

@riverlijunjie
Copy link
Contributor

Thanks for the details, both of you. @rkazants This is primarily being used to generate rust bindings, and then in places like wasmtime-wasi-nn to showcase some LLM examples. I started using String Tensor type because it is exposed in /bindings/c/ dir like here. @riverlijunjie I can create (draft?) PRs for these string C-Apis if you'd like. I can create a "discussion" so that design choices like where we're allocating memory are finalized. Or this issue is fine as well for that.

@rahulchaphalkar It will be great if you can create PR for it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants