diff --git a/docs/articles_en/openvino_workflow/running_inference_with_openvino.rst b/docs/articles_en/openvino_workflow/running_inference_with_openvino.rst index 17595ffdae3692..227097201b4434 100644 --- a/docs/articles_en/openvino_workflow/running_inference_with_openvino.rst +++ b/docs/articles_en/openvino_workflow/running_inference_with_openvino.rst @@ -14,6 +14,7 @@ Running Inference with OpenVINO™ openvino_docs_OV_UG_ShapeInference openvino_docs_OV_UG_DynamicShapes openvino_docs_OV_UG_model_state_intro + openvino_docs_OV_UG_string_tensors Optimize Inference .. meta:: diff --git a/docs/articles_en/openvino_workflow/running_inference_with_openvino/integrate_with_your_application.rst b/docs/articles_en/openvino_workflow/running_inference_with_openvino/integrate_with_your_application.rst index 0b167d932d767a..3d60fa22e0c512 100644 --- a/docs/articles_en/openvino_workflow/running_inference_with_openvino/integrate_with_your_application.rst +++ b/docs/articles_en/openvino_workflow/running_inference_with_openvino/integrate_with_your_application.rst @@ -15,12 +15,12 @@ Integrate OpenVINO™ with Your Application .. meta:: - :description: Learn how to implement a typical inference pipeline of OpenVINO™ + :description: Learn how to implement a typical inference pipeline of OpenVINO™ Runtime in an application. -Following these steps, you can implement a typical OpenVINO™ Runtime inference -pipeline in your application. Before proceeding, make sure you have +Following these steps, you can implement a typical OpenVINO™ Runtime inference +pipeline in your application. Before proceeding, make sure you have :doc:`installed OpenVINO Runtime ` and set environment variables (run ``/setupvars.sh`` for Linux or ``setupvars.bat`` for Windows, otherwise, the ``OpenVINO_DIR`` variable won't be configured properly to pass ``find_package`` calls). @@ -243,8 +243,8 @@ To learn how to change the device configuration, read the :doc:`Query device pro Step 3. Create an Inference Request ################################### -``ov::InferRequest`` class provides methods for model inference in OpenVINO™ Runtime. -Create an infer request using the following code (see +``ov::InferRequest`` class provides methods for model inference in OpenVINO™ Runtime. +Create an infer request using the following code (see :doc:`InferRequest detailed documentation ` for more details): .. tab-set:: @@ -299,6 +299,7 @@ You can use external memory to create ``ov::Tensor`` and use the ``ov::InferRequ :language: cpp :fragment: [part4] +See :doc:`additional materials ` to learn how to handle textual data as a model input. Step 5. Start Inference ####################### @@ -329,7 +330,7 @@ OpenVINO™ Runtime supports inference in either synchronous or asynchronous mod :fragment: [part5] -This section demonstrates a simple pipeline. To get more information about other ways to perform inference, read the dedicated +This section demonstrates a simple pipeline. To get more information about other ways to perform inference, read the dedicated :doc:`"Run inference" section `. Step 6. Process the Inference Results @@ -360,6 +361,7 @@ Go over the output tensors and process the inference results. :language: cpp :fragment: [part6] +See :doc:`additional materials ` to learn how to handle textual data as a model output. Step 7. Release the allocated objects (only for C) ################################################## @@ -440,5 +442,6 @@ Additional Resources * See the :doc:`OpenVINO Samples ` page or the `Open Model Zoo Demos `__ page for specific examples of how OpenVINO pipelines are implemented for applications like image classification, text prediction, and many others. * :doc:`OpenVINO™ Runtime Preprocessing ` +* :doc:`String Tensors ` * :doc:`Using Encrypted Models with OpenVINO ` diff --git a/docs/articles_en/openvino_workflow/running_inference_with_openvino/string_tensors.rst b/docs/articles_en/openvino_workflow/running_inference_with_openvino/string_tensors.rst new file mode 100644 index 00000000000000..a5a1d3dd9987aa --- /dev/null +++ b/docs/articles_en/openvino_workflow/running_inference_with_openvino/string_tensors.rst @@ -0,0 +1,208 @@ +.. {#openvino_docs_OV_UG_string_tensors} + +String Tensors +============== + + +.. meta:: + :description: Learn how to pass and retrieve text to and from OpenVINO model. + +OpenVINO tensors can hold not only numerical data, like floating-point or integer numbers, +but also textual information, represented as one or multiple strings. +Such a tensor is called a string tensor and can be passed as input or retrieved as output of a text-processing model, such as +`tokenizers and detokenizers `__. + +While this section describes basic API to handle string tensors, more practical examples that leverage both +string tensors and OpenVINO tokenizer can be found in +`GenAI Samples `__. + + +Representation +############## + +String tensors are supported in C++ and Python APIs, represented as instances of the `ov::Tensor` +class with the `element_type` parameter equal to `ov::element::string`. Each element of a string tensor is a string +of arbitrary length, including an empty string, and can be set independently of other elements in the same tensor. + +Depending on the API used (C++ or Python), the underlying data type that represents the string when accessing the tensor elements is +different: + + - in C++, std::string is used + - in Python, `numpy.str_`/`numpy.bytes_` populated Numpy arrays are used, as a read-only copy of the underlying C++ content + +String tensor implementation doesn't imply any limitations on string encoding, as underlying `std::string` doesn't have such limitations. +It is capable of representing all valid UTF-8 characters but also any other byte sequence outside of the UTF-8 encoding standard. +Users should pay extra attention when handling arbitrary byte sequences when accessing tensor content as encoded UTF-8 symbols. + +As the string representation is more sophisticated in contrast to for example `float` or `int` data type, +the underlying memory that is used for string tensor representation cannot be handled without properly constructing and destroying string objects. +Also, in contrast to numerical data, C++ and Python do not share the same memory layout, so there is no immediate +sharing of tensor content between the two APIs. Python provides only a numpy-compatible view of the data +allocated and held in C++ core as an array of the `std::string` objects. + +A developer must consider these restrictions when writing code using string tensors and +avoid treating the content as raw bytes or as a view of data in Python. + +Create a String Tensor +###################### + +The following is an example of how to create a small 1D tensor pre-populated with three elements: + +.. tab-set:: + + .. tab-item:: Python + :sync: py + + .. code-block:: py + :force: + + import openvino as ov + + tensor = ov.Tensor(['text', 'more text', 'even more text']) + + .. tab-item:: C++ + :sync: cpp + + .. code-block:: cpp + + #include + #include + #include + + std::vector strings = {"text", "more text", "even more text"}; + ov::Tensor tensor(ov::element::string, ov::Shape{strings.size()}, &strings[0]); + +The example demonstrates that similarly to tensors with numerical information, +a tensor object can be created on top of existing memory in C++ by providing a pointer to a pre-allocated array of elements. +Here, an instance of std::vector is used to hold the memory and consists of three std::string objects. +So, the `tensor` object in the C++ example will share the same memory as the `strings` vector. + +Note that `ov::Tensor`, when initialized with a pointer, requires pre-initialized memory with valid `std::string` objects +created by calling one of the available `std::string` constructors even for empty string. It is undefined behaviour if +not initialized memory is passed to this `ov::Tensor` constructor. + +In the Python version of the example above, a regular list of strings is used as an initializer. +No memory sharing is available this time, in contrast to C++, +and the strings from the initialization list are copied to a separately allocated storage underneath the `tensor` object. + +Besides a plain Python list of strings, an initializer can be one of the supported `numpy` arrays initialized +with Unicode or byte strings: + +.. tab-set:: + + .. tab-item:: Python + :sync: py + + .. code-block:: python + :force: + + import numpy as np + + tensor = ov.Tensor(np.array(['text', 'more text', 'even more text'])) + tensor = ov.Tensor(np.array([b'text', b'more text', b'even more text'])) + +If `ov::Tensor` is created without providing initialization strings, +a tensor of a specified shape and empty strings as elements is created: + +.. tab-set:: + + .. tab-item:: Python + :sync: py + + .. code-block:: python + :force: + + tensor = ov.Tensor(dtype=str, shape=[3]) + + .. tab-item:: C++ + :sync: cpp + + .. code-block:: cpp + + ov::Tensor tensor(ov::element::string, ov::Shape{3}); + +`ov::Tensor` allocates and initializes the required number of `std::string` objects under the hood. + + +Accessing Elements +################## + +The following code prints all elements in the 1D string tensor constructed above. +In C++ code the same `.data` template method is used for other data types, +and to access string data it should be called with the `std::string` type. +In Python, dedicated `std_data` and `byte_data` fields are used instead of `data` field for numerical data. + +.. tab-set:: + + .. tab-item:: Python + :sync: py + + .. code-block:: python + :force: + + data = tensor.str_data # use tensor.byte_data instead to access encoded strings as `bytes` + for i in range(tensor.get_size()): + print(data[i]) + + .. tab-item:: C++ + :sync: cpp + + .. code-block:: cpp + + #include + + std::string* data = tensor.data(); + for(size_t i = 0; i < tensor.get_size(); ++i) + std::cout << data[i] << '\n'; + +In the case of Python, an object retrieved with `tensor.str_data` (or `tensor.bytes_data`) is a numpy array +with `numpy.str_` elements (or `numpy.bytes_` correspondingly). It is a copy of underlying data from +the `tensor` object and cannot be used for tensor content modification. +To set new values, the entire tensor content should be set as a list or as a `numpy` array, as demonstrated +below. + +In contrast to Python, when using `tensor.data()` in C++, a pointer to the underlying data +storage is returned and it can be used for tensor element modification: + +.. tab-set:: + + .. tab-item:: Python + :sync: py + + .. code-block:: python + + # Unicode strings: + tensor.str_data = ['one', 'two', 'three'] + # Do NOT use tensor.str_data[i] to set a new value, it won't update the tensor content + + # Encoded strings: + tensor.bytes_data = [b'one', b'two', b'three'] + # Do NOT use tensor.bytes_data[i] to set a new value, it won't update the tensor content + + .. tab-item:: C++ + :sync: cpp + + .. code-block:: cpp + + std::string new_content[] = {"one", "two", "three"}; + std::string* data = tensor.data(); + for(size_t i = 0; i < tensor.get_size(); ++i) + data[i] = new_content[i]; + +When reading or setting string tensor elements in Python, it is recommended to use `str` objects (or `numpy.str_` if used in numpy array) +when it is known that the underlying byte sequence forms a valid UTF-8 encoded string. +Otherwise, if arbitrary byte sequences are allowed, +not necessarily within the UTF-8 standard, use `bytes` strings (or `numpy.bytes_` correspondingly) instead. + +Accessing tensor content through `str_data` implicitly applies UTF-8 decoding. +If parts of the byte stream cannot be represented as valid Unicode symbols, +the � replacement symbol is used to signal errors in such invalid Unicode streams. + +Additional Resources +#################### + +* Learn about the :doc:`basic steps to integrate inference in your application `. + +* Use `OpenVINO tokenizers `__ to produce models that use string tensors to work with textual information as pre- and post-processing for the large language models. + +* Check out `GenAI Samples `__ to see how string tensors are used in real-life applications.