Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Bulkinsert supports null in csv formats #35911

Closed
1 task done
OxalisCu opened this issue Sep 2, 2024 · 1 comment
Closed
1 task done

[Enhancement]: Bulkinsert supports null in csv formats #35911

OxalisCu opened this issue Sep 2, 2024 · 1 comment
Assignees
Labels
kind/enhancement Issues or changes related to enhancement

Comments

@OxalisCu
Copy link
Contributor

OxalisCu commented Sep 2, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

Recently, milvus has added support for the special data type of null, so it is necessary to support the parsing of null values ​​in csv format in the bulkinsert function.

csv is a text data format, and there is no specific data type corresponding to null, so the implementation idea is to use a configurable special string to represent the null value. If this special string is matched, then this field will be parsed as null.

Data types that support null values

Currently the data types that support null are

bool, int8, int16, int32, int64, float, double, varchar, string, json, array

The data types that do not support null are

binaryvector, floatvector, float16vector, bfloat16vector, sparsefloatvector

In addition, the array type supports null, but the elements in the array do not support null, as shown in the following example

// support
arr := NULL
// not support
arr := [1, 2, NULL, 4]

nullkey configuartion

In csv format, parsing of null identifiers can be configured. The field name is nullkey, which currently supports any string. If not configured, the default is nullkey = "" .

curl --request POST "http://localhost:19530/v2/vectordb/jobs/import/create" \
--header "Content-Type: application/json" \
--data-raw '{
    "files": [
        [
            "filepath"
        ]
    ],
    "collectionName": "collection_name",
    "options": {"nullkey": "NULL"}
}'

Why is this needed?

No response

Anything else?

No response

@OxalisCu OxalisCu added the kind/enhancement Issues or changes related to enhancement label Sep 2, 2024
@xiaofan-luan
Copy link
Collaborator

/assign @smellthemoon

could you help on this?

sre-ci-robot pushed a commit that referenced this issue Sep 9, 2024
see details in this issue
#35911

---------

Signed-off-by: OxalisCu <[email protected]>
@OxalisCu OxalisCu closed this as completed Sep 9, 2024
chyezh pushed a commit to chyezh/milvus that referenced this issue Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement
Projects
None yet
Development

No branches or pull requests

3 participants