Skip to content

Commit

Permalink
Merge branch 'main' into DS-315/milvus-improve-error-handling
Browse files Browse the repository at this point in the history
  • Loading branch information
Filip Knefel committed Dec 30, 2024
2 parents 37424e2 + 5914bd9 commit 208a6e0
Show file tree
Hide file tree
Showing 366 changed files with 123,188 additions and 4,427 deletions.
8 changes: 8 additions & 0 deletions .github/workflows/e2e.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ jobs:
TOGETHERAI_API_KEY: ${{ secrets.TOGETHERAI_API_KEY }}
VOYAGEAI_API_KEY: ${{ secrets.VOYAGEAI_API_KEY }}
VERTEXAI_API_KEY: ${{ secrets.VERTEXAI_API_KEY }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_ENDPOINT: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
run: |
make install-base
make install-all-embedders
Expand Down Expand Up @@ -107,6 +109,7 @@ jobs:
KAFKA_API_KEY: ${{ secrets.KAFKA_API_KEY }}
KAFKA_SECRET: ${{ secrets.KAFKA_SECRET }}
KAFKA_BOOTSTRAP_SERVER: ${{ secrets.KAFKA_BOOTSTRAP_SERVER }}
DATABRICKS_PAT: ${{ secrets.DATABRICKS_PAT }}
run : |
source .venv/bin/activate
make install-test
Expand Down Expand Up @@ -153,10 +156,14 @@ jobs:
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_ENDPOINT }}
AZURE_SEARCH_ENDPOINT: ${{ secrets.AZURE_SEARCH_ENDPOINT }}
AZURE_SEARCH_API_KEY: ${{ secrets.AZURE_SEARCH_API_KEY }}
AZURE_REDIS_INGEST_TEST_PASSWORD: ${{ secrets.AZURE_REDIS_INGEST_TEST_PASSWORD }}
MONGODB_URI: ${{ secrets.MONGODB_URI }}
MONGODB_DATABASE: ${{ secrets.MONGODB_DATABASE_NAME }}
QDRANT_API_KEY: ${{ secrets.QDRANT_API_KEY }}
QDRANT_SERVER_URL: ${{ secrets.QDRANT_SERVER_URL }}
KAFKA_API_KEY: ${{ secrets.KAFKA_API_KEY }}
KAFKA_SECRET: ${{ secrets.KAFKA_SECRET }}
KAFKA_BOOTSTRAP_SERVER: ${{ secrets.KAFKA_BOOTSTRAP_SERVER }}
run : |
source .venv/bin/activate
make install-test
Expand Down Expand Up @@ -288,6 +295,7 @@ jobs:
S3_INGEST_TEST_SECRET_KEY: ${{ secrets.S3_INGEST_TEST_SECRET_KEY }}
AZURE_SEARCH_ENDPOINT: ${{ secrets.AZURE_SEARCH_ENDPOINT }}
AZURE_SEARCH_API_KEY: ${{ secrets.AZURE_SEARCH_API_KEY }}
AZURE_REDIS_INGEST_TEST_PASSWORD: ${{ secrets.AZURE_REDIS_INGEST_TEST_PASSWORD }}
BOX_APP_CONFIG: ${{ secrets.BOX_APP_CONFIG }}
DROPBOX_APP_KEY: ${{ secrets.DROPBOX_APP_KEY }}
DROPBOX_APP_SECRET: ${{ secrets.DROPBOX_APP_SECRET }}
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -121,4 +121,5 @@ jobs:
run: |
make install-base
make install-test
pip install unstructured
make unit-test
78 changes: 76 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,72 @@
## 0.3.7-dev6
## 0.3.12-dev4

### Enhancements

* **Migrate Vectara Destination Connector to v2**
* **Improved Milvus error handling**

## 0.3.12-dev2

### Enhancements

* **Added Redis destination connector**

## 0.3.12-dev1

* **Bypass asyncio exception grouping to return more meaningful errors from OneDrive indexer**

## 0.3.12-dev0

### Fixes

* **Fix Kafka destination connection problems**

### Enhancements

* **Kafka destination connector checks for existence of topic**
* **Create more reflective custom errors** Provide errors to indicate if the error was due to something user provided or due to a provider issue, applicable to all steps in the pipeline.
* **Bypass asyncio exception grouping to return more meaningful errors from OneDrive indexer**

## 0.3.11

### Enhancements

* **Support Databricks personal access token**

### Fixes

* **Fix missing source identifiers in some downloaders**

## 0.3.10

### Enhancements

* **Support more concrete FileData content for batch support**

### Fixes

* **Add Neo4J to ingest destination connector registry**
* **Fix closing SSHClient in sftp connector**

## 0.3.9

### Enhancements

* **Support ndjson files in stagers**
* **Add Neo4j destination connector**
* **Support passing data in for uploaders**

### Fixes

* **Make sure any SDK clients that support closing get called**

## 0.3.8

### Fixes

* **Prevent pinecone delete from hammering database when deleting**

## 0.3.7

### Fixes

Expand All @@ -8,6 +76,10 @@
* **Fixes issue with SingleStore Source Connector not being available**
* **Fixes issue with SQLite Source Connector using wrong Indexer** - Caused indexer config parameter error when trying to use SQLite Source
* **Fixes issue with Snowflake Destination Connector `nan` values** - `nan` values were not properly replaced with `None`
* **Fixes Snowflake source `'SnowflakeCursor' object has no attribute 'mogrify'` error**
* **Box source connector can now use raw JSON as access token instead of file path to JSON**
* **Fix fsspec upload paths to be OS independent**
* **Properly log elasticsearch upload errors**

### Enhancements

Expand All @@ -17,7 +89,9 @@
* **Makes multiple SQL connectors (Snowflake, SingleStore, SQLite) more robust against SQL injection.**
* **Optimizes memory usage of Snowflake Destination Connector.**
* **Added Qdrant Cloud integration test**
* **Improved Milvus error handling**
* **Add DuckDB destination connector** Adds support storing artifacts in a local DuckDB database.
* **Add MotherDuck destination connector** Adds support storing artifacts in MotherDuck database.
* **Update weaviate v2 example**

## 0.3.6

Expand Down
1 change: 1 addition & 0 deletions requirements/common/base.in
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

python-dateutil
pandas
ndjson
# Pydantic generic Secret only introduced in 2.7
pydantic>=2.7
dataclasses_json
Expand Down
44 changes: 22 additions & 22 deletions requirements/common/base.txt
Original file line number Diff line number Diff line change
@@ -1,51 +1,51 @@
# This file was autogenerated by uv via the following command:
# uv pip compile ./common/base.in --output-file ./common/base.txt --no-strip-extras --python-version 3.9
# uv pip compile base.in --output-file base.txt --no-strip-extras --python-version 3.9
annotated-types==0.7.0
# via pydantic
click==8.1.7
# via -r ./common/base.in
# via -r base.in
dataclasses-json==0.6.7
# via -r ./common/base.in
deprecated==1.2.14
# via -r base.in
deprecated==1.2.15
# via opentelemetry-api
marshmallow==3.22.0
marshmallow==3.23.1
# via dataclasses-json
mypy-extensions==1.0.0
# via typing-inspect
ndjson==0.3.1
# via -r base.in
numpy==1.26.4
# via
# -c ./common/constraints.txt
# -c constraints.txt
# pandas
opentelemetry-api==1.16.0
# via opentelemetry-sdk
opentelemetry-sdk==1.16.0
# via -r ./common/base.in
# via -r base.in
opentelemetry-semantic-conventions==0.37b0
# via opentelemetry-sdk
packaging==23.2
# via
# -c ./common/constraints.txt
# marshmallow
packaging==24.2
# via marshmallow
pandas==2.2.3
# via -r ./common/base.in
pydantic==2.9.2
# via -r ./common/base.in
pydantic-core==2.23.4
# via -r base.in
pydantic==2.10.3
# via -r base.in
pydantic-core==2.27.1
# via pydantic
python-dateutil==2.9.0.post0
# via
# -r ./common/base.in
# -r base.in
# pandas
pytz==2024.2
# via pandas
setuptools==75.1.0
setuptools==75.6.0
# via
# opentelemetry-api
# opentelemetry-sdk
six==1.16.0
six==1.17.0
# via python-dateutil
tqdm==4.66.5
# via -r ./common/base.in
tqdm==4.67.1
# via -r base.in
typing-extensions==4.12.2
# via
# opentelemetry-sdk
Expand All @@ -56,7 +56,7 @@ typing-inspect==0.9.0
# via dataclasses-json
tzdata==2024.2
# via pandas
wrapt==1.16.0
wrapt==1.17.0
# via
# -c ./common/constraints.txt
# -c constraints.txt
# deprecated
1 change: 0 additions & 1 deletion requirements/common/constraints.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,4 @@ wrapt>=1.14.0
# NOTE(robinson): chroma was pinned to importlib-metadata>=7.1.0 but 7.1.0 was installed
# instead of 7.2.0. Need to investigate
importlib-metadata==7.1.0
unstructured==0.15.10
numpy<2
3 changes: 3 additions & 0 deletions requirements/connectors/duckdb.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
-c ../common/constraints.txt

duckdb
4 changes: 4 additions & 0 deletions requirements/connectors/duckdb.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# This file was autogenerated by uv via the following command:
# uv pip compile ./connectors/duckdb.in --output-file ./connectors/duckdb.txt --no-strip-extras --python-version 3.9
duckdb==1.1.3
# via -r ./connectors/duckdb.in
3 changes: 3 additions & 0 deletions requirements/connectors/neo4j.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
neo4j
cymple
networkx
10 changes: 10 additions & 0 deletions requirements/connectors/neo4j.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# This file was autogenerated by uv via the following command:
# uv pip compile neo4j.in --output-file neo4j.txt --no-strip-extras --python-version 3.9
cymple==0.12.0
# via -r neo4j.in
neo4j==5.27.0
# via -r neo4j.in
networkx==3.2.1
# via -r neo4j.in
pytz==2024.2
# via neo4j
3 changes: 3 additions & 0 deletions requirements/connectors/redis.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
-c ../common/constraints.txt

redis
6 changes: 6 additions & 0 deletions requirements/connectors/redis.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# This file was autogenerated by uv via the following command:
# uv pip compile ./requirements/connectors/redis.in --output-file ./requirements/connectors/redis.txt --no-strip-extras --python-version 3.9
async-timeout==5.0.1
# via redis
redis==5.2.0
# via -r ./requirements/connectors/redis.in
2 changes: 2 additions & 0 deletions requirements/connectors/vectara.in
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
-c ../common/constraints.txt

requests
aiofiles
httpx
2 changes: 2 additions & 0 deletions requirements/connectors/vectara.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ idna==3.10
# via requests
requests==2.32.3
# via -r ./connectors/vectara.in
aiofiles==24.1.0
# via -r ./connectors/vectara.in
urllib3==1.26.20
# via
# -c ./connectors/../common/constraints.txt
Expand Down
3 changes: 3 additions & 0 deletions requirements/test.in
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
-c ./common/constraints.txt

pytest
pytest-cov
pytest-mock
pytest-check
unstructured
pytest-asyncio
pytest_tagging
pytest-json-report
faker
docker
universal_pathlib
deepdiff

# Connector specific deps
cryptography
Expand Down
Loading

0 comments on commit 208a6e0

Please sign in to comment.