This folder contains a Dockerfile and a minimal pipeline setup to showcase an example of how to use Haystack in an airgapped environment.
- While the container can run in an air-gapped environment, Internet access is required to build the container itself.
- The number and size of the models used in the pipeline will increase the size of the resulting Docker image.
- This is an overly simplistic Dockerfile provided for reference, some changes are expected to suit realistic use cases.
For the example here, you need docker compose
installed on your system.
You’ll also need to clone the haystack-demos
repository and run the following commands.
- Read the
docker-compose.yml
,Dockerfile
, andretriever-reader.yml
files carefully. - Make any appropriate change: choose the desired pipeline and components, add or remove unnecessary commands from the
Dockerfile
. cd haystack-demos/airgapped-rest_api
docker compose build
After building the Docker image, it should be possible to run the container without internet access.
docker compose up
To find Airgapped Docker’s IP address:
docker inspect <AIRGAPPED_DOCKER_ID> | grep IPAddress
You will need to replace the IP address in the commands below.
To index the test data through REST APIs:
find ./airgapped-test-data -name '*.txt' -exec curl --request POST --url http://<IPAddress>:8000/file-upload --header 'accept: application/json' --header 'content-type: multipart/form-data' --form files="@{}" --form meta=null \;
To verify if the data is written in the DocumentStore:
curl --request POST --url http://<IPAddress>:8000/documents/get_by_filters --header 'accept: application/json' --header 'content-type: application/json' --data '{"filters": {}}'
Sample Query through REST API:
curl --request POST --url http://<IPAddress>:8000/query --header 'accept: application/json' --header 'content-type: application/json' --data '{"query": "what is my name?"}'