Releases: jina-ai/serve
💫 Release v3.16.0
Release Note (3.16.0
)
Release time: 2023-05-05 06:35:58
This release contains 4 new features, 5 bug fixes and 8 documentation improvements.
🆕 Features
(Beta) Replicate with consensus stateful Executors using RAFT algorithm with new DocArray version (docarray >= 0.30
) (#5564)
When scaling Executors inside a Deployment
, you can now ensure internal state (if the Executor has one) can be synced across every replica by ensuring they all work in consensus. This means the internal state of every replica will be consistent and they can thus serve requests in an equivalent manner.
For this, you need to decorate the Executor methods that alter its inner state with the @write
decorator. Then, when adding the Executor inside a Deployment, you need to add the stateful=True
flag and optionally configure the ports of every peer in the replication cluster using the --peer-ports
argument:
from jina import Deployment, Executor, requests
from jina.serve.executors.decorators import write
from docarray import DocList
from docarray.documents import TextDoc
class MyStateStatefulExecutor(Executor):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._docs_dict = {}
@write
@requests(on=['/index'])
def index(self, docs: DocList[TextDoc], **kwargs) -> DocList[TextDoc]:
for doc in docs:
self._docs_dict[doc.id] = doc
@requests(on=['/search'])
def search(self, docs: DocList[TextDoc], **kwargs) -> DocList[TextDoc]:
for doc in docs:
self.logger.debug(f'Searching against {len(self._docs_dict)} documents')
doc.text = self._docs_dict[doc.id].text
d = Deployment(name='stateful_executor',
uses=MyStateStatefulExecutor,
replicas=3,
stateful=True,
peer_ports=[12345, 12346, 12347])
with d:
d.block()
The consensus
module in Jina will ensure that the three replicas all hold the same state. Jina uses the RAFT algorithm (https://raft.github.io/) to provide this feature.
Support HTTP and combined protocols for Deployment with new DocArray version (docarray >= 0.30
) (#5826)
You can now use the new DocArray version when using HTTP as protocol to serve a Deployment, or a composition of HTTP and gRPC.
This allows OpenAPI specs matching the exact Document schemas defined by the Executor:
from jina import Executor, requests
from docarray import DocList, BaseDoc
from docarray.documents import ImageDoc
from docarray.typing import AnyTensor
import numpy as np
class InputDoc(BaseDoc):
img: ImageDoc
class OutputDoc(BaseDoc):
embedding: AnyTensor
class MyExec(Executor):
@requests(on='/bar')
def bar(
self, docs: DocList[InputDoc], **kwargs
) -> DocList[OutputDoc]:
docs_return = DocList[OutputDoc](
[OutputDoc(embedding=np.zeros((100, 1))) for _ in range(len(docs))]
)
return docs_return
d = Deployment(uses=MyExec, protocol='http')
with d:
d.block()
Support sharding in Deployment with new DocArray version (docarray >= 0.30
) (#5828)
Using shards
inside a Deployment
(where the Executor
works with docarray>=0.30
) will now work the same as with previous versions of docarray
as described in the documentation.
Clickable link to /docs
OpenAPI endpoint (#5854)
When starting a Flow or a Deployment with the HTTP protocol, the /docs
and /redocs
links will now appear complete and can be clicked to open the browser directly from the terminal.
🐞 Bug Fixes
Improved platform-specific dependency for uvloop
. (#5841)
Fix installation issues when downstream dependencies try to install Jina on Windows using poetry
.
Use utf-8
encoding when opening files (#5821)
Use utf-8
encoding when opening files to fix potential problems when using Jina in Windows.
Fix new Deployment CLI (#5819)
Fix the output YAML file when using jina new <project-name> --type deployment
. It previously dumped an invalid Deployment YAML. The syntax has now been fixed so that it can be deployed.
Use container stop
so that containers running cuda can be killed (#5816)
Use stop
instead of kill
from the Python Docker SDK. This allows containerized Executors that are run with conda run
to terminate, since conda does not propagate its captured signals to its Python subprocesses.
Catch ContentTypeError
for retry in post requests (#5825)
ContentTypeError
responses in the Client will now be caught and retries applied if required.
📗 Documentation Improvements
- Update Slack count (#5842)
- Add storage section (#5831)
- Remove layer from orchestration layer title (#5829)
- Fix broken links (#5822)
- Add docs for
jc recreate
(#5815) - Point to legacy
DocumentArray
docs (#5810) - Fix some English in docstrings (#5718)
- Fix some links in scalability chapter (#5809)
🤟 Contributors
We would like to thank all contributors to this release:
- notandor (@NotAndOr )
- Deepankar Mahapatro (@deepankarm )
- Alex Cureton-Griffiths (@alexcg1 )
- Tanguy Abel (@Tanguyabel )
- Nikolas Pitsillos (@npitsillos )
- Felix Mönckemeyer (@nemesisflx)
- Joan Fontanals (@JoanFM)
- Florian Hönicke (@florian-hoenicke )
💫 Patch v3.15.4
Release Note (3.15.2
)
Release time: 2023-05-05 06:35:58
🙇 We'd like to thank all contributors for this new release! In particular,
Joan Fontanals, notandor, Deepankar Mahapatro, Nikolas Pitsillos, Tanguy Abel, Florian Hönicke, Alex Cureton-Griffiths, Felix Mönckemeyer, Jina Dev Bot, 🙇
🆕 New Features
- [
6ba33fe0
] - catch ContentTypeError for retry in POST requests (#5825) (Tanguy Abel) - [
9f317c58
] - add support to HTTP and Composite Deployment with docarray v2 (#5826) (Joan Fontanals)
🐞 Bug fixes
- [
bf536eb8
] - set platform specific dependency to uvloop in a better way. (#5841) (notandor) - [
9b784b9e
] - fix HTTP support for Deployment and docarray v2 (#5830) (Joan Fontanals) - [
d57679a0
] - use always utf-8 encoding when opening files(#5821) (Florian Hönicke) - [
779bff28
] - fix new CLI for Deployment (#5819) (Joan Fontanals) - [
034ee847
] - use container STOP so that containers running CUDA can be killed (#5816) (Joan Fontanals) - [
90b46ade
] - docstrings-english (#5718) (Alex Cureton-Griffiths) - [
d6636eb9
] - link issues in scalability chapter (#5809) (Felix Mönckemeyer)
📗 Documentation
- [
8fda71bf
] - update slack count (#5842) (Deepankar Mahapatro) - [
ecd86056
] - jcloud: add storage section (#5831) (Nikolas Pitsillos) - [
5080dacc
] - remove layer from orchestration layer title (#5829) (Joan Fontanals) - [
d45f4b3d
] - fix broken links (#5822) (Joan Fontanals) - [
b3590006
] - add docs for jc recreate (#5815) (Nikolas Pitsillos) - [
12948a86
] - point to legacy documentarray docs (#5810) (Joan Fontanals)
🏁 Unit Test and CICD
- [
c2d52cdd
] - fix some tests in CI (#5812) (Joan Fontanals) - [
892a2825
] - fix docarray v2 comp tests (#5811) (Joan Fontanals)
🍹 Other Improvements
- [
3adc08eb
] - bump jina version (#5849) (Joan Fontanals) - [
a5d3403d
] - pin urllib3 (#5848) (Joan Fontanals) - [
de9fcc92
] - fix links in README (#5846) (Joan Fontanals) - [
729d4d49
] - fix link to survey (#5844) (Joan Fontanals) - [
75f0269e
] - try to fix CI (#5814) (Joan Fontanals) - [
c596452c
] - ignore dynamically generated protobuf docs (#5726) (Alex Cureton-Griffiths) - [
7a1c1e4a
] - docs: update TOC (Jina Dev Bot) - [
b3493635
] - version: the next version will be 3.15.1 (Jina Dev Bot)
💫 Patch v3.15.2
Release Note (3.15.2
)
Release time: 2023-05-05 06:35:58
🙇 We'd like to thank all contributors for this new release! In particular,
Joan Fontanals, notandor, Deepankar Mahapatro, Nikolas Pitsillos, Tanguy Abel, Florian Hönicke, Alex Cureton-Griffiths, Felix Mönckemeyer, Jina Dev Bot, 🙇
🆕 New Features
- [
6ba33fe0
] - catch ContentTypeError for retry in POST requests (#5825) (Tanguy Abel) - [
9f317c58
] - add support to HTTP and Composite Deployment with docarray v2 (#5826) (Joan Fontanals)
🐞 Bug fixes
- [
bf536eb8
] - set platform specific dependency to uvloop in a better way. (#5841) (notandor) - [
9b784b9e
] - fix HTTP support for Deployment and docarray v2 (#5830) (Joan Fontanals) - [
d57679a0
] - use always utf-8 encoding when opening files(#5821) (Florian Hönicke) - [
779bff28
] - fix new CLI for Deployment (#5819) (Joan Fontanals) - [
034ee847
] - use container STOP so that containers running CUDA can be killed (#5816) (Joan Fontanals) - [
90b46ade
] - docstrings-english (#5718) (Alex Cureton-Griffiths) - [
d6636eb9
] - link issues in scalability chapter (#5809) (Felix Mönckemeyer)
📗 Documentation
- [
8fda71bf
] - update slack count (#5842) (Deepankar Mahapatro) - [
ecd86056
] - jcloud: add storage section (#5831) (Nikolas Pitsillos) - [
5080dacc
] - remove layer from orchestration layer title (#5829) (Joan Fontanals) - [
d45f4b3d
] - fix broken links (#5822) (Joan Fontanals) - [
b3590006
] - add docs for jc recreate (#5815) (Nikolas Pitsillos) - [
12948a86
] - point to legacy documentarray docs (#5810) (Joan Fontanals)
🏁 Unit Test and CICD
- [
c2d52cdd
] - fix some tests in CI (#5812) (Joan Fontanals) - [
892a2825
] - fix docarray v2 comp tests (#5811) (Joan Fontanals)
🍹 Other Improvements
- [
3adc08eb
] - bump jina version (#5849) (Joan Fontanals) - [
a5d3403d
] - pin urllib3 (#5848) (Joan Fontanals) - [
de9fcc92
] - fix links in README (#5846) (Joan Fontanals) - [
729d4d49
] - fix link to survey (#5844) (Joan Fontanals) - [
75f0269e
] - try to fix CI (#5814) (Joan Fontanals) - [
c596452c
] - ignore dynamically generated protobuf docs (#5726) (Alex Cureton-Griffiths) - [
7a1c1e4a
] - docs: update TOC (Jina Dev Bot) - [
b3493635
] - version: the next version will be 3.15.1 (Jina Dev Bot)
💫 Release v3.15.0
Release Note (3.15.0
)
Release time: 2023-04-14 09:55:25
This release contains 6 new features, 6 bug fixes and 5 documentation improvements.
🆕 Features
HTTP and composite protocols for Deployment (#5764)
When using a Deployment
to serve a single Executor, you can now expose it via HTTP
or a combination of HTTP
and gRPC
protocols:
from jina import Deployment, Executor, requests
class MyExec(Executor):
@requests(on='/bar')
def bar(self, docs, **kwargs):
pass
dep = Deployment(protocol=['http', 'grpc'], port=[12345, 12346], uses=MyExec)
with dep:
dep.block()
With this, you can also access the OpenAPI schema in localhost:12345/docs
:
Force network mode option (#5789)
When using a containerized Executor inside a Deployment or as part of a Flow, under some circumstances you may want to force the network mode to make sure the container is reachable by the Flow or Deployment to ensure readiness. This ensures that the Docker Python SDK runs the container with the relevant options.
For this, we have added the argument force_network_mode
to enable this.
You can set this argument to any of these options:
AUTO
: Automatically detect the Docker network.HOST
: Use the host network.BRIDGE
: Use a user-defined bridge network.NONE
: UseNone
as the network.
from jina import Deployment
dep = Deployment(uses='jinaai+docker://TransformerTorchEncoder', force_network_mode='None')
with dep:
dep.block()
Allow disabling thread lock (#5771)
When an Executor exposes a synchronous method (not a coroutine) and exposes this method via the @requests
decorator, Jina makes sure that each request received is run in a thread.
This thread is however locked with a threading.Lock
object to protect the user from potential hazards of multithreading while leaving the Executor free to respond to health checks coming from the outside or from orchestrator frameworks such as Kubernetes. This lock can be bypassed if the allow_concurrent
argument is passed to the Executor.
from jina import Deployment, Executor, requests
class MyExec(Executor):
@requests(on='/bar')
def bar(self, docs, **kwargs):
pass
dep = Deployment(allow_concurrent=True, uses=MyExec)
with dep:
dep.block()
grpc_channel_options
for custom gRPC options for the channel (#5765)
You can now pass grpc_channel_options
to allow granular tuning of the gRPC connectivity from the Client or Gateway. You can check the options in gRPC Python documentation
client = Client(grpc_channel_options={'grpc.max_send_message_length': -1})
Create Deployments from the CLI (#5756)
New you can create from the Jina CLI to create a first project to deploy a single Deployment
in the same way it was possible to create one for a Flow
.
Now the jina new
command accepts a new type
argument that can be flow
or deployment
.
jina new hello-world --type flow
jina new hello-world --type deployment
Add replicas
argument to Gateway for Kubernetes (#5711)
To scale the Gateway in Kubernetes or in JCloud, you can now add the replicas
arguments to the gateway
.
from jina import Flow
f = Flow().config_gateway(replicas=3).add()
f.to_kubernetes_yaml('./k8s_yaml_path')
jtype: Flow
version: '1'
with: {}
gateway:
replicas: 3
executors:
- name: executor0
🐞 Bug Fixes
Retry client gRPC stream and unary RPC methods (#5733)
The retry mechanism parameters were not properly respected by the Client
in prior releases. This is now fixed and will improve the robustness against transient errors.
from jina import Client, DocumentArray
Client(host='...').post(
on='/',
inputs=DocumentArray.empty(),
max_attempts=100,
)
Allow HTTP timeout (#5797)
When using the Client
to send data to an HTTP service, the connection timed out after five minutes (the default setting for aiohttp). This can now be edited for cases where a request may take longer, thus avoiding the Client disconnecting after a longer period.
from jina import Client, DocumentArray
Client(protocol='http').post(
on='/',
inputs=DocumentArray.empty(),
timeout=600,
)
Enable root logging at all times (#5736)
The JINA_LOG_LEVEL
environment variable controls the log level of the JinaLogger. Previously the debug logging of other dependencies was not respected. Now they can be enabled.
logging.get_logger('urllib3').setLevel(logging.DEBUG)
Fix Gateway tensor serialization (#5752)
In prior releases, when an HTTP Gateway was run without torch
installed and connected to an Executor returning torch.Tensor
as part of the Documents, the Gateway couldn't serialize the Documents back to the Client, leading to a no module torch
error. This is now fixed and works without installing torch
in the Gateway container or system.
from jina import Flow, Executor, Document, DocumentArray, requests
import torch
class DummyTorchExecutor(Executor):
@requests
def foo(self, docs: DocumentArray, **kwargs):
for d in docs:
d.embedding = torch.rand(1000)
d.tensor = torch.rand(1000)
from jina import Flow, Executor, Document, DocumentArray, requests
flow = Flow().config_gateway(port=12346, protocol='http').add(port='12345', external=True)
with flow:
docs = flow.post(on='/', inputs=Document())
print(docs[0].embedding.shape)
print(docs[0].tensor.shape)
Composite Gateway tracing (#5741)
Previously, tracing didn't work for Gateways that exposed multiple protocols:
from jina import Flow
f = Flow(port=[12345, 12346], protocol=['http', 'grpc'], tracing=True).add()
with f:
f.block()
Adapt to DocArray v2 (#5742)
Jina depends on DocArray's data structures. This version adds support for DocArray v2's upcoming major changes.
The involves naming conventions:
DocumentArray
➡️DocList
BaseDocument
➡️BaseDoc
from jina import Deployment, Executor, requests
from docarray import DocList, BaseDoc
from docarray.documents import ImageDoc
from docarray.typing import AnyTensor
import numpy as np
class InputDoc(BaseDoc):
img: ImageDoc
class OutputDoc(BaseDoc):
embedding: AnyTensor
class MyExec(Executor):
@requests(on='/bar')
def bar(
self, docs: DocList[InputDoc], **kwargs
) -> DocumentArray[OutputDoc]:
docs_return = DocList[OutputDoc](
[OutputDoc(embedding=np.zeros((100, 1))) for _ in range(len(docs))]
)
return docs_return
with Deployment(uses=MyExec) as dep:
docs = dep.post(
on='/bar',
inputs=InputDoc(img=ImageDoc(tensor=np.zeros((3, 224, 224)))),
return_type=DocList[OutputDoc],
)
assert docs[0].embedding.shape == (100, 1)
assert docs.__class__.document_type == OutputDoc
📗 Documentation improvements
- JCloud Flow name customization (#5778)
- JCloud docs revamp for instance (#5759)
- Fix Colab link (#5760)
- Remove docsQA (#5743)
- Misc polishing
🤟 Contributors
We would like to thank all contributors to this release:
- Girish Chandrashekar (@girishc13)
- Asuzu Kosisochukwu (@asuzukosi)
- AlaeddineAbdessalem (@alaeddine-13)
- Zac Li (@zac-li)
- nikitashrivastava29 (@nikitashrivastava29)
- samsja (@samsja)
- Alex Cureton-Griffiths (@alexcg1)
- Joan Fontanals (@JoanFM)
- Deepankar Mahapatro (@deepankarm)
💫 Patch v3.14.1
Release Note (3.14.1
)
Release time: 2023-02-24 10:21:36
This release contains 3 bug fixes and 3 documentation improvements.
🐞 Bug Fixes
Respect replication and host configuration for Executor Deployments (#5705)
Prior to this release, this setting wasn't fully respected when an Executor was deployed with a replication setup or specific host configuration.
It turns out that even though all replicas would be spun up, the Gateway would only know about one of them and would forward incoming requests to that replica. Additionally, all replica hostnames would be set to 0.0.0.0
, although the user may have chosen other hostname settings.
This release properly respects settings. All replicas are now configured with the specified hostname and the Jina Gateway properly load balances traffic to all replicas.
from jina import Deployment, Executor, requests, DocumentArray
class MyExecutor(Executor):
@requests
def foo(self, **kwargs):
print(self.runtime_args.name)
with Deployment(uses=MyExecutor, replicas=3, host='127.0.0.1') as dep:
dep.post(on='/', inputs=DocumentArray.empty(5), request_size=1)
─────────────────────── 🎉 Deployment is ready to serve! ───────────────────────
╭─────────────────────── 🔗 Endpoint ───────────────────────╮
│ ⛓ Protocol GRPC │
│ 🏠 Local 127.0.0.1:64503 │
│ 🔒 Private 192.168.178.45:64503 │
│ 🌍 Public 2003:c0:df06:f00:b16e:f44:45a6:7633:64503 │
╰───────────────────────────────────────────────────────────╯
executor/rep-0
executor/rep-0
executor/rep-1
executor/rep-2
executor/rep-2
Hide known warnings in Google Colab (#5706)
When using Jina in Google Colab, you may have received warnings about coroutines not being awaited. These warnings were mainly about Jina detecting a Jupyter environment and didn't really represent wrong behaviour. Thus, we have suppressed these warnings in this release.
DocArray v2 support with protobuf v4 (#5702)
Prior to this release, Jina did not support DocArray v2 if protobuf
v4 was installed. This release fixes the support and DocArray v2 should work fine for both versions of protobuf
(v3
and v4
). We have also increased test coverage so support is tested for both versions.
📗 Documentation Improvements
- Make Flow and Deployment difference clearer (#5709)
- Add warning about cleaning up Flows (#5708)
- Add JCloud Executor level labels (#5704)
🤟 Contributors
We would like to thank all contributors to this release:
- AlaeddineAbdessalem (@alaeddine-13)
- Alex Cureton-Griffiths (@alexcg1)
- Joan Fontanals (@JoanFM)
- Subba Reddy Veeramreddy (@subbuv26)
💫 Release v3.14.0
Release Note (3.14.0
)
Release time: 2023-02-20 09:15:47
This release contains 11 new features, 6 refactors, 12 bug fixes and 10 documentation improvements.
🆕 Features
Reshaping Executors as standalone services with the Deployment layer (#5563, #5590, #5628, #5672 and #5673)
In this release we aim to unlock more use cases, mainly building highly performant and scalable services. With its built-in layers of abstraction, Jina lets users build scalable, containerized, cloud-native components which we call Executors. Executors have always been services, but they were mostly used in Flows to form pipelines.
Now you can deploy an Executor on its own, without needing a Flow. Whether it's for model inference, prediction, embedding, generation or search, an Executor can wrap your business logic, and you get a gRPC microservice with Jina's cloud-native features (shards, replicas, dynamic batching, etc.)
To do this we offer the Deployment layer to deploy an Executor. Just like a Flow groups and orchestrate many Executors, a Deployment orchestrates just one Executor.
A Deployment can be used with both the Python API and YAML. For instance, after you define an Executor, use the Deployment class to serve it:
from jina import Deployment
with Deployment(uses=MyExecutor, port=12345, replicas=2) as dep:
dep.block() # serve forever
─────────────────────── 🎉 Deployment is ready to serve! ───────────────────────
╭────────────── 🔗 Endpoint ────────────────╮
│ ⛓ Protocol GRPC │
│ 🏠 Local 0.0.0.0:12345 │
│ 🔒 Private 192.168.3.147:12345 │
│ 🌍 Public 87.191.159.105:12345 │
╰───────────────────────────────────────────╯
Or implement a Deployment in YAML and run it from the CLI:
jtype: Deployment
with:
port: 12345
replicas: 2
uses: MyExecutor
py_modules:
- my_executor.py
jina deployment --uses deployment.yml
The Deployment class offers the same interface as a Flow, so it can be used as a client too:
from jina import Deployment
with Deployment(uses=MyExecutor, port=12345, replicas=2) as dep:
dep.post(on='/foo', inputs=DocumentArray.empty(1)
print(docs.texts)
Furthermore, you can use the Deployment to create Kubernetes and Docker Compose YAML configurations of a single Executor deployment. So, to export to Kubernetes with the Python API:
from jina import Deployment
dep = Deployment(uses='jinaai+docker://jina-ai/DummyHubExecutor', port_expose=8080, replicas=3)
dep.to_kubernetes_yaml('/tmp/config_out_folder', k8s_namespace='my-namespace')
And exporting to Kubernetes with the CLI is just as straightforward:
jina export kubernetes deployment.yml output_path
As is exporting to Docker Compose with the Python API:
from jina import Deployment
dep = Deployment(uses='jinaai+docker://jina-ai/DummyHubExecutor', port_expose=8080, replicas=3)
dep.to_docker_compose_yaml(
output_path='/tmp/docker-compose.yml',
)
And of course, you can also export to Docker Compose with the CLI:
jina export docker-compose deployment.yml output_path
Read more about serving standalone Executors in our documentation.
(Beta) Support DocArray v2 (#5603)
As the DocArray refactoring is shaping up nicely, we've decided to integrate initial support. Although this support is still experimental, we believe DocArray v2 offers nice abstractions to clearly define the data of your services, especially with the single Executor deployment that we introduce in this release.
With this new experimental feature, you can define your input and output schemas with DocArray v2 and use type hints to define schemas of each endpoint:
from jina import Executor, requests
from docarray import BaseDocument, DocumentArray
from docarray.typing import AnyTensor, ImageUrl
class InputDoc(BaseDocument):
img: ImageUrl
class OutputDoc(BaseDocument):
embedding: AnyTensor
class MyExec(Executor):
@requests(on='/bar')
def bar(
self, docs: DocumentArray[InputDoc], **kwargs
) -> DocumentArray[OutputDoc]:
return_docs = DocumentArray[OutputDoc](
[OutputDoc(embedding=embed(doc.img)) for doc in docs]
)
return return_docs
Read more about the integration in the DocArray v2 section of our docs.
Communicate with individual Executors in Custom Gateways (#5558)
Custom Gateways can now make separate calls to specific Executors without respecting the Flow's topology.
With this feature, we target a different set of use cases, where the task does not necessarily have to be defined by a DAG pipeline. Rather, you define processing order using explicit calls to Executors and implement any use case where there's a central service (Gateway) communicating with remote services (Executors).
For instance, you can implement a Gateway like so:
from jina.serve.runtimes.gateway.http.fastapi import FastAPIBaseGateway
from jina import Document, DocumentArray, Flow, Executor, requests
from fastapi import FastAPI
class MyGateway(FastAPIBaseGateway):
@property
def app(self):
app = FastAPI()
@app.get("/endpoint")
async def get(text: str):
doc1 = await self.executor['executor1'].post(on='/', inputs=DocumentArray([Document(text=text)]))
doc2 = await self.executor['executor2'].post(on='/', inputs=DocumentArray([Document(text=text)]))
return {'result': doc1.texts + doc2.texts}
return app
# Add the Gateway and Executors to a Flow
flow = Flow() \
.config_gateway(uses=MyGateway, protocol='http', port=12345) \
.add(uses=FirstExec, name='executor1') \
.add(uses=SecondExec, name='executor2')
Read more about calling individual Executors.
Add secrets to Jina on Kubernetes (#5557)
To support building secure apps, we've added support for secrets on Kubernetes in Jina. Mainly, you can create environment variables whose sources are Kubernetes Secrets.
Add the secret using the env_from_secret
parameter either in Python API or YAML:
from jina import Flow
f = (
Flow().add(
uses='jinaai+docker://jina-ai/DummyHubExecutor',
env_from_secret={
'SECRET_USERNAME': {'name': 'mysecret', 'key': 'username'},
'SECRET_PASSWORD': {'name': 'mysecret', 'key': 'password'},
},
)
)
f.to_kubernetes_yaml('./k8s_flow', k8s_namespace='custom-namespace')
Add GatewayStreamer.stream()
to yield response and Executor errors (#5650)
If you're implementing a custom Gateway, you can use the GatewayStreamer.stream()
method to catch errors raised in Executors. Catching such errors wasn't possible with the GatewayStreamer.stream_docs()
method.
async for docs, error in self.streamer.stream(
docs=my_da,
exec_endpoint='/',
):
if error:
# raise error
else:
# process results
Read more about the feature in the documentation.
Add argument suppress_root_logging
to remove or preserve root logging handlers (#5635)
In this release, we've added the argument suppress_root_logging
to (you guessed it) suppress root logger messages. By default, root logs are suppressed.
Kudos to our community member @Jake-00 for the contribution!
Add gRPC streaming endpoint to Worker and Head runtimes (#5614)
To empower Executors we've added a gRPC streaming endpoint to both the Worker and Head runtimes. This means that an Executor or Head gRPC server exposes the same interface as a Jina gRPC Gateway. Therefore, you can use Jina's gRPC Client with each of those entities.
Add prefetch
argument to client post method (#5607)
A prefetch
argument has been added to the Client.post()
method. Previously, this argument was only available to the Gateway and it controlled how many requests a Gateway could send to Executors at a time.
However, it was not possible to control how many requests a Gateway (or Executor in case of a single Executor Deployment) could receive at a time.
Therefore, we've added the argument to the Client.post()
to give you better control over your requests.
Read more in the documentation.
Run warmup on runtimes and Executor (#5579)
On startup, all Jina entities that hold gRPC connections and stubs to other entities (Head and Gateway) now start warming up before the services become ready. This ensures lower latencies on first requests submitted to the Flow.
Make gRPC Client thread safe (#5533)
Previously, as gRPC asyncio clients offer limited support for multi-threading, using the Jina gRPC Client in multiple threads would print errors.
Therefore, in this release, we make the gRPC Client thread-safe in the sense that a thread can re-use it multiple times without another thread using it simultaneously. This means you can use the gRPC Client with multi-threading, while being sure only asyncio tasks belonging to the same thread have access to the gRPC stub at the same time.
Add user survey (#5667)
When running a Flow, a message now shows up in the terminal with a survey link. Feel free to fill in [ou...
💫 Patch v3.13.2
Release Note (3.13.2
)
This release contains 1 bug fix.
🐞 Bug Fixes
Respect timeout_ready
when generating startup probe (#5560)
As Kubernetes Startup Probes were added to all deployments in release v3.13.0, we added default values for all probe configurations. However, if those default configurations were not enough to wait for an Executor that takes time to load and become ready, the Executor deployment would become subject to the configured restart policy. Therefore, Executors that are slow to load would keep restarting forever.
In this patch, this behavior is fixed by making sure that Startup Probe configurations respect the timeout_ready
argument of Executors.
Startup Probes are configured like so:
periodSeconds
always set to 5 secondstimeoutSeconds
always set to 10 secondsfailureThreshold
is the number of attempts maybe by kubernetes to check if the pod is ready. It varies according totimeout_ready
. The formula used isfailureThreshold = timeout_ready / 5000
(astimeout_ready
is in microseconds andperiodSeconds
is 5 seconds) and
in all cases, it will be at least 3. Iftimeout_ready
is-1
(in Jina it means waiting forever for the Executor to become ready), since waiting forever is not supported in Kubernetes, the value forfailureThreshold
will be 120 attempts.
🤘 Contributors
We would like to thank all contributors to this release:
- AlaeddineAbdessalem (@alaeddine-13)
💫 Patch v3.13.1
Release Note (3.13.1
)
Release time: 2022-12-21 12:58:15
This release contains 3 bug fixes and 1 documentation improvement.
🐞 Bug Fixes
Support Gateway with multiple protocols for Kubernetes export (#5532)
You can now export Flows with multiple protocols to Kubernetes. Previously this would cause an error.
flow = Flow().config_gateway(protocol=['http', 'grpc'])
flow.to_kubernetes_yaml('k8s_flow_folder')
Fix Python 3.11 support (#5529)
It was previously impossible to install Jina with Python 3.11 due to a grpcio
dependency problem. grpcio
added support for Python 3.11 only with version 1.49.0, causing potential problems when used by Jina and other projects.
In this release grpcio>=1.49.0
is installed alongside Jina when using Python 3.11. However, be aware of potential problems related to grpc hanging.
Unary RPC from Client respects results_in_order
(#5513)
In prior releases, calling the post
method of a client with grpc
and using stream=False
did not respect the results_in_order
parameter and results were always returned in order:
# this wrongly returns results in order
c = Client(protocol='grpc')
c.post(on='/', inputs=DocumentArray.empty(10000), stream=False, results_in_order=False)
Also this implied that using the Client with asyncio=True
and stream=False
in the post call would return results in the order that they were returned by the Flow, rather than respecting the input order:
# this wrongly returns results in order
c = Client(protocol='grpc', asyncio=True)
async for resp in c.post(on='/', inputs=DocumentArray.empty(10000), stream=False, results_in_order=False)
print(resp)
This release fixes the ordering bug.
📗 Documentation Improvements
- Document inheritance of arguments from Flow API to Executors and Gateway (#5535)
🤘 Contributors
We would like to thank all contributors to this release:
- AlaeddineAbdessalem (@alaeddine-13)
- Joan Fontanals (@JoanFM)
- Jackmin801 (@Jackmin801)
- Anne Yang (@AnneYang720)
💫 Release v3.13.0
Release Note (3.13.0
)
Release time: 2022-12-15 15:33:43
This release contains 14 new features, 9 bug fixes and 7 documentation improvements.
This release introduces major features like Custom Gateways, Dynamic Batching for Executors, development support with auto-reloading, support for the new namespaced Executor scheme jinaai
, improvements for our gRPC transport layer, and more.
🆕 Features
Custom Gateways (#5153, #5189, #5342, #5457, #5465, #5472 and #5477)
Jina Gateways are now customizable in the sense that you can implement them in much the same way as an Executor. With this feature, Jina gives power to the user to implement any server, protocol or interface at the Gateway level. There's no more need to build an extra service that uses the Flow.
For instance, you can define a Jina Gateway that communicates with the rest of Flow Executors like so:
from docarray import Document, DocumentArray
from jina.serve.runtimes.gateway.http.fastapi import FastAPIBaseGateway
class MyGateway(FastAPIBaseGateway):
@property
def app(self):
from fastapi import FastAPI
app = FastAPI(title='Custom FastAPI Gateway')
@app.get(path='/service')
async def my_service(input: str):
# convert input request to Documents
docs = DocumentArray([Document(text=input)])
# send Documents to Executors using GatewayStreamer
result = None
async for response_docs in self.streamer.stream_docs(
docs=docs,
exec_endpoint='/',
):
# convert response docs to server response and return it
result = response_docs[0].text
return {'result': result}
return app
Then you can use it in your Flow in the following way:
flow = Flow().config_gateway(uses=MyGateway, port=12345, protocol='http')
A Custom Gateway can be used as a Python class, YAML configuration or Docker image.
Adding support for Custom Gateways required exposing the Gateway API and supporting multiple ports and protocols (mentioned in a prior release). You can customize it by subclassing the FastAPIBaseGateway class (for simple implementation) or base Gateway for more complex use cases.
Working on this feature also involved exposing and improving the GatewayStreamer API as a way to communicate with Executors within the Gateway.
Find more information in the Custom Gateway page.
Dynamic batching (#5410)
This release adds Dynamic batching capabilities to Executors.
Dynamic batching allows requests to be accumulated and batched together before being sent to an Executor. The batch is created dynamically depending on the configuration for each endpoint.
This feature is especially relevant for inference tasks where model inference is more optimized when batched to efficiently use GPU resources.
You can configure Dynamic batching using either a decorator or the uses_dynamic_batching
parameter. The following example shows how to enable Dynamic batching on an Executor that performs model inference:
from jina import Executor, requests, dynamic_batching, Flow, DocumentArray, Document
import numpy as np
import torch
class MyExecutor(Executor):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# initialize model
self.model = torch.nn.Linear(in_features=128, out_features=128)
@requests(on='/bar')
@dynamic_batching(preferred_batch_size=10, timeout=200)
def embed(self, docs: DocumentArray, **kwargs):
docs.embeddings = self.model(torch.Tensor(docs.tensors))
flow = Flow().add(uses=MyExecutor)
With Dynamic Batching enabled, the Executor above will efficiently use GPU resources to perform inference by batching Documents together.
Read more about the feature in the Dynamic Batching documentation page.
Install requirements of local Executors (#5508)
Prior to this release, the install_requirements
parameter of Executors only installed Executor requirements for Hub Executors. Now, local Executors with a requirements.txt
file will also have their requirements installed before starting Flows.
Support jinaai
Executor scheme to enable namespaced Hub Executors (#5462, #5468 and #5515)
As Jina AI Cloud introduced namespaces to Executor resources, we made changes to support the new jinaai
Executor scheme. This PR adds support for the new scheme.
This means that namespaced Executors can now be used with the jinaai
scheme in the following way:
from jina import Flow
flow = Flow().add(uses='jinaai://jina-ai/DummyHubExecutor')
This scheme is also supported in Kubernetes and other APIs:
from jina import Flow
flow = Flow().add(uses='jinaai+docker://jina-ai/DummyHubExecutor')
flow.to_kubernetes_yaml('output_path', k8s_namespace='my-namespace')
The support of the new scheme means the minimum supported version of jina-hubble-sdk
has been increased to 0.26.10
.
Add auto-reloading to Flow and Executor on file changes (#5461, #5488 and #5514)
A new argument reload
has been added to the Flow and Executor APIs, which automatically reloads running Flows and Executors when changes are made to Executor source code or YAML configurations of Flows and Executors.
Although this feature is only meant for development, it aims to help developers iterate fast and automatically update Flows with changes they make live during development.
Find out more about this feature in these two sections:
Expand Executor serve parameters (#5494)
The method Executor.serve
can receive more parameters, similar to what the Flow API expects. With new parameters to control serving and deployment configurations of the Executor, this method empowers the Executor to be convenient for single service tasks.
This means you can not only build advanced microservices-based pipelines and applications, but also build individual services with all Jina features: shards/replicas, dynamic batching, auto-reload, etc.
Read more about the method in the Python API documentation.
Add gRPC trailing metadata when logging gRPC error (#5512)
When logging gRPC errors, context trailing metadata is now shown. This helps identify underlying network issues rather than the error codes that mask multiple network errors into a single gRPC status code.
For instance, the new log message looks like the following:
DEBUG gateway@ 1 GRPC call to deployment executor0 failed
with error <AioRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
...
trailing_metadata=Metadata((('content-length', '0'),
('l5d-proxy-error', 'HTTP Balancer service in
fail-fast'), ('l5d-proxy-connection', 'close'),
('date', 'Tue, 13 Dec 2022 10:20:15 GMT'))), for
retry attempt 2/3. Trying next replica, if available.
The trailing_metadata returned by load balancers will help to identify the root cause more accurately.
Implement unary_unary
stub for Gateway Runtime (#5507)
This release adds the gRPC unary_unary
stub for Gateway Runtime as a new communication stub with Executors. Since the gRPC performance best practices for Python page suggests that unary stream implementation might be faster for Python, we added this communication method.
However, this is not enabled by default. The streaming RPC method will still be used unless you set the stream
option to False
in the Client.post()
method. The feature is only effective when the gRPC protocol is used.
Read more about the feature in the documentation: https://docs.jina.ai/concepts/client/send-receive-data/#use-unary-or-streaming-grpc
Add Startup Probe and replace Readiness Probe with Liveness Probe (#5407)
Before this release, when exporting Jina Flows to Kubernetes YAML configurations, Kubernetes Readiness Probes used to be added for the Gateway pod and each Executor pod. In this release we have added a Startup Probe and replaced Readiness Probe with Liveness Probe.
Both probes use the jina ping
command to check that pods are healthy.
New Jina perf Docker images (#5498)
We added a slightly larger Docker image with suffix perf
which includes a set of tools useful for performance tuning and debugging.
The new image is available in Jina AI's Docker hub.
New Jina Docker image for Python 3.10, and use Python 3.8 for default Jina image (#5490)
Besides adding Docker images aimed for performance optimization, we added an image with a newer Python version: ...
💫 Release v3.12.0
Release Note (3.12.0
)
Release time: 2022-11-25 12:28:29
This release contains 8 new features, 16 bug fixes and 15 documentation improvements.
🆕 Features
Support multiple protocols at the same time in Flow Gateways (#5435 and #5378)
Prior to this release, a Flow only exposed one server in its Gateway with one of the following protocols: HTTP, gRPC or WebSockets.
Now, you can specify multiple protocols and for each one, a separate server is started. Each server is bound to its own port.
For instance, you can do:
from jina import Flow
flow = Flow(port=[12345, 12346, 12347], protocol=['http', 'grpc', 'websocket'])
with flow:
flow.block()
or: jina flow --uses flow.yml
where flow.yml
is:
jtype: Flow
with:
protocol:
- 'grpc'
- 'http'
- 'websocket'
port:
- 12345
- 12344
- 12343
The protocol
and port
parameters can still accept single values rather than a list. Therefore, there is no breaking change.
Alias parameters protocols
and ports
are also defined:
flow = Flow(ports=[12345, 12346, 12347], protocols=['http', 'grpc', 'websocket'])
In Kubernetes, this exposes separate services for each protocol.
Read the docs for more information.
Add Kubernetes information to resource attributes in instrumentation (#5372)
When deploying to Kubernetes, the Gateway and Executors expose the following Kubernetes information as resource attributes in instrumentation:
k8s.namespace.name
k8s.pod.name
k8s.deployment.name
/k8s.statefulset.name
Besides that, the following resource attributes are set if they are present in the environment variables of the container:
k8s.cluster.name
(setK8S_CLUSTER_NAME
environment variable)k8s.node.name
(setK8S_NODE_NAME
environment variable)
Add option to return requests in order using the Client
(#5404)
If you use replicated Executors, those which finish processing first return their results to the Gateway which then returns them to the client. This is useful if you want results as soon as each replicated Executor finishes processing your Documents.
However, this may be inconvenient if you want the Documents you send to the Flow to return in order. In this release, you can retain the order of sent Documents (when using replicated Executors) by passing the results_in_order
parameter in the Client
.
For instance, if your Flow looks like this:
from jina import Flow, DocumentArray, Document
f = Flow().add(replicas=2)
You can do the following to keep results in order:
input_da = DocumentArray([Document(text=f'{i}') for i in range(100)])
with f:
result_da = f.post('/', inputs=input_da, request_size=10, results_in_order=True)
assert result_da[:, 'text'] == input_da[:, 'text']
Add docs_map
parameter to Executor endpoints (#5366)
Executor endpoint signatures are extended to the following:
class MyExecutor(Executor):
@requests
async def foo(
self, docs: DocumentArray, parameters: Dict, docs_matrix: Optional[List[DocumentArray]], docs_map: Optional[Dict[str, DocumentArray]]
) -> Union[DocumentArray, Dict, None]:
pass
Basically, the parameter docs_map
has been added. It's a dictionary that maps previous Executor names to DocumentArrays. This is useful when you have an Executor that combines results from many previous Executors, and you need information about where each resulting DocumentArray comes from.
Add Gateway API (#5342)
Prior to this release, all Gateway configurations were specified in the Flow API. However, by principle, Flow parameters are commonly inherited by Executors and the Gateway. We already gave the Executor its own API to be customized (either using the method add()
or the executors
YAML section in Flow YAML).
In this release, we have done the same for Gateway. It defines its own API in both the Python API and YAML interface. In the Python API, you can configure the Gateway using the config_gateway()
method:
flow = Flow().config_gateway(port=12345, protocol='http')
And in the YAML interface, you can configure the Gateway using the gateway
section:
!Flow
gateway:
protocol: http
port: 12344
executors:
- name: exec
This is useful when you want to apply parameters just for the Gateway. If you want a parameter to be applied to all Executors, then continue to use the Flow API.
Keep in mind that you can still provide Gateway
parameters using the Flow API. This means there are no breaking changes introduced.
Support UUID in CUDA_VISIBLE_DEVICES round-robin assignment (#5360)
You can specify a comma-separated list of GPU UUIDs in the CUDA_VISIBLE_DEVICES
to assign devices to Executor replicas in a round-robin fashion. For instance:
CUDA_VISIBLE_DEVICES=RRGPU-0aaaaaaa-74d2-7297-d557-12771b6a79d5,GPU-0bbbbbbb-74d2-7297-d557-12771b6a79d5,GPU-0ccccccc-74d2-7297-d557-12771b6a79d5,GPU-0ddddddd-74d2-7297-d557-12771b6a79d5
Check CUDA's documentation to see the accepted formats to assign CUDA devices by UUID.
GPU device | Replica ID |
---|---|
GPU-0aaaaaaa-74d2-7297-d557-12771b6a79d5 |
0 |
GPU-0bbbbbbb-74d2-7297-d557-12771b6a79d5 |
1 |
GPU-0ccccccc-74d2-7297-d557-12771b6a79d5 |
2 |
GPU-0ddddddd-74d2-7297-d557-12771b6a79d5 |
3 |
GPU-0aaaaaaa-74d2-7297-d557-12771b6a79d5 |
4 |
Thanks to our community member @mchaker for submitting this feature request!
Capture shard failures in the head runtime (#5338)
In case you use Executor shards, partially failed requests (those that fail on a subset of the shards) no longer raise an error.
Instead, successful results are returned. An error is raised only when all shards fail to process Documents. Basically, the HeadRuntime
's behavior is updated to fail only when all shards fail.
Thanks to our community user @soumil1 for submitting this feature request.
Add successful, pending and failed metrics to HeadRuntime (#5374)
More metrics have been added to the Head Pods:
jina_number_of_pending_requests
: number of pending requestsjina_successful_requests
: number of successful requestsjina_failed_requests
: number of failed requestsjina_received_request_bytes
: the size of received requests in bytesjina_sent_response_bytes
: the size of sent responses in bytes
See more in the instrumentation docs.
Add deployment label in grpc stub metrics (#5344)
Executor metrics used to show up aggregated at the Gateway level and users couldn't see separate metrics per Executor. With this release, we have added labels for Executors so that metrics in the Gateway can be generated per Executor or aggregated over all Executors.
🐞 Bug Fixes
Check whether the deployment is in Executor endpoints mapping (#5440)
This release adds an extra check in the Gateway when sending requests to deployments: The Gateway sends requests to the deployment only if it is in the Executor endpoint mapping.
Unblock event loop to allow health service (#5433)
Prior to this release, sync function calls inside Executor endpoints blocked the event loop. This meant that health-checks submitted to Executors failed for long tasks (for instance, inference using a large model).
In this release, such tasks no longer block the event loop. While concurrent requests to the same Executor wait until the sync task finishes, other runtime tasks remain functional, mainly health-checks.
Dump environment variables to string for Kubernetes (#5430)
Environment variables are now cast to strings before dumping them to Kubernetes YAML.
Unpin jina-hubble-sdk version (#5412)
This release frees (unpins) jina-hubble-sdk
version. The latest jina-hubble-sdk
is installed with the latest Jina.
Bind servers to host
argument instead of __default_host__
(#5405)
This release makes servers at each Jina pod (head, Gateway, worker) bind to the host address specified by the user, instead of always binding to the __default_host__
corresponding to the OS. This lets you, depending on your network interface, restrict or expose your Flow services in your network.
For instance, if you wish to expose all pods to the internet, except for the last Executor, you can do:
flow = Flow(host='0.0.0.0').add().add(host='127.0.0.1')
After this fix, Jina respects this syntax and binds the last Executor only to 127.0.0.1
(accessible only inside the host machine).
Thanks to @wqh17101 for reporting this issue!
Fix backoff_multiplier
format when using max_attempts
in the Client
(#5403)
This release fixes the format of backoff_multiplier
parameter when injected into the gRPC request. The issue appeared when you use the max_attempts
parameter in Client
.
Maintain the correct tracing operations chain (#5391)
Tracing spans for Executors used to show up out of order. This behavior has been fixed by using the method start_as_current_span
instead of start_span
to maintain the tracing chain in the correct order.
Use Async health servicer for tracing interceptors when tracing is enabled (#5392)
When tracing is enabled, health checks in Docker and Kubernetes deployments used to fail silently until the Flow timed out. This happened because tracing interceptors expected RPC...