Releases: kubeflow/arena
Releases · kubeflow/arena
v0.9.10
Release 0.9.10
Changed
- Fix --data-dir is not taking effect in custom-serving.
- Fix the prompt content when submitting serve job.
- Default delete secret permissions in et-operator.
- Enable create secret for deepspeedjob, etjob.
Please follow the Get started Guide to install.
v0.9.9
Release 0.9.9
Changed
- Update SDK and JAVA SDK Unit test.
- Fix panic when pod started failed.
- Support job set image pull policy.
- Support new training type deepspeed.
- Fix evaluator node selector.
- Fix update serve duplicate create env and toleration.
Please follow the Get started Guide to install.
v0.9.8
Release 0.9.8
Changed
- Support Cron tfjob set ttlAfterFinished.
- Add DeepSpeed base image dockerfile.
- Move policy v1beta1 to v1.
- Fix evaluatejob job yaml in charts.
Please follow the Get started Guide to install.
v0.9.7
Release 0.9.7
Changed
- Support set TTLSecondsAfterFinished in Builder.
Please follow the Get started Guide to install.
v0.9.6
Release 0.9.6
Changed
- Add ownerReference for configmap and tensorboard.
Please follow the Get started Guide to install.
v0.9.5
Release 0.9.5
Changed
- Add imagePullSecret and shareMemory for arena serve.
- Support TTLSecondsAfterFinished.
- Support TFJob StartingDeadlineSeconds.
- Support TFJob/PytorchJob ActiveDeadlineSeconds.
Please follow the Get started Guide to install.
v0.9.4
Release 0.9.4
Changed
- Fix serve update when limits is null.
- Fix arena serve update bug.
- Fix model serve args bug.
- Add toleration dedup for arena serve update.
Please follow the Get started Guide to install.
v0.9.3
Release 0.9.3
Changed
- Optimize arena submit etjob for spot.
- Fix arena serve update bug.
- Fix model serve args bug.
- Add toleration dedup for arena serve update.
Please follow the Get started Guide to install.
v0.9.2
Release 0.9.2
Fixed
- Fix serve triton bugs.
Changed
- Skip to update crd when upgrade arena-artifacts.
- Modify the support method of Toleration.
Added
- Update images and support clean all policy for tfjob.
- Support submit parameters useHostNetwork useHostIPC useHostPID.
- Support for arena serve update.
- Support custom scheduler name.
Please follow the Get started Guide to install.
v0.9.1
Release 0.9.1
Fixed
- Fix the bug that failed to run pytorchjob with RDMA.
- Fix the bug that error dispaly gpu core resources on nodes.
- Fix the bug that add evaluator and tensorboard to pod group.
Changed
- Refact installtion.
- Modify restful-serving to http-serving of deployment services.
- Optimize the operators to omit the Completed jobs into the queue.
Added
- Support modeljob adapts helm3.
- Cron workload supports custom labels.
- Java SDK submits training job with --label.
- Add resource limits for tfjob.
- Add subpathexpr for job .
Please follow the Get started Guide to install.