Skip to content

Releases: kubeflow/arena

v0.9.10

17 Nov 12:24
4b5c18c
Compare
Choose a tag to compare

Release 0.9.10

Changed

  • Fix --data-dir is not taking effect in custom-serving.
  • Fix the prompt content when submitting serve job.
  • Default delete secret permissions in et-operator.
  • Enable create secret for deepspeedjob, etjob.

Please follow the Get started Guide to install.

v0.9.9

24 Oct 12:44
516d8cb
Compare
Choose a tag to compare

Release 0.9.9

Changed

  • Update SDK and JAVA SDK Unit test.
  • Fix panic when pod started failed.
  • Support job set image pull policy.
  • Support new training type deepspeed.
  • Fix evaluator node selector.
  • Fix update serve duplicate create env and toleration.

Please follow the Get started Guide to install.

v0.9.8

18 Oct 11:24
cd1f02e
Compare
Choose a tag to compare

Release 0.9.8

Changed

  • Support Cron tfjob set ttlAfterFinished.
  • Add DeepSpeed base image dockerfile.
  • Move policy v1beta1 to v1.
  • Fix evaluatejob job yaml in charts.

Please follow the Get started Guide to install.

v0.9.7

17 Oct 01:41
b58010a
Compare
Choose a tag to compare

Release 0.9.7

Changed

  • Support set TTLSecondsAfterFinished in Builder.

Please follow the Get started Guide to install.

v0.9.6

13 Oct 08:23
b3c2c7f
Compare
Choose a tag to compare

Release 0.9.6

Changed

  • Add ownerReference for configmap and tensorboard.

Please follow the Get started Guide to install.

v0.9.5

18 Sep 08:07
09a5715
Compare
Choose a tag to compare

Release 0.9.5

Changed

  • Add imagePullSecret and shareMemory for arena serve.
  • Support TTLSecondsAfterFinished.
  • Support TFJob StartingDeadlineSeconds.
  • Support TFJob/PytorchJob ActiveDeadlineSeconds.

Please follow the Get started Guide to install.

v0.9.4

14 Sep 08:06
f7e889e
Compare
Choose a tag to compare

Release 0.9.4

Changed

  • Fix serve update when limits is null.
  • Fix arena serve update bug.
  • Fix model serve args bug.
  • Add toleration dedup for arena serve update.

Please follow the Get started Guide to install.

v0.9.3

04 Sep 02:16
e195c23
Compare
Choose a tag to compare

Release 0.9.3

Changed

  • Optimize arena submit etjob for spot.
  • Fix arena serve update bug.
  • Fix model serve args bug.
  • Add toleration dedup for arena serve update.

Please follow the Get started Guide to install.

v0.9.2

31 Aug 03:06
3c0c15e
Compare
Choose a tag to compare

Release 0.9.2

Fixed

  • Fix serve triton bugs.

Changed

  • Skip to update crd when upgrade arena-artifacts.
  • Modify the support method of Toleration.

Added

  • Update images and support clean all policy for tfjob.
  • Support submit parameters useHostNetwork useHostIPC useHostPID.
  • Support for arena serve update.
  • Support custom scheduler name.

Please follow the Get started Guide to install.

v0.9.1

30 Aug 02:51
06ee271
Compare
Choose a tag to compare

Release 0.9.1

Fixed

  • Fix the bug that failed to run pytorchjob with RDMA.
  • Fix the bug that error dispaly gpu core resources on nodes.
  • Fix the bug that add evaluator and tensorboard to pod group.

Changed

  • Refact installtion.
  • Modify restful-serving to http-serving of deployment services.
  • Optimize the operators to omit the Completed jobs into the queue.

Added

  • Support modeljob adapts helm3.
  • Cron workload supports custom labels.
  • Java SDK submits training job with --label.
  • Add resource limits for tfjob.
  • Add subpathexpr for job .

Please follow the Get started Guide to install.