Changelog¶
All notable user-facing changes to dagster-ray will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.4.0¶
This release introduces a new feature that is very useful in dev environments: Cluster Sharing. Cluster sharing allows reusing existing RayCluster resources created by previous Dagster steps. It's implemented for KubeRayCluster Dagster resource. This feature enables faster iteration speed and reduced infrastructure costs (at the expense of job isolation). Therefore KubeRayCluster is now recommended over KubeRayInteractiveJob for use in dev environments.
Learn more in Cluster Sharing docs.
Added¶
KubeRayCluster.cluster_sharingparameter that controls cluster sharing behavior.dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusterssensor that cleans up expired clusters (both shared and non-shared). Learn more in docs.dagster-rayentry now appears in the Dagster libraries list in the web UI.
Changed¶
- [
breaking] - removed
cleanup_kuberay_clusters_opand other associated definitions in favor ofdagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusterssensor that is more flexible.
0.3.1¶
Added¶
failure_tolerance_timeoutconfiguration parameter forKubeRayInteractiveJobandKubeRayCluster. It can be set to a positive value to give the cluster some time to transition out offailedstate (which can be transient in some scenarios) before raising an error.
Fixes¶
- ensure both
.head.serviceIPand.head.serviceNameare set on theRayClusterwhile waiting for cluster readiness.
0.3.0¶
This release includes massive docs improvements and drops support for Python 3.9.
Changes¶
- [
breaking] dropped Python 3.9 support (EOL October 2025).
- [internal] most of the general, backend-agnostic code has been moved to
dagster_ray.core(top-level imports still work).
0.2.1¶
Fixes¶
- Fixed broken wheel on PyPI.
0.2.0¶
Changed¶
KubeRayInteractiveJob.deletion_strategynow defaults toDeleteClusterfor both successful and failed executions. This is a reasonable default for the use case.KubeRayInteractiveJob.ttl_seconds_after_finishednow defaults to600seconds.KubeRayCluster.lifecycle.cleanupnow defaults toalways.- [
breaking]
RayJobandRayClusterclients and resources Kubernetes init parameters have been renamed tokube_configandkube_context.
Added¶
enable_legacy_debuggerconfiguration parameter to subclasses ofRayResourceon_exceptionoption forlifecycle.cleanuppolicy. It's triggered during resource setup/cleanup (includingKeyboardInterrupt), but not by user@op/@assetcode.KubeRayInteractiveJobnow respectslifecycle.cleanup. It defaults toon_exception. Users are advised to rely on built-inRayJobcleanup mechanisms, such asttlSecondsAfterFinishedanddeletionStrategy.
Fixes¶
- removed
ignore_reinit_errorfromRayResourceinit options: it's potentially dangerous, for example in case the user has accidentally connected to another Ray cluster (including local ray) before initializing the resource.
0.1.0¶
Changed¶
- [
breaking]
RayResource: top-levelskip_initandskip_setupconfiguration parameters have been removed. Thelifecyclefield is the new way of configuring steps performed during resource initialization.KubeRayCluster'sskip_cleanuphas been moved tolifecycleas well. - [
breaking] injected
dagster.io/run_idKubernetes label has been renamed todagster/run-id. Keys starting withdagster.io/have been converted todagster/to match howdagster-k8sdoes it. - [
breaking]
dagster_ray.kuberayConfigurations have been unified with KubeRay APIs. dagster-raynow populates Kubernetes labels with more values (including some useful Dagster Cloud values such asgit-sha).
Added¶
KubeRayInteractiveJob-- a resource that utilizes the newInteractiveModeforRayJob. It can be used to connect to Ray in Client mode -- likeKubeRayCluster-- but gives access toRayJobfeatures, such as automatic cleanup (ttlSecondsAfterFinished), retries (backoffLimit) and timeouts (activeDeadlineSeconds).RayResourcesetup lifecycle has been overhauled: resources now has anactionsparameter with 3 configuration options:create,waitandconnect. The user can disable them and run.create(),.wait()and.connect()manually if needed.