Release Highlights

Ray Data: This release features a new Delta Lake and Unity Catalog integration and performance improvements to various reading/writing operators.
Ray Core: Enhanced GPU object support with intra-process communication and improved Autoscaler v2 functionality
Ray Train: Improved hardware metrics integration with Grafana and enhanced collective operations support
Ray Serve LLM: This release features early proof of concept for prefill-decode disaggregation deployment and LLM-aware request routing such as prefix-cache aware routing.
Ray Data LLM: Improved throughput and CPU memory utilization for ray data workers.

Ray Libraries

Ray Data

🎉 New Features:

Add reading from Delta Lake tables and Unity Catalog integration (#53701)
Enhanced pin_memory support in iter_torch_batches (#53792)
Add pin_memory to iter_torch_batches (#53792)

💫 Enhancements:

Re-enabled sorting in Ray Data tests with performance improvements (#54475)
Enhanced handling of mismatched columns and pandas.NA values (#53861, #53859)
Improved read_text trailing newline semantics (#53860)
Optimized backpressure handling with policy-based resource management (#54376)
Enhanced write_parquet with support for both partition_by and row limits (#53930)
Prevent filename collisions on write operations (#53890)
Improved execution performance for One Hot encoding in preprocessors (#54022)

🔨 Fixes:

Fixing map_groups issues (#54462)
Prevented Op fusion for streaming repartition to avoid performance degradation (#54469)
Fixed ActorPool autoscaler scaling up logic (#53983)
Resolved empty dataset repartitioning issues (#54107)
Fixed PyArrow overflow handling in data processing (#53971, #54390)
Fixed IcebergDatasink to properly generate individual file uuids (#52956)
Avoid OOMs with read_json(..., lines=True) (#54436)
Handle HuggingFace parquet dataset resolve URLs (#54146)
Fixed BlockMetadata derivation for Read operator (#53908)

📖 Documentation:

Updated AggregateFnV2 documentation to clarify finalize method (#53835)
Improved preprocessor and vectorizer API documentation

Ray Train

🎉 New Features:

Added broadcast_from_rank_zero and barrier collective operations (#54066)
Enhanced hardware metrics integration with Grafana dashboards (#53218)
Added support for dynamically loading callbacks via environment variables (#54233)

💫 Enhancements:

Improved checkpoint population from before_init_train_context (#54453)
Enhanced controller state logging and metrics (#52805)
Added structured logging environment variable support (#52952)
Improved handling of Noop scaling decisions for smoother scaling logic (#53180)
Logging of controller state transitions to aid in debugging and analysis (#53344)

🔨 Fixes:

Fixed GPU tensor reporting in ray.train.report (#53725)
Enhanced move_tensors_to_device utility for complex tensor structures (#53109)
Improved worker health check error handling with trace information (#53626)
Fixed GPU transfer support for non-contiguous tensors (#52548)
Force abort on SIGINT spam and do not abort finished runs (#54188)

📖 Documentation:

Updated beginner PyTorch example (#54124)
Added documentation for ray.train.collective APIs (#54340)
Added a note about PyTorch DataLoader's multiprocessing and forkserver usage (#52924)
Fixed various docstring format and indentation issues (#52855, #52878)
Added note that ray.train.report API docs should mention optional checkpoint_dir_name (#54391)

🏗 Architecture refactoring:

Removed subclass relationship between RunConfig and RunConfigV1 (#54293)
Enhanced error handling for finished training runs (#54188)
Deduplicated ML doctest runners in CI for efficiency (#53157)
Converted isort configuration to Ruff for consistency (#52869)

Ray Tune

💫 Enhancements:

Updated test_train_v2_integration to use the correct RunConfig (#52882)

🔨 Fixes:

Fixed RayTaskError serialization logic (#54396)
Improved experiment restore timeout handling (#53387)

📖 Documentation:

Replaced session.report with tune.report and corrected import paths (#52801)
Removed outdated graphics cards reference in docs (#52922)
Fixed various docstring format issues (#52879)

Ray Serve

🎉 New Features:

Added RouterConfig field to DeploymentConfig for custom RequestRouter configuration (#53870)
Added support for implementing custom request routing algorithms (#53251)

💫 Enhancements:

Enhanced FastAPI ingress deployment validation for multiple deployments (#53647)
Optimized get_live_deployments performance (#54454)
Progress towards making ray.serve.llm compatible with vLLM serve frontend (#54481, #54443, #54440)

🔨 Fixes:

Fixed deployment scheduler issues with component scheduling (#54479)
Fixed runtime_env validation for py_modules (#53186)
Added descriptive error message when deployment name is not found (#45181)

📖 Documentation:

Added troubleshooting guide for DeepSeek/multi-node GPU deployment on KubeRay (#54229)
Updated the guide on serving models with Triton Server in Ray Serve
Added documentation for custom request routing algorithms
Added custom request router docs (#53511)

🏗 Architecture refactoring:

Remove indirection layers of node initialization (#54481)
Incremental refactor of LLMEngine (#54443)
Remove random v0 logic from serve endpoints (#54440)
Remove usage of internal_api.memory_summary() (#54417)
Remove usage of ray._private.state (#54140)

Ray Serve/Data LLM

🎉 New Features

Support separate deployment config for PDProxy in PrefixAwareReplicaSet (#53935)
Support for prefix-aware request router (#52725)

💫 Enhancements

Log engine stats after each batch task is done. (#54360)
Decouple max_tasks_in_flight from max_concurrent_batches (#54362)
Make llm serve endpoints compatible with vLLM serve frontend, including streaming, tool_code, and health check support (#54440)
Remove botocore dependency in Ray Serve LLM (#54156)
Update vLLM version to 0.9.2 (#54407)

🔨 Fixes

Fix health check in prefill disagg (#53937)
Fix doc to only support int concurrency (#54196)
Fix vLLM batch test by changing to Pixtral (#53744)
Fix pickle error with remote code models in vLLM Ray workloads (#53868)
Adaption of the change of vllm.PoolingOutput (#54467)

📖 Documentation

Ray serve/lora doc fix (#53553)
Add Ray serve/LLM doc (#52832)
Add a doc snippet to inform users about existing diffs between vLLM and Ray Serve LLM behavior in some APIs like streaming, tool_code, and health check (#54123)
Troubleshooting DeepSeek/multi-node GPU deployment on KubeRay (#54229)

🏗 Architecture refactoring

Make llm serve endpoints compatible with vLLM serve frontend, including streaming, tool_code, and health check support (#54490)
Prefix-aware scheduler [2/N] Configure PrefixAwareReplicaSet to correctly handle the number of available GPUs for each worker and to ensure efficient GPU utilization in vLLM (#53192)
Organize spread out utils.py (#53722)
Remove ImageRetriever class and related tests from the LLM serving codebase. (#54018)
Return a batch of rows in the udf instead of row by row (#54329)

RLlib

🎉 New Features:

Implemented Offline Policy Evaluation (OPE) via Importance Sampling (#53702)
Enhanced ConnectorV2 ObservationPreprocessor APIs with multi-agent support (#54209)
Add GPU inference to offline evaluation (#52718)

💫 Enhancements:

Enhanced MetricsLogger to handle tensors in state management (#53514)
Improved env seeding in EnvRunners with deterministic training example rewrite (#54039)
Cleanup of meta learning classes and examples (#52680)

🔨 Fixes:

Fixed EnvRunner restoration when no local EnvRunner is available (#54091)
Fixed shapes in explained_variance for recurrent policies (#54005)
Resolved device check issues in Learner implementation (#53706)
Enhanced numerical stability in MeanStdFilter (#53484)
Fixed weight synching in offline evaluation (#52757)
Fixed bug in split_and_zero_pad utility function (#52818)

📖 Documentation:

Do-over of examples for connector pipelines (#52604)
Remove "new API stack" banner from all RLlib docs pages as it's now the default (#54282)

Ray Core

🎉 New Features:

Enhanced GPU object support with intra-process communication (#53798)
Integrated single-controller collective APIs with GPU objects (#53720)
Added support for ray.get on driver process for GPU objects (#53902)
Supporting allreduce on list of input nodes in compiled graphs (#51047)
Add single-controller API for ray.util.collective and torch gloo backend (#53319)

💫 Enhancements:

Improved autoscaler v2 functionality with cloud instance ID reusing (#54397)
Enhanced cluster task manager with better resource management (#54413)
Upgraded OpenTelemetry SDK for better observability (#53745)
Improved actor scheduling to prevent deadlocks in ordered actors (#54034)
Enhanced get_max_resources_from_cluster_config functionality (#54455)
Use std::move in cluster task manager constructor (#54413)
Improve status messages and add comments about stale seq_no handling (#54470)
uv run integration is now enabled by default (#53060)

🔨 Fixes:

Fixed race conditions in object eviction and repinning for recovery (#53934)
Resolved GCS crash issues on duplicate MarkJobFinished RPCs (#53951)
Enhanced actor restart handling on node failures (#54088)
Improved reference counting during worker graceful shutdown (#53002)
Fix race condition when canceling task that hasn't started yet (#52703)
Fix the issue where a valid RestartActor rpc is ignored (#53330)
Fixed "Check failed: it->second.num_retries_left == -1" error (#54116)
Fix detached actor being unexpectedly killed (#53562)

📖 Documentation:

Enhanced troubleshooting guides and API documentation
Updated reStructuredText formatting on Resources page (#53882)
Fix working code snippets (#52748)
Add doc for running KubeRay dashboard (#53830)
Add antipattern for nested ray.get (#43184)

🏗 Architecture refactoring:

Delete old skipped tests and unused code (#54427)
Consolidate TaskManager interface (#54317)
Move dependencies of NodeManager to main.cc for better testability (#53782)
Use smart pointer in logging.cc (#54351)
Delete event_label and unused environment variables (#54378, #54095)
Remove actor task path in normal task submitter (#53996)
Rename GcsFunctionManager and use fake in test (#53973)

Dashboard

🎉 New Features:

Add dynolog for on-demand GPU profiling for Torch training (#53191)

💫 Enhancements:

Added TPU usage metrics to reporter agent (#53678)
Enhanced GPU profiling manager IP address retrieval (#53807)
Improved configurability of Grafana dashboard parameters (#53236)
Add configurability of 'orgId' param for requesting Grafana dashboards (#53236)

🔨 Fixes:

Fixed Grafana dashboard dropdowns for data and train dashboards (#52752)
Resolved daylight savings time issues in dashboard (#52755)
Fix retrieving IP address from the GPUProfilingManager on the dashboard agent (#53807)

Docs
🎉 New Features:

New end-to-end examples:
- Multi-modal AI pipeline (#52342)
- Xgboost tutorial (#52383)
- Audio transcription and LLM as judge curation (#53189)
- LLM training and inference (#53415)
- Scalable video processing (#50965)

💫 Enhancements:

Add pydoclint to pre-commit (#52974)
Add vale to pre-commit (#53564)

Breaking Changes

Removed deprecated ray.workflow package (#53612)
Removed deprecated storage parameter from ray.init (#53669)
Removed deprecated ray start CLI options (#53675)
Removed experimental "array" library (#54105)
Remove dask from byod 3.9 deps (#54521)

Dependencies & Build

Added uv binary v0.7.19 for improved package management (#54437)
Upgraded datasets in release tests (#54425)
Enhanced wheel building process with single bazel call optimization (#54476)
Fixed uv run parser for handling extra arguments (#54488)
Upgrade h11, requests, starlette, jinja2, pyopenssl and cryptography
Generate multi-arch image indexes (#52816)

Thanks!

Thank you to everyone who contributed to this release!
@kouroshHakha, @davidwagnerkc, @MengjinYan, @minerharry, @simonsays1980, @Myasuka, @noemotiovon, @goutamvenkat-anyscale, @harshit-anyscale, @jugalshah291, @tianyi-ge, @sven1977, @crypdick, @JohnsonKuan, @lk-chen, @richardsliu, @alexeykudinkin, @EagleLo, @soffer-anyscale, @zcin, @AdrienVannson, @nilsmelchert, @raulchen, @jujipotle, @DrehanM, @vigneshka, @Ziy1-Tan, @Blaze-DSP, @ArthurBook, @GokuMohandas, @walkoss, @bveeramani, @edoakes, @omatthew98, @SeanQuant, @CheyuWu, @cszhu, @win5923, @kevin85421, @angelinalg, @iamjustinhsu, @eicherseiji, @kunling-anyscale, @vickytsang, @MortalHappiness, @aslonnie, @psr-ai, @sbhat98, @anyadontfly, @marwan116, @cristianjd, @2niuhe, @codope, @fscnick, @ryanaoleary, @srinathk10, @TimothySeah, @han-steve, @Future-Outlier, @Syulin7, @Qiaolin-Yu, @elliot-barn, @JoshKarpel, @dayshah, @can-anyscale, @ok-scale, @mattip, @SolitaryThinker, @owenowenisme, @nehiljain, @GeneDer, @rnkrtt, @israbbani, @DriverSong, @sinalallsite, @pcmoritz, @akyang-anyscale, @xinyuangui2, @nrghosh, @davidxia, @rueian, @stephanie-wang, @jjyao, @chris-ray-zhang, @czgdp1807, @justinvyu, @Daraan, @landscapepainter, @troychiu, @khluu, @hipudding, @ruisearch42, @robertnishihara, @ArturNiederfahrenhorst, @abrarsheikh, @alanwguo, @HollowMan6, @ran1995data, @matthewdeng

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ray-2.48.0

Release Highlights

Ray Libraries

Ray Data

Ray Train

Ray Tune

Ray Serve

Ray Serve/Data LLM

RLlib

Ray Core

Dashboard

Thanks!

Contributors

Uh oh!