Skip to content

Ceph stress testing runs into "icechunk.IcechunkError: × session error: error writing object to object store service error" #1197

@petermkr

Description

@petermkr

What happened?

When trying to run highly parallel code against a Ceph S3-compatible object store with Icechunk, the program exited with the following error message:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib64/python3.13/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
~~~~^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/multiprocessing/pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/home/username/icechunk-demo/demo/demo06_icechunk_parallel_partial_write.py", line 100, in worker
session.commit(message)
~~~~~~~~~~~~~~^^^^^^^^^
File "/home/username/.cache/pypoetry/virtualenvs/icechunk-demo-ZY7yPLKK-py3.13/lib64/python3.13/site-packages/icechunk/session.py", line 271, in commit
return self._session.commit(
~~~~~~~~~~~~~~~~~~~~^
message, metadata, rebase_with=rebase_with, rebase_tries=rebase_tries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
icechunk.IcechunkError: × session error: error writing object to object store service error

│ context:
│ 0: icechunk::storage::s3::write_ref
│ with ref_key="branch.main/ref.json" previous_version=VersionInfo { etag: Some(ETag(""7a9369dcfbc377183fef8e9d8f0bf40e"")), generation: None }
│ at icechunk/src/storage/s3.rs:819
│ 1: icechunk::refs::update_branch
│ with name="main" new_snapshot=6C5N1DRCE5SP0CV9MJCG current_snapshot=Some(0GYPX3ZH15N48YG7V8K0)
│ at icechunk/src/refs.rs:173
│ 2: icechunk::session::_commit
│ with Worker (init_time=18, lead_time_index=03): Wrote chunk data. rewrite_manifests=false
│ at icechunk/src/session.rs:985
│ 3: icechunk::session::commit
│ with Worker (init_time=18, lead_time_index=03): Wrote chunk data.
│ at icechunk/src/session.rs:948

"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/username/icechunk-demo/demo/demo06_icechunk_parallel_partial_write.py", line 199, in
main()
~~~~^^
File "/home/username/icechunk-demo/demo/demo06_icechunk_parallel_partial_write.py", line 168, in main
stats: list[tuple[int, int, int, int, float, float, float]] = pool.starmap(worker, tasks)
~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/multiprocessing/pool.py", line 375, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/usr/lib64/python3.13/multiprocessing/pool.py", line 774, in get
raise self._value
icechunk.IcechunkError: × session error: error writing object to object store service error

│ context:
│ 0: icechunk::storage::s3::write_ref
│ with ref_key="branch.main/ref.json" previous_version=VersionInfo { etag: Some(ETag(""7a9369dcfbc377183fef8e9d8f0bf40e"")), generation: None }
│ at icechunk/src/storage/s3.rs:819
│ 1: icechunk::refs::update_branch
│ with name="main" new_snapshot=6C5N1DRCE5SP0CV9MJCG current_snapshot=Some(0GYPX3ZH15N48YG7V8K0)
│ at icechunk/src/refs.rs:173
│ 2: icechunk::session::_commit
│ with Worker (init_time=18, lead_time_index=03): Wrote chunk data. rewrite_manifests=false
│ at icechunk/src/session.rs:985
│ 3: icechunk::session::commit
│ with Worker (init_time=18, lead_time_index=03): Wrote chunk data.
│ at icechunk/src/session.rs:948

What did you expect to happen?
The commit should be successful with a Ceph Object Gateway backend.

Minimal Complete Verifiable Example
Given that I have a concrete idea of where things might not be working together, I have omitted the example. If you require it, just tell me.

Anything else we need to know?
I have analyzed the problem and found the following:

In this code (part of write_ref in s3.rs),

        match res {
            Ok(_) => Ok(WriteRefResult::Written),
            Err(err) => {
                let code = err.as_service_error().and_then(|e| e.code()).unwrap_or("");
                if code.contains("PreconditionFailed")
                    || code.contains("ConditionalRequestConflict")
                {
                    Ok(WriteRefResult::WontOverwrite)
                } else {
                    Err(Box::new(err).into())
                }
            }
        }

there are two error codes which cause the operation to retry.
While Ceph correctly returns PreconditionFailed for HTTP 412, it uses ConcurrentModification for HTTP 409 as verified with Wireshark:

Image

This means that the code follows the "real error path" and aborts.

If you agree with this analysis, would it then be possible to add ConcurrentModification to the or-condition in the mentioned code, or check more generally for generic 409 HTTP responses?

Environment
platform: Fedora Linux 42 (Workstation Edition) (6.15.10-200.fc42.x86_64)
python: 3.13.7
icechunk: 1.1.4
zarr: 3.1.0
numcodecs: 0.16.2
xarray: 2025.8.0

Thanks for the great tool!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions