-
-
Notifications
You must be signed in to change notification settings - Fork 412
Description
The recent dulwich
release (0.23.1
) has started breaking DVC.
https://github.com/iterative/dvc/actions/runs/15990598700/job/45103035168#step:6:177
DVC uses multiple Git backends (dulwich
, pygit2
, and GitPython
) to cover feature gaps across implementations and sometimes for performance reasons.
I bisected the issue to this commit 9596a2c, which started writing index extensions with empty extension data.
Lines 795 to 798 in 9596a2c
# Write extensions | |
if extensions: | |
for extension in extensions: | |
write_index_extension(f, extension) |
The index format written by dulwich
looks to be technically valid per the index-format
spec. Git seems to handle this without the issue. But pygit2/libgit2 seems to expect the Cache Tree extension to at least have one entry.
Relevant code for reference:
I understand that this isn’t strictly a bug in dulwich
since the format is spec-compliant. Still, would you be open to a patch that simply skips writing an index extension when the data is empty?
Minimal reproducer
# /// script
# dependencies = [
# "dulwich==0.23.1",
# "pygit2==1.18.0",
# ]
# ///
import tempfile
from dulwich.index import TreeExtension
from dulwich.porcelain import init
from pygit2 import Repository
path = tempfile.mkdtemp()
print(path)
with init(path) as repo:
index = repo.open_index()
index._extensions.append(TreeExtension.from_bytes(b""))
index.write()
pyg_repo = Repository(path)
pyg_repo.index.read()
uv run script.py
/var/folders/3g/1vds4g8d4p3909hrwr65j6300000gn/T/tmp3u5lf5iw
Traceback (most recent call last):
File "/Users/user/projects/dvcorg/dvc/script.py", line 22, in <module>
pyg_repo.index.read()
^^^^^^^^^^^^^^
File "/Users/user/.cache/uv/environments-v2/script-f78ce4ac73dbb248/lib/python3.13/site-packages/pygit2/repository.py", line 649, in index
check_error(err, io=True)
~~~~~~~~~~~^^^^^^^^^^^^^^
File "/Users/user/.cache/uv/environments-v2/script-f78ce4ac73dbb248/lib/python3.13/site-packages/pygit2/errors.py", line 66, in check_error
raise GitError(message)
_pygit2.GitError: corrupted TREE extension in index
Let me know if you’d like a PR for this.