Skip to content

Known Docling Issue (ARM64 containers) #300

@helicalchris

Description

@helicalchris

Hi I just wanted to highlight an issue I have in docling where using an Arm64 container on a Mac leads to an Illegal Instruction error.

This is a known error with the libraries docling uses - but docling-serve shows Arm64 containers without a sanity warning it's known to not work.

Can I suggest such a sanity warning is added (feel free to link to my docling issue)?

FYI my dosubot chat about it:

docling-project/docling#2041 (comment)

This is a known issue affecting docling and docling-serve when running in ARM64 containers on Apple Silicon (including M4 Max) with Docker, Colima, or Podman. The root cause is almost always a native dependency (like PyTorch or ONNX Runtime) being installed as an x86_64 binary instead of ARM64, which leads to an "Illegal Instruction" crash as soon as the code tries to use unsupported CPU instructions in the container environment. This happens even if you explicitly set the device to CPU and is not resolved by increasing memory or CPU allocation to the container VM [source].

There is no official fix or documented workaround in the docling or docling-serve repositories as of now. The Dockerfiles and build workflows do support ARM64 builds, but there is no guarantee that all dependencies (especially PyTorch and ONNX Runtime) are always installed as ARM64 wheels in user environments, and there is no architecture-specific error handling in the code [source].

What you can try: Make sure your container base image is ARM64 (linux/arm64) and not x86_64.
Inside the running container, check the architecture of the installed PyTorch and ONNX Runtime binaries. For example, run pip show torch and then use file $(python -c "import torch; print(torch.file)") to confirm they are ARM64 binaries.
If you find any x86_64 binaries, uninstall and reinstall them with pip install --force-reinstall --no-cache-dir torch torchvision onnxruntime to force ARM64 wheels.
Avoid using --platform linux/amd64 on ARM Macs, as this will cause other errors or fallback to emulation, which is not supported for these libraries [source].
If you want to debug further, you can try running the container with gdb or strace to see which library triggers the illegal instruction, but in almost all cases, it's a native dependency mismatch.

This is a broader ecosystem issue with ARM64 containers and Python ML libraries, not a bug in docling itself. If you need a reliable workaround, running docling in a native venv on macOS ARM64 (as you already found) is the only guaranteed path until the container ecosystem for ARM64 matures further.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions