Skip to content

Docker Compose: OpenSearch fails to start on new Apple Silicon (M4) with SIGILL #196

@nasseralbess

Description

@nasseralbess

The Bug

When running docker compose up on newer Apple Silicon hardware (tested on an M4 MacBook Air), the search-1 container (OpenSearch) fails to start. It enters a crash loop, repeatedly showing a fatal error from the Java Runtime Environment.

This blocks local development for anyone on the affected hardware, preventing contributions. I encountered this while working on a separate issue related to adding run_numbers to CMS RAW data records.

Observed Behaviour

The search-1 container log is flooded with the following fatal error, indicating an "Illegal Instruction" signal (SIGILL):

search-1                 | # A fatal error has been detected by the Java Runtime Environment:
search-1                 | #
search-1                 | #  SIGILL (0x4) at pc=0x0000f10bbfd40c5c, pid=15, tid=16
search-1                 | #
search-1                 | # JRE version:  (21.0.6+7) (build )
search-1                 | # Java VM: OpenJDK 64-Bit Server VM (21.0.6+7-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64)
search-1                 | # Problematic frame:
search-1                 | # j  java.lang.System.registerNatives()V+0 [email protected]

Steps to Reproduce

  1. Use a Mac with new Apple Silicon hardware (M4).
  2. Clone the cernopendata-portal repository.
  3. From the project root, run docker compose up.
  4. Observe that the db-1 and cache-1 containers run, but search-1 enters a restart loop, and the process eventually fails with dependency failed to start: container opendatacernch-search-1 is unhealthy.

The Fix That worked for me

The root cause appears to be the JVM attempting to use ARM's Scalable Vector Extension (SVE) instructions, which are either not supported or incorrectly implemented in the JVM version included with the default OpenSearch image.

The issue can be resolved with a two-part fix:

  1. Disable SVE in the JVM: This is done by passing a specific flag to the JVM via an environment variable.
  2. Upgrade the OpenSearch Image: The default opensearch:2 image did not seem to respect the environment variable. It was necessary to switch to a more specific, newer version for the flag to take effect.

The following changes to the search service in docker-compose.yml make the portal run successfully on an M4 Mac:

Current docker-compose.yml (for the search service):

  search:
    restart: "always"
    image: docker.io/opensearchproject/opensearch:2
    environment:
      - bootstrap.memory_lock=true
      - discovery.type=single-node
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
      # ... other vars

Working docker-compose.yml (for the search service):

  search:
    restart: "always"
    # 1. Image version was updated to one that respects _JAVA_OPTIONS.
    #    The user reported using a version like "2.11.9". We should
    #    pin this to a specific, tested version.
    image: docker.io/opensearchproject/opensearch:2.11.1 # Or a newer tested version
    environment:
      # 2. This environment variable is added to fix the JVM crash.
      - _JAVA_OPTIONS=-XX:UseSVE=0
      - bootstrap.memory_lock=true
      - discovery.type=single-node
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
      # ... other vars

Proposed Solution

To ensure the portal is usable out-of-the-box for all contributors on modern hardware, I suggest we update the docker-compose.yml in the main branch.

  1. Pin the search service's image to a specific, recent version of OpenSearch that is confirmed to work with the _JAVA_OPTIONS flag (e.g., 2.11.1 or newer).
  2. Add the _JAVA_OPTIONS=-XX:UseSVE=0 environment variable to the search service.

This will resolve a key development blocker and preserve this knowledge for future contributors.

cc @tiborsimko

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions