Opensearch 2.3.0 on ARM exited with code 143

ggee · November 24, 2022, 5:07am

I’m wondering if anyone has seen this. Using 2.3.0 with this basic config.

config:
  opensearch.yml: |
    cluster.name: opensearch-cluster
    network.host: 0.0.0.0
    plugins.security.disabled: true

on x64 kubernetes, this works fine, but when I tried to run in ARM nodes, I get the following error in logs and it restarts.

Enabling OpenSearch Security Plugin
Killing opensearch process 10
OpenSearch exited with code 143
Performance analyzer exited with code 1

I’m using the exact same config on each, so not sure what I missed.

dacr · November 30, 2022, 12:26pm

I have exactly the same issue with Opensearch 2.4.0 on x86_64 architectures. I’ve really no idea why opensearch is being killed, I’ve enabled docker debug mode, check syslog, … and no clue of what’s happening

version: '3'
services:
  odfe-node1:
    image: opensearchproject/opensearch:2.4.0
    container_name: odfe-node1
    environment:
      - cluster.name=odfe-cluster
      - node.name=odfe-node1
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - DISABLE_INSTALL_DEMO_CONFIG=true
      - "OPENSEARCH_JAVA_OPTS=-Xms8000m -Xmx8000m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - ...
    ports:
      - 9200:9200
      - 9600:9600
    expose:
      - "9200"
    networks:
      - ...
...

$ docker-compose up odfe-node1 
Starting odfe-node1 ... done
Attaching to odfe-node1
odfe-node1    | Disabling execution of install_demo_configuration.sh for OpenSearch Security Plugin
odfe-node1    | Enabling OpenSearch Security Plugin
odfe-node1    | Killing opensearch process 10
odfe-node1    | OpenSearch exited with code 143
odfe-node1    | Performance analyzer exited with code 0
odfe-node1 exited with code 0

dacr · November 30, 2022, 1:13pm

OK found why I got this 143 code, it was in fact linked to the performance analyser startup failure as a configuration file (performance-analyzer.properties) was missing. And so if the performance analyzer can’t be started then opensearch is automatically killed by the startup script.

ggee · November 30, 2022, 2:26pm

So how did you fix it? What steps? Strange it starts on x64 for me though.

Thanks

ggee · December 8, 2022, 8:36pm

Has anyone seen this that has a fix or at least give me some pointers how to figure it out? The pod is in crashloop starting up. I’ve been trying to figure out how to get more logs using the helm values yaml , but no luck yet.

$ kubectl logs -f opensearch-cluster-master-0
Disabling execution of install_demo_configuration.sh for OpenSearch Security Plugin
Enabling OpenSearch Security Plugin
Killing opensearch process 10
OpenSearch exited with code 143
Performance analyzer exited with code 1

ggee · December 8, 2022, 9:08pm

When I try and run the docker image locally, the only file in the logs folder is the performce-analyzer.log with the following.

uintx InitialCodeCacheSize=4096 is outside the allowed range [ 65536 ... 18446744073709551615 ]
Improperly specified VM option 'InitialCodeCacheSize=4096'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

But I don’t know why this would make the main process fail.

ggee · December 8, 2022, 9:39pm

OK, it was the performce-analyzer killing the pod. How can I override the bad settings in the PA_AGENT_JAVA_OPTS when running docker without building my own image?

For the ARM image, the PA -XX:InitialCodeCacheSize= is set totally wrong. BUG

ggee · December 12, 2022, 1:33pm

I opened the following

github.com/opensearch-project/performance-analyzer-rca

[BUG] InitialCodeCacheSize=4096 causing ARM docker image to crashloop

opened 10:22PM - 08 Dec 22 UTC

ggee

For the opensearch container image running on ARM, the container gets into crash…loop in kubernetes and also fails when running locally, ``` $ kubectl logs -f opensearch-cluster-master-0 Disabling execution of install_demo_configuration.sh for OpenSearch Security Plugin Enabling OpenSearch Security Plugin Killing opensearch process 10 OpenSearch exited with code 143 Performance analyzer exited with code 1 ``` Further investigation found this in the performce-analyzer.log. ``` uintx InitialCodeCacheSize=4096 is outside the allowed range [ 65536 ... 18446744073709551615 ] Improperly specified VM option 'InitialCodeCacheSize=4096' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. ``` Hacking the container and modifying the opensearch-performance-analyzer/performance-analyzer-agent-cli and changing the value to the minimum 65536, the opensearch container was able to start. When using HELM, I tested with ``` config: opensearch.yml: | cluster.name: opensearch-cluster network.host: 0.0.0.0 plugins.security.disabled: true ``` I also tested with a basic docker-compose to run locally and saw same crash. I could not find a way to override this using docker env settings or helm charts. PA_AGENT_JAVA_OPTS is not exposed and cannot be overridden. Both a proper fix and a way to override would be useful.

but I am still looking for a solution to my current deployment.

Topic		Replies	Views
Get an error response while getting _cluster/stats OpenSearch troubleshoot , configure	0	330	October 17, 2022
Running the Opensearch docker image on ARM machine OpenSearch discuss	0	482	October 25, 2022
Opensearch restarts on Openshift On-Premises OpenSearch troubleshoot	2	724	June 2, 2022
3 node opensearch cluster not able initialized OpenSearch troubleshoot , configure , security-issue	1	4813	April 14, 2023
How to setup security on containerised opensearch? Security configure	3	1555	February 2, 2024

Opensearch 2.3.0 on ARM exited with code 143

Related topics