Troubleshooting

The goal of this guide is to list the most common issues with an on premise server that doesn't work. It is important to note that all of these can be prevented if you follow our installation procedure to the letter.

Support logs generator¶

To save some time when debugging your issues and/or communicating with the support, you can automatically generate every containers' logs and informations about your hardware by navigating to your installation folder and executing the following command:

make support-zip

A zip file containing the logs will be generated in that same folder.

Warning

This command will only work if your installation files are not outdated. If this is your case, update Octoperf with the latest configuration files.

If the Zip package is not installed on your machine, you'll get an error at the end of the execution., but the logs files will still be generated in a folder name "logs".

Check list¶

Here is a checklist that we recommend to follow before going any further. If you reach the end of the list without figuring out your issue, please get in touch with us so that we can answer any question.

Info

This guide is for On premise servers, for the troubleshooting guide on agents check the dedicated page.

Hardware¶

Are you using at least the minimum hardware configuration? (4 cpus, 8 GB Ram, 10 GB disk space).

Especially regarding the disk space, make sure that the docker folder (by default /var/lib/docker) is mounted on a point with enough disk space. More info in the dedicated section of this page.

OS and Docker¶

Check the version of your:

Operating system: uname -a,
Docker: docker version. Docker-ce is mandatory. Make sure you are using the latest available version of both. Older versions should work but we only test our releases on the latest ones,
Docker-compose: docker compose version, if the compose plugin is not installed, please follow the instructions here.

Warning

The docker version provided by operating systems like RHEL by default is outdated and usually won't work, prefer podman and podman-compose in that case. (there is a podman up/down command for the On Premise Infrasctructure and a podman section for agents).

Networking¶

You might not be able to access the OctoPerf server from your browser because of network restrictions, in that case:

If there is any firewall installed on the server, they interfere with IPTables rules. As Docker inserts IPTables rules, it's preferable to disable firewalls to avoid networking issues,
Make sure OctoPerf is accessible locally on the server. Try curl http://localhost:port/ (replace "port" by the port on which it's running; example: curl http://localhost:8080),
If there is any security policy in your company preventing access to the octoperf server and/or port over the network, please check with you IT admin to solve the issue.

Configuration files¶

In case the server will not start, check if you have edited any configuration file provided in the enterprise-edition.zip by OctoPerf. If so, please revert the changes you made and restart the server.

Make sure the configuration files are readable by the user launching OctoPerf, in case of doubt, just make them readable by everyone using this command from the root of the enterprise-edition folder:

chmod -R 777 ./

Warning

docker-compose.yml and application.yml are YML files, and as such proper indentation with 2 spaces per level must be used at all times. Otherwise the file or some section of it might be considered invalid and ignored.

Check containers logs¶

If your installation files are too old and as such do not contain the log generation utility, you can check the logs manually. First list the containers running:

guillaume@guillaume-VirtualBox:~/Downloads/enterprise-edition$ docker ps -a
CONTAINER ID   IMAGE                                                      COMMAND                  CREATED              STATUS              PORTS                               NAMES
95a494494d62   enterprise-edition_nginx                                   "docker-entrypoint.s…"   About a minute ago   Up About a minute   0.0.0.0:80->80/tcp, :::80->80/tcp   nginx
2e88aee1f25b   octoperf/enterprise-edition:13.0.1                         "./entrypoint.sh java"   2 minutes ago        Up About a minute                                       enterprise-edition
1027f344b52a   docker.elastic.co/elasticsearch/elasticsearch-oss:7.17.9   "/tini -- /usr/local…"   2 minutes ago        Up 2 minutes        9200/tcp, 9300/tcp                  elasticsearch
23cf709c97a1   octoperf/enterprise-documentation:13.0.1                   "/docker-entrypoint.…"   2 minutes ago        Up 2 minutes        8080/tcp                            enterprise-documentation
95ee79d9d612   octoperf/enterprise-ui:13.0.1                              "/docker-entrypoint.…"   2 minutes ago        Up 2 minutes        8080/tcp                            enterprise-ui

First we can see that their status is Up, this is a good sign. If any container is Down or Restarting make sure to check its logs to see what's the issue.

The next step is to check a particular container logs using the CONTAINER ID from the above result:

guillaume@guillaume-VirtualBox:~/Downloads/enterprise-edition$ docker logs -f 1027f344b52a

Search for messages with severity: Error. Then check if the sections below list this particular error message.

Enterprise-edition (backend)¶

This container hosts the API used by both the User Interface and Agents in order to connect or report metrics.

Can't connect to elasticsearch¶

This indicates that elasticsearch didn't start as expected. Make sure that you have increased vm.max_map_count as described in the installation guide.

Check the section about elasticsearch for more details on what could be the issue.

Out of memory¶

Thanks to OctoPerf being well optimized, this container can operate with a very low amount of memory, usually less than 2 GB. You shouldn't have to increase this value under normal circumstances. However if you run a lot of concurrent tests or have many load generators reporting metrics it can be required to increase its memory. Replace the enterprise-edition section of the docker-compose.yml with something like this:

  enterprise-edition:
    image: docker.io/octoperf/enterprise-edition:12.12.3
    stdin_open: true
    tty: true
    container_name: enterprise-edition
    links:
      - elasticsearch
    environment:
      JAVA_OPTS: "-Xmx4g"
      AUTO_DETECTED_HOSTNAME: ${AUTO_DETECTED_HOSTNAME}
    depends_on:
      - elasticsearch
    restart: unless-stopped
    volumes:
      - octoperf-data:/home/octoperf/data
      - ./config:/home/octoperf/config
      - ./license:/home/octoperf/license

Warning

docker-compose.yml is a YML file, and as such proper indentation with 2 spaces per level must be used at all times. Otherwise the file or some section of it might be considered invalid and ignored.

Elasticsearch¶

This container hosts the Database where everything is stored. In order to improve its performance, Elastic.co provides a guide on how to optimize Elasticsearch to use all available resources.

Errors in the elasticsearch container can have various explanations. The first step is to identify the error message in the elasticsearch container logs.

UnknownHostException: geoip.elastic.co¶

This error message simply indicate that a non essential component of elasticsearch couldn't access internet to update itself. You can safely disregard it.

Not enough disk space¶

In this case you will be able to browse OctoPerf and look at every information, but any action that requires to create data will fail, for example:

Launching a test,
Updating a virtual user,
Creating a new runtime profile,
Etc..

By default, Elasticsearch is configured to use elasticsearch-data local volume, which is usually located in /var/lib/docker. The issue manifests itself with logs from Elasticsearch:

[2015-10-27 09:40:08,801][INFO ][cluster.routing.allocation.decider] [Milan] low disk watermark [15%] exceeded on [DZqnmWIZRpapZY_TPkkMBw][Milan] free: 58.6gb[12.6%], replicas will not be assigned to this node

Elasticsearch is informing you the disk space is becoming scarce. Elasticsearch considers disk space before allocating data to a node. First, check the available disk space using df -h, which shows something like:

Filesystem      Size  Used Avail Use% Mounted on
udev             12G     0   12G   0% /dev
tmpfs           2,4G  3,8M  2,4G   1% /run
/dev/nvme1n1p2  234G   46G  176G  21% /
tmpfs            12G  100M   12G   1% /dev/shm
tmpfs           5,0M  4,0K  5,0M   1% /run/lock
tmpfs            12G     0   12G   0% /sys/fs/cgroup
/dev/nvme1n1p1  487M  8,3M  478M   2% /boot/efi

In this case, /var/lib/docker is mounted on / Filesystem, which has 176G available. We're fine. But if available disk space goes below 5%, indices are switched into read-only mode to prevent further writes.

Fixing Disk Space Issue

Stop OctoPerf On Premise Infra,
Allocate and mount a new disk with enough disk space. Let's say you mounted the disk on folder /opt/elasticsearch,
Find where the elasticsearch-data volume is stored on the disk. List docker volumes:

ubuntu@laptop:~$ docker volume ls
DRIVER              VOLUME NAME
local               enterprise-edition_elasticsearch-data
local               enterprise-edition_octoperf-dat

The volume we are interested in is enterprise-edition_elasticsearch-data. Let's find where the data is stored by inspecting it:

ubuntu@laptop:~$ docker volume inspect enterprise-edition_elasticsearch-data 
[
    {
        "CreatedAt": "xxxxx",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "enterprise-edition",
            "com.docker.compose.version": "1.23.1",
            "com.docker.compose.volume": "elasticsearch-data"
        },
        "Mountpoint": "/var/lib/docker/volumes/enterprise-edition_elasticsearch-data/_data",
        "Name": "enterprise-edition_elasticsearch-data",
        "Options": null,
        "Scope": "local"
    }
]

Now we know the data is stored in /var/lib/docker/volumes/enterprise-edition_elasticsearch-data/_data. We have to copy it to the new disk:

sudo cp -rp /var/lib/docker/volumes/enterprise-edition_elasticsearch-data/_data/ /opt/elasticsearch/

Now, edit docker-compose.yml and find:

    volumes:
      - elasticsearch-data:/usr/share/elasticsearch/data

Let's configure the volume to map the data to /opt/elasticsearch:

    volumes:
      - /opt/elasticsearch:/usr/share/elasticsearch/data

Restart OctoPerf On Premise Infra (which will restart Elasticsearch),
Re-enable write operations, run:

 curl -XPUT ELASTICSEARCH_IP:9200/_settings -H "Content-Type: application/json" --data '{"index": {"blocks": {"read_only_allow_delete": null}}}'

This enables write operations on all indices within elasticsearch. Make sure you have enough disk space available, otherwise indices will switch back to read-only operations again.

To find Elasticsearch ip, first run docker ps to list containers:

ubuntu@laptop:~$ docker ps | grep elasticsearch
a09d8f01fa71        docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.4   "/usr/local/bin/dock…"   9 days ago          Up 3 hours          0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp   elastic_kibana_elasticsearch_1_7d43584139e7

And inspect the relevant container:

ubuntu@laptop:~$ docker inspect a09 | grep "IPAddress"
            "SecondaryIPAddresses": null,
            "IPAddress": "",
                    "IPAddress": "172.18.0.3",

In this case, Elasticsearch local IP address is 172.18.0.3.

Permission denied¶

In this case, the elasticsearch container will exit directly after launch because it is unable to read its files on the disk.

Now since elasticsearch failed to start, the log should contain errors. If you have messages like failed to obtain node locks or permission denied, this means that the rights on elasticsearch's data has been altered. If you don't know where to find it, you can find more info on the elasticsearch-data folder here.

The fix for this is to restore the initial rights through a command like this one, make sure to update elasticsearch-data to point to the proper folder:

chown -R 1000:1000 /elasticsearch-data

After this, remove all containers and restart the entire stack again.