Troubleshooting
The goal of this guide is to list the most common issues with an on premise server that doesn't work. It is important to note that all of these can be prevented if you follow our installation procedure to the letter.
Check list¶
Here is a checklist that we recommend to follow before going any further. If you reach the end of the list without figuring out your issue, please get in touch with us so that we can answer any question.
Info
This guide is for On premise servers, for the troubleshooting guide on agents check the dedicated page.
Hardware¶
Are you using at least the minimum hardware configuration? (4 cpus, 8 GB Ram, 10 GB disk space).
Especially regarding the disk space, make sure that the docker folder (by default /var/lib/docker
) is mounted on a point with enough disk space. More info in the dedicated section of this page.
OS and Docker¶
Check the version of your:
- Operating system:
uname -a
, - Docker:
docker version
. Docker-ce is mandatory. Make sure you are using the latest available version of both. Older versions should work but we only test our releases on the latest ones, - Docker-compose:
docker compose version
, if the compose plugin is not installed, please follow the instructions here.
Warning
The docker version provided by operating systems like RHEL by default is outdated and usually won't work, prefer podman and podman-compose in that case. (there is a podman up/down command for the On Premise Infrasctructure and a podman section for agents).
Networking¶
You might not be able to access the OctoPerf server from your browser because of network restrictions, in that case:
-
If there is any firewall installed on the server, they interfere with IPTables rules. As Docker inserts IPTables rules, it's preferable to disable firewalls to avoid networking issues,
-
Make sure OctoPerf is accessible locally on the server. Try
curl http://localhost:port/
(replace "port" by the port on which it's running; example:curl http://localhost:8080
), -
If there is any security policy in your company preventing access to the octoperf server and/or port over the network, please check with you IT admin to solve the issue.
Configuration files¶
In case the server will not start, check if you have edited any configuration file provided in the enterprise-edition.zip by OctoPerf. If so, please revert the changes you made and restart the server.
Make sure the configuration files are readable by the user launching OctoPerf, in case of doubt, just make them readable by everyone using this command from the root of the enterprise-edition folder:
chmod -R 777 ./
Warning
docker-compose.yml and application.yml are YML files, and as such proper indentation with 2 spaces per level must be used at all times. Otherwise the file or some section of it might be considered invalid and ignored.
Check containers logs¶
If all of the above didn't help, then you need to check the containers logs. First list the containers running:
guillaume@guillaume-VirtualBox:~/Downloads/enterprise-edition$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
95a494494d62 enterprise-edition_nginx "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:80->80/tcp, :::80->80/tcp nginx
2e88aee1f25b octoperf/enterprise-edition:13.0.1 "./entrypoint.sh java" 2 minutes ago Up About a minute enterprise-edition
1027f344b52a docker.elastic.co/elasticsearch/elasticsearch-oss:7.17.9 "/tini -- /usr/local…" 2 minutes ago Up 2 minutes 9200/tcp, 9300/tcp elasticsearch
23cf709c97a1 octoperf/enterprise-documentation:13.0.1 "/docker-entrypoint.…" 2 minutes ago Up 2 minutes 8080/tcp enterprise-documentation
95ee79d9d612 octoperf/enterprise-ui:13.0.1 "/docker-entrypoint.…" 2 minutes ago Up 2 minutes 8080/tcp enterprise-ui
First we can see that their status is Up, this is a good sign. If any container is Down or Restarting make sure to check its logs to see what's the issue.
The next step is to check a particular container logs using the CONTAINER ID from the above result:
guillaume@guillaume-VirtualBox:~/Downloads/enterprise-edition$ docker logs -f 1027f344b52a
Search for messages with severity: Error. Then check if the sections below list this particular error message.
Enterprise-edition (backend)¶
This container hosts the API used by both the User Interface and Agents in order to connect or report metrics.
Can't connect to elasticsearch¶
This indicates that elasticsearch didn't start as expected. Make sure that you have increased vm.max_map_count as described in the installation guide.
Check the section about elasticsearch for more details on what could be the issue.
Out of memory¶
Thanks to OctoPerf being well optimized, this container can operate with a very low amount of memory, usually less than 2 GB. You shouldn't have to increase this value under normal circumstances. However if you run a lot of concurrent tests or have many load generators reporting metrics it can be required to increase its memory. Replace the enterprise-edition section of the docker-compose.yml
with something like this:
enterprise-edition:
image: docker.io/octoperf/enterprise-edition:12.12.3
stdin_open: true
tty: true
container_name: enterprise-edition
links:
- elasticsearch
environment:
JAVA_OPTS: "-Xmx4g"
AUTO_DETECTED_HOSTNAME: ${AUTO_DETECTED_HOSTNAME}
depends_on:
- elasticsearch
restart: unless-stopped
volumes:
- octoperf-data:/home/octoperf/data
- ./config:/home/octoperf/config
- ./license:/home/octoperf/license
Warning
docker-compose.yml is a YML file, and as such proper indentation with 2 spaces per level must be used at all times. Otherwise the file or some section of it might be considered invalid and ignored.
Elasticsearch¶
This container hosts the Database where everything is stored. In order to improve its performance, Elastic.co provides a guide on how to optimize Elasticsearch to use all available resources.
Errors in the elasticsearch container can have various explanations. The first step is to identify the error message in the elasticsearch container logs.
UnknownHostException: geoip.elastic.co¶
This error message simply indicate that a non essential component of elasticsearch couldn't access internet to update itself. You can safely disregard it.
Not enough disk space¶
In this case you will be able to browse OctoPerf and look at every information, but any action that requires to create data will fail, for example:
- Launching a test,
- Updating a virtual user,
- Creating a new runtime profile,
- Etc..
By default, Elasticsearch is configured to use elasticsearch-data
local volume, which is usually located in /var/lib/docker
. The issue manifests itself with logs from Elasticsearch:
[2015-10-27 09:40:08,801][INFO ][cluster.routing.allocation.decider] [Milan] low disk watermark [15%] exceeded on [DZqnmWIZRpapZY_TPkkMBw][Milan] free: 58.6gb[12.6%], replicas will not be assigned to this node
Elasticsearch is informing you the disk space is becoming scarce. Elasticsearch considers disk space before allocating data to a node. First, check the available disk space using df -h
, which shows something like:
Filesystem Size Used Avail Use% Mounted on
udev 12G 0 12G 0% /dev
tmpfs 2,4G 3,8M 2,4G 1% /run
/dev/nvme1n1p2 234G 46G 176G 21% /
tmpfs 12G 100M 12G 1% /dev/shm
tmpfs 5,0M 4,0K 5,0M 1% /run/lock
tmpfs 12G 0 12G 0% /sys/fs/cgroup
/dev/nvme1n1p1 487M 8,3M 478M 2% /boot/efi
In this case, /var/lib/docker
is mounted on /
Filesystem, which has 176G
available. We're fine. But if available disk space goes below 5%
, indices are switched into read-only mode to prevent further writes.
Fixing Disk Space Issue
- Stop OctoPerf On Premise Infra,
- Allocate and mount a new disk with enough disk space. Let's say you mounted the disk on folder
/opt/elasticsearch
, - Find where the
elasticsearch-data
volume is stored on the disk. List docker volumes:
ubuntu@laptop:~$ docker volume ls
DRIVER VOLUME NAME
local enterprise-edition_elasticsearch-data
local enterprise-edition_octoperf-dat
The volume we are interested in is enterprise-edition_elasticsearch-data
. Let's find where the data is stored by inspecting it:
ubuntu@laptop:~$ docker volume inspect enterprise-edition_elasticsearch-data
[
{
"CreatedAt": "xxxxx",
"Driver": "local",
"Labels": {
"com.docker.compose.project": "enterprise-edition",
"com.docker.compose.version": "1.23.1",
"com.docker.compose.volume": "elasticsearch-data"
},
"Mountpoint": "/var/lib/docker/volumes/enterprise-edition_elasticsearch-data/_data",
"Name": "enterprise-edition_elasticsearch-data",
"Options": null,
"Scope": "local"
}
]
Now we know the data is stored in /var/lib/docker/volumes/enterprise-edition_elasticsearch-data/_data
. We have to copy it to the new disk:
sudo cp -rp /var/lib/docker/volumes/enterprise-edition_elasticsearch-data/_data/ /opt/elasticsearch/
- Now, edit
docker-compose.yml
and find:
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
Let's configure the volume to map the data to /opt/elasticsearch
:
volumes:
- /opt/elasticsearch:/usr/share/elasticsearch/data
- Restart OctoPerf On Premise Infra (which will restart Elasticsearch),
- Re-enable write operations, run:
curl -XPUT ELASTICSEARCH_IP:9200/_settings -H "Content-Type: application/json" --data '{"index": {"blocks": {"read_only_allow_delete": null}}}'
This enables write operations on all indices within elasticsearch. Make sure you have enough disk space available, otherwise indices will switch back to read-only
operations again.
To find Elasticsearch ip, first run docker ps
to list containers:
ubuntu@laptop:~$ docker ps | grep elasticsearch
a09d8f01fa71 docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.4 "/usr/local/bin/dock…" 9 days ago Up 3 hours 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp elastic_kibana_elasticsearch_1_7d43584139e7
And inspect the relevant container:
ubuntu@laptop:~$ docker inspect a09 | grep "IPAddress"
"SecondaryIPAddresses": null,
"IPAddress": "",
"IPAddress": "172.18.0.3",
In this case, Elasticsearch local IP address is 172.18.0.3
.
Permission denied¶
In this case, the elasticsearch container will exit directly after launch because it is unable to read its files on the disk.
Now since elasticsearch failed to start, the log should contain errors. If you have messages like failed to obtain node locks
or permission denied
, this means that the rights on elasticsearch's data has been altered. If you don't know where to find it, you can find more info on the elasticsearch-data folder here.
The fix for this is to restore the initial rights through a command like this one, make sure to update elasticsearch-data to point to the proper folder:
chown -R 1000:1000 /elasticsearch-data
After this, remove all containers and restart the entire stack again.