Traccar service not stopped properly

fts5 years ago

When I stop the traccar service with "systemctl stop traccar.service" or when the system is shut down or rebooted I very often end up with a non working traccar.
The logs then show that it is stuck at

2020-10-06 20:21:26  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 20:21:27  INFO: Waiting for changelog lock....

which repeats endlessly.

Reading about this message in the forum I learned this happens when the database is left in an inconsistent state.
But isn't the traccar service there to make sure that it stops traccar consistently?
This happened with traccar version 4.8 and also after my upgrade to version 4.10

What can I do to avoid this issue?

Best regards

Anton Tananaev5 years ago

The easiest option is to restore the database from backup and try again.

fts5 years ago

Thanks for the quick response.
Restoring the database from backup is what I did.

But having to restore the database from backup every second time I restart the server is not a permanent solution.

What do I have to do to avoid this issue?

Anton Tananaev5 years ago

If it happens every second time, then you clearly doing something wrong. This issue only happen if you kill/restart the process before initialization is finished, which during normal run (not upgrade) takes only a few seconds.

During upgrade database migration can take several minutes and that's where usually this issue happens. People don't wait till full migration is finished or they think that the process is stuck, so they kill the service in the middle leaving database in a locked state.

fts5 years ago

I don't know what I would be doing wrong.
The database migration is long done and I did not kill the process manually.
I can easily reproduce the issue by running systemctl stop traccar.service and then systemctl start traccar.service
The log file output is:

2020-10-06 23:36:35  INFO: HikariPool-1 - Starting...
2020-10-06 23:36:38  INFO: HikariPool-1 - Start completed.
2020-10-06 23:36:55  INFO: HikariPool-1 - Starting...
2020-10-06 23:37:12  INFO: HikariPool-1 - Starting...
2020-10-06 23:37:16  INFO: HikariPool-1 - Start completed.
2020-10-06 23:37:20  INFO: Clearing database change log checksums
2020-10-06 23:37:21  INFO: SELECT COUNT(*) FROM PUBLIC.DATABASECHANGELOGLOCK
2020-10-06 23:37:21  INFO: SELECT COUNT(*) FROM PUBLIC.DATABASECHANGELOGLOCK
2020-10-06 23:37:21  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 23:37:21  INFO: Successfully acquired change log lock
2020-10-06 23:37:23  INFO: SELECT MD5SUM FROM PUBLIC.DATABASECHANGELOG WHERE MD5SUM IS NOT NULL
2020-10-06 23:37:23  INFO: UPDATE PUBLIC.DATABASECHANGELOG SET MD5SUM = NULL
2020-10-06 23:37:23  INFO: Successfully released change log lock
2020-10-06 23:37:23  INFO: SELECT COUNT(*) FROM PUBLIC.DATABASECHANGELOGLOCK
2020-10-06 23:37:23  INFO: SELECT COUNT(*) FROM PUBLIC.DATABASECHANGELOGLOCK
2020-10-06 23:37:23  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 23:37:23  INFO: Successfully acquired change log lock
2020-10-06 23:37:23  INFO: Can not use class liquibase.parser.core.json.JsonChangeLogParser as a Liquibase service because org.yaml.snakeyaml.constructor.BaseConstructor is not in the classpath
2020-10-06 23:37:23  INFO: Can not use class liquibase.parser.core.yaml.YamlChangeLogParser as a Liquibase service because org.yaml.snakeyaml.constructor.BaseConstructor is not in the classpath
2020-10-06 23:37:44  INFO: HikariPool-1 - Starting...
2020-10-06 23:37:48  INFO: HikariPool-1 - Start completed.
2020-10-06 23:38:04  INFO: HikariPool-1 - Starting...
2020-10-06 23:38:08  INFO: HikariPool-1 - Start completed.
2020-10-06 23:38:26  INFO: HikariPool-1 - Starting...
2020-10-06 23:38:29  INFO: HikariPool-1 - Start completed.
2020-10-06 23:38:33  INFO: Clearing database change log checksums
2020-10-06 23:38:34  INFO: SELECT COUNT(*) FROM PUBLIC.DATABASECHANGELOGLOCK
2020-10-06 23:38:34  INFO: SELECT COUNT(*) FROM PUBLIC.DATABASECHANGELOGLOCK
2020-10-06 23:38:34  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 23:38:34  INFO: Waiting for changelog lock....
2020-10-06 23:38:44  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 23:38:44  INFO: Waiting for changelog lock....
2020-10-06 23:38:54  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 23:38:54  INFO: Waiting for changelog lock....
2020-10-06 23:39:04  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 23:39:04  INFO: Waiting for changelog lock....
2020-10-06 23:39:14  INFO: SELECT LOCKED FROM PUBLIC.DATABASECHANGELOGLOCK WHERE ID=1
2020-10-06 23:39:14  INFO: Waiting for changelog lock....

Anton Tananaev5 years ago

What are the server specs? It looks like it's really slow.

fts5 years ago

It is a tiny installation handling only 5 trackers which are all offline at the moment.
The server is a virtual machine running on Vmware ESXi 6.7 - the VM has 1 vCore and 1 GB RAM, the OS is Debian 10 with all the latest updates.
The Vmware ESXi host CPU is a AMD GX-415GA which is running at 8 % load.

Anton Tananaev5 years ago

What about database? Is it local? Are you using SSD drives?

fts5 years ago

Yes, I am using SSD.
I am still using the local H2 Database but want to migrate it to a local postgres database.
But before doing this I wanted to adress this issue.

Anton Tananaev5 years ago

I just looked at your log again and I don't really understand this part:

2020-10-06 23:37:44  INFO: HikariPool-1 - Starting...
2020-10-06 23:37:48  INFO: HikariPool-1 - Start completed.
2020-10-06 23:38:04  INFO: HikariPool-1 - Starting...
2020-10-06 23:38:08  INFO: HikariPool-1 - Start completed.
2020-10-06 23:38:26  INFO: HikariPool-1 - Starting...
2020-10-06 23:38:29  INFO: HikariPool-1 - Start completed.

Did you restart the server there? Why are there 3 starts?

fts5 years ago

Many thanks for looking so closely at my problem.

In fact the server was started again at timestamp "2020-10-06 23:36:35".
I called systemctl start traccar.service only once - everything that is logged after this timestamp was done by traccar itself.

Looking at the logfiles again I recognized that when the server is stopped properly it logs

2020-10-07 09:47:20  INFO: Shutting down server...

This log entry is missing when the issue occurs.

Can I activate some debug output to help finding the issues cause?

fts5 years ago

Looking into the syslog I found that traccar was started 3 times because it crashed and was restarted by the service.
The syslog output of the crash is:

Oct  6 23:37:08 F-04 systemd[1]: traccar.service: Scheduled restart job, restart counter is at 2.
Oct  6 23:37:08 F-04 systemd[1]: Stopped traccar.
Oct  6 23:37:08 F-04 systemd[1]: Started traccar.
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 traccar[829]: # A fatal error has been detected by the Java Runtime Environment:
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 traccar[829]: #  SIGSEGV (0xb) at pc=0x00007fcdf1f6df56, pid=829, tid=835
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 traccar[829]: # JRE version: OpenJDK Runtime Environment (11.0.5+10) (build 11.0.5+10-LTS)
Oct  6 23:37:30 F-04 traccar[829]: # Java VM: OpenJDK 64-Bit Server VM (11.0.5+10-LTS, mixed mode, tiered, compressed oops, serial gc, linux-amd64)
Oct  6 23:37:30 F-04 traccar[829]: # Problematic frame:
Oct  6 23:37:30 F-04 traccar[829]: # V  [libjvm.so+0x482f56]  OopFlow::compute_reach(PhaseRegAlloc*, int, Dict*)+0x136
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 traccar[829]: # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 traccar[829]: # An error report file with more information is saved as:
Oct  6 23:37:30 F-04 traccar[829]: # /tmp/hs_err_pid829.log
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 traccar[829]: # Compiler replay data is saved as:
Oct  6 23:37:30 F-04 traccar[829]: # /tmp/replay_pid829.log
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 traccar[829]: # If you would like to submit a bug report, please visit:
Oct  6 23:37:30 F-04 traccar[829]: #   http://bugreport.java.com/bugreport/crash.jsp
Oct  6 23:37:30 F-04 traccar[829]: #
Oct  6 23:37:30 F-04 systemd[1]: traccar.service: Main process exited, code=killed, status=6/ABRT
Oct  6 23:37:30 F-04 systemd[1]: traccar.service: Failed with result 'signal'.
Oct  6 23:37:41 F-04 systemd[1]: traccar.service: Service RestartSec=10s expired, scheduling restart.
Oct  6 23:37:41 F-04 systemd[1]: traccar.service: Scheduled restart job, restart counter is at 3.
Oct  6 23:37:41 F-04 systemd[1]: Stopped traccar.
Oct  6 23:37:41 F-04 systemd[1]: Started traccar.
Oct  6 23:37:50 F-04 traccar[846]: #
Oct  6 23:37:50 F-04 traccar[846]: # A fatal error has been detected by the Java Runtime Environment:
Oct  6 23:37:50 F-04 traccar[846]: #
Oct  6 23:37:50 F-04 traccar[846]: #  SIGSEGV (0xb) at pc=0x00007f6c943e0f56, pid=846, tid=852
Oct  6 23:37:50 F-04 traccar[846]: #
Oct  6 23:37:50 F-04 traccar[846]: # JRE version: OpenJDK Runtime Environment (11.0.5+10) (build 11.0.5+10-LTS)
Oct  6 23:37:50 F-04 traccar[846]: # Java VM: OpenJDK 64-Bit Server VM (11.0.5+10-LTS, mixed mode, tiered, compressed oops, serial gc, linux-amd64)
Oct  6 23:37:50 F-04 traccar[846]: # Problematic frame:
Oct  6 23:37:50 F-04 traccar[846]: # V  [libjvm.so+0x482f56]  OopFlow::compute_reach(PhaseRegAlloc*, int, Dict*)+0x136
Oct  6 23:37:50 F-04 traccar[846]: #
Oct  6 23:37:50 F-04 traccar[846]: # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
Oct  6 23:37:50 F-04 traccar[846]: #
Oct  6 23:37:50 F-04 traccar[846]: # An error report file with more information is saved as:
Oct  6 23:37:50 F-04 traccar[846]: # /tmp/hs_err_pid846.log
Oct  6 23:37:50 F-04 traccar[846]: #
Oct  6 23:37:50 F-04 traccar[846]: # Compiler replay data is saved as:

unfortunately the error report files in /tmp/ do not exist.

fts5 years ago

I was able to generate report files of the Java fatal error.
You can download them here:
https://ufile.io/f/00qft

Anton Tananaev5 years ago

Looks like it's a bug in JVM. Try using different version of Java.

fts5 years ago

OK, traccar is currently running with the JVM that is bundled with the traccar installation (OpenJDK Runtime Environment 11.0.5+10)
do you have a recommendation which version I should try instead?