Extreme CPU load

memesaregood2 years ago

Our test server suffers a massive load when running Traccar for a long time, and therefore Traccar suffers a thread starvation.

We have set the heap limit of 6 GB and and added more cores to the VM (6, now), running Ubuntu 20.04.5, but the issue still stands. Our Traccar numbers are:

  • ~10 users;
  • 1 Geofence;
  • ~300 devices with a message time interval varying from 10 seconds to 6 minutes
  • 24 computed attributes, 10 of which are linked to all devices

First ~20 minutes seem to be good to go, but at some point Traccar just consumes almost all available resources. However, it does not throw an OutOfMemoryError exception.

How do we find the cause? What info would help us, or you?

Anton Tananaev2 years ago

What version are you using?

memesaregood2 years ago

It's a week-old 5.6 preview.

Anton Tananaev2 years ago

Have you tried the official release? Does it have the same problem?

memesaregood2 years ago

We have not, and we don't know if it does. We will probably rollback to the official 5.6 build and see if the issue persists tomorrow, as today is a day-off.

memesaregood2 years ago

The official release does not have such issues.

Anton Tananaev2 years ago

Can you provide a memory dump of when the CPU is at 100%?

memesaregood2 years ago

Would a system memdump be sufficient?

Anton Tananaev2 years ago

No, we need a jstack dump.

Anton Tananaev2 years ago

In your dump I see a massive computed attributes calculation stack. That could be the problem. Do you have any complicated expressions?

memesaregood2 years ago

Yeah, we do. We had to improvise and implement a Wialon-like fuel calculation table, as no such functionality is present in Traccar. The main calculation attribute is almost 8,000 characters, had to double the attribute length limit in the DB.

I figured it would have an impact, but damn, not that big. Traccar "overflows" in just under 5 minutes and I figured it's something about new changes.

I will request more computing resources from my supervisor and see how it goes.

Anton Tananaev2 years ago

Just for the reference, here's the link to the change:

https://github.com/traccar/traccar/pull/5036

Do you think the change from createExpression to createScript caused it? We might need to revert the change or at least make it configurable.

memesaregood2 years ago

I can't say I'm sure about it, but switching to createScript shouldn't do much. It just allows for more features to be used. I haven't detected any performance difference when we didn't have big computed attributes - just small ones, e.g. ignition. If you're worried about it, можно поизучать это дело и сравнить, but again, I'm not sure if it's the way to go.

It could be that our expesssions are just a little too heavy.

Anton Tananaev2 years ago

It should be fairly easy to revert and compare if the issue is easily reproducible.

memesaregood2 years ago

Alright, I can set up a local server tomorrow, compiled without my commits. Should be fairly easy to modify my expressions to not use local variables and all that fancy stuff, with the same DB.
Does that work well for you?