Looks like is not valid TCP data, check device configuration and change it from UDP to TCP and try again
That is correct, I am using a UDP protocol with instant acknowledgement.
UDP was never a problem before (versions 5.6 and older), moreover as you can see the HEX is received and an event is recorded. So a "lost" UDP packet cannot explain the issue at hand.
Maybe try TCP protocol.
If it was a protocol issue, why is only the event registered and not the positions data? It doesn't make any sense...
There are pretty significant differences between the way TCP and UDP are handled, so there can easily be a problem there.
Is there a single place in any of the handlers where I can add a log reporting that will capture all positions imports (and respectively any errors)?
If there was, don't you think we would have had logging there already?
I am not in your head, I don't know what you would or would not do.
All I know is that there is an issue recording the positions data. It might be UDP protocol related but it's still an issue and switching to TCP would not resolve it.
I am willing to help resolve it by logging the positions handler output but need some advice where to start with.
I would do it myslef but it would take days. With your knowledge it will be only a few minutes.
I'm giving a suggestion on what to try, but you're not interested, so I don't think we're on the same page here.
Yes we are not. Your suggestion is a workaround, not a fix for the issue and I am interested in fixing the problem.
Also, I am starting to think you know more about the issue that you are willing to share. Is there a known issue with UDP that cannot be fixed for the moment?
My suggestion is not a workaround. It's a way to narrow down the root cause of the problem.
I'd like to do this too but the issue occurs on a random set of devices at random times. So I never know which devices will freeze. I cannot switch all devices to TCP only to test a theory.
Setting a log write at the right place(s) in the source code should help to pinpoint the issue a lot faster and definitely a lot more efficiently.
Since you know the code well, I was hoping you can advice on how to debug fast.
Anton, another user just confirmed that he still experiences device freeze on version 6.4 over TCP. See here.
I've also done some log analysis after running the server for about 24 hours after restart yesterday.
360 devices have sent data to the server.
100% of the records are acknowledged, i.e. the server sent response 00050000018101
back.
20 devices have missing records.
15 devices have just one missing record. I still need to confirm if it's the last record that is missing or not. If it's the last one, it might mean that device might have just froze.
5 devices are confirmed frozen, records count missing between 50 and 300. No new records are recorded in positions table.
I will start looking in the code in a few moments and will appreciate your feedback at where I can start from.
Just add more logging in some pipeline handlers and see what happens to decoded positions.
Hello,
I have been running traccar server version 6.4 and I am experiencing a device freeze issue yet once again - incoming HEX is recorded in the log, an event might be generated but no data in the positions table is saved.
Issue appears to be occurring for random devices.
No errors in the logs.
After server restart, positions start to be updated correctly. However, the data during the "freeze" period is missing.
Here's an example from the log file - top 3 records are for a frozen device, right after that we've got a working device:
What can I do to troubleshoot?