! urgent ! - infinite billing

Hello, I’ve an urgent issue, the system is continuously billing a customer even if he is not sending calls.
Cdr count is increasing even if I can’t find any cdr and billing is decreasing as he is calling.
I’ve checked if there is an ip conflict with other customers and nothing wrong, i’ve disabled the customer auth and is continuing billing, i’ve set balance to 0 and it’s decreasing less than 0.
I’ve executed on all servers systemctl restart yeti-web yeti-cdr-billing@cdr_billing yeti-delayed-job yeti-scheduler, i’ve also restarted redis and nothing changes.
I can’t restart sems because I’ve live traffic
Where can I check the issue?

partially solved, i found it was a sems node issues that had a fault I don’t know where, and started generating cdr for the same calls in loop. all these generated cdr where billed to customer and no way to stop this bug until I restarted sems.
Load of that node was relatively low, about 600 calls.
Let me know how I can help for finding the bug and fix it, thank you

I think you just have some CDRs in queue in sems and it writes data asynchronously. You droppped all this CDRs by restarting sems.

it’s not possible, customer was not sending traffic for more than 2 hours and his balance was decreasing until I restarted sems.
I’ve three sems nodes managed from a kamailio load balancer, and only node 2 had this infinite billing problem. If that was a queue processing, then other 2 nodes should have conitnued to process the queue for the other calls of the same customer.
I’ve checked everything, I had the same doubt but it doesn’t justify also the fact that his balance went less then 0 even if his “min_balance” was set at 0.

Each node has own in memory CDR queue. Anyway there is no enough details to say anything about your problem.

also the fact that his balance went less then 0 even if his “min_balance” was set at 0.

this is may happen when you have long CDR queue.

I really really hope that was a queue issue, is there a way to contact you to have a fast reply if I’ll have again a situation like this? So you can check by yourself?

I found the problem:
On node 2 I have a long queue processing but I can’t understand why. Only on node 2, 1 and 3 are working correctly

this is node 2

'cdr_threads': [   {   'db_exceptions': 0,
                                                           'queue_len': 29221,
                                                           'retry_queue_len': 0,
                                                           'tried_cdrs': 184713,
                                                           'writed_cdrs': 184712},
                                                       {   'db_exceptions': 0,
                                                           'queue_len': 29072,
                                                           'retry_queue_len': 0,
                                                           'tried_cdrs': 184812,
                                                           'writed_cdrs': 184811}],

this is node 3 in the same moment:

 'cdr_threads': [   {   'db_exceptions': 0,
                                                           'queue_len': 0,
                                                           'retry_queue_len': 0,
                                                           'tried_cdrs': 205782,
                                                           'writed_cdrs': 205782},
                                                       {   'db_exceptions': 0,
                                                           'queue_len': 0,
                                                           'retry_queue_len': 0,
                                                           'tried_cdrs': 206258,
                                                           'writed_cdrs': 206258}],

sems versions are the same for all 3 nodes: 1.12.40core99

You have to check your cdr database writing performance.

I’m using normal hdd…so is it depending from this? I see very low I/O on hdd, about 4MB/s but maybe it’s because it’s working a tons of small files and that slows down the writing speed. Is that correct?

this is not how postgresql works. may be you also have different latency to CDR db depends on node.

with a load of 4000 calls divided on 3 nodes now also node 1 and node 3 are having the same issue. Node 1 have sems and cdr db. I can try to stop sems on first node and see if it speeds up a bit the cdr writing.
So it means it doesn’t depends from HDD?

It depends on lot of factors, including HDD speed and there are no enough details to say about root cause in your case. But “small files” theory doesn’t look correct.

PS: 7200rpm hdd able to handle up to 120tps