[sr-dev] [SR-Users] Dialog module and confirmed-unacked dialogs

Daniel-Constantin Mierla miconda at gmail.com
Mon Apr 28 10:20:52 CEST 2014


Hello,

is any BYE passing through the proxy?

There could be two approaches:

- when 200ok, set the dialog timeout to 40s (or some parm value), then 
when ACK is handled, set it to the real dialog timeout value
- cleanup timer can reset the timeout of non-acked calls that are 
established for more than 40 sec -- this timer is running every 90 sec

The second is less accurate, but might be more lightweight from handling 
point of view (a dedicated process does the cleanup, now for long 
standing unconfirmed dialogs only).

Cheers,
Daniel

On 25/04/14 16:51, Alex Balashov wrote:
> Hi,
>
> I am wondering if perhaps we ought to do something with regard to 
> specific handling of confirmed-nonacked dialogs (CONFIRMED_NA state in 
> dlg_hash.h) in the dialog module. These are dialogs where a 2xx reply 
> is sent to the opening INVITE transaction, but no end-to-end ACK is 
> seen[1] by Kamailio and thus the dialog is not recorded as 
> transitioning to CONFIRMED state.
>
> This can happen for a variety of reasons, but the most common scenario 
> I run into is the CANCEL-200 OK race, where the caller cancels the 
> call just as the callee answers it, near-simultaneously. The 200 OK 
> hasn't gotten back to the caller yet, so when it receives it, it has 
> no effect, because from the caller's point of view, the dialog has 
> already been CANCEL'd. Meanwhile, the CANCEL has no effect on the 
> callee end either, since, from its point of view, the dialog has 
> already transitioned into confirmed state.
>
> The problem I am running into a lot is that these dialogs stay tracked 
> up until the dialog timeout period, which can be several hours away. 
> In high-volume environments, they can clog up concurrent channel 
> counts. The receiving UAS has, of course, disposed of these dialogs 
> long ago, after 64*T1, but they remain "stuck" in Kamailio.
>
> I know that RFC 3261 Section 13.3.1.4 ("The INVITE is Accepted") says:
>
>     If the server retransmits the 2xx response for 64*T1 seconds without
>     receiving an ACK, the dialog is confirmed, but the session SHOULD
>     be terminated.  This is accomplished with a BYE, as described
>     in Section 15.
>
> Now, I know that "SHOULD" != "MUST", and I would imagine that this is 
> probably the main reason why the dialog module does not time out such 
> dialogs according to the same timers as the callee UA might.
>
> Nevertheless, they present a problem. Right now, I deal with it by 
> using a script that combs 'kamctl fifo dlg_list' for dialogs in 
> 'state:: 3' for more than X seconds and manually ends them.
>
> But, when there's a problem that one runs into nearly ubiquitously in 
> all deployments with nontrivial deployments, it seems to me it's time 
> to consider an additional 'dialog' modparam or something of that ilk 
> that can provide an expedited timeout for nonacked dialogs.
>
> I would be happy to write such a patch. The reason I am bringing it up 
> to the community is because I am uncertain as to whether this might 
> have any unforeseen consequences, or whether it's been discussed in 
> various dialog_ng discussions in the past that I have not carefully 
> monitored.
>
> Thanks!
>
> -- Alex
>
> (With apologies for cross-posting.)
>
> [1] Or correctly associated based on tight matching.
>

-- 
Daniel-Constantin Mierla - http://www.asipto.com
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda




More information about the sr-dev mailing list