[SR-Users] t_reply in failure route with dialog module

Daniel-Constantin Mierla miconda at gmail.com
Fri Dec 27 23:18:51 CET 2013


Hello,

I kind of forgot about this thread ...

The backtrace is clean and it doesn't show like a block/infinite loop 
when you took it. Are you sure you took it when kamailio used 100%CPU?

How many times you run 'n' after backtrace? The loop in cbp could have 
tens of iterations.

Maybe you can try again and let's see if you get same results on 
backtrace/gdb.

Cheers,
Daniel


On 25/11/13 21:43, Daniel-Constantin Mierla wrote:
> Hello,
>
> the backtrace is ok for the moment. I will look over it and come back 
> with results.
>
> Cheers,
> Daniel
>
> On 11/25/13 12:10 PM, Efelin Novak wrote:
>> Hi Daniel,
>>
>> sorry it took me more than I expected. Is this sufficient? Meanwhile 
>> I found out that this happens when fr_inv_timer triggers and dialog 
>> module is turned on.
>>
>> Backtrace:
>>
>> bt
>> #0  futex_release (lock=0xb4b90798) at ../../mem/../futexlock.h:137
>> #1  next_state_dlg (dlg=dlg at entry=0xb4bc5a38, event=event at entry=4, 
>> old_state=old_state at entry=0xbfef08a0, 
>> new_state=new_state at entry=0xbfef089c, unref=unref at entry=0xbfef08a4) 
>> at dlg_hash.c:950
>> #2  0xb6b05479 in dlg_onreply (t=0xb4bc5ce8, type=1048576, 
>> param=0xbfef097c) at dlg_handlers.c:469
>> #3  0xb6cc2336 in run_trans_callbacks_internal (cb_lst=0xb4bc5d28, 
>> type=type at entry=1048576, trans=0xb4bc5ce8, 
>> params=params at entry=0xbfef097c) at t_hooks.c:290
>> #4  0xb6cc25da in run_trans_callbacks_with_buf 
>> (type=type at entry=1048576, rbuf=rbuf at entry=0xb4bc5d54, 
>> req=req at entry=0xb4bc6a08, repl=repl at entry=0xffffffff, 
>> flags=flags at entry=505) at t_hooks.c:336
>> #5  0xb6cee953 in _reply_light (trans=trans at entry=0xb4bc5ce8,
>>     buf=0xb731f588 "SIP/2.0 505 Error\r\nVia: SIP/2.0/UDP 
>> 192.168.21.4;branch=z9hG4bKa6f5.8ca73657.0\r\nVia: SIP/2.0/UDP 
>> 10.0.0.95;rport=5060;branch=z9hG4bKBc143aFe81jDj\r\nFrom: \"USER\" 
>> <sip:USER at 10.0.0.95 <mailto:sip%3AUSER at 10.0.0.95>>;t"..., 
>> len=len at entry=434, code=code at entry=505, lock=0, bm=0xbfef0e48, 
>> to_tag_len=<error reading variable: Unhandled dwarf expression opcode 
>> 0xfa>,
>>     to_tag=<error reading variable: Unhandled dwarf expression opcode 
>> 0xfa>) at t_reply.c:628
>> #6  0xb6ceedeb in _reply (trans=0xb4bc5ce8, trans at entry=0x1f9, 
>> p_msg=p_msg at entry=0xb731f568, code=505, text=0xb731f568 "Error", 
>> lock=0) at t_reply.c:782
>> #7  0xb6ceefd7 in t_reply_unsafe (t=0x1f9, t at entry=0xb4bc5ce8, 
>> p_msg=0xb731f568, p_msg at entry=0xb6d196c0, code=0, 
>> text=text at entry=0xb731f568 "Error") at t_reply.c:1531
>> #8  0xb6cd6904 in w_t_reply (msg=0xb6d196c0, p1=0xb731c988 
>> "8]0\267\001", p2=0xb731c9b8 "8P0\267 ") at tm.c:1283
>> #9  0x080651cf in do_action (h=h at entry=0xbfef1258, 
>> a=a at entry=0xb7305d98, msg=msg at entry=0xb6d196c0) at action.c:1080
>> #10 0x08064137 in run_actions (h=h at entry=0xbfef1258, 
>> a=a at entry=0xb73053c8, msg=msg at entry=0xb6d196c0) at action.c:1573
>> #11 0x0806cedd in run_top_route (a=0xb73053c8, 
>> msg=msg at entry=0xb6d196c0, c=c at entry=0x0) at action.c:1658
>> #12 0xb6ceba98 in run_failure_handlers (t=t at entry=0xb4bc5ce8, 
>> rpl=0xffffffff, code=408, extra_flags=96) at t_reply.c:1028
>> #13 0xb6cece22 in t_should_relay_response 
>> (Trans=Trans at entry=0xb4bc5ce8, new_code=new_code at entry=408, 
>> branch=branch at entry=0, should_store=should_store at entry=0xbfef1464,
>>     should_relay=should_relay at entry=0xbfef1460, 
>> cancel_data=cancel_data at entry=0xbfef1510, 
>> reply=reply at entry=0xffffffff) at t_reply.c:1304
>> #14 0xb6cef076 in relay_reply (t=t at entry=0xb4bc5ce8, 
>> p_msg=p_msg at entry=0xffffffff, branch=branch at entry=0, 
>> msg_status=msg_status at entry=408, 
>> cancel_data=cancel_data at entry=0xbfef1510,
>>     do_put_on_wait=do_put_on_wait at entry=0) at t_reply.c:1707
>> #15 0xb6cc28b4 in fake_reply (t=t at entry=0xb4bc5ce8, branch=0, 
>> code=code at entry=408) at timer.c:354
>> #16 0xb6cc34b0 in final_response_handler (t=0xb4bc5ce8, 
>> r_buf=0xb4bc5dc8) at timer.c:526
>> #17 retr_buf_handler (ticks=272914717, tl=0xb4bc5ddc, p=0x3e8) at 
>> timer.c:584
>> #18 0x08183325 in timer_list_expire (slow_mark=12, slow_l=0xb49f2438, 
>> h=0xb49f2338, t=272914717) at timer.c:894
>> #19 timer_handler () at timer.c:959
>> #20 timer_main () at timer.c:998
>> #21 0x080b8c65 in main_loop () at main.c:1709
>> #22 0x08063bfe in main (argc=1, argv=0xbfef1844) at main.c:2566
>>
>> (I have changed  USER and IP address to hide my network :))
>>
>> After this when I hit 'n' multiple times I got following output in 
>> infinite loop:
>>
>> 22 0x08063bfe in main (argc=1, argv=0xbfef1844) at main.c:2566
>> (gdb) n
>> next_state_dlg (dlg=dlg at entry=0xb4bc5a38, event=event at entry=4, 
>> old_state=old_state at entry=0xbfef08a0, 
>> new_state=new_state at entry=0xbfef089c, unref=unref at entry=0xbfef08a4) 
>> at dlg_hash.c:952
>> 952        LM_DBG("dialog %p changed from state %d to "
>> (gdb)
>> 955    }
>> (gdb)
>> dlg_onreply (t=0xb4bc5ce8, type=1048576, param=0xbfef097c) at 
>> dlg_handlers.c:470
>> 470        dlg_run_event_route(dlg, (rpl==FAKED_REPLY)?NULL:rpl, 
>> old_state, new_state);
>> (gdb)
>> 472        if (new_state==DLG_STATE_EARLY) {
>> (gdb)
>> 479        if (new_state==DLG_STATE_CONFIRMED_NA &&
>> (gdb)
>> 540        if ( new_state==DLG_STATE_DELETED
>> (gdb)
>> 542                        || old_state==DLG_STATE_EARLY) ) {
>> (gdb)
>> 541                    && (old_state==DLG_STATE_UNCONFIRMED
>> (gdb)
>> 559        if (unref) dlg_unref(dlg, unref);
>> (gdb)
>> 563        dlg_release(dlg);
>> (gdb)
>> 565    }
>> (gdb)
>> run_trans_callbacks_internal (cb_lst=0xb4bc5d28, 
>> type=type at entry=1048576, trans=0xb4bc5ce8, 
>> params=params at entry=0xbfef097c) at t_hooks.c:292
>> 292            cbp=cbp->next;
>> (gdb)
>> 283        while(cbp){
>> (gdb)
>> 286            if ( (cbp->types)&type ) {
>> (gdb)
>> 292            cbp=cbp->next;
>> (gdb)
>> 283        while(cbp){
>> (gdb)
>> 286            if ( (cbp->types)&type ) {
>> (gdb)
>> 287                DBG("DBG: trans=%p, callback type %d, id %d 
>> entered\n",
>> (gdb)
>> 290                cbp->callback( trans, type, params );
>> (gdb)
>> 289                params->param = &(cbp->param);
>> (gdb)
>> 290                cbp->callback( trans, type, params );
>> (gdb)
>> 292            cbp=cbp->next;
>> (gdb)
>> 283        while(cbp){
>> (gdb)
>> 286            if ( (cbp->types)&type ) {
>> (gdb)
>> 292            cbp=cbp->next;
>> (gdb)
>> 283        while(cbp){
>> (gdb)
>> 286            if ( (cbp->types)&type ) {
>> (gdb)
>> 287                DBG("DBG: trans=%p, callback type %d, id %d 
>> entered\n",
>> (gdb)
>> 290                cbp->callback( trans, type, params );
>> (gdb)
>> 289                params->param = &(cbp->param);
>> (gdb)
>> 290                cbp->callback( trans, type, params );
>> (gdb)
>> 292            cbp=cbp->next;
>> (gdb)
>> 283        while(cbp){
>>
>> Regards
>>
>> Efelin
>>
>>
>> 2013/11/20 Daniel-Constantin Mierla <miconda at gmail.com 
>> <mailto:miconda at gmail.com>>
>>
>>     Hello,
>>
>>     I will investigate more -- meanwhile had some traveling. It would
>>     speed up if you can send the backtrace of one process that blocks
>>     when you applied the patch.
>>
>>     You need to connect with gdb to it:
>>
>>     gdb /path/to/kamailio _PID_
>>
>>     replace _PID_ with the pid of blocked kamailio process.
>>
>>     Cheers,
>>     Daniel
>>
>>
>>     On 11/15/13 5:50 PM, Efelin Novak wrote:
>>>     Hi Daniel,
>>>
>>>     thanks for a reply. I applied the patch and now the Kamailio
>>>     just prints
>>>
>>>     WARNING: tm [t_lookup.c:1564]: t_unref(): WARNING: script writer
>>>     didn't release transaction
>>>
>>>     and than freezes without any log. It does not resend the
>>>     incoming "winning" failure reply neither response to any other
>>>     messages, not even to a new calls. It just freezes. All kamailio
>>>     processes are running and eating the whole 4-core processor.
>>>
>>>     Restart of Kamailio solves this problem.
>>>
>>>     Any ideas how to continue with debug?
>>>
>>>     Thanks
>>>
>>>     Efelin
>>>
>>>
>>>     2013/11/15 Daniel-Constantin Mierla <miconda at gmail.com
>>>     <mailto:miconda at gmail.com>>
>>>
>>>         Hello,
>>>
>>>         can you try attached patch?
>>>
>>>         Let me know if all goes fine and I will commit it to the
>>>         repository.
>>>
>>>         Cheers,
>>>         Daniel
>>>
>>>
>>>         On 11/15/13 10:25 AM, Efelin Novak wrote:
>>>>         Hi,
>>>>
>>>>         when I use t_reply("505", "Error"); in my failure route,
>>>>         the response is not forwarded and following is written into
>>>>         a log:
>>>>
>>>>         kamailio[26216]: WARNING: tm [t_lookup.c:1559]: t_unref():
>>>>         WARNING: script writer didn't release transaction
>>>>
>>>>         plus next line is written exactly 416000 times into a log
>>>>         afterwards:
>>>>
>>>>         kamailio[32685]: CRITICAL: dialog [dlg_hash.c:794]:
>>>>         log_next_state_dlg(): bogus event 4 in state 5 for dlg
>>>>         0xb4af6588 [2575:7017] with clid '121d44f0-6555f4c8' and
>>>>         tags 'd12546d053aadc68o2' ''
>>>>
>>>>         My point is to change the incoming code from users and
>>>>         append a Q.850 reason code.
>>>>         Is there any other way how to do this or a way how to fix this?
>>>>         I'm using Kamilio 4.0.4 on Debian 7.1
>>>>
>>>>
>>>>         The code is as follows:
>>>>         failure_route[MANAGE_FAILURE]
>>>>         {
>>>>                 if (t_is_canceled()) {
>>>>                         exit;
>>>>                 }
>>>>                 if($T_reply_code == 408 && isflagset(10))
>>>>                 {
>>>>                         xlog("Ringing timeout");
>>>>         append_to_reply("Reason: Q.850;cause=28\r\n");
>>>>                         t_reply("505", "Error");
>>>>                 }
>>>>         }
>>>>
>>>>
>>>>         _______________________________________________
>>>>         SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>>>>         sr-users at lists.sip-router.org  <mailto:sr-users at lists.sip-router.org>
>>>>         http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>         -- 
>>>         Daniel-Constantin Mierla -http://www.asipto.com
>>>         http://twitter.com/#!/miconda  <http://twitter.com/#%21/miconda>  -http://www.linkedin.com/in/miconda
>>>         Kamailio Advanced Trainings - Berlin, Nov 25-28
>>>            - more details about Kamailio trainings athttp://www.asipto.com  -
>>>
>>>
>>>         _______________________________________________
>>>         SIP Express Router (SER) and Kamailio (OpenSER) - sr-users
>>>         mailing list
>>>         sr-users at lists.sip-router.org
>>>         <mailto:sr-users at lists.sip-router.org>
>>>         http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>
>>
>>     -- 
>>     Daniel-Constantin Mierla -http://www.asipto.com
>>     http://twitter.com/#!/miconda  <http://twitter.com/#%21/miconda>  -http://www.linkedin.com/in/miconda
>>     Kamailio Advanced Trainings - Berlin, Nov 25-28
>>        - more details about Kamailio trainings athttp://www.asipto.com  -
>>
>>
>
> -- 
> Daniel-Constantin Mierla -http://www.asipto.com
> http://twitter.com/#!/miconda  -http://www.linkedin.com/in/miconda

-- 
Daniel-Constantin Mierla - http://www.asipto.com
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20131227/b70724fc/attachment-0001.html>


More information about the sr-users mailing list