<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Hello Jason,<br>
    <br>
    I pushed a patch trying to fix this case, it is only on git master
    branch. Can you test it? If all goes fine, we can consider
    backporting it.<br>
    <br>
    Cheers,<br>
    Daniel<br>
    <br>
    <div class="moz-cite-prefix">On 09/04/14 23:26, Jason Penton wrote:<br>
    </div>
    <blockquote
cite="mid:CALoGXNXkRu5OqC_eriqWD72=WaYXNMaLxL-jU836=Os5cY9NhA@mail.gmail.com"
      type="cite">
      <div dir="ltr">Hey Daniel,
        <div><br>
        </div>
        <div>nothing extraordinary...</div>
        <div><br>
        </div>
        <div>
          <div># -- TM params --</div>
          <div>modparam("tm", "fr_timer", 20000);</div>
          <div>modparam("tm", "fr_inv_timer", 10000)</div>
        </div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>Cheers</div>
        <div>Jason</div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Wed, Apr 9, 2014 at 10:32 PM, Jason
          Penton <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:jason.penton@gmail.com" target="_blank">jason.penton@gmail.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">Hey Daniel,
              <div><br>
              </div>
              <div>Yes I did a test with a very basic config file and I
                am not able to re-create. However, with my *complex* cfg
                file I can re-create every time. Tomorrow I will compare
                what is different and report back... hopefully with fix
                ;)</div>
              <div><br>
              </div>
              <div>here is bt of timer process deadlocking itself:</div>
              <div><br>
              </div>
              <div>
                <div>#0  syscall () at
                  ../sysdeps/unix/sysv/linux/x86_64/syscall.S:39</div>
                <div>#1  0x00007f5009f22004 in futex_get
                  (lock=0x7f4fc55030d8) at ../../mem/../futexlock.h:123</div>
                <div>#2  0x00007f5009f223e1 in _lock (s=0x7f4fc55030d8,
                  file=0x7f5009f90fd1 "t_cancel.c",
                  function=0x7f5009f91980 "cancel_branch", line=250) at
                  lock.h:99</div>
                <div>#3  0x00007f5009f23271 in cancel_branch
                  (t=0x7f4fc5501b40, branch=0, reason=0x7fff646d03a8,
                  flags=3) at t_cancel.c:250</div>
                <div>#4  0x00007f5009f22c02 in cancel_uacs
                  (t=0x7f4fc5501b40, cancel_data=0x7fff646d03a0,
                  flags=1) at t_cancel.c:123</div>
                <div>#5  0x00007f5009f718c4 in _reply_light
                  (trans=0x7f4fc5501b40, </div>
                <div>    buf=0x7f500a24dc68 "SIP/2.0 500 Server error on
                  LIR select next S-CSCF\r\nVia: SIP/2.0/UDP
                  10.0.1.167:6060;branch=z9hG4bKb7.2ae09f29ffbd0034cd6d58483053603b.1\r\nVia:
                  SIP/2.0/UDP
                  10.0.1.166:4060;branch=z9hG4bKb7.3faa03ddea80"...,
                  len=778, code=500, to_tag=0x7f500a1c7ae0
                  "c82b15d7f12ef185f95fe4945457d449-8bab",
                  to_tag_len=37, lock=0, bm=0x7fff646d0b60) at
                  t_reply.c:660</div>
                <div>#6  0x00007f5009f7244c in _reply
                  (trans=0x7f4fc5501b40, p_msg=0x7f500a1c6bc0, code=500,
                  text=0x7f500a249a48 "Server error on LIR select next
                  S-CSCF", lock=0) at t_reply.c:795</div>
                <div>#7  0x00007f5009f76436 in t_reply_unsafe
                  (t=0x7f4fc5501b40, p_msg=0x7f500a1c6bc0, code=500,
                  text=0x7f500a249a48 "Server error on LIR select next
                  S-CSCF") at t_reply.c:1643</div>
                <div>#8  0x00007f5009f57621 in w_t_reply
                  (msg=0x7f500a1c6bc0, p1=0x7f500a2497d8
                  "\340\332$\nP\177", p2=0x7f500a249870 "h\321$\nP\177")
                  at tm.c:1324</div>
                <div>#9  0x000000000041a700 in do_action
                  (h=0x7fff646d1d30, a=0x7f500a24cee8,
                  msg=0x7f500a1c6bc0) at action.c:1119</div>
                <div>#10 0x0000000000423831 in run_actions
                  (h=0x7fff646d1d30, a=0x7f500a24cee8,
                  msg=0x7f500a1c6bc0) at action.c:1607</div>
                <div>#11 0x000000000041a5a4 in do_action
                  (h=0x7fff646d1d30, a=0x7f500a24d478,
                  msg=0x7f500a1c6bc0) at action.c:1102</div>
                <div>#12 0x0000000000423831 in run_actions
                  (h=0x7fff646d1d30, a=0x7f500a249148,
                  msg=0x7f500a1c6bc0) at action.c:1607</div>
                <div>#13 0x000000000041a54e in do_action
                  (h=0x7fff646d1d30, a=0x7f500a24c500,
                  msg=0x7f500a1c6bc0) at action.c:1098</div>
                <div>#14 0x0000000000423831 in run_actions
                  (h=0x7fff646d1d30, a=0x7f500a247a28,
                  msg=0x7f500a1c6bc0) at action.c:1607</div>
                <div>#15 0x0000000000423fdf in run_top_route
                  (a=0x7f500a247a28, msg=0x7f500a1c6bc0, c=0x0) at
                  action.c:1693</div>
                <div>#16 0x00007f5009f73815 in run_failure_handlers
                  (t=0x7f4fc5501b40, rpl=0xffffffffffffffff, code=408,
                  extra_flags=96) at t_reply.c:1061</div>
                <div>#17 0x00007f5009f7527a in t_should_relay_response
                  (Trans=0x7f4fc5501b40, new_code=408, branch=1,
                  should_store=0x7fff646d201c,
                  should_relay=0x7fff646d2018,
                  cancel_data=0x7fff646d2070, </div>
                <div>    reply=0xffffffffffffffff) at t_reply.c:1416</div>
                <div>#18 0x00007f5009f76ede in relay_reply
                  (t=0x7f4fc5501b40, p_msg=0xffffffffffffffff, branch=1,
                  msg_status=408, cancel_data=0x7fff646d2070,
                  do_put_on_wait=0) at t_reply.c:1819</div>
                <div>#19 0x00007f5009f44c88 in fake_reply
                  (t=0x7f4fc5501b40, branch=1, code=408) at timer.c:354</div>
                <div>#20 0x00007f5009f450e7 in final_response_handler
                  (r_buf=0x7f4fc5501e60, t=0x7f4fc5501b40) at
                  timer.c:526</div>
                <div>
                  #21 0x00007f5009f4518d in retr_buf_handler
                  (ticks=260027386, tl=0x7f4fc5501e80, p=0x3e8) at
                  timer.c:584</div>
                <div>#22 0x0000000000544119 in timer_list_expire
                  (t=260027386, h=0x7f4fc527cbe0, slow_l=0x7f4fc527cdf0,
                  slow_mark=0) at timer.c:894</div>
                <div>#23 0x0000000000544418 in timer_handler () at
                  timer.c:959</div>
                <div>#24 0x00000000005446b2 in timer_main () at
                  timer.c:998</div>
                <div>#25 0x0000000000471ddf in main_loop () at
                  main.c:1689</div>
              </div>
              <div><br>
              </div>
            </div>
            <div class="HOEnZb">
              <div class="h5">
                <div class="gmail_extra"><br>
                  <br>
                  <div class="gmail_quote">On Wed, Apr 9, 2014 at 9:34
                    PM, Daniel-Constantin Mierla <span dir="ltr"><<a
                        moz-do-not-send="true"
                        href="mailto:miconda@gmail.com" target="_blank">miconda@gmail.com</a>></span>
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div bgcolor="#FFFFFF" text="#000000"> Hello,<br>
                        <br>
                        that should not be a very rare case and I would
                        expect to be caught so far, anyhow ... this
                        looks like easy to reproduce, have you tried it?<br>
                        <br>
                        You can have two kamailio, one relying the
                        invite to the second, which will reply with 100,
                        then wait for the timeout on the first instance.
                        You can add some debug messages in the code to
                        see if the lock is called twice.<br>
                        <br>
                        Cheers,<br>
                        Daniel
                        <div>
                          <div><br>
                            <br>
                            <div>On 09/04/14 17:51, Jason Penton wrote:<br>
                            </div>
                          </div>
                        </div>
                        <blockquote type="cite">
                          <div>
                            <div>
                              <div dir="ltr">Hi All,
                                <div><br>
                                </div>
                                <div>I have been experiencing a deadlock
                                  when a timeout occurs on a t_relayed()
                                  INVITE. Going through the code I have
                                  noticed a possible chance of deadlock
                                  (without re-entrant enabled). Here is
                                  my thinking:</div>
                                <div><br>
                                </div>
                                <div>t_should_relay_response() is called
                                  with REPLY_LOCK when the timer process
                                  fires on the fr_inv_timer (no response
                                  from the INVITE that was relayed,
                                  other than 100 provisional) and a 408
                                  is generated. However, from within
                                  that function there are calls
                                  to run_failure_handlers() which in
                                  turn *could* try and lock the reply
                                  (viz. somebody having a t_reply() call
                                  in the cfg file - in failure route
                                  block). This would result in another
                                  lock on the same transaction's
                                  REPLY_LOCK....<br>
                                </div>
                                <div><br>
                                </div>
                                <div>Has anybody else experienced
                                  something like this?</div>
                                <div><br>
                                </div>
                                <div>this is on master btw.</div>
                                <div><br>
                                </div>
                                <div>Cheers</div>
                                <div>Jason</div>
                              </div>
                              <br>
                              <fieldset></fieldset>
                              <br>
                            </div>
                          </div>
                          <pre>_______________________________________________
sr-dev mailing list
<a moz-do-not-send="true" href="mailto:sr-dev@lists.sip-router.org" target="_blank">sr-dev@lists.sip-router.org</a>
<a moz-do-not-send="true" href="http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev" target="_blank">http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev</a><span><font color="#888888">
</font></span></pre>
                          <span><font color="#888888"> </font></span></blockquote>
                        <span><font color="#888888"> <br>
                            <pre cols="72">-- 
Daniel-Constantin Mierla - <a moz-do-not-send="true" href="http://www.asipto.com" target="_blank">http://www.asipto.com</a>
<a moz-do-not-send="true" href="http://twitter.com/#%21/miconda" target="_blank">http://twitter.com/#!/miconda</a> - <a moz-do-not-send="true" href="http://www.linkedin.com/in/miconda" target="_blank">http://www.linkedin.com/in/miconda</a></pre>
                          </font></span></div>
                      <br>
                      _______________________________________________<br>
                      sr-dev mailing list<br>
                      <a moz-do-not-send="true"
                        href="mailto:sr-dev@lists.sip-router.org"
                        target="_blank">sr-dev@lists.sip-router.org</a><br>
                      <a moz-do-not-send="true"
                        href="http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev"
                        target="_blank">http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev</a><br>
                      <br>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Daniel-Constantin Mierla - <a class="moz-txt-link-freetext" href="http://www.asipto.com">http://www.asipto.com</a>
<a class="moz-txt-link-freetext" href="http://twitter.com/#!/miconda">http://twitter.com/#!/miconda</a> - <a class="moz-txt-link-freetext" href="http://www.linkedin.com/in/miconda">http://www.linkedin.com/in/miconda</a></pre>
  </body>
</html>