[SR-Users] possible bug with dialog module

Kelvin Chua kelchy at gmail.com
Thu Apr 22 12:11:06 CEST 2010


these Cisco ATAs are so old, we are maintaining around 300 of these, and a
lot of times
we encounter corrupted SIP messages of which some are fixed by just reboots,
other times,
by upgrading firmwares.

most of the times, kamailio manages to stay afloat in the midst
of all these corruption except for some instances like the one i'm
reporting, it didn't.

i am currently getting the latest git and will test.
the last crash we encountered before yesterday is march 30 and before that,
january 5.
if i manage to upgrade this now, we won't know if it really works not unless
we monitor this
for at least 3 months :-)

Kelvin Chua


On Thu, Apr 22, 2010 at 4:47 PM, Daniel-Constantin Mierla <miconda at gmail.com
> wrote:

> Hi Timo,
>
> thanks for troubleshooting. I committed the patch that moves setting of
> bind_addr before any error case in populate_leg_info(). I backported to
> kamailio_3.0 branch as well.
>
> Kelvin, can you get the lasted git version for branch kamailio_3.0 and
> test?
>
> Thanks,
> Daniel
>
>
> On 4/22/10 1:21 AM, Timo Reimann wrote:
>
>> Hello,
>>
>>
>> Kelvin Chua wrote:
>>
>>
>>> (gdb) bt
>>> #0  0x00002ab61b62779a in update_dialog_dbinfo (cell=0x2ab61c9100f8) at
>>> dlg_db_handler.c:501
>>>
>>>
>> This corresponds to
>>
>>   SET_STR_VALUE(values+8, cell->bind_addr[DLG_CALLEE_LEG]->sock_str);
>>
>> so assumingly sip-router crashes when it tries to access the callee's
>> bound address's sock_str...
>>
>>
>>
>>
>>> #1  0x00002ab61b628ea8 in dlg_onreply (t=0x7d5228, type=<value optimized
>>> out>, param=<value optimized out>) at dlg_handlers.c:361
>>> #2  0x00002ab617965505 in run_trans_callbacks_internal
>>> (cb_lst=0x2ab61c938830, type=128, trans=0x2ab61c9387c0,
>>>
>>> (gdb) print cell
>>> $1 = (struct dlg_cell *) 0x2ab61c9100f8
>>>
>>>
>>>
>>> (gdb) print *cell
>>> 0}}, bind_addr = {0x88c580, 0x0},
>>>   cbs = {first = 0x0, types = 0}, profile_links = 0x0}
>>>
>>>
>> ... as supported by the fact that bind_addr's second field
>> (DLG_CALLEE_LEG) is 0.
>>
>> Why does the segfault happen?
>>
>> Let's trace the code path: The initial error message
>>
>> "bad sip message or missing Contact hdr"
>>
>> occurred in dlg_handlers.c, line 218, which makes this piece of code's
>> surrounding function "populate_leg_info" return prematurely (by means of
>> "goto error0"). Specifically, this implies that the code at the end of
>> the function on line 272
>>
>>   dlg->bind_addr[leg] = msg->rcv.bind_address;
>>
>> isn't carried out anymore, leaving the callee's bound address associated
>> with the given dialog unassigned. (This happens to be the only occasion
>> where the bound address is assigned.) Instead, execution drops back to
>> the "dlg_onreply" function and proceeds to line 361, thereby calling the
>> database update function:
>>
>>   update_dialog_dbinfo(dlg);
>>
>> which directly leads to the segfaulting code location.
>>
>>
>> AFAICS, "update_dialog_dbinfo" is dereferencing a possibly null memory
>> location at the dialog data in question only, so one way to prevent the
>> segfault from happening is to move the bound address assignment before
>> any failing code in the function. This should make sure that some
>> accessible bound address is stored in any case.
>>
>>
>> Cheers,
>>
>> --Timo
>>
>>
>
> --
> Daniel-Constantin Mierla * http://www.asipto.com/ *
> http://twitter.com/miconda *
> http://www.linkedin.com/in/danielconstantinmierla
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20100422/2f779e82/attachment.htm>


More information about the sr-users mailing list