[Serusers] SER Reports "out of memory"

Mon May 30 21:26:15 CEST 2005

Jan,

Great! I can only image that your very busy, however, do you have any sort 
of time frame for this to be commited to CVS?

Regards,
Paul

On 5/30/05, Jan Janak <jan at iptel.org> wrote:
> 
> On 30-05-2005 14:11, Greger V. Teigre wrote:
> > See inline.
> > Jiri Kuthan wrote:
> > >At 09:24 AM 5/30/2005, Greger V. Teigre wrote:
> > >
> > >[...]
> > >>>* when ser starts up usrloc is "lazy-loaded"
> > >>>* if a usrloc record is looked up in cache and is __NOT__ found,
> > >>>then MySQL will be queried. If found in MySQL then the usrloc
> > >>>record will be put in to cache for future lookups
> > >>>
> > >>>By doing these two things we should not have a problem we
> > >>>excessively large subscriber bases.
> > >>>
> > >>>Thoughts?
> > >>
> > >>Makes sense. This is how Berkeley DB and many other DBs work. In
> > >>fact, the best would be to build an abstraction cache layer around
> > >>all the query functions that have data in the DB. This way you would
> > >>get the optimum performance/scalability.
> > >
> > >I have to admit I am not sufficiently familiarized with BDB. If I
> > >understand it right, they do confgurable in-memory caching and they
> > >also support some kind of master-slave replication. I am not sure
> > >though how this scales...(20 SERs with 20 BDBs, one of them master
> > >and replicating UsrLoc changes to 19 slaves who are all able to
> > >identify inconsistent cache?)
> > >
> > >I mean the structural problem here is dealing with r-w intensive
> > >Usrloc operations and still desiring to replicate for reliability.
> > >There is a variety of algorithms to deal with it and I don't know
> > >well what the respective DB systems actually do.
> >
> > I'm not proposing to use BDB, it was just an example. Databases are very
> > good at replication, even two-way replication can be done quite 
> efficiently
> > through locking etc. I just took Paul's setup with cluster back-end as
> > granted and wrote my comments based on that...
> >
> > Thinking a bit wider and building on your comments, Jiri:
> > The challenge, I think, is to handle the following things in any likely
> > deployment scenario:
> > 1. Usrloc writes to cache vs. DB
> > 2. Replication of usrloc, multiple DBs vs. cluster, across LAN or WAN
> > 3. Memory caching management (inconsistencies etc)
> >
> > For the sake of the readers, here is how I understand SER's operations
> > today:
> > 1. Usrloc is always written to cache, DB write is controlled through
> > write-through parameter
> > 2. Replication is handled by t_replicate
> > 3. Management of cache is not needed, the cache is always updated. 
> However,
> > an updated DB (and thus dirty cache) will not be detected
> 
> I am working on that already. The entries in the usrloc cache will
> have an additional expires value and if that value expires then the
> usrloc code will refresh it from the database. Also there will be no
> full cache anymore -- usrloc will cache only a portion of the whole
> location database and old entries will be using LRU scheme.
> 
> The cache will be empty upon startup. When SER calls lookup then
> usrloc will search the cache -- if there is no entry or if it is
> expired then it will load it from the database and store in the cache
> for limited period of time. If there is no entry in the database then
> it will create a negative cache entry (to limit the amount of
> unsuccessful database queries).
> 
> Database updates will not assume anything about the state of the
> database so it should not matter if the entry still exists / does not
> exists / has been modified..
> 
> There is one drawback though -- nathelper as it is implemented right
> now will not work anymore -- we would need to rewrite it to use the
> contents of the database.
> 
> > Here is how I understand Paul's proposal (and with my annotated 
> suggestions
> > from my last email :-):
> > 1. Usrloc is always written to DB, cache is updated if it is already in 
> the
> > cache
> > 2. Replication is handled by underlying database across DBs or in a 
> cluster
> > 3. If usrloc is not found, DB is checked. If cache is full, some 
> mechanism
> > for throwing out a usrloc is devised
> >
> > I must admit I often fall for the argument: "let each system do what it 
> is
> > best at."
> > Following that, replication should only be done at an application level 
> if
> > the underlying database is not capable of doing it (if we agree that a 
> DB
> > is good at replication). The only thing I see a DB is not capable of, is
> > handling the NAT issues. So, if a given usrloc has to be represented by
> > different location (ex. the registration server), then the DB cannot do
> > replication. However, if the NAT issue is handled through some other 
> means,
> > ex. Call-Id aware LVS with one public IP, then the usrloc should be the
> > same across DBs and the DB should handle the replication.
> 
> Another approach would be to let the user agent handle NATs. Sipura
> phones, for example, can register with two proxy servers.
> 
> > You don't need many subscribers before you'll want redundancy and as
> > active-passive redundancy is a waste of resources, I believe an upgrade 
> of
> > the replication mechanism should soon be imminent. ;-)
> > I think I have said this before, but this is my enterprise-level "dream"
> > scenario:
> > 1. Two geographically distributed server centers
> > 2. DNS SRV for load distribution (and possible using segmentation of
> > clients through their configurations if they don't support DNS SRV)
> > 3. Each data center has Call-Id sensitive LVS in front, with one or more
> > servers at the back (a fair-sized LVS box can handle 8,000 UDP packets 
> per
> > second)
> > 4. Each data center either has a DB cluster or two-ways SER-based
> > replication
> > 5. The data centers replicate between each other using either DB-based
> > replication or two-ways SER-based replication
> > 6. The SER-based replication is an enhanced version of t_replicate() 
> were
> > replication is to a set of servers and replication is ACKed and 
> guaranteed
> > (queue). I would suggest using the XMLRPC interface Jan has introduced
> > 7. I think Paul's cache-suggestions are good regardless of decisions on
> > replication
> >
> > Entry level scenario where the same box is running LVS, SER, and DB (you
> > can quickly add new boxes) has a very low cost.
> >
> > >> However, there is one more thing: You need to decide on an
> > >>algorithm for selecting a usrloc record to replace when the cache is
> > >>full. Do you store extra info in memory for each usrloc to make the
> > >>right decision (ex. based on the number of lookups).
> > >
> > >You may also purchase more memory :)
> >
> > Do you suggest that no mechanism should be devised when the cache limit 
> is
> > hit? ;-) Then maybe I can suggest an email alert to the operator when a
> > certain amount of the cache is full... :-D I trust my people to act fast
> > and appropriate, but not that fast and appropriate!
> >
> > g-)
> >
> > _______________________________________________
> > Serusers mailing list
> > serusers at lists.iptel.org
> > http://lists.iptel.org/mailman/listinfo/serusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20050530/0e5fbc06/attachment.htm>