Search the Asterisk Blog

CDR/CEL Processing – Climbing the Beanstalk

By Nir Simionovich

One of the most annoying tasks within Asterisk (or VoIP in general) is the task of CDR and event processing. Why is processing these so annoying? Well, depending on your infrastructure, problems can arise because of any of the below:

  • Row locking within the database
  • Handling of multiple input points
  • Handling a constantly changing data set
  • Split brain processing issues with clusters
  • Replication of data records between multiple data processing points
  • Data synchronizing
  • Data uniqueness and data consistency
  • etc.

The primary issue is to make sure that when you process an event or a CDR, it is processed once and only once.

Asterisk provides multiple backends for processing CDR and CEL records. These include log files, MySQL, Postgres, ODBC, and others. However, all of these are more or less prone to the same caveats listed above.

When we started developing our cloud platform, cloudonix.io, we were confronted with the following issues for processing CDR/CEL records:

  • Our platform had an unknown number of Asterisk servers, autoscaling at unknown times.
  • Our platform was split into multiple cloud zones and regions.
  • Our platform was spitting out multiple event types that required different types of processing.

Thus, we took the normal track everybody did: logging into a RDBMS of sort. The result was annoying.  Processing was slow, locking of records was always an issue, and, worst of all, providing multiple processing points to handle the events was becoming increasingly difficult. The solution: beanstalkd (http://kr.github.io/beanstalkd).

For lack of a better description, beanstalkd is very simply a job queue. It will accept a free formatted string into a queue (named ‘tube’) in the beanstalk world. Jobs can be pushed into the queue (put), extracted from the queue (reserve), deleted from the queue (delete), and more. Our idea for operation was really simple: CDR records need to process relatively slow, CEL records need to process relatively fast, we want to be able to have multiple clients reserve work from the queue, process the work uniquely, then delete the job, or return to the queue upon failure.

Thus, cdr_beankstalk and cel_beanstalk were born.

The beanstalk backends are fairly similar in nature, and provide a means of inserting CDR/CEL events directly from Asterisk to a remote beanstalkd server. So, how do we work with beanstalkd? Let’s have a look.

Step 1: Install the beanstalkd server

Some of the major distributions provide beanstalkd as a precompiled package, so try that one first, otherwise download the source. Pay attention to the installation and configuration, as you will need that later on. Normally, the installation will make the beanstalkd server available at port 11300.

Note:beanstalkd doesn’t include any type of authorization or security, so make sure you block this port to the world. Remember, security is on you!

Step 2: Configure cdr_beanstalkd.conf

The following sample shows the cdr_beanstalkd.conf file:

You will need to uncomment the lines to activate the backend. The important configuration parameters here are tube and priority. The backend enables the insertion of events to a single assigned tube. That means that all CDR records will be inserted to the asterisk-cdr tube. When a job is inserted, you can assign a priority. The lower the number, the higher the priority. The priority mechanism is especially useful when you have multiple Asterisk servers with varying functionality, which require different levels of priority for processing. For example, an Asterisk server that deals with wholesale routing requires a higher priority than one doing voicemail.

Step 2: Test your configuration

Using the console, make sure your new CDR backend is working:

A similar configuration is available for cel_beanstalkd.conf and the same concepts apply.

Once your configuration is alive and tested, you can use the beanstalkd client libraries to start processing your jobs. One of my favorite libraries for PHP is called Pheanstalk(https://github.com/pda/pheanstalk). It’s a little old, but provides all the functionality that you could want. The overall beanstalkd client life cycles can be defined as the following:

This means that in order to process uniquely, you will need to reserve a job, delete it when finished processing, or release the reservation if the processing failed, which means that you will not let the queue store the reserved job again. The nice thing is that you don’t need to worry about queue integrity or the priorities, beanstalkd will take care of that for you. In addition, you can configure beanstalkd to store your queue on file as well, meaning that your storage and operations are stateful. Thus, you can continue your processing and work, even after a failure.

There Are 4 Comments

  • I noticed that the code is available on GitHub, but not in the Asterisk source package.

    Is it still in development? When it will be distributed with the code?

    Regards,
    Marcelo

  • marek cervenka says:

    why you picked beanstalkd ? there is no development from 2014
    can you compare it to apache kafka ?(pros/cons)

    • We originally created the first version of this CDR handler back in 2010, for usage with Asterisk 1.6. When created, it was used internally for one of our side projects, which was later on abandoned and was unmaintained for a while. Back then, the contribution process wasn’t as straightforward as today, thus, it never got properly submitted up-stream.

      Recently, we’ve updated the code base to support latest Asterisk, and thus, at the same time we’ve decided to fully release it to the master branch. I admit, it is a bit dated as a technology tool – but, having said that, it is still a viable tool to use. Even PHPAGI hadn’t been updated in years, yet, it is still the predominant tool with AGI developers.

      As I see it, stability is the most important factor – as a telephone needs to be, first and foremost, stable. Not that Kafka isn’t stable, the technical footprint that Kafka requires is far bigger than Beanstalk’s, and thus, introduce potential instability factors to the system. Personally, I don’t believe that every new/shiny/cool tech should be used, just off the bat. I prefer my technology a bit more seasoned.

      Internally, as a company, we look at technology and evaluate it all the time. For example, we’ve recently reviewed the various ARI libraries and deemed all of them as – improperly maintained, lacking proper documentation, lacking proper deployment guidelines and more. Thus, we decided to create a new one. Yes, I am the creator of PHP-ARI, but we’ve decided not to use it and write something new (which will be released soon enough), simply because it doesn’t fit our new use-case.

      The comparison with Kafka is pointless – these are two fundamentally different tools.

Add to the Discussion

Your email address will not be published. Required fields are marked *

About the Author

Nir Simionovich

See All of Nir's Articles

More From
The Digium Blog

  • No items