Thursday, July 26, 2012

Asterisk 11: What did we do to the poor channel object

I often feel like you can group languages into two camps: those that perform memory management for you via some form of garbage collection, and those that don't.  Today, most colleges start students off with a "modern" language that performs garbage collection, like Java.  Regardless of the pros/cons of doing this - and that debate is endless - there is a reason why many newer languages have memory management as a core language feature.


Writing code that effectively manages memory is hard.  (And it also takes longer).


But it isn't just that the implementation is harder.  You probably think of memory leaks when you hear "managing memory is hard".  But its more then that.


Designing complicated software that effectively manages memory such that the program is maintainable is very hard.


C++ has made a lot of great strides in alleviating some of this, the newest standard in particular.  Shared pointers, an explicit multithreading memory model, and a number of other things being added to the core language help a lot in this regard.  But Asterisk is written in C, so we don't get that stuff.  We have to do it ourselves.


So, consider, for example, a multi-threaded program where two threads have to manipulate an instance of struct foo.


foo.h:



struct foo {

  int bar;

};


struct foo *create_foo(void);

void destroy_foo(struct foo *obj);

Say we have a thread running in foo.c that has to interact with an instance of foo, and a thread running in bar.c that also interacts with the same instance of foo.  The classic design question is: who owns foo?  Does foo.c?  Or does bar.c?

The easiest answer is to say "the thread running in foo.c will own the instance of foo.  bar can interact with the instance, but it has to get it from foo.c."

So how do we then synchronize an instance of foo?


Well, one potential solution would be to put a mutex directly in the foo instance:

struct foo {
   pthread_mutex_t *mutex;
   int bar;
};


But then you have to remember to lock the foo every time you want to use it.  Plus, nothing prevents bar from just freeing the instance of foo whenever it wants, leaving foo.c nothing but a pointer to some memory it doesn't own!  What's worse, it doesn't know that it doesn't own it - and since the mutex lived in foo, there's no way to synchronize it!  As a single plus, both foo.c and bar.c get full access to the instance, so there's nothing additional you have to write and maintain in foo.h.


If we want to get around the destruction problem, we need some form of reference counting.  When bar is done with its foo instance, it can decrement the ref count - if the ref count hits 0, the object gets nuked.  Same thing with foo - it decrements the ref count when its done with its instance.  This solves the ownership problem, but it does introduce a number of complexities:

  1. Ref counting leaks - bar bumps the ref count on its instance, but never decrements it.  The instance never goes away.  (More on this in the next blog post!)
  2. Circular references: struct foo has a reference to a ref counted object, and that ref counted object has a reference back to foo.  Neither object will ever go away.
  3. Synchronization can still be tricky - you can't decrement the ref count until you unlock the object, and once you unlock the object someone else can decrement the ref count as well.  Your ref counting library has to be pretty robust to survive those kinds of scenarios.
So, ref counting the object and providing a lock in the object can help some of the synchronization problems.  What else can we do?

Well, we could be restrictive and only give a handle to the instance of foo back to bar.  Our foo header file would end up looking something like:

struct foo_handle;

struct foo_handle *get_foo(void);

void destroy_foo_handle(struct *foo_handle);

void do_operation(struct foo_handle* obj);

All of the details of foo won't be exposed in the header - instead, struct foo will just live inside foo.c.  This "opaquification" of foo (in this case, by providing an opaque object foo_handle to reference the actual foo object) is a pretty good solution, but it does mean that all of the operations on foo have to be done through its handle, which could be a lot of operations defined in the header.

Still, the handle approach is probably better, and safer - its much clearer who owns the object and you can define an explicit contract between foo.c and bar.c as to what operations are safe to do on an instance of foo through its handle foo_handle.

I think that's actually understated.  Think of it this way: on a major project with tens of thousands of lines of code, can you really expect yourself to remember all of the semantics of an object years after you've written it?  Can you really expect someone else to learn it and remember it?  How do you prevent a new coder from introducing subtle race conditions or deadlocks without a contract that they have to explicitly violate (probably by trying to sneak a change to the contract by you in a code review)?

Ever wonder why there have been so many deadlocks in Asterisk?

Consider the channel object - that little object in Asterisk (how many fields are in that struct now?  50?  100?) that is almost always interacted on by two threads: the pbx_thread servicing the channel in the dialplan, and the various thread(s) that provide the protocol for the channel technology.  Those threads both interact on the channel; those threads both change the state of the channel; those threads can both cause the destruction of the channel.

This is a really long winded explanation of getting around to what we did for the channel object in Asterisk: we opaquified it.  Its now no longer accessible outside of channel_internal.c - not even channel.c can see it.  Neither can the channel technology implementations.  All of the operations on the channel are hidden behind specific calls that define the contract for that channel.

If you need to know a property of the channel: use one of the ast_channel_* calls.  If you need to change the state of the channel: use one of the ast_channel_* calls.  You get the idea.

Currently, that contract is huge: while we opaquified the channel object, we did not change the semantics of what you can do with that channel.  All of the "dangerous" stuff is still there - you can still change the state willy-nilly; masquerades are still in; you can still cause deadlocks by holding the channel lock while queuing an indication on the channel.  Good times.  But the contract is in place, and it'll probably get tighter in the future.

This is really all a slow migration towards an easier to manage, easier to use channel object.  Before reference counted objects, you had race conditions between who destroyed the channel, and the issue of "who owns this channel" was a real problem.  Now, we're narrowing the scope even further, from "who owns this channel" to "what is allowed on this channel".

With any luck, in a few more major versions, people won't look at the channel object and ask themselves, "do I lock the private first or the channel?  Do I have to bump the ref count on the channel before I unlock it and queue this frame?  Do I even unlock it before queuing this frame?"

One can only hope!

Thursday, July 19, 2012

Asterisk 11: Call ID Logging

So... I wrote a blog post!  On another blog!

Asterisk 11 Development: Call IDs for Asterisk Logs

Pretty exciting stuff for administrators of Asterisk.  If you've ever watched an Asterisk CLI on a busy system, you know that at even low verbosity, the log statements for calls end up so intermingled that making sense of it all takes a lot of skill and patience.

Well, it'll still take that - just a lot less.  All channels now get a unique Call ID that lasts for the lifetime of a call - not just the channel.  When channels are bridged, their Call IDs are merged - if a channel is bridged with another channel and they both have Call IDs, one of them gets retired.  As channels move between calls, the Call IDs are tracked.  In the end, you can either get all the calls that a channel was involved in, or get all of the channels involved in a single call.

This lets you grep through Asterisk log files and get only the information you need - all the way down to DEBUG statements.

I'm hoping to get a blog post put together for the Digium Blog on chan_motif next.  And no, there is no protocol named "motif"... hence why we need the blog post :-)

Thursday, July 12, 2012

A Pluggable Framework for the Asterisk Test Suite - Part 3

Recap:

So, in the last two posts (Part 1 and Part 2), we've covered the motivations for a pluggable framework for the Asterisk Test Suite, and we've taken a test (cdr_userfield) and made it completely configuration driven.  This same approach works well for most of the other CDR tests as well - such as cdr_accountcode, which was also mentioned in the first post.  Both of these tests (and others) can be driven by the configuration we've defined for SimpleTestCase and the configuration we've defined for our CDRModule.

You'll note that I said "most of the other CDR tests".

A Monkey in the Works

Enter the monkey wrench that are the ForkCDR tests.  Lets take a look at one, cdr_fork_end_time.  First, we have some expectations that are added in the constructor of the test:

class ForkEndTimingTest(CDRTestCase):
    def __init__(self):
        CDRTestCase.__init__(self)

        self.add_expectation('cdrtest_local', AsteriskCSVCDRLine(
            destination = "1",
            lastapp = "ForkCDR",
            dcontext = "default",
            dchannel = "SIP/test-.*",
            channel = "Local/1@default-.*",
            disposition = "ANSWERED",
            amaflags = "DOCUMENTATION"))

        self.add_expectation('cdrtest_local', AsteriskCSVCDRLine(
            destination = "1",
            lastapp = "Hangup",
            dcontext = "default",
            dchannel = "SIP/test-.*",
            channel = "Local/1@default-.*",
            disposition = "NO ANSWER",
            amaflags = "DOCUMENTATION"))


Well, that's not too bad - we can easily express that in the configuration for a CDRModule instance, by specifying two expected lines for file cdrtest_local.  Unfortunately, there's more:



    def match_cdrs(self):

        CDRTestCase.match_cdrs(self)

        if not self.passed:

            return

        cdr1 = AsteriskCSVCDR(fn = "%s/%s/cdr-csv/%s.csv" % (self.ast[0].base, self.ast[0].directories['astlogdir'], "cdrtest_local"))

        #check for missing fields
        for cdritem in cdr1:
            if cdritem.duration is None or cdritem.start is None or cdritem.end is None:
                logger.Error("EPIC FAILURE: CDR record %s is missing one or more key fields. This should never be able to happen." % cdritem)
                self.passed = False
                return

        # The dialplan is set up so that these two CDRs should each last at least 4 seconds. Giving it wiggle room,
        # we'll just say we want it to be greater than 1 second.
        if ((int(cdr1[0].duration) <= 1) or (int(cdr1[1].duration) <= 1)):
            logger.error("FAILURE: One or both CDRs only lasted a second or less (expected more)")
            self.passed = False
            return

        end = time.strptime(cdr1[0].end, "%Y-%m-%d %H:%M:%S")
        beg = time.strptime(cdr1[1].start, "%Y-%m-%d %H:%M:%S")

        #check that the end of the first CDR occured within a 1 second split of the beginning of the second CDR
        if (abs(time.mktime(end) - time.mktime(beg)) > 1):
            logger.error("Time discrepency between end1 and start2 must be one second or less.\n")
            logger.error("Actual times: end cdr1 = %s   begin cdr2 = %s" % (cdr1[0].end, cdr1[1].start))

            self.passed = False
            return



Rut roh Shaggy.  We now have custom logic in method match_cdrs that is pretty specific to this test.  Here is what this method does that our CDRModule doesn't do:
  • It ensures that certain field were left in each line
  • Verifies that each duration is greater than 1 second
  • Checks that the end of the first CDR entry is within one second of the beginning of the second CDR entry
The first two could be configuration driven in CDRModule; the third, on the other hand, is definitely specific to the usage of ForkCDR.  It doesn't make sense for those behaviors to be exposed to other CDR tests.

So what do we do?
I look cheeky and fun, but I'm going to screw up
your well planned architecture

Overriding CDRModule

Ideally, ForkCDR tests would provide their own module that was specific only to their test.  We wouldn't want to put them in the general purpose library because, well, the logic isn't general purpose - its specific to only a very few tests.  Lets assume that we'll provide a concrete implementation of CDRModule, and extend it to provide the ForkCDR specific logic.  You'll remember the CDRModule already provides a method, match_cdrs, that does the matching between the actual CDR CSV lines and what our expected results.  We can put the logic of cdr_fork_end_time's match_cdrs into its own override of that method, call the base class's implementation, and be done.  It would look something like this:





class ForkCdrModuleEndTime(CDRModule):


    ''' A class that adds some additional CDR checking of the end times on top

    of CDRModule

    In addition to checking the normal expectations, this class also checks
    whether or not the end times of the CDRs are within some period of time
    of each each other.

    Note that this class assumes the CDRs are in cdrtest_local.
    '''

    def __init__(self, module_config, test_object):
        super(ForkCdrModuleEndTime, self).__init__(module_config, test_object)

    def match_cdrs(self):
        super(ForkCdrModuleEndTime, self).match_cdrs()

        if (not self.test_object.passed):
            return

        cdr1 = AsteriskCSVCDR(fn = "%s/%s/cdr-csv/%s.csv" %
                (self.test_object.ast[0].base,
                 self.test_object.ast[0].directories['astlogdir'],
                 "cdrtest_local"))

        #check for missing fields
        for cdritem in cdr1:
            if (cdritem.duration is None or
                cdritem.start is None or
                cdritem.end is None):
                logger.Error("EPIC FAILURE: CDR record %s is missing one or " \
                             "more key fields. This should never be able to " \
                             "happen." % cdritem)
                self.test_object.passed = False
                return

        # The dialplan is set up so that these two CDRs should each last at
        # least 4 seconds. Giving it wiggle room, we'll just say we want it to
        # be greater than 1 second.
        if ((int(cdr1[0].duration) <= 1) or (int(cdr1[1].duration) <= 1)):
            logger.error("FAILURE: One or both CDRs only lasted a second or " \
                         "less (expected more)")
            self.test_object.passed = False
            return

        end = time.strptime(cdr1[0].end, "%Y-%m-%d %H:%M:%S")
        beg = time.strptime(cdr1[1].start, "%Y-%m-%d %H:%M:%S")

        #check that the end of the first CDR occured within a 1 second split of
        # the beginning of the second CDR
        if (abs(time.mktime(end) - time.mktime(beg)) > 1):
            logger.error("Time discrepency between end1 and start2 must be " \
                         "one second or less.\n")
            logger.error("Actual times: end cdr1 = %s   begin cdr2 = %s" %
                         (cdr1[0].end, cdr1[1].start))
            self.test_object.passed = False
            return


But, you ask, what good is this?  How do we get our TestRunner module to load this class, which - currently - is in the same directory as the test-config for this test (tests/cdr/cdr_fork_end_time)?

Dynamic Test Module Importing and Class Instantiation

Big words for a subheading.  What do we mean?

Well, lets assume that ForkCdrModuleEndTime sits in module ForkCdrModule, which lives in folder tests/cdr/cdr_fork_end_time.  Our normal python libraries, including TestRunner and CdrModule, live in lib/python/asterisk, and the Python search path is usually modified to include lib/python.

If we directly attempted to __import__ ForkCdrModule.ForkCdrModuleEndTime, by specifying its module type/class type in the test-config.yaml, we'd throw an Exception.  That module doesn't live in the lib/python/asterisk package, nor does it exist in any package in the Python search path.  How do we get around this?

  1. Move ForkCdrModule into the lib.python.asterisk package.  Unfortunately, this means that test specific logic ends up in our general purpose libraries, which is what we want to avoid.
  2. Make tests/cdr/cdr_fork_end_time a package by adding an __init__.py file to the directory.  We could then modify the python search path to include that directory before TestRunner starts an import.  This feels less than ideal for two reasons:
    1. We end up having to make a lot of tests packages, which isn't really what we want.  Packages with a single module/class is less than ideal, and would require a lot of empty __init__.py files.
    2. We'd end up having to modify TestRunner to either statically "know" of the various test modules and their locations, or we'll have to pass that information into TestRunner via some configuration so that TestRunner can dynamically modify the Python search path.  The latter option isn't too bad, but it does feel a little less than optimal.
  3. Create a module importer using the Python imp package and import the module ourselves.  This is more work, but should be the most flexible, and avoids having to turn the tests into Python packages.
Guess which option we went with?
Python module importing using the imp package can be broken down into two major stages:
  1. Implement an object that has a find_module method.  The purpose of this is for your object to determine if it should be responsible for handling the import of the specified module.  If it cannot handle the import, it raises an ImportError during construction.  Otherwise, its find_module method should provide an object to do the actual loading.
  2. Implement an object to load the Python module into memory and return it for execution.  Its up to this object to do all the heavy lifting, be that reading the module from some backing storage, to converting the contents into interpret-able Python code, setting module properties, etc.  In our case, since we still expect the test's to write their modules in Python, this ends up being fairly straight forward.

TestModuleFinder

Code first, then discussion:




class TestModuleFinder(object):
    ''' Determines if a module is a test module that can be loaded '''

    supported_paths = []

    def __init__(self, path_entry):
        if not path_entry in TestModuleFinder.supported_paths:
            raise ImportError()
        LOGGER.debug('TestModuleFinder supports path %s' % path_entry)
        return

    def find_module(self, fullname, suggested_path = None):
        ''' Attempts to find the specified module

        Parameters:
        fullname The full name of the module to load
        suggested_path Optional path to find the module at
        '''
        search_paths = TestModuleFinder.supported_paths
        if suggested_path:
            search_paths.append(suggested_path)
        for path in search_paths:
            if os.path.exists('%s/%s.py' % (path, fullname)):
                return TestModuleLoader(path)
        LOGGER.warn("Unable to find module '%s'" % fullname)
        return None

sys.path_hooks.append(TestModuleFinder)


So, what do we have here?
  • An object that during construction checks to see if a class property, supported_paths, contains the path to some module that is passed to the constructor.  If supported_paths contains the path we don't throw an ImportError - otherwise, we do.
  • A method, find_module, that looks for some module along the supported_paths (plus an optional path provided to the method).  If we find a file with the extension '.py' in that path, we return an instance of a new object, TestModuleLoader, and pass to it the full path to that file.  Otherwise, we return None.
Not too hard.  You'll notice that we add the type of the object to the system path_hooks.  This lets Python know that when an import occurs, it should create an instance of this type to aid in the importing.

Onward!

TestModuleLoader

Again: code!


class TestModuleLoader(object):
    ''' Loads modules defined in the tests '''

    def __init__(self, path_entry):
        ''' Constructor

        Parameters:
        path_entry The path the module is located at
        '''
        self._path_entry = path_entry

    def _get_filename(self, fullname):
        return '%s/%s.py' % (self._path_entry, fullname)

    def load_module(self, fullname):
        ''' Load the module into memory

        Parameters:
        fullname The full name of the module to load
        '''
        if fullname in sys.modules:
            mod = sys.modules[fullname]
        else:
            mod = sys.modules.setdefault(fullname,
                imp.load_source(fullname, self._get_filename(fullname)))

        return mod


So, since our file should still be valid Python code, the actual TestModuleLoader class ends up being very simple as well.  We have a method, load_module, that will be called when the module should be loaded into memory.  When this happens, we first check to see if the name of the module has already been added to the system modules - if so, we simply return the corresponding module.

If not, we use the imp package to load the module into memory from the source file specified (since we found the Python source in the TestModuleFinder).  We associate that source to the name of the module and add it to the system modules - and again, return the corresponding module.

And... that's it!

Putting the Pieces Together


We do need to inform TestRunner that it should load our test's module (ForkCdrModule) and instantiate the object (ForkCdrModuleEndTime) from some location other than the normal Asterisk Test Suite Python library location.  To do that, we add a new YAML key, load-from-path, and specify the path to the module:

test-modules:
    test-object:
        config-section: test-object-config
        typename: 'SimpleTestCase.SimpleTestCase'
    modules:
        -
            load-from-path: 'tests/cdr/cdr_fork_end_time'
            config-section: 'cdr-config'
            typename: 'ForkCdrModule.ForkCdrModuleEndTime'


Now, in TestRunner, we have add a small portion to the usual pluggable module loading to look for this key and, if so, add the path to the TestModuleFinder class's search paths:


        if ('load-from-path' in module_spec):
            TestModuleFinder.supported_paths.append(
                module_spec['load-from-path'])



And... that's it!

Conclusions

At the end of this little series, we've taken a lot of common code throughout the Asterisk Test Suite and eliminated it by allowing it to be configured.  Since we can't predict what every test will need, we've allowed a mechanism by which a test can write its own modules and have them loaded into the framework at run time - without having to turn the test into a Python package.

Going forward, I expect us to use these concepts a lot.  We have a lot of plans to expand on this framework for CEL records, AMI events, SIPp tests, core Asterisk bridging tests, and a whole host of other things.  It should be interesting to see what we'll need to expand on.  Two things we will need at some point in time:

  1. Lots more entry points in the Test Objects for pluggable modules to insert themselves.
  2. A way for pluggable modules to refer to other pluggable modules.  Currently, pluggable modules are only aware of the Test Object.  We'll need a way for a module to determine if another module exists and - if it doesn't, create it; if it does exist, it may either want the existing object or it may want a new one.  This starts to suspiciously sound a lot like an IOC container or dependency injection... 


Thursday, July 5, 2012

A Pluggable Framework for the Asterisk Test Suite - Part 2

A Recap

So, in the previous post, I introduced motivation of having a Pluggable Framework in the Asterisk Test Suite:

  • Reduce code duplication
  • Reuse common logic across tests more efficiently then simply requiring test writers to add layers of inheritance
  • Decrease time and complexity involved in writing tests
To recap, we have:
  • A TestRunner module that has an entry point that:
    • Reads a YAML file into memory
    • Imports the specified package.modules and instantiates the specified objects, passing them their configuration from the YAML file
    • Differentiates between the main 'Test Object' and 'Pluggable Modules' that are added to the Test Object
    • Starts the twisted reactor, tells the test object to run, and reports the results
  • Some Test Object, specified in the configuration (in our case, SimpleTestCase)
Not all of this was shown in the blogpost, as that's a lot of information.  If you're curious at what the actual source code for this looks like, you can view it in the public Subversion repository here:
(Note: there's no guarantee that these links will always work - the Test Suite changes over time!)

So, what's next?

Well, the original tests that were used as a motivating example - cdr_accountfield and cdr_userfield - are supposed to verify CDRs.  While the SimpleTestCase is able to originate a call into Asterisk (along with starting/stopping Asterisk and doing some other necessary plumbing), it doesn't do anything with CDRs.  So what we need is a way to verify CDRs.

We don't want to simply tack this onto SimpleTestCase - not all tests need to verify CDRs, and we may have other Test Objects that do want to verify CDRs.  Enter: the Pluggable Module.

The best thing I could find for "pluggable" via Google Images.  Yay clipart!


Introducing: CDRModule

So, we want to verify CDR records in a test.  (FYI: CDR == Call Detail Records.  What parties were involved in a call, how long they talked, etc.  CDRs are widely used, but there are some problems with them - there's a lot of complex call scenarios that can't be expressed in CDRs, which is why we have CEL - but that's a topic for another day)  What we had previously in the various tests already could do this - a set of Python libraries read in a CSV file (which happens to be the easiest CDR backend to test) and - using regular expressions - matches records to expected results.  Lets take a look again at cdr_userfield:


class CDRUserFieldTest(CDRTestCase):
    def __init__(self):

        CDRTestCase.__init__(self)

        self.add_expectation('cdrtest_local',AsteriskCSVCDRLine(source="",
            destination="1", dcontext="default", callerid="",
            channel="Local/1@default-.*", dchannel="", lastapp="Hangup", lastarg="",
            disposition="ANSWERED", amaflags="DOCUMENTATION", userfield="bazinga"
        ))


Under the hood in the CDRTestCase class, we have a dictionary of expected results, that the method add_expectation adds to.  This maps a CSV CDR file (cdrtest_local, in this case), to a list of CDR CSV lines - managed by the class AsteriskCSVCDRLine.  The AsteriskCSVCDRLine class contains fields that match the columns in the CDR record.

When a CDR record is verified, each line is read and verified against the matching entry in the list of AsteriskCSVCDRLine objects for that file.  For each column in a record, if a field has a value, a regular expression is used to determine if the two fields match.  If any field doesn't match - or if the file has fewer or more lines then what is expected - the CDRTestCase class will fail the test.

To replicate this functionality, the CDRModule can use the same principle: have a dictionary of file names that map to a list of AsteriskCSVCDRLine objects.  Instead of hard coding in the values, we'll instead read them from a YAML file that is presented to us in our constructor.

As a first step, we can define our YAML layout.  We'll need the following:
  • The ability to support multiple files
  • The ability to support multiple line definitions per file
  • The ability to support any number of column names per line definition
  • Each column name has a value that can be compiled as a regular expression
What we end up with is the following:


cdr-config:
    -
        file: 'cdrtest_local'
        lines:
            -
                source: ''
                destination: '1'
                dcontext: 'default'
                callerid: ''
                channel: 'Local/1@default-.*'
                dchannel: ''
                lastapp: 'Hangup'
                lastarg: ''
                disposition: 'ANSWERED'
                amaflags: 'DOCUMENTATION'
                userfield: 'bazinga'



Now we're ready for the CDRModule class itself.



class CDRModule(object):

    ''' A module that checks a test for expected CDR results '''



    def __init__(self, module_config, test_object):
        ''' Constructor

        Parameters:
        module_config The yaml loaded configuration for the CDR Module
        test_object A concrete implementation of TestClass
        '''
        self.test_object = test_object

        # Build our expected CDR records
        self.cdr_records = {}
        for record in module_config:
            file_name = record['file']
            if file_name not in self.cdr_records:
                self.cdr_records[file_name] = []
            for csv_line in record['lines']:
                # Set the record to the default fields, then update with what
                # was passed in to us
                dict_record = dict((k, None) for k in AsteriskCSVCDRLine.fields)
                dict_record.update(csv_line)

                self.cdr_records[file_name].append(AsteriskCSVCDRLine(
                    accountcode=dict_record['accountcode'], source=dict_record['source'],
                    destination=dict_record['destination'], dcontext=dict_record['dcontext'],
                    callerid=dict_record['callerid'], channel=dict_record['channel'],
                    dchannel=dict_record['dchannel'], lastapp=dict_record['lastapp'],
                    lastarg=dict_record['lastarg'], start=dict_record['start'],
                    answer=dict_record['answer'], end=dict_record['end'],
                    duration=dict_record['duration'], billsec=dict_record['billsec'],
                    disposition=dict_record['disposition'], amaflags=dict_record['amaflags'],
                    uniqueid=dict_record['uniqueid'], userfield=dict_record['userfield']))


This obviously looks very similar to our cdr_userfield test, except that now all of the expected results are being built from a YAML file.  Note that the AsteriskCSVCDRLine class defines the allowed CDR fields and we use that to pre-populate each expected CDR line.  That way a test writer does not have to provide default values for each possible CDR field for each expected CDR line.

Now that question is: how do we get our CDRModule to be called when the test ends, so that the CDR records can be verified?

By default, the Test Objects (concrete implementations of the TestCase class) do not expose places to hook modules onto them.  What we need is for some method in CDRModule to be called when the concrete implementation of TestCase is finished running the test.  To do this, we'll:

  1. Expose a method that an observer can use to register themselves for notifications when Asterisk has fully stopped
  2. Modify TestCase to call the observers when Asterisk and the twisted reactor have fully stopped, that is, when the test is in a state that it can no longer modify its data
The registration method:


    def register_stop_observer(self, callback):
        ''' Register an observer that will be called when Asterisk is stopped

        Parameters:
        callback The deferred callback function to be called when Asterisk is stopped

        Note:
        This appends a callback to the deferred chain of callbacks executed when
        all instances of Asterisk are stopped.
        '''
        self._stop_callbacks.append(callback)



In TestCase, the method stop_reactor is always used to stop the test.  Here, we can modify the method such that the callbacks registered in register_stop_observer will be added to the deferred chain, such that they're each called after the test has finished running.


    def stop_reactor(self):
        """
        Stop the reactor and the test.
        """
        def __stop_reactor(result):
            """ Called when the Asterisk instances are stopped """
            logger.info("Stopping Reactor")
            if reactor.running:
                try:
                    reactor.stop()
                except twisted.internet.error.ReactorNotRunning:
                    # Something stopped it between our checks - at least we're stopped
                    pass
        if not self._stopping:
            self._stopping = True
            df = self.__stop_asterisk()
            df.addCallback(__stop_reactor)
            for callback in self._stop_callbacks:
                df.addCallback(callback)


Now we just need to modify the constructor of CDRModule to register itself with the concrete implementation of TestCase and specify what method we want called when the test stops:


        # Hook ourselves onto the test object

        test_object.register_stop_observer(self._check_cdr_records)



Finally, we implement our CDR checking.  Note that much of this was pulled out of the existing CDRTestCase classes - essentially, when we're told to check our CDRs, we take the expected results and match them against the actual CDR file that was created for the test.  The AsteriskCSVCDR class does the heavy lifting for us in terms of actually matching records.  Finally, we modify the pass/fail results of the Test Object based on the results of the AsteriskCSVCDR instances.


    def _check_cdr_records(self, callback_param):

        ''' A deferred callback method that is called by the TestCase

        derived object when all Asterisk instances have stopped

        Parameters:
        callback_param
        '''
        LOGGER.debug("Checking CDR records...")
        self.match_cdrs()
        return callback_param


    def match_cdrs(self):
        ''' Called when all instances of Asterisk have exited.  Derived
        classes can override this to provide their own behavior for CDR
        matching.
        '''
        expectations_met = True
        for key in self.cdr_records:
            cdr_expect = AsteriskCSVCDR(records=self.cdr_records[key])
            cdr_file = AsteriskCSVCDR(fn="%s/%s/cdr-csv/%s.csv" %
                (self.test_object.ast[0].base,
                 self.test_object.ast[0].directories['astlogdir'],
                 key))
            if cdr_expect.match(cdr_file):
                LOGGER.debug("%s.csv - CDR results met expectations" % key)
            else:
                LOGGER.error("%s.csv - CDR results did not meet expectations.  Test Failed." % key)
                expectations_met = False

        self.test_object.passed = expectations_met



And that's it!  We now have a mechanism to fully configure a CDR test such that the run-test script is completely unnecessary.  Lets look again at cdr_userfield:


cdr_userfield as run-test script:


#!/usr/bin/env python
'''
Copyright (C) 2010, Digium, Inc.
Terry Wilson

This program is free software, distributed under the terms of
the GNU General Public License Version 2.
'''

import sys
sys.path.append("lib/python")
from asterisk.CDRTestCase import CDRTestCase
from asterisk.asterisk import Asterisk
from asterisk.cdr import AsteriskCSVCDR, AsteriskCSVCDRLine
from twisted.internet import reactor
import re

class CDRUserFieldTest(CDRTestCase):
    def __init__(self):

        CDRTestCase.__init__(self)

        self.add_expectation('cdrtest_local',AsteriskCSVCDRLine(source="",
            destination="1", dcontext="default", callerid="",
            channel="Local/1@default-.*", dchannel="", lastapp="Hangup", lastarg="",
            disposition="ANSWERED", amaflags="DOCUMENTATION", userfield="bazinga"
        ))

def main():
    test = CDRUserFieldTest()
    test.start_asterisk()
    reactor.run()
    test.stop_asterisk()
    return test.results()

if __name__ == '__main__':
    sys.exit(main())



cdr_userfield as YAML:


testinfo:
    summary: 'Test that Set(CDR(userfield)=...) works'
    description: |
        'Test that setting the userfield field in the CDR works'

test-modules:
    test-object:
        config-section: test-object-config
        typename: 'SimpleTestCase.SimpleTestCase'
    modules:
        -
            config-section: 'cdr-config'
            typename: 'cdr.CDRModule'

test-object-config:
    spawn-after-hangup: True
    test-iterations:
        -
            channel: 'Local/1@default'
            application: 'Echo'

cdr-config:
    -
        file: 'cdrtest_local'
        lines:
            -
                source: ''
                destination: '1'
                dcontext: 'default'
                callerid: ''
                channel: 'Local/1@default-.*'
                dchannel: ''
                lastapp: 'Hangup'
                lastarg: ''
                disposition: 'ANSWERED'
                amaflags: 'DOCUMENTATION'
                userfield: 'bazinga'

properties:
    minversion: '1.8.0.0'
    dependencies:
        - python : 'twisted'
        - python : 'starpy'
        - asterisk : 'cdr_csv'
    tags:
        - CDR
        - chan_local





Phew.  That is a lot of code to look at in a blog post, so I don't blame anyone if they skipped through some of it.  The important point to take away from this exercise is how the YAML method of defining the test describes the entire test.  A number of details about the test were hidden from the test writer in the run-test script - for example, the fact that Local channels were used to originate the call into Asterisk was completely hidden.  They had to know that's how the base class worked.  While abstracting details is a feature of inheritance, when it happens with crucial information, you probably have poor class design - and getting inheritable class design "right" is incredibly hard.  So why make a test writer have to do it?

Next time: the fly in the ointment that are the Fork CDR tests.

Thursday, June 28, 2012

A Pluggable Framework for the Asterisk Test Suite - Part 1

Asterisk Test Suite - a (very) brief Overview

On the Asterisk development team, we attempt to write software using an Agile methodology (specifically, the "Scrum" flavor).  As part of this development process, we perform continuous integration using Atlassian Bamboo, which, in addition to doing a basic build check, also executes sets of tests against the current build. There's a myriad of reasons to do continuous integration - they're almost too many to list - but the most obvious of these is that we get a consistent sanity check as to the state of the code at all times.

The tests that the Bamboo server runs come in two forms: unit tests, as part of the Asterisk Unit Test framework, and functional tests, as part of the Asterisk Test Suite.  The Asterisk Test Suite is a freely available external tool that orchestrates these tests against one or more instances of Asterisk.  These tests cover a large range of functionality - everything from SIP compliance, to channel technology interoperability, to ensuring that Asterisk applications and functions work as intended.

Each test is essentially a stand-alone entity - the Asterisk Test Suite does not mandate what language a test is written.  Organically, the Test Suite has moved towards Python, but it still includes tests written in a host of other languages (most notably Lua, but there's also some bash ones in there that no one enjoys maintaining).  As the tests have grown, we've added to the libraries that the Asterisk Test Suite provides to try and minimize reproduction of functionality and to minimize the amount of time that it takes to write a test.

While the Test Suite is very flexible - it will run any executable file named 'run-test' - that flexibility has cost us in having a lot of repeated, boilerplate code.  In the current state of the Test Suite, the Python tests typically inherit from a common base class, TestCase, that does a lot of the heavy lifting of starting/stopping Asterisk, creating an AMI connection, and exposing the basic AMI events that allow the test logic to be inserted. Even still, the tests can get a bit repetitive.

Two Current Tests: cdr_accountcode and cdr_userfield

The following two tests - cdr_accountcode and cdr_userfield (which tests that the AccountCode and UserField values are recorded properly in CDR records) - illustrate the general problem.  The tests inherit from CDRTestCase, which in turn inherits from TestCase.  Besides the complexity of having multiple levels of inheritance, there is still some repeated elements to the meat of these two tests.

cdr_accountcode:

#!/usr/bin/env python
'''
Copyright (C) 2012, Digium, Inc.
Walter Doekes

This program is free software, distributed under the terms of
the GNU General Public License Version 2.
'''

import sys
sys.path.append("lib/python")
from asterisk.CDRTestCase import CDRTestCase
from asterisk.asterisk import Asterisk
from asterisk.cdr import AsteriskCSVCDR, AsteriskCSVCDRLine
from twisted.internet import reactor
import re

class CDRAccountCodeTest(CDRTestCase):
    def __init__(self):
        CDRTestCase.__init__(self)

        # 3@default -> (answer)
        # accountcode: third
        self.add_expectation("Master", AsteriskCSVCDRLine(source="",
            accountcode="third",
            destination="3", dcontext="default", callerid="",
            channel="Local/3@default-.*", dchannel="",
            lastapp="Hangup", lastarg="",
            disposition="ANSWERED", amaflags="DOCUMENTATION",
        ))
        # NOTE: I removed some of the additional expectations
        # for brevity

def main():
    test = CDRAccountCodeTest()
    test.start_asterisk()
    reactor.run()
    test.stop_asterisk()
    return test.results()

if __name__ == '__main__':
    sys.exit(main())

# vim: set ts=8 sw=4 sts=4 et ai:


cdr_userfield:

#!/usr/bin/env python
'''
Copyright (C) 2010, Digium, Inc.
Terry Wilson

This program is free software, distributed under the terms of
the GNU General Public License Version 2.
'''

import sys
sys.path.append("lib/python")
from asterisk.CDRTestCase import CDRTestCase
from asterisk.asterisk import Asterisk
from asterisk.cdr import AsteriskCSVCDR, AsteriskCSVCDRLine
from twisted.internet import reactor
import re

class CDRUserFieldTest(CDRTestCase):
    def __init__(self):
        CDRTestCase.__init__(self)

        self.add_expectation('cdrtest_local',AsteriskCSVCDRLine(source="", 
            destination="1", dcontext="default", callerid="",
            channel="Local/1@default-.*", dchannel="", lastapp="Hangup", lastarg="",
            disposition="ANSWERED", amaflags="DOCUMENTATION", userfield="bazinga"
        ))


def main():
    test = CDRUserFieldTest()
    test.start_asterisk()
    reactor.run()
    test.stop_asterisk()
    return test.results()

if __name__ == '__main__':
    sys.exit(main())


Some obvious things in this comparison:
  • The act of setting up the test case class, starting Asterisk, running the twisted reactor, stopping Asterisk, and returning a result is all functionality that a single entry point could do, if it knew what object to instantiate
  • The object being instantiated is identical, save for what CDR information it expects to match on
  • The CDR information that a match must be made on can be configuration driven
Since common classes and modules can provide the functionality these tests use, and the differences between the tests can be driven by configuration, why not ... do that?


A Pluggable Framework

Since tests already define their configuration in a YAML file, the idea is to use that configuration data to drive  all of the test's behavior.  This includes what modules are instantiated to support test execution, what data is needed to configure them, and what the expected results of the test should be.

Since all tests are spawned as a separate process, a new module - TestRunner (what a creative name...) - is responsible for having an entry point and building the test objects specified in the configuration.  TestRunner parses the test's YAML file, looking for modules to locate and instantiate.  In the next post I'll go more into how that happens - there's one aspect of this that I think is pretty cool and gets around some of the requirements Python has for defining packages.

Loadable modules currently fall into one of two categories: Test Objects, and Pluggable Modules.  A Test Object is responsible for starting and stopping Asterisk and orchestrating the test activities.  Pluggable Modules add bits of functionality into the test - things like verifying CDRs/CELs, adding AMI event listening and verification, etc.  These categories may have to expand in the future - and the role of a Test Object defined more narrowly - but it works well for our purposes now.

As an example, say we have a test-configuration that defines a Test Object SimpleTestCase (in module SimpleTestCase (and yes, per PEP8, this should be lower case - Pythonic compliance has not traditionally been our strongest suit, but we're working on that)):

test-modules:
    test-object:
        config-section: test-object-config
        typename: 'SimpleTestCase.SimpleTestCase'

test-object-config:
    spawn-after-hangup: True
    test-iterations:
        -
            channel: 'Local/1@default'
            application: 'Echo'



The TestRunner parses the test-modules configuration section and determine that it needs to create a Test Object.  The YAML file itself defines the Keyword (test-object-config) that contains the configuration information for this object.  TestRunner takes the generated in memory representation of the YAML and passes that in to the constructor of the TestObject.

    # NOTE: Some code removed here for brevity
    test_object_spec = test_config['test-modules']['test-object']

    module_obj = load_and_parse_module(test_object_spec['typename'])
    if module_obj is None:
        return None

    test_object_config = None
    if ('config-section' in test_object_spec and
        test_object_spec['config-section'] in test_config):
        test_object_config = test_config[test_object_spec['config-section']]
    else:
        test_object_config = test_config

    # The test object must support injection of its location as a parameter
    # to the constructor, and its test-configuration object (or the full test
    # config object, if none is specified)
    test_obj = module_obj(test_path, test_object_config)


Note that a test object gets two things: its relative location to the current process directory (so it can find its test specific files), and the in-memory YAML object configuring it.  The method load_and_parse_module does the work of separating the typename into package/module/classname parts and importing the module.


def load_and_parse_module(type_name):
    ''' Take a qualified module/object name, load the module, and return
    the type specifying the object

    Parameters:
    type_name A fully qualified module/object to load into memory

    Returns:
    An object type to be instantiated
    None on error
    '''


    LOGGER.debug("Importing %s" % type_name)

    # Split the object typename into its constituent parts - the module name
    # and the actual type of the object in that module
    parts = type_name.split('.')
    module_name = ".".join(parts[:-1])

    if not len(module_name):
        LOGGER.error("No module specified: %s" % module_name)
        return None

    module = __import__(module_name)
    for comp in parts[1:]:
        module = getattr(module, comp)
    return module



And voila: we have an in-memory object that acts as our test!

TestRunner does the mechanics of starting the twisted reactor and telling the Test Object to run - which takes the part of the entry point mechanics that was reproduced across all tests.  So, now that we can create a Test Object and tell it to do stuff, we need to figure out how to actually verify the purpose of those tests: CDR records.

In the next post I'll go into more detail about the Pluggable Modules - specifically the CDR Module - and how they interact with the Test Object.  We can then put the pieces together to see how the CDR tests above are configured to obviate the need for the run-test scripts.

Thursday, June 21, 2012

So... yeah. Changes.

Its been awhile since I've written anything here.  The original plan - back when I was writing the posts on WPF - was to document aspects of a project I was working on that used Prism, Unity, and WPF.  It wasn't a very complex project, but it was a lot of fun - at least the Prism portion.

As it turns out, that project died a quiet death, and I got bored documenting something that I no longer had a reason to be tinkering with.  After that... I stopped writing here.

And then a bunch of stuff happened.  Without going into a ton of detail, I discovered that I was the only person generating revenue for the company that I worked for.  When your business is Engineering Contract Services, that's not a good sign.  So I started looking.

And I ended up at a fantastic company - Digium - working on the open source project Asterisk.

You know how you sometimes think of the way things should be in software development?  The people you'd like to work with, the things you'd like to be working on, the environment you'd like to be in?  I found myself there.  Its pretty fantastic.

I did go from working primarily in Microsoft venues - often in the .NET ecosystem (although the last contract I was on was C++ on a RedHawk Linux OS) - to working on Linux systems in C.  But its been an absolute blast.

(And to scratch that OO itch, there's always the Asterisk Test Suite for some Python goodness.)

So, a few small goals of mine:

  1. Write something now and again.  I'm shooting for 1 post a week.  That may be a bit ambitious however, but we'll see.
  2. Write about Asterisk development.  There's some great stuff coming down the pipeline in Asterisk 11.
  3. Learn Italian.

Monday, December 27, 2010

Creating Dynamic Menus in WPF and Prism - Part 4

So, last time I noted that my approach was less than, uhm, optimal.  In fact, it was pretty bad.  At least it didn't involve convoluted code behind!  Anyway.

So, I'd like to have a nice, dynamic menu or toolbar (yup, either or both) that Prism modules can add items to.  It'd be nice if they could place those items where they need them, such that the menus are still ordered in a useful manner for the user.  It'd also be nice if a module could add an item that was always displayed, versus an item that is only displayed when a view in the module is active. 

Whew.

Spec the code.  Design the code.  Implement the code.  (Okay, so I didn't do that completely, and I paid for it.  But let's pretend that I practice what I preach...)

  1. The menu / toolbar service should not be tied to its view.  That is, items added should be able to be displayed in a menu, toolbar, "ribbon", or some hybrid thereof.
  2. Modules should be able to differentiate between the various types however - I should know if I'm adding an item to a menu or a toolbar.
  3. Modules should be able to specify whether an item has a parent (which can be any previously added item), and place the item relative to some other child of that parent.
  4. Modules should be able to add a seperator.
  5. Items that are added can be either permanent (always shown), or tied to the activity of the module.
Design decisions time!

You'll note that I already defined that the dynamic menu / toolbar thingy should be a service - based on the shellacking of my previous attempt, its become pretty obvious that this thing should be a service interface in the Prism project's Infrastructure DLL.  I didn't list that in the requirements, but it probably should be.

The first requirement should be pretty easy to meet - the service can expose an ObservableCollection of items, and the various UI elements can use a HierarchicalDataTemplate to format the collection.  This does mean that I'll probably need multiple implementations of the service however (requirement #2) - that shouldn't be a problem however using Unity's Resolve(string name) overload.

Requirement three means that each item will have to have children items, and the service should:
  • expose a way to add an item that optionally includes a parent and a predecessor
  • expose an IEnumerable<> of items, so that modules can query for parents / predecessors
  • (maybe?) expose a way to query for items, since the hierarchical nature of the data means queries will have to be recursive (and writing recursive LINQ operators everywhere would kinda suck)
Requirement four: items can be just a separator.  Boolean field for the win!

Requirement five is tricky.  There's a couple of ways Prism handles what is viewed / active:
  1. IActiveAware.  Views can implement this interface, which gives you a boolean field indicating whether or not the view is active, and an event that can be subscribed to when that field changes.  This is tied to a Region's concept of what is active.  If we allow modules to provide an IActiveAware object, a menu item could be tied to that, and we could show / hide a menu item based on that object.
  2. The service could implement INavigationAware.  This would give us notification when a navigation request is made to some object.  I'm not sure what happens if we register an object that is INavigationAware that is not designed to handle a navigation URI request - in essence, we want our service to snoop navigation, not actually perform in it.
  3. A PresentationEvent, that modules can fire when they want their items shown or hidden.
Option #1 provides a lot of granularity, as each View can have menu items added / removed.  On the other hand, it may provide too much granularity, and we may find ourselves adding / removing items when we don't want to.  We'd have to be careful as to what object implementing IActiveAware is sent to the service.

Option #2 feels a little fishy.  While it seems ideal, I don't like our service being a navigation object when it isn't; that, and at least two of the methods (OnNavigatedTo and OnNavigatedFrom) would never be called.  Implmenting half of interfaces always feels a little bad, so I'm inclined not to pick that route.

Option #3 gives the most granularity, but also requires more out of the users of the service.  I want this to be as "fire and forget" as possible - I put in my menu items, and I move on.

So, I'm probably going to go with Option #1.

Lots of design decisions made, with a lot of tradeoffs.  Next time: more design decisions, and some initial implementation.