January 21, 2005

A little bit of Algorithm::Dependency and decomposition

Algorithm::Dependency is a nifty little distribution to help you manage dependencies. (“Little” is not derogatory; just the opposite.) It provides just enough functionality so that you can glue it to your application with fairly little fuss. Here’s how I’m planning to use it to support an OpenInteract2 feature improvement.

In OIN-115 we have the improvement 'Break setup into individual tasks'.[1] The goal of this imrovement is to allow other technologies easily put themselves into the OI2 lifecycle. With the improvement processes we don't know about beforehand can get initialized at server startup, and by a separate process, rather than managing this state themselves. So if you want to establish a connection to a message queueing system when the server starts up you can just create a setup task to do it and the rest of your code can just assume it's there.

Previously all the setup routines were in one big class, OpenInteract2::Setup. This was tightly coupled to OpenInteract2::Context such that the context was responsible for knowing what methods to call, the order to call them, and what methods to skip in certain circumstances. (Sometimes you want to create a Context without it being fully-formed, such as when you're creating a new website, install SQL structures, or other bootstrapping actions.) Obviously this is no good -- not only are we limiting what's called, but we can't discover new behaviors at runtime to integrate these unknown technologies. (A small change in 1.99_05 allows you to use a subclass of OpenInteract2::Setup and define pre/post methods, but that's a bandaid fix.)

After some grunt work I ported each of the setup procedures from methods in a class to each in their own class. As a result we had a bunch of subclasses classes under OpenInteract2::Setup. Each class had to implement get_name() (to return its unique name) and execute() (to do its work). It could implement get_dependencies() (to declare what it should run after) and some lifecycle stuff (setup/teardown, basically). Additionally, each subclass is responsible for registering itself with the parent so the parent knows how to find them. (There's a simple example of this below.)

So how can we allow OI2 to find all these setup classes at runtime but still maintain an order to executing them? Well, we already had code for the first part in OpenInteract2::Manage, which walks over @INC, finds all subclasses and require()s them. Once brought in each subclass registers itself with the parent factory with a name for us humans to reference the action by. We don't need to know 'OpenInteract2::Manage::Website::Create', just 'create_website'. (This is made simple by Class::Factory). So that got abstracted out to a utility method for common usage and it works fine.

Now onto the ordering part (and the actual point of this post). Previously the caller of the setup controlled not only the order but also what procedures should not be run if others weren't run. (See the setup() method under OpenInteract2::Context for that nasty bit.) But if we're bringing in these procedures dynamically we can no longer proscribe the order. So we need a dependency tree built from the data -- name + dependencies -- made available by each class.

I've written dependency trees before, but that's more infrastructure than I wanted and more code that's not aimed at the core purpose that can break. After hunting around CPAN a bit I found Algorithm::Dependency. It has a few dependencies itself (har har) but was a snap to install on my Powerbook. (Even for Win32 it just required a few PPDs to be built.)

So here's the procedure for what I wanted:

  1. find all setup tasks on the system
  2. ask each task for its name and dependency names
  3. pass the task names + associated dependency names to an object
  4. ask the object the order in which I should execute the tasks </ol>

    In A::D-speak the last part translates to:

    my $dep = Algorithm::Dependency->new(
        source => Some::Storage::Class->new,
    );
    unless ( $dep ) {
        oi_error "Failed to create setup dependency object: cannot start server"
    }
    my $all_ordered_setup_tasks = $dep->schedule_all();
    

    Pretty straightforward, even with the 'Some::Storage::Class' in there. That 'source' winds up highlighting A::D's pragmatic assumptions from your application. It deals with IDs (or names), not objects. So as long as your dependencies can be referenced by a string (and EVERYTHING in Perl can), then you're good to go.

    But the above only takes care of item 4 from our list, with a some of item 3 behind the scenes as well. So how do we tell A::D about our dependencies? You can use a static file (and A::D includes a class for reading it in), but we need something dynamic -- remember, we want to discover these classes at runtime. Fortunately the storage class was very straightforward; here it is with some logging removed:

    package OpenInteract2::Setup::DependencySource;
    # ...a few imports...
       
    sub _load_item_list {
        my ( $self ) = @_;
        my @ad_items = ();
        my @all_actions = OpenInteract2::Setup->list_actions;
        foreach my $setup_name ( @all_actions ) {
            my $setup_item = OpenInteract2::Setup->new( $setup_name );
            next unless ( $setup_item );
            my @dep_names = $setup_item->get_dependencies;
            my $dep_item = Algorithm::Dependency::Item->new( $setup_name, @dep_names );
            unless ( $dep_item ) {
                oi_error "Failed to create dependency item from '$setup_name'";
            }
            push @ad_items, $dep_item;
        }
        return \@ad_items;
    }
    

    So with that little class and a reference to the base setup action our A::D code becomes:

    my $dep = Algorithm::Dependency->new(
        source   => OpenInteract2::Setup::DependencySource->new(),
        selected => [ $DEFAULT_DEPENDENCY ],
     );
    my $all_items = $dep->schedule_all;
    

    (The $DEFAULT_DEPENDENCY is simply reading the server configuration; everything else depends on it and we execute that separately to deal with some bootstrapping issues.)

    Easy enough. However, if you run this you won't get what you expect. A::D's documentation tells you that it doesn't return ordered dependencies. (I'm not sure what use unordered dependencies are, but ok.) It's a simple fix:

    my $dep = Algorithm::Dependency::Ordered->new(
    

    Now for the slightly hard part -- what happens when you want to remove something from the list? This functionality is not in A::D but it was fairly simple to add. The pseudocode is just:

    foreach item you want to remove
        - ask the dependency object for that item's children
        - remove the item
        - remove the item's children
    

    I coded this method up (called 'without()'), added docs and tests and sent it to the author Adam Kennedy. We'll see if it gets added. (IME patches with docs and tests are treated much better than code-only.) In the meantime, now it's easy to do something like this:

       # don't run 'read repository' or any of its dependencies
       OpenInteract2::Setup->run_all_actions( $context, 'read repository' );
    

    Since 'read packages' depends on 'read repository', it doesn't get run. Neither does 'read action table' or 'read spops config' since they depend on 'read packages', and on down the line. Even better: we don't have to explicitly account for and ignore other setup tasks added at runtime -- for instance, if Class::DBI needed to do some startup work you could create a class for it like this:

    package OpenInteract2::Setup::ClassDBI;
     
    use strict;
    use base qw( OpenInteract2::Setup );
     
    sub get_name         { return 'initialize Class::DBI' }
    sub get_dependencies { return ( 'read packages' ) }
    sub execute {
        my ( $self, $ctx ) = @_;
        # ... do my actual work here ...
    }
     
    OpenInteract2::Setup->register_factory_type( get_name() => __PACKAGE__ );
    

    Since it depends on 'read packages', it too gets skipped when we execute the above run_all_actions() method.

    [1] That issue is a great example of something driven by implementation already half-done rather than a use case beforehand. I think it points out a problem with knowing a system really well: need and implementation become interchangeable in your head. Oh well.

Next: You know you're a geek when...
Previous: Norton Antivirus message caused by moving shortcuts