August 02, 2003

Importance of variable naming

Tim Bray had a recent entry about Perl’s illegibility. Here’s the code:

@fields = (0, @fields);
while ($r = $q->fetchrow_arrayref)
{
  my $topic = $r->[0];
  while ($topic != -1)
  {
    my $i;
    foreach $i (1 .. $#{$r})
    {
      $totals{$fields[$i]}->[$topic] += $r->[$i];
    }
    $topic = $parents[$topic];
  }
}
shift @fields;

The first thing that struck me was the variable naming. What happens if you take a minute or two and rename some of the variables? Here's a quick result with some rationale. (I also added spaces inside delimiters because I think it's one of the easiest actions you can take to make your code more readable.)

# Why unshift a value on and then shift it off later? Just
# create a new variable.
my @shift_fields = ( 0, @fields );
  
# Name the magic number previously used
my $root_topic = -1;
  
# The standard notation for a DBI statement handle is $sth, 
# stick with the standard
while ( my $row = $sth->fetchrow_arrayref ) {
  # prefixing this with a 'current_' lets people know
  # it changes state
  my $current_topic = $row->[0];
  
  # Replace that funky array member notation with something 
  # meaningful, plus get the calculation out of the while loop
  my $num_fields = scalar @{ $row };
  
  # replace that magic '-1' with something meaningful
  while ( $current_topic != $root_topic ) {
  
    # replace that funky array member notation, and instead of $i
    # use a name that better represents the purpose
    foreach my $field_num ( 1 .. $num_fields ) {
  
      # less efficient to create a temp variable, but
      # far more readable
      my $field_name = $shift_fields[ $field_num ];
      $totals{ $field_name }->[ $current_topic ] += $row->[ $field_num ];
    }
    $current_topic = $parents[ $current_topic ];
  }
}

That looks better, and even if you don't know much about Perl (well, you need to know what a hash is and how it's accessed) you can dope out what's going on.

The thing is, the time you spend on a script -- including the time you'll look at it later -- informs how much time you should spend on naming. Since this script runs three times a day I would have tried to use clearer names throughout. This is true of any language, but because people aren't used to looking at sigils Perl seems more like line noise to them. Understandable. But then it's awfully arrogant to judge the readability of a language if you don't know it. True that would be less fun, but then we'd all have more time to work on meaningful work details like variable naming...

It's interesting that Tim noted, "Now I'll get email from dozens of Perl gods explaining that (a) this is perfectly clear and (b) I could have done it better in three lines of code." Perl has a bad rep for this sort of macho bullshit and it's a shame. I think the actual practice of said machoness is fading but impressions last a long time -- e.g., "Java is slow."

Next: Language inventor or serial killer?
Previous: Graphing internal Maven dependencies