Function - call thyself


A function that calls itself. What sorts of pictures does that conjure in your mind? Errors, warnings or infinite loops? Crashes? Bugs? Believe it or not, using a function that calls itself can be used for more than hanging up IE. It can cut down on the amount of code you have to write, often by hundreds of lines.

I see that grimace. You don’t believe me do you? That’s all right, I didn’t believe myself at first either. The sort of thing you probably associate with this sort of thing would be something along the lines of:

function writeNumber(number){
    print( number );
    print( " and " );
    writeNumber(number + 1);
}
writeNumber(1);

If your first language is a programming languauge rather than, say, English, you would realise immediately that the preceding code will, in theory, execute for eternity - kind of like counting to infinite. The output would be 1 and 2 and 3 and 4 … and it would keep increasing forever: 75374524 and 75374525 and 75374526 and 75374527 and …. It’s not as bad as:

for( i = 0; i >= 0; i++ ){
    print( i );
}

but in both cases the script would either time out or have to be terminated. Another common mistake while we’re on the topic of inifinite loops is the forever while.

i = 1;
while ( i < 10 ){
    echo i;
}

Again, the output is not pretty. (As I said, it may be hard for beginners to spot the mistake in this one. The problem here is that i is always less than ten as we’re not changing it, so the loop continues forever as i is always 1. Adding a simple i++; line before closing the loop will solve the problem.)

There are a lot of ugly connotations related with topics such as this, especially if you’ve encountered (or written) some of the above code. So if you panicked when I said a function that called itself could be useful, I completely understand your apprehension.

But have no fear; I am not completely crazy (yet) and taking over the world through blatant misinformation is still a long way down on my todo list. So, cast aside your doubt for the time being, and allow me to demonstrate to you how this sort of recursive programming that I’m talking about can make a serious impact in your code.

The most beautiful way of using this technique (yes, inspiring code is beautiful) is to run through files and folders, arrays or collections of any sort that contain children collections (such as XML markup).

Here’s an example in PHP:

function loop($array, $parent=''){

    foreach( $array as $option => $value ){
        if( $parent != '' ){
            $option = $parent . '[' . $option . ']';
        }
        if( is_array($value) ){
            loop($value, $option);
        } else {
            ?>
        <div><?php echo $option; ?> = <?php echo $value; ?></div>
            <?php
        }
    }

}

loop ( $names );

Ingenius, isn’t it?

If the above code is not clear to you, let’s take a look at some example usage. I’ll be working in PHP, but the language doesn’t really matter; the principle is the important thing.

Let’s say I have an array, with usernames in it, like so:

var $names = array ();

$names[0] = "Bobster";
$names[1] = "Billy";
$names[2] = "Matty";
$names[3] = "Joey";
$names[4] = "Jacko";

To output these values just requires an easy peasy:

foreach ( $names as $name ){
    echo $name;
}

So why in the world did you just write that loop() function, I hear you exclaim. In this case, it’s summarised in one word: extensibility.

We’ve decided that the usernames aren’t enough. We need to store the the first and last names in the array as well.

var $names = array ();

$names[0] = array (
    'username' => "Bobster",
    'firstname' => 'Bob',
    'lastname' => 'Johnson');
$names[1] = array (
    'username' => "Billy",
    'firstname' => 'William',
    'lastname' => 'Gates');
$names[2] = array (
    'username' => "Matty",
    'firstname' => 'Matthew',
    'lastname' => 'Jefferson');
$names[3] = array (
    'username' => "Joey",
    'firstname' => 'Joseph',
    'lastname' => 'Carlton');
$names[4] = array (
    'username' => "Jacko",
    'firstname' => 'Jack',
    'lastname' => 'Richardson');

Our array is now multi-dimensional. So, to get Bob’s username, we’d call $names[0]['username'];. Now, outputting all the values isn’t so easy:

foreach ( $names as $userarray ){
    foreach ( $userarray as $field => $key ){
        echo $field . ' = ' . $key . '';
    }
}

Do you see where I’m heading? What happens when instead of having firstname and lastname keys, we make is another array containing the first and last names like $names[0]['name']['first']? The array is now three-dimensional and we start to run into some problems in outputting our values:

foreach ( $names as $userarray ){
    foreach ( $userarray as $field => $key ){
        if( is_array($field) ){
            foreach ($field as $sub_key => $value ){
                echo $key . ' ' . $sub_key . ' = ' . $value;
            }
        } else {
            echo $field . ' = ' . $key . '';
        }
    }
}

Boy! Did that double in size or what! Becoming clearer now? If we decide to make another field that has another array in it, so we get four dimensions, the code will again need to be changed. But, what if we have an array where we don’t know how many dimensions it has? This is exactly the situation developers are faced with when trying to traverse files and folders. We don’t know how many levels of nested folders exists inside a particular loction. Trying to write a single function that traverses a filesystem is like trying to write a single function to crawl the internet. And guess what? I already showed you how its done.

Rather than nesting an arbitary number of foreach statements, we finally get back to our loop function.

function loop($array, $parent=''){

    foreach( $array as $option => $value ){
        if( $parent != '' ){
            $option = $parent . '[' . $option . ']';
        }
        if( is_array($value) ){
            loop($value, $option);
        } else {
            ?>
    <div><?php echo $option; ?> = <?php echo $value; ?></div>
            <?php
        }
    }

}

loop ( $names );

In the loop function, we go through the array, and if the array contains another array, we run loop on it. The beauty of this function is that we pass the key as well as the child array, so we don’t lose the level that we’re on. An entry located at $names[0]['admins'][2]['superadmins']['users'] = 'Bob' would be correctly displayed by the loop function as names[0][admins][2][superadmins][users] = Bob The quotes aren’t printed by this example, but you easily add them in.

Similarly, archiving files could use a function that looks at the contents of the current folder, archives any files, then runs itself on any subfolders.

Find this code inspiring? Or do you know of any other techniques that save time and effort?


Back to Top ↑

3 Comments so far

Leave a comment
  1. 1

    When writing recursive functions you must be aware of that stack can be limited. You can always replace a recursive call with an array and a foor loop if needed.

  2. 2

    Good point, Mudi. Note that qsort(3) uses a recursive algorithm, but it’s unlikely to blow the stack because it’s only invoked log2 N times.

    Also, Ankur, any algorithm that can be expressed recursively can also be expressed iteratively, and the iterative version is almost always more efficient. While recursive algorithms are easier for the developer to maintain, converting it to an iterative algo will probably improve your app’s performance.

  3. 3

    Thanks Mudi and Blake.

    It’s true that iteration is generally more efficient in terms of processing speed, but to traverse a multi-dimensional array or directory would really take a lot of code to accomplish iteratively. If I’m mistaken perhaps you could show me how it’s done?

    In any case, recursive functions at least make sure nothing is missed - every element has its turn.

RSS feed for comments on this post. TrackBack URI

Leave a comment

Comments may be edited for formatting.