You are at your desk at WeSellEverythingAndAnythingMegacorp, the fantastic e-commerce you work for. You’re trying to write some code while thinking intensively about your lunch. Suddenly, a voice break your incredible concentration. It’s Davina, your colleague developer, hired only a few moons ago. She has a problem and she thinks you can help her.
She needs to display a directory tree, and she believes that she needs to use recursion. But she never quite understood the concept.
Recursion. The word alone brings back some painful and confused memory. You didn’t do much recursion the last ten years of glorious copy past from Stack Overflow. Where to begin? What example to give?
You begin to search the deep Internet, and you stumble upon this article. Welcome, dear reader! It’s now my task to help you.
If you’re used to implement iterations all over the place, using for
, while
, foreach
, for in
, or similar constructs, the concept of recursion can be difficult to grasp. It’s another way to think about a problem and its solution, and, only for that, it’s great. Widening our skills for problem solving is always useful, arguably more than learning the last trendy framework.
While iteration can solve everything you want, a whole range of problems are way, way easier to solve using recursion. To grasp the good mindset for understanding and applying recursive solutions, we’ll see in this article:
- What’s recursion.
- How to recursively calculate a sum.
- How to recursively change words in a sentence.
- How to display a tree of directory with an unknown depth using recursion.
This article include some exercises. You and Davina will be granted Great Invoker of the Recursive Demon Unleashed if you try to solve each of them! Here are the rules: spend from 20 minutes to an hour for each exercise. No more, no less. In general, put some good will in them, but avoid anger and deep sadness.
When you feel you’re blocked, you can as well do a break and come back to it later. This is very often the best thing to do.
Now that you know the rules, let’s go hunting the Ghost of Recursion all together!
What Is Recursion?
Before diving into examples, we shall first understand what recursion is. Let’s define it, thanks to the computer science wiki:
A method where the solution to a problem depends on solutions to smaller instances of the same problem.
Decomposing a problem is a very useful technique to solve it without headaches. I spoke already about it when I revisited the Single Responsibility Principle.
In short, decomposing a problem means:
- Breaking down a problem into smaller and simpler sub-problems.
- Solving the sub-problems to solve the main problem.
If these sub-problems are all smaller instances of the same problem, it means that a solution to a sub-problem is valid for every other sub-problems. You need to apply the same solution to every single one of them, again and again.
When you reach the smallest sub-problem, your main problem will be solved, too. How cool is that?
How To Do Recursion?
To apply a recursive solution to a problem, you need to go through two steps:
- Finding the base case.
- Finding the recursive steps.
The Base Case
Recursion can be seen as a reduction from the bigger problem to the simplest, smallest instance of the same problem. The smallest of all sub-problems is called the base case. This is what we should find first.
In the real world, your recursive process will often take the shape of a function. Like any function, it will take some input and give back some output. The base case is the simplest process the function can do for a given input.
The Recursive Step
When you’ve found the base case, you’ve solved the smallest sub-problem, but you still have to solve every other ones.
The recursive step is the reduction we spoke about earlier: applying the same solution to every sub-problem. You’ll reduce the main problem into a chain of smaller sub-problems, until your problem is so simple it reaches the base case.
Recursion In Action
Davina looks at you, desperate, horribly lost. You begin to question as well where I’m heading to while reading this article. You’re right: enough theory! It’s now practice time.
I would advise you to try to solve the problems given in this article. You’ll then understand my theoretical babbling from earlier, and, as a result, how to apply recursion to a wide range of problems. You’ll remember it, too, because you tried it with your own brain.
These exercises are in PHP. If you know already a language with a C-like syntax, it shouldn’t be too hard for you to follow. You can as well try to translate the examples in another language. They are pretty short, so it shouldn’t take weeks.
If you don’t understand anything about PHP, you can still ask me to do them in another language. I would be happy to do it depending on the need.
Sum of Range
Let’s take a simple problem for the beginning: calculating the sum for a range of positive integers, starting from 0.
For example:
- Sum range to 5:
0 + 1 + 2 + 3 + 4 + 5 = 15
- Sum range to 10:
0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55
- Sum range to n:
0 + 1 + 2 + 3 + 4 + 5 + ... + n = ???
Your challenge, if you accept it, is to write a function which has these properties:
- The variable
n
is the argument of the function. - The output have to be the sum of the range from 0 to
n
.
Iterative Solution
Before using recursion, let’s try to solve the problem using the good old iterative for
loop. Davina is already solving it, so fire your best editor and let’s go!
<?php
function iterativeSumRange($n)
{
$total = 0;
for ($i = 0; $i <= $n; $i++) {
$total += $i;
}
return $total;
}
echo iterativeSumRange(5);
// => 15
This can be seen as an accumulation: we first go from 0, and then we add the good numbers to $total
.
Recursive Solution
To solve the problem using recursion, we need to go through the two steps we saw above.
The Base Case
The base case is the smallest possible sub-problem. What would be the value of n
which would make our function trivial?
If n
is 0, the sum of range from 0 to 0 is… 0. This is the smallest sub-problem of our main problem. With this base case, we can already write the beginning of our function:
<?php
function recursiveSumRange($n) {
if ($n == 0) {
return $n;
}
}
The Recursive Steps
As we said, recursion can be seen as a reduction. We’ll take n
and we’ll reduce it till we reach our base case, when n
equals 0.
Our main problem is now: to sum a range of positive integers from n
to 0. If n
is equal to 5, it will be: 5 + 4 + 3 + 2 + 1 + 0 = 15
.
How to decompose this problem to smallest instance of the same problem? A smaller sub-problem would be adding one number at a time instead of adding 5 different numbers.
We begin with n == 5
. The first sub-problem would be adding a number. The solution is adding 4, which is (n - 1)
. Now, n == 4
, and we can continue to apply this solution: adding (n - 1)
, which is 3
. We apply this solution again and again, till we reach our base case, n == 0
. That’s all. We’re done.
Now, you should be able to try to complete our recursiveSumRange
function we began earlier.
<?php
function sumRange($n)
{
if ($n == 0) {
return $n;
} else {
return $n + sumRange($n - 1);
}
}
echo sumRange(5);
; => 15
Calling sumRange
again and again allows us to decrease n
by 1 at each step, as we wanted.
Most of the time, people explain recursion by calling the same function repeatedly. Even if it’s partially true, we shouldn’t think about it that way.
What happens here is much more than repeating the call of a function. It’s more useful to think of it as a chain of deferred operations.
Let’s zoom on each return statement for each step of the recursive process:
return 5 + sumRange(4)
return 5 + 4 + sumRange(3)
return 5 + 4 + 3 + sumRange(2)
return 5 + 4 + 3 + 2 + sumRange(1)
return 5 + 4 + 3 + 2 + 1 + sumRange(0)
return 5 + 4 + 3 + 2 + 1 + 0
15
From the first step to the 5th, you can see an expansion. Every sum operation we need are chained. However, nothing is added at that point: that’s why we speak of deferred operations. They will be executed later.
At the 5th call of the sumRange
function, we gives 0
as argument. It’s our base case! That’s why only 0 is returned from sumRange
at the 6th step. It’s where the reduction happens. The function sumRange
doesn’t call itself anymore so every summation operation can be executed.
These deferred operations are not visible in your code or in your output: they are in memory. The program needs to held them somehow, to be able to execute them at the end.
If we would had not specified the base case, the recursive process would never end. For example:
<?php
function sumRange($n)
{
return $n + sumRange($n - 1);
}
echo sumRange(5);
What happens in that case?
return 5 + sumRange(4)
return 5 + 4 + sumRange(3)
return 5 + 4 + 3 + sumRange(2)
return 5 + 4 + 3 + 2 + sumRange(1)
return 5 + 4 + 3 + 2 + 1 + sumRange(0)
return 5 + 4 + 3 + 2 + 1 + 0 + sumRange(-1)
return 5 + 4 + 3 + 2 + 1 + 0 + (-1) + sumRange(-2)
…
return 5 + 4 + 3 + ... + sumRange(-488117)
return 5 + 4 + 3 + ... + sumRange(-488118)
At that point, everything crash and I get a wonderful error message telling me that I blew up the memory allocation limit. As I was saying above, every deferred operation were held in memory, until there weren’t enough memory to contain them all!
Changing Words In a Sentence
Davina seems now full of hope to understand what recursion is. She never really thought about it that way. That’s great! Let’s continue our recursive exploration.
I’m sure you always wanted to change words in a sentence recursively. Personally, it’s my biggest dream.
The goal is to design a function which takes a string as argument, and recursively change some words. The words “i” and “me” will become “you”, the word “you” will become “i”.
We’ll only deal with lowercase sentences to make things easier.
For example, if you give the sentence i do not know if you hear me
as argument of your recursive function, the output must be you do not know if i hear you
.
To help you out, here are some helper functions you can use:
<?php
// Return first word of a string
function firstWord($sentence) {
return explode(" ", $sentence)[0];
}
// Return every word of a string except the first
function butFirstWord($sentence) {
return implode(" ", array_slice(explode(" ", $sentence), 1));
}
// Replace a word by another
function changeWord($word) {
switch ($word) {
case "i":
case "me":
return "you";
break;
case "you":
return "i";
break;
default:
return $word;
break;
}
}
You remind to Davina to find the best case first, and then the recursive steps.
If you have difficulties to find the recursive solution, you can try to solve the problem with an iterative loop first. I have faith in you. You can make it!
Here’s the recursive solution:
<?php
function replaceWords($sentence)
{
if ($sentence == "") {
return $sentence;
} else {
return changeWord(firstWord($sentence)) . " " . replaceWords(butFirstWord($sentence));
}
}
This example is similar to the previous one. This time, we don’t have deferred summations, but deferred concatenations, while changing the words as we go.
First, we had to find the base case. What’s the value of $sentence
which makes everything trivially simple? An empty sentence, which return an empty sentence. We write:
<?php
function replaceWords($sentence)
{
if ($sentence == "") {
return $sentence;
}
}
Second, we need the recursive steps. Since the sentence can be of any length, it’s easier to consider one word at a time. Then, we can change it if we need to, before looking at the next word. If there is no word to change anymore, we reach our base case (the empty sentence).
That’s all! Who said that recursion was difficult?
Displaying a Directory Tree
At that point, Davina ask you this legitimate question: “What’s the point to know how to create recursive functions if we can use good old iterative loops instead? Can we use loops for my problem?”.
You look at Davina’s problem. You answer is clear: no, iteration would not be a good solution for that.
Davina is developing a backoffice for the content department of the e-commerce you’re working for. Copywriters want to be able to choose images of product from any sub-directory of a file system.
She needs to display a tree of directories which can have any depth, that is, an unknown number of nested sub-directories.
Thinking about the problem with Davina, you come up with a simple solution: having each directory and sub-directories as keys of a hashmap. You could then iterate through it and display each directory.
To make the problem more concrete and easier to reason about, you decide to create an arbitrary directory tree as follow:
dirs
├── 1
│ ├── 1-1
│ └── 1-2
│ ├── 1-2-1
│ └── 1-2-2
├── 2
│ ├── 2-1
│ │ ├── 2-1-1
│ │ │ ├── 2-1-1-1
│ │ │ └── 2-1-1-2
│ │ │ └── 2-1-1-2-1
│ │ └── 2-1-2
│ └── 2-2
└── 3
└── 3-1
From there, you would like to populate a PHP array with the directory names as keys, nesting them to represent the depth of each sub-directory. Something like this:
<?php
Array
(
[dirs] => Array (
[1] => Array (
[1-1] => Array ()
[1-2] => Array (
[1-2-1] => Array ()
[1-2-2] => Array ()
)
)
[2] => Array (
[2-1] => Array (
[2-1-1] => Array (
[2-1-1-1] => Array ()
[2-1-1-2] => Array (
[2-1-1-2-1] => Array ()
)
)
[2-1-2] => Array ()
)
[2-2] => Array ()
)
[3] => Array (
[3-1] => Array ()
)
)
)
First things first, we need to create these directories. Easy peasy! Open a terminal, go in your filesystem where the root directory should be, and run this one-liner:
mkdir -p dirs/{1/{1-1,1-2/{1-2-1,1-2-2}},2/{2-1/{2-1-1/{2-1-1-1,2-1-1-2/2-1-1-2-1},2-1-2},2-2},3/3-1}
To guarantee your success, you’ll need these functions, too. Copy them where you’ll implement your solution:
<?php
// Return all the direct children (sub-directories) of the $path
function subDirectories($path) {
$files = array_diff(scandir($path), array('..', '.'));
$directories = [];
foreach ($files as $f) {
if (is_dir(absolutePath($path, $f))) {
$directories[] = $f;
}
}
return $directories;
}
// Concatenate an absolute path with a directory
function absolutePath($path, $dir) {
return $path . DIRECTORY_SEPARATOR . $dir;
}
We can first try to solve the problem using iterative loops. Remember, even if we defined an arbitrary example, the depth could go on an on.
Don’t spend too much time on your iteration. It might be possible using a stack, but I didn’t really try. You’ll see the likely failed attempt developers will usually do below:
<?php
// Iterative approach
function displayDirectories($path, $subDirs) {
$totalDir = [];
$absPath = dirPath($path, $subDirs);
foreach (subDirectories($path) as $sub) {
$totalDir[$sub] = [];
$subPath = dirPath($path, $sub);
foreach(subDirectories($subPath) as $subSub) {
$totalDir[$sub][$subSub] = [];
$subPath = dirPath($path, $subSub);
// foreach...
}
}
return $totalDir;
}
print_r(displayDirectories(".", "dirs"));
// => Return only two levels of sub-directories
This solution would works if we knew the depth of the directory tree. Since we don’t, we don’t know either how deep we need to loop.
The Base case
Let’s write the base case. What would be the easiest case we could solve?
<?php
function recursiveDisplayDirectories($path, $subDirs) {
if (empty($subDirs)) {
return $subDirs;
}
}
print_r(recursiveDisplayDirectories(".", "dirs"));
The easiest case to solve would be a directory without any sub-directory.
The Recursive Steps
To understand better the problem, the structure of a filesystem can be displayed as a tree data structure. Recursion is very well suited to parse this kind of data structure.
Here it is:
The dotted lines are there to remind us that there could be even more sub-directories going down.
Now that we have a clear visual representation of our problem, let’s decompose it even more:
- Every directory has potentially an unlimited amount of direct children.
- For example, the directory
dirs
has three direct sub-directories,1
,2
,3
, but it could have more.
- For example, the directory
- The depth of sub-directories, from one node to a leaf node (a node without any child) is unknown.
- For example, the path from the node
dirs
to the node2-1-1-2-1
could be even deeper. There could be another subdirectory2-1-1-2-1-1
, for example.
- For example, the path from the node
The first point can be easily solved. I gave you the function subDirectories($path)
which will give you every direct children of a directory. We can use an iterative loop to parse them.
The second point is more tricky to solve. This is where we need recursion. We could imagine going deeper and deeper in the tree for each sub-directory using recursion, until we reach a leaf node, which is as well our base case.
We could create an array for each level we go through, and when we reach a leaf-node, we nest every array we created into each other.
I think we have now all the information we need to implement the recursive steps.
Don’t worry if it feels too difficult for you or Davina. It can be tricky to think in term of recursion when you’re used to write iterations. Focus, give it a good try, and even if you don’t succeed, it will bring you closer to the breakthrough you’ll have one day. I promise.
1<?php
2// recursive approach
3function recursiveDisplayDirectories($path, $subDirs) {
4 $total = [];
5 if (empty($subDirs)) {
6 return $subDirs;
7 }
8
9 foreach($subDirs as $d) {
10 $newPath = dirPath($path, $d);
11 $total[$d] = recursiveDisplayDirectories($newPath, subDirectories($newPath));
12 }
13
14 return $total;
15}
16
17print_r(recursiveDisplayDirectories(".", ["dirs"]));
What’s happening here?
Line 4
: we create a new array. It will be the array for this level of directories, including every sub-directory.Line 5 to 7
: we have our base case.Line 9
: we begin to iterate through every sub-directory of the current directory.Line 11
: the recursive process begins. For each sub-directory, we go a level deeper by updating the$path
, and give as argument the sub-directories of the new path.Line 5
: When we reached our base case for every branch of the tree, we return an empty array. Every array created during the expansion will be returned too, nesting one into another.Line 14
: Finally, we return our nested array after going through every branch of the tree.
Here’s an optional exercise: what about implementing recursiveDisplayDirectories
using only recursion, without the foreach
?
Is Recursion Still a Mystery?
Davina is delighted to have solved her problem, and you’re happy to have, again, helped a colleague who needed it.
There are more to know about recursion, especially regarding performances (memory used and time to execute). The last example didn’t include many levels of sub-directories, but if you have more of them, you can end up consuming way too much memory.
Some languages, like Haskell or some LISP dialects, specifically optimize some form of recursion to make it faster while using less memory. It’s called tail call optimization. The performances can be very close to an iterative loop. If you want me to speak about that and in what cases it applies, shout out in the comment.
If you want more exercises, you can contact me. I’m on social media (you’ll find them on the about page), or you can let a comment.
What did we learn together?
- Recursion is a way to divide a problem into smaller sub-problems. The solution should solve every sub-problem, one by one.
- A recursive solution to a problem must have two steps: the base case (the smallest problem to solve) and the recursive steps (applying the same solution over and over till we reach the base case).
- Iteration can be used instead of recursion. Yet, recursive solutions can be simpler for specific types of problem, like tree-shaped data structure or data structures with unknown depth.