Keeping it Small and Simple

2007.12.13

Creating a tree utility in Python, part 2

Filed under: Python Tutorial — Lorenzo E. Danielsson @ 15:01

In the previous tutorial, I showed you how to recursively traverse a directory tree and list all the files and directories it contained. The output the program produced looks slightly spartan, so let’s work on that.

Adding lines

The tree utility uses lines. Each entry is preceded by a ‘|–‘ (‘pipe’ character and two dashes). We will add that to our own implementation. While were at it, we will increase the indentation a little so that our items line


 1 #! /usr/bin/env python
 2
 3 # Show the contents of a directory tree.
 4
 5 import sys
 6 import os
 7
 8 def print_tree(path, indent=”):
 9     """ Recursively print the contents of a directory. """
10     for file in os.listdir(path):
11         fullpath = path + "/" + file
12         print indent + ‘|– ‘ + file
13
14         if os.path.isdir(fullpath):
15             print_tree(fullpath, indent+’    ‘)
16
17 # Process command-line arguments.
18 dir = os.getcwd()
19 if len(sys.argv) == 2:
20     dir = sys.argv[1]
21 elif len(sys.argv) > 2:
22     print "Usage: %s [path]" % sys.argv[0]
23     sys.exit(0)
24
25 # Make sure we really have a path.
26 if not os.path.isdir(dir):
27     print "E: that is not a valid path"
28     sys.exit(0)
29
30 print_tree(dir)

A problem related to the tree lines

Gradually, our tree program is taking shape. But there is still a lot left to do. First of all, our tree output doesn’t quite look right. Our vertical lines “break” any time we get to a sub-directory with content. This seems easy to fix.


 1 #! /usr/bin/env python
 2
 3 # Show the contents of a directory tree.
 4
 5 import sys
 6 import os
 7
 8 def print_tree(path, indent=”):
 9     """ Recursively print the contents of a directory. """
10     for file in os.listdir(path):
11         fullpath = path + "/" + file
12
13         print indent + ‘|– ‘ + file
14
15         if os.path.isdir(fullpath):
16             print_tree(fullpath, indent+’|    ‘)
17
18 # Process command-line arguments.
19 dir = os.getcwd()
20 if len(sys.argv) == 2:
21     dir = sys.argv[1]
22 elif len(sys.argv) > 2:
23     print "Usage: %s [path]" % sys.argv[0]
24     sys.exit(0)
25
26 # Make sure we really have a path.
27 if not os.path.isdir(dir):
28     print "E: that is not a valid path"
29     sys.exit(0)
30
31 print_tree(dir)

Well, what is good is that we no longer have “broken” lines in the tree output. But you may have noticed another problem instead: every single line extends to the very bottom of the output (it’s a bit hard to explain, just try running the program on a few different directory trees and you will see what I mean). What should happen is that any of the vertical lines should only extend up to the last item in that list.

The last item in the list

In order to solve this we need to know if the current item we are processing is the last item in the list. There are several ways we can do this. I’ve chosen to use a way which I find to be readable and fairly easy to understand.

I now store the output of os.listdir() in a variable called files. I then use a slightly different loop as you can see below. The variable i will contain a number that corresponds to the index of an item in the files list. To know whether or not I’m on the last item in the list I can compare i with len(files) – 1. Now, to access an item in the list, I have to use files[i].

Having the ability to know whether the current file we are processing is the last one in the directory listing allows us to fix the problem with the vertical lines extending to the bottom of the tree output. Look at the following program.


 1 #! /usr/bin/env python
 2
 3 # Show the contents of a directory tree.
 4
 5 import sys
 6 import os
 7
 8 def print_tree(path, indent=”):
 9     """ Recursively print the contents of a directory. """
10     files = os.listdir(path)
11     for i in range(0, len(files)):
12         fullpath = path + "/" + files[i]
13
14         print indent + ‘|– ‘ + files[i]
15
16         if os.path.isdir(fullpath):
17             if i == len(files) – 1:
18                 print_tree(fullpath, indent+’    ‘)
19             else:
20                 print_tree(fullpath, indent+’|    ‘)
21
22 # Process command-line arguments.
23 dir = os.getcwd()
24 if len(sys.argv) == 2:
25     dir = sys.argv[1]
26 elif len(sys.argv) > 2:
27     print "Usage: %s [path]" % sys.argv[0]
28     sys.exit(0)
29
30 # Make sure we really have a path.
31 if not os.path.isdir(dir):
32     print "E: that is not a valid path"
33     sys.exit(0)
34
35 print_tree(dir)

It’s beginning to look better and better. But there is still something not quite right. Comparing our output to the output of the tree utility, we notice that in tree, the last output in the list is preceeded by a ‘`–‘ instead of the usual ‘|–‘. Here again we take advantage of the fact that we rewrote our loop to know whether or not we are processing the last file.

Let’s look at our final version of the program for today:


 1 #! /usr/bin/env python
 2
 3 # Show the contents of a directory tree.
 4
 5 import sys
 6 import os
 7
 8 def print_tree(path, indent=”):
 9     """ Recursively print the contents of a directory. """
10     files = os.listdir(path)
11     for i in range(0, len(files)):
12         fullpath = path + "/" + files[i]
13
14         if i == len(files) – 1:
15             print indent + ‘`– ‘ + files[i]
16         else:
17             print indent + ‘|– ‘ + files[i]
18
19         if os.path.isdir(fullpath):
20             if i == len(files) – 1:
21                 print_tree(fullpath, indent+’    ‘)
22             else:
23                 print_tree(fullpath, indent+’|    ‘)
24
25 # Process command-line arguments.
26 dir = os.getcwd()
27 if len(sys.argv) == 2:
28     dir = sys.argv[1]
29 elif len(sys.argv) > 2:
30     print "Usage: %s [path]" % sys.argv[0]
31     sys.exit(0)
32
33 # Make sure we really have a path.
34 if not os.path.isdir(dir):
35     print "E: that is not a valid path"
36     sys.exit(0)
37
38 print_tree(dir)

There we go. Now the output looks quite decent. We will end here for this time. Next time, we’ll add a few command-line options to our tree utility.

It is important that you understand how these small programs work. If you are having problems, just take out a pen and paper. Keep a track of the values of different variables and work yourself through the loop, each time drawing the lines on the paper that the computer would draw on the screen. That way you will easily see exactly how the program works.

I believe that it helps you to work through the program if you type it yourself, line by line. That is part of the reason that I supply the full code to each example instead of just the lines that change. Try to avoid yanking and putting the code. If you really want to learn, do yourself the favor of just spending a few extra minutes typing the program.

Also, the principles behind this program work in other programming languages too, so if you know how to program in another language, say Java, you should easily be able to adapt this program for Java.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: