P2 : Python
Finding Duplicate Filenames
Objectives
- Use a scripting language
- Use an associative array
Description
In this assignment, you are going to write a small program in Python that
uses a type called a dictionary which is basically an associative array.
Specification
The purpose of this program is to produce a table listing all the unique
filenames that occur in a set of directories which are entered from the
keyboard.
For each filename, you need to store a list of all the directories that
contain that file.
- The program should read a set of directory names from standard input.
- Use a dictionary to store the name of each file (as the key) and a list of
directories in which that file appears.
- The program should write a table containing a line for each file
name to standard output. Each name should contain the file name, the
number of directories containing that file and a list of the directories where
it occurs (in that order).
- The output should be sorted by file name.
Can you also sort the paths that are listed for each individual file? How hard
would it be to sort by the number of directories that contain the file?
Describe the approach you would take in your README file.
- The program will be worth worth more points if you recursively search for
files in any subdirectories that you find.
Testing
Download or copy from the directory
on onyx the following files:
the testDirs directory
has some directories with sometimes duplicated files which you can run your
program on. You can make a symbolic link to testDirs by typing
ln -s ~tcole/teaching/cs354/handouts/p2/testDirs
Be sure the first line of your program has the line
#!/usr/bin/python
in it and then make your program executable by typing
chmod +x filelist.py
Now you can run the test script by typing
./testProg
FYI: testProg is a shell script written in bash. You should be able to
tell what it does by reading the file in an editor or listing it to the screen
with the cat command.
Submission
Required files
Be sure the filenames match exactly. I use a test script to run the
programs; if it doesn't find the files it expects, you will automatically lose
5% of the points for the assignment.
- filelist.py
- Any test programs you wrote to figure out how python statements worked can
be included in a directory called practice. Be sure to describe these
programs in your README file.
- README : This file should be plain text. Include a brief description of
the program and a list of submitted files. This file should also include
a brief evaluation of the python language in terms of readability,
writability and ease of learning.
Put all the required files for the assignment in a single directory, change
to that directory and submit using the following command:
submit tcole cs354 2