Friday, January 16, 2015

Recursive File and Directory Manipulation in Python (Part 3)



Repost from python central.

In Part 2 of this series we expanded our file-searching script to be able to search for multiple file extensions under a tree, and to write the results (all paths to files of matching extensions found) to a log file. Now that we've come to the final part of the series, we'll add more functionality (in the form of functions) to our script to be able to move, copy, and even delete the results of the search.
Before looking at the move/copy/delete functions, we'll first take the subroutine of logging the results to file and encapsulate that in a function also. The following is what that part of our script looked like before:

In order to put it into a function definition, we simply put the definition statement above, adding the appropriate arguments, and indent the rest accordingly (note how found changes to results and logname to logpath):
For error-reporting purposes, we'll also define a small function to write an error log, and we'll see why shortly. Here is the function for that, it takes in 3 arguments and writes a list of strings to an error log:
With both of our logging functions defined, we'll now write our functions to perform batch operations on our results list from our file search. We will first look at how to perform a batch move on the files found from their original locations to a target directory. The function we will use to actually move the files is the move function from the shutil module (imagine that :P), so we'll want to add this statement to the beginning of our script:
For our function definition, instead of acting directly on the found variable in the script, we'll have our method take in a dictionary of results and act on it. It also needs the path to the directory to move them to and an error log path, if we want it. It will also need a variable to store the errors (a list of path strings to files):
Before writing the rest of the function definition, a few important notes on the move function—firstly, this function will move the source argument to the destination of the same type. This means that if the source path is a directory, the destination will also be a directory, and likewise for files. Secondly, if the destination exists and is a file, then the source must be a file or the function will fail. In other words, if the destination is a directory, the source (either file or directory) will be moved into the destination directory, but if the destination is a file, the source may only be a file (we can't move a directory into a file). With that said, all we need do is make sure the dest argument to batchmove is an existing directory, so we'll use a try statement after testing:
This way, if the move fails, it will alert the user and wait before exiting. With our destination directory checked, we can add the core of the function: looping through results and moving each file. Here is the loop:
The keys in results are simply the file extensions searched for, so only the values are needed, and for each path in the current paths list in the values, we try to move it to our destination. If the move fails, the path is added to the error list. When the loop completes, a message is printed to the standard output.
After the loop completes, we'll want to log any errors encountered with our logerr method like so:
Finally, we'll have the script print a final message and exit:
Putting it all together, here is what our batchmove function looks like:
Now that we have the batchmove function, in order to define the batchcopy function, we need only change the function call of the innermost loop to copy2 (and messages accordingly), so the full definition would look like this:
These two we've defined should be useful enough not to need a deletion function, but if we wanted one, we would only have to remove the dest check from batchmove and change the inner loop function call to os.remove like so:
This is just to show how we'd implement the deletion subroutine, but actually it's not recommended, because whatever files Python deletes are deleted permanently (not sent to the Recycle Bin!). It would be safer simply to move the files to a folder with batchmove and delete them from there, but of course the choice is up to you :). Now that we've defined these functions, all we'd need to do to utilize them is call them after the search loop with found as the results argument, and whatever paths to log files we want accordingly, so finding and moving our files around will be a breeze even if we don't know where they are!

No comments:

Post a Comment