Dec 252010

Managing back-ups or log files often involves deleting older versions of your archives, so that at any given time you are only storing a particular number of copies. Many scripts rely on the creation date of the files, deleting those files that are older than some set time–a day, a week, a month–but this is a dangerous approach. What if the automated process that creates those files silently fails?

For example, let’s say a database dump occurs nightly: a SQL file is generated, compressed, and stored in a dumps directory.  You then write a script deleting all the files in the dumps directory that are more than a week old. Your plan is to keep only the last seven SQL files. Everything runs fine to start, but sometime in the future, due to changing permissions, the database dump no longer runs properly. Your clean-up script keeps executing however. All files more than a week old are deleted, even if no new files are being created. After a week, the directory is empty. You don’t have a single back-up of your database.

The problem here is a bad assumption. You shouldn’t assume that deleting everything older than a week will result in seven files sticking around. If your goal is to keep seven files, then you should write a script that ensures seven files are kept. Luckily this is easy in bash.

# Usage: keep_newest(directory, number_files_to_keep)
keep_newest ()
  if [ $# -ne 2 ]
    return -1

  local FILES_COUNT=$(ls -1 "$1" 2>/dev/null | wc -l)
  local FILES_TOO_MANY=$(expr $FILES_COUNT - "$2")

  if [ $FILES_TOO_MANY -gt 0 ]
    local CUR_DIR=$(pwd)
    cd "$1"
    ls -t  | tail -n $FILES_TOO_MANY | xargs rm
    cd $CUR_DIR

The above bash function takes two parameters, a directory and the number of files to keep. First, the number of files in the directory is determined by listing the files one per line and piping that listing to wc -l, which returns the number of lines (or the number of files in this case). Next we subtract our target number from the actual number of files, and if the result is greater than zero, we know we have an overabundance. We switch directories (first, we record our current directory to ensure we can finish where we started) and then delete the extra files. We pass the -t option to ls to ensure that we are always deleting the oldest files.

 Leave a Reply



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2010 Geektastical Suffusion theme by Sayontan Sinha