Backups are important. Not keeping all backups is equally important. Especially when you are taking backups of big databases, and you don’t have enough space to keep 500TB of data backups.
Therefore I wrote a script that deletes old backups. I wanted to keep at least one back up / quarter for several years (until I, or someone else cleans them up by hand). I also wanted to keep the last 4 weekly full backups, and keep another 4 backups, one for each of the 4last months.
The script can handle multiple databases. With that I mean: it can handle multiple directories, where each directory holds the backups of one database. It is written for db2 databases, but it can easily be adapted for other databases of course. I wrote a lot of comments in the code, so it should be self explanatory.
This is what I came up with:
#!/bin/bash shopt -s nullglob #make sure the for loop always gives something expected back. or a null value. #examplestring: dbtest.0.db2inst.NODE0000.CATN0000.20120426021540.001 #add a new value for each directory it should check. Every directory will be checked seperatly. locations[0]="/path/to/a/backupdir/" locations[1]="/path/to/another/backupdir/" #function to check if a value is present in an array. containsElement () { local e for e in "${@:2}"; do [[ "$e" = "$1" ]] && return 0; done return 1 } for location in "${locations[@]}" do unset i i=0 unset file #create an array that only contains the filenames that end on '001' #(db2 can split backups into multiple files, we want to check only one) for uniquefile in $location*.001 do file[$i]="$(basename $uniquefile)" let i++ done #order the array. newest filest first unset sorted readarray -t sorted < <(printf '%s\0' "${file[@]}" | sort -rz | xargs -0n1) unset z z=0 #keep count of files that are already selected to stay unset yearMonths yearMonths=0 #keep count of what months (with year included, form: 201205) are already selected to stay explicitMonths=( 01 04 07 10 ) #select the months of wich a backup file should always stay once. (every quarter) unset stay #keep an array of files that need to stay. stay=0 for a in "${sorted[@]}"; do date=$(echo $a | awk -F '.' '{print $6}') yearMonth=${date:0:6} month=${date:4:2} #keep the weekly backups for 4weeks if [ "$z" -lt "4" ] then stay[$z]="$a" #add the filename to the 'stay' array. let z++ #add the month of the last saved backups. we don't need extra backups for that month anymore. yearMonths[0]="$yearMonth" #keep backups for 4 months (4weeks + 4months) some months. year can be included, since the unnecesary backups will already be removed by next year, when the script checks again. else if [ "$z" -lt "8" ] then if ! containsElement "$yearMonth" "${yearMonths[@]}" then stay[$z]="$a" let z++ yearMonths[$[${#yearMonths[@]}+1]]="$yearMonth" fi fi fi #keep one backup per quarter if containsElement "$month" "${explicitMonths[@]}" && ! containsElement "$yearMonth" "${yearMonths[@]}" then stay[$z]="$a" let z++ yearMonths[$[${#yearMonths[@]}+1]]="$yearMonth" fi done #loop again over all files, only delete the files that are not present in the 'stay' array. for a in "${sorted[@]}"; do #check if element has to stay or not, and check if the element isn't empty. (the * would do bad things if ! containsElement "$a" "${stay[@]}" && [[ -n "$a" ]] then #also delete the files which belong together. (ending on .001, .002, .003, ...) rm $location${a%???}* fi done done