Quantcast
Channel: Bert blogt » fedora
Viewing all articles
Browse latest Browse all 10

database backup retention script

$
0
0

Backups are important. Not keeping all backups is equally important. Especially when you are taking backups of big databases, and you don’t have enough space to keep 500TB of data backups. ;-)

Therefore I wrote a script that deletes old backups. I wanted to keep at least one back up / quarter for several years (until I, or someone else cleans them up by hand). I also wanted to keep the last 4 weekly full backups, and keep another 4 backups, one for each of the 4last months.

The script can handle multiple databases. With that I mean: it can handle multiple directories, where each directory holds the backups of one database. It is written for db2 databases, but it can easily be adapted for other databases of course. I wrote a lot of comments in the code, so it should be self explanatory.

This is what I came up with:

#!/bin/bash
shopt -s nullglob #make sure the for loop always gives something expected back. or a null value.
#examplestring: dbtest.0.db2inst.NODE0000.CATN0000.20120426021540.001
#add a new value for each directory it should check. Every directory will be checked seperatly.
locations[0]="/path/to/a/backupdir/"
locations[1]="/path/to/another/backupdir/"

#function to check if a value is present in an array.
containsElement () {
  local e
  for e in "${@:2}"; do [[ "$e" = "$1" ]] && return 0; done
  return 1
}

for location in "${locations[@]}"
do
  unset i
  i=0
  unset file
  #create an array that only contains the filenames that end on '001'
  #(db2 can split backups into multiple files, we want to check only one)
  for uniquefile in $location*.001
   do
    file[$i]="$(basename $uniquefile)"
    let i++
  done

  #order the array. newest filest first
  unset sorted
  readarray -t sorted < <(printf '%s\0' "${file[@]}" | sort -rz | xargs -0n1)

  unset z
  z=0 #keep count of files that are already selected to stay
  unset yearMonths

  yearMonths=0 #keep count of what months (with year included, form: 201205) are already selected to stay
  explicitMonths=( 01 04 07 10 ) #select the months of wich a backup file should always stay once. (every quarter)
  unset stay #keep an array of files that need to stay.
  stay=0
  for a in "${sorted[@]}";
   do

   date=$(echo $a | awk -F '.'  '{print $6}')
    yearMonth=${date:0:6}
    month=${date:4:2}

   #keep the weekly backups for 4weeks
    if [ "$z" -lt "4" ]
     then
      stay[$z]="$a" #add the filename to the 'stay' array.
      let z++
      #add the month of the last saved backups. we don't need extra backups for that month anymore.
      yearMonths[0]="$yearMonth"

    #keep backups for 4 months (4weeks + 4months) some months. year can be included, since the unnecesary backups will already be removed by next year, when the script checks again.
    else if [ "$z" -lt "8" ]
     then
      if ! containsElement "$yearMonth" "${yearMonths[@]}"
       then
        stay[$z]="$a"
        let z++
        yearMonths[$[${#yearMonths[@]}+1]]="$yearMonth"
      fi
    fi
    fi

    #keep one backup per quarter
    if containsElement "$month" "${explicitMonths[@]}" && ! containsElement "$yearMonth" "${yearMonths[@]}"
     then
      stay[$z]="$a"
      let z++
      yearMonths[$[${#yearMonths[@]}+1]]="$yearMonth"
    fi
  done

  #loop again over all files, only delete the files that are not present in the 'stay' array.
  for a in "${sorted[@]}";
   do
    #check if element has to stay or not, and check if the element isn't empty. (the * would do bad things :)
    if ! containsElement "$a" "${stay[@]}" && [[ -n "$a" ]]
     then
        #also delete the files which belong together. (ending on .001, .002, .003, ...)
        rm $location${a%???}*
    fi
  done
done

Viewing all articles
Browse latest Browse all 10

Latest Images

Trending Articles



Latest Images