November 01, 2007

Posted by John

Tagged mysql, rsync, server, sftp, ssh, and subversion

Older: Keeping Your Tag Cloud Running: Intro

Newer: Marcel Molina: Beautiful Code

KYTCR Part 1: Backups

Again, I would like to repeat that this is not necessarily defacto or best practice, I’m just putting out what is working for me. If you think I’m stupid feel free to say so, but post an example of a better way or your comment will be deleted. Onward.

So your app is up and running, what is next? The most important thing now is to back everything up. This includes your database, your subversion repository and any files that are not versioned (user uploads, etc.).

First things first, prepare move offsite

Before we even get to backups, note this: backups are no good if they are only on your server and your server crashes. Push backups offsite with sftp, ssh, scp or rsync. Below is an example yaml config I use called sftp.yml. You’ll see how it comes into use later in this article.

username: stewey
password: iscool
host: familyguy.com
dir: path/to/backup/folder

Database Backups

Database backups for the size of apps we are talking about are pretty simple. I only know mysql so that is all I will show but you could take these principles to any RDBMS. What I did is create a folder in my user home directory named bin. Inside it I keep all the ruby files that do the backup work. Below is ~/bin/database_dump.rb.

#!/usr/bin/ruby
require 'yaml'
require 'logger'
require 'rubygems'
require 'net/ssh'
require 'net/sftp'

APP        = '/path/to/app/current/directory'
LOG_FILE   = '/home/username/log/database.log'
TIMESTAMP  = '%Y%m%d%H%M%S'

log        = Logger.new(LOG_FILE, 5, 10*1024)
dump       = "conductor_#{Time.now.strftime(TIMESTAMP)}.sql.gz"
# get the off server sftp configuration settings
ftp_config = YAML::load(open('/home/username/bin/sftp.yml'))
# get the production database configuration
config     = YAML::load(open(APP + '/config/database.yml'))['production']
cmd        = "mysqldump -u #{config['username']} -p#{config['password']} -h #{config['host']} --add-drop-table --add-locks --extended-insert --lock-tables #{config['database']} | gzip -cf9 > #{dump}"

log.info 'Getting ready to create a backup'
`#{cmd}`
log.info 'Backup created, starting the transfer offsite'
Net::SSH.start(ftp_config['host'], ftp_config['username'], ftp_config['password']) do |ssh|
  ssh.sftp.connect do |sftp|
    sftp.open_handle("#{ftp_config['dir']}/#{dump}", 'w') do |handle|
      sftp.write(handle, open("#{dump}").read)
    end
  end
end
log.info 'Finished transferring backup offsite'
log.info 'Removing local file'
cmd       = "rm -f #{dump}"
log.debug "Executing: #{cmd}"
`#{cmd}`
log.info 'Local file removed'

So that is the quick and dirty. Basically, I use ruby to run a mysqldump command (which uses the information from our database.yml file) and sftp the backup off the server to another location. I store the sftp settings in a yaml file that I can reuse for each of the backup scripts. What would I do different? I would move this to a rake task and/or use one of the tools that is already out there. You don’t have to setup crazy tools and all kinds of things. Think inside the box. You have ruby. Ruby is fun. Just use it to get the job done. That is what the script above does. Might seem simple but it’s been working great for several months. I run it nightly using a cron entry like this:

0 3 * * * /usr/bin/ruby /home/username/bin/database_dump.rb

crontab -e will open up your cron config file and you can simply paste in the above snippet and change the paths to ruby and your database dump script. The settings above will force the dump to run at 3AM nightly. To test your db dump script you can run it just like any ruby script (ruby database_dump.rb). Once you have it working like that, throw it in cron as I just mentioned.

MySQL Backup Related Links

Subversion Backups

Ok. So now you have database backups. Sweet. Next up is subversion. But isn’t subversion backup enough? I mean sometimes I commit and go home and svn up and…Stop. Please stop. Subversion is for version control. It is not a code backup solution. Use subversion for version control and do regular svn dumps for backup. I even scoot them offsite with the mysql backups. Below is a simple subversion dump script I threw together:

#!/usr/bin/ruby
require 'yaml'
require 'logger'
require 'rubygems'
require 'net/ssh'
require 'net/sftp'

REPO        = '/path/to/your/repository'
LOG_FILE   = '/home/username/log/svn_dump.log'
TIMESTAMP  = '%Y%m%d%H%M%S'

log        = Logger.new(LOG_FILE, 5, 10*1024)
dump       = "yourappname_svn_#{Time.now.strftime(TIMESTAMP)}.dump.gz"
ftp_config = YAML::load(open('/home/username/bin/sftp.yml'))

log.info 'Starting subversion dump'
cmd        = "svnadmin dump #{REPO} | gzip -c9 > #{dump}"
log.debug "Executing: #{cmd}"
`#{cmd}`

log.info "Dumped #{REPO} to #{dump}"
log.info 'Backup created, starting the transfer offsite'
Net::SSH.start(ftp_config['host'], ftp_config['username'], ftp_config['password']) do |ssh|
  ssh.sftp.connect do |sftp|
    sftp.open_handle("#{ftp_config['dir']}/#{dump}", 'w') do |handle|
      sftp.write(handle, open("#{dump}").read)
    end
  end
end

log.info 'Finished transferring backup offsite'
log.info 'Removing local file'
cmd = "rm -f #{dump}"
log.debug "Executing: #{cmd}"
`#{cmd}`
log.info 'Local file removed'

Again. Quick and dirty. I just use ruby to run a shell command (svnadmin dump) which is piped into gzip for compression to make the file smaller and the transfer a bit speedier. That is really all there is to it. Once you have an svn dump, it’s really easy to restore. Let’s say the dump was mydump.gz. The following would unzip it and load it into a directory:

gunzip mydump.gz
svnadmin load /var/repos/myapp < mydump

Pretty simple eh?

Subversion Backup Related Links

Other Backups

Other things you need to backup are any user uploaded files, anything in your capistrano shared directory and any custom configured files on your server. I was using ruby to tar and gzip most of these files and then sftp them offsite but I noticed a huge memory spike during this process, so large in fact that it choked out my mongrels. Instead, I would recommend using rsync for anything over a few megs. It was made to move files around and you can even get cheap backup offsite that is georedundant from rsync.net for less than $3/GB per month. Well worth the cash, IMO.

Other Related Links

Did I miss anything? Have a better idea? Didn’t understand something? Leave comments below.

The ‘Keep Your Tag Cloud Running’ Series

13 Comments

  1. Off-site backups! Yes, it is amazing, but I have seen grown human beings make back-ups by making duplicate copies of the data on the same physical disk.

    Another one to consider is backing up configuration information. At any given time your application will have specific versions of Ruby, Rails, individual Gems, MySQL and more. If you can are using configuration files and build tools to drive the build of your development and production sandboxes, then you can version development or production instances of a particular configuration. This allows you to easily restore all parts of your application to a previous known point in time. Perhaps overkill for many projects, but it can be great in a multi-developer environment, or if you need an extra layer of accountability for your systems.

  2. Thanks for the scripts. And also for the reminder to back up the shared directory and/or user-uploaded files — shamefully this only occurred to me recently! Using rsync is a good tip and I’ll start using that today.

  3. What about having everything in a virtual machine? Just copy the whole image nightly to an offsite server and you don’t have to write a script for every single type of data you want to secure. Plus it’s extremely quick to put it back when something went wrong.

  4. Rsync is your friend.

  5. @Kevin – Yep, the backing up key information in the Other Related Links list coers some of what you are talking about.

    @pangel – that works if everything sits on one vps. our application is sitting on two which will make it easier to beef up later when we need to.

    @Sam – Amen.

  6. Ned Baldessin Ned Baldessin

    Nov 01, 2007

    Quick note: I wrestled for a while with the Backup gem because I wanted to do incremental backups (rotated weekly, etc). Unfortunately, that gem requires you to have the local /path/to/your/backup/ be exactly the same as the distant /path/to/the/backup/folder. It makes things difficults if your are making a backup on a cheap shared host.

    Cheers.

  7. Jacob Atzen Jacob Atzen

    Nov 01, 2007

    Doing full backups over and over might not be the most space efficient way of keeping backups. I really like rdiff-backup for keeping a backlog of old backups. It’s based on diff’s, so keeping a backup for every day of the last year is actually quite manageable as compared to having 365 copies of your database and repository.

    My backup strategy for repositories is to simply do a file-by-file backup of the repository to the place I keep my rdiff-backup. This only works if you use the fsfs backend though.

  8. @Jacob – I typically only keep like a weeks’ worth of backup.

  9. What about using a cheap account from Dreamhost (U$S 9) just for backup purposes. You can run RSYNC from there, and have tons of bandwith and space.

  10. @PabloC – Yep, I do that personally but prefer something a little higher grade for work stuff.

  11. Here is another idea that i was looking around. Check “Duplicity”.

    http://www.brainonfire.net/2007/08/11/remote-encrypted-backup-duplicity-amazon-s3/

  12. PabloC, how does Duplicity (and any other rsync backup) deal with mysql tables? I assume one needs to run mysqldump before rsyncing to the backup site?

  13. Great scripts! I’m having an issue with the script.

    …ruby/lib/ruby/gems/1.8/gems/net-ssh-2.0.0/lib/net/ssh.rb:151:in `start’: undefined method `keys’ for password (NoMethodError)
    from database_dump.rb:26

Sorry, comments are closed for this article to ease the burden of pruning spam.

About

Authored by John Nunemaker (Noo-neh-maker), a programmer who has fallen deeply in love with Ruby. Learn More.

Projects

Flipper
Release your software more often with fewer problems.
Flip your features.