New-formula-starburst

Rails rake tasks to sync your remote database to your local development environment

13

So, you’re managing or developing a Ruby on Rails application that you deploy regularly using Capistrano (as any sane Rails developer should), and you want a really easy way to sync up the database on your remote production server with the development environment you have running on your local machine. This is really useful for debugging your app, especially if you have a lot of user-generated content. It’s also just really helpful for design and development if your local environment looks exactly the same as your live environment.

Well, I know it’s not that hard to log in to your remote database, export the whole database, copy it back to your machine, and load it into your development environment. But after a few times doing this, you realize there’s got to be a better way.

Since Capistrano gives us so much flexibility to do pretty much anything on our remote machine, there is an easy way. Here are a few tasks that you can freely copy and paste into your application that will automate all this:

  1. Connect to your production database server
  2. Dump your production MySQL database (sorry, this only works for MySQL right now!)
  3. Download the MySQL dump file to your local machine
  4. Replace the contents of your local development environment with the data and structure from your live production environment

Usage

rake db:production_data_refresh

Yep! That’s all there is to it. It uses the database connection information that you have already set up in your config/database.yml file, so no passwords are ever transmitted. It also uses SSH and SFTP to do all the remote communication and data transfer, so everything is nice and secure.

It is highly recommended that you set up public key authentication so you don’t have to log in with your username and password every time Capistrano wants to connect to your remote host.

Get your copy and paste finger ready … the code is right below the fold

Add these tasks to your lib/tasks/my_app.rake file:

namespace :db do
  desc "Dump the current database to a MySQL file" 
  task :database_dump do
    load 'config/environment.rb'
    abcs = ActiveRecord::Base.configurations
    case abcs[RAILS_ENV]["adapter"]
    when 'mysql'
      ActiveRecord::Base.establish_connection(abcs[RAILS_ENV])
      File.open("db/#{RAILS_ENV}_data.sql", "w+") do |f|
        if abcs[RAILS_ENV]["password"].blank?
          f << `mysqldump -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} #{abcs[RAILS_ENV]["database"]}`
        else
          f << `mysqldump -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} -p#{abcs[RAILS_ENV]["password"]} #{abcs[RAILS_ENV]["database"]}`
        end
      end
    else
      raise "Task not supported by '#{abcs[RAILS_ENV]['adapter']}'" 
    end
  end

  desc "Refreshes your local development environment to the current production database" 
  task :production_data_refresh do
    `rake remote:exec ACTION=remote_db_runner --trace`
    `rake db:production_data_load --trace`
  end 

  desc "Loads the production data downloaded into db/production_data.sql into your local development database" 
  task :production_data_load do
    load 'config/environment.rb'
    abcs = ActiveRecord::Base.configurations
    case abcs[RAILS_ENV]["adapter"]
    when 'mysql'
      ActiveRecord::Base.establish_connection(abcs[RAILS_ENV])
      if abcs[RAILS_ENV]["password"].blank?
        `mysql -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} #{abcs[RAILS_ENV]["database"]} < db/production_data.sql`
      else
        `mysql -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} -p#{abcs[RAILS_ENV]["password"]} #{abcs[RAILS_ENV]["database"]} < db/production_data.sql`
      end
    else
      raise "Task not supported by '#{abcs[RAILS_ENV]['adapter']}'" 
    end
  end

end

Now, add these tasks to your config/deploy.rb file:

desc 'Dumps the production database to db/production_data.sql on the remote server'
task :remote_db_dump, :roles => :db, :only => { :primary => true } do
  run "cd #{deploy_to}/#{current_dir} && " +
    "#{rake} RAILS_ENV=#{rails_env} db:database_dump --trace" 
end

desc 'Downloads db/production_data.sql from the remote production environment to your local machine'
task :remote_db_download, :roles => :db, :only => { :primary => true } do  
  execute_on_servers(options) do |servers|
    self.sessions[servers.first].sftp.connect do |tsftp|
      tsftp.get_file "#{deploy_to}/#{current_dir}/db/production_data.sql", "db/production_data.sql" 
    end
  end
end

desc 'Cleans up data dump file'
task :remote_db_cleanup, :roles => :db, :only => { :primary => true } do  
  delete "#{deploy_to}/#{current_dir}/db/production_data.sql" 
end 

desc 'Dumps, downloads and then cleans up the production data dump'
task :remote_db_runner do
  remote_db_dump
  remote_db_download
  remote_db_cleanup
end

Enjoy! Please let me know if anything doesn’t work the way you think it should.

Comments

  • Avatar ashif sayani said 25 days later:

    The task doesn’t work, would I replace the reference to wamily with my database name on production?

  • Avatar Nate Clark said 25 days later:

    @ashif – sorry about that, I forgot to change the namespace from wamily to db. I updated the code above and it should work now. Let me know if you still have trouble.

  • Avatar Calvin Yu said about 1 month later:

    I created a rake task that did a remote mysqldump, but yours works better because I now don’t have to constantly grant access when my IP changes.

    I also updated the task to rehash all user passwords so that I can login as anyone.

    Thanks for all the hard work.

  • Avatar Brad Gessler said 2 months later:

    Thanks for the script! I’ve been trying to deal with how to move DB’s around for Radiant CMS development and this foots the bill.

    I did run into one problem that I fixed however… you need to change all of the

    #{abcs[RAILS_ENV][“host”]}

    to

    #{“localhost” unless abcs[RAILS_ENV][“host”]}

    These changes the assumption that if the host is blank in the database.yml files, the localhost database should be dumped.

    I know… its not DRY… sue me.

  • Avatar b said 2 months later:

    Hey thats neat! Thanks for sharing.

    Just a note that the current version of cap doesn’t do the rake remote:exec thing anymore. Replace with cap #task: task :production_data_refresh do `cap remote_db_runner` `rake db:production_data_load—trace` end

    also I had to make the localhost edits as mentioned above.

    Oh and dont forget to deploy the changes you make before running the task! Ha!

    Again thanks alot.

  • Avatar aaronv said 2 months later:

    Once doing all the changes in previous comments, I had a problem with the final `rake db:production_data_load—trace`

    It just needs a space before the—trace.

    Thanks for the script everyone, it’s great! Who knows why this isn’t already built into capistrano..

  • Avatar udi said 3 months later:

    Hi!

    your’e missing a colon on the Capistrano link.

    cheers! Udi

  • Avatar Ivan Storck said 5 months later:

    has anyone done this the other way, from production down to local?

  • Avatar Ben Schwarz said 7 months later:

    In response to Ivan’s comment, I thought it might be useful to get something that I use fairly regularly .

    This comment has been flagged for moderator approval. It won't appear on this blog until the author approves it.
  • Avatar Denis said 9 months later:

    Good article!! Thank’s

    This comment has been flagged for moderator approval. It won't appear on this blog until the author approves it.
  • Avatar Denis said 9 months later:

    Good article!! Thank you!

    This comment has been flagged for moderator approval. It won't appear on this blog until the author approves it.
  • Avatar Axel said 9 months later:

    Hello. I love your site!!

    This comment has been flagged for moderator approval. It won't appear on this blog until the author approves it.
  • Avatar Jonathan Boutelle said about 1 year later:

    Nice! It’s still a hit to your production database though, and running over the network it could take a while.

    Run this task on a database slave that lives in the dev center (rather than on the production cluster) and you’ve got a nice setup!

Trackbacks

Use the following link to trackback from your own site:
/articles/trackback/19480