Feb
So, you’re managing or developing a Ruby on Rails application that you deploy regularly using Capistrano (as any sane Rails developer should), and you want a really easy way to sync up the database on your remote production server with the development environment you have running on your local machine. This is really useful for debugging your app, especially if you have a lot of user-generated content. It’s also just really helpful for design and development if your local environment looks exactly the same as your live environment.
Well, I know it’s not that hard to log in to your remote database, export the whole database, copy it back to your machine, and load it into your development environment. But after a few times doing this, you realize there’s got to be a better way.
Since Capistrano gives us so much flexibility to do pretty much anything on our remote machine, there is an easy way. Here are a few tasks that you can freely copy and paste into your application that will automate all this:
- Connect to your production database server
- Dump your production MySQL database (sorry, this only works for MySQL right now!)
- Download the MySQL dump file to your local machine
- Replace the contents of your local development environment with the data and structure from your live production environment
Usage
rake db:production_data_refresh
Yep! That’s all there is to it. It uses the database connection information that you have already set up in your config/database.yml file, so no passwords are ever transmitted. It also uses SSH and SFTP to do all the remote communication and data transfer, so everything is nice and secure.
It is highly recommended that you set up public key authentication so you don’t have to log in with your username and password every time Capistrano wants to connect to your remote host.
Get your copy and paste finger ready … the code is right below the fold
Add these tasks to your lib/tasks/my_app.rake file:
namespace :db do
desc "Dump the current database to a MySQL file"
task :database_dump do
load 'config/environment.rb'
abcs = ActiveRecord::Base.configurations
case abcs[RAILS_ENV]["adapter"]
when 'mysql'
ActiveRecord::Base.establish_connection(abcs[RAILS_ENV])
File.open("db/#{RAILS_ENV}_data.sql", "w+") do |f|
if abcs[RAILS_ENV]["password"].blank?
f << `mysqldump -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} #{abcs[RAILS_ENV]["database"]}`
else
f << `mysqldump -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} -p#{abcs[RAILS_ENV]["password"]} #{abcs[RAILS_ENV]["database"]}`
end
end
else
raise "Task not supported by '#{abcs[RAILS_ENV]['adapter']}'"
end
end
desc "Refreshes your local development environment to the current production database"
task :production_data_refresh do
`rake remote:exec ACTION=remote_db_runner --trace`
`rake db:production_data_load --trace`
end
desc "Loads the production data downloaded into db/production_data.sql into your local development database"
task :production_data_load do
load 'config/environment.rb'
abcs = ActiveRecord::Base.configurations
case abcs[RAILS_ENV]["adapter"]
when 'mysql'
ActiveRecord::Base.establish_connection(abcs[RAILS_ENV])
if abcs[RAILS_ENV]["password"].blank?
`mysql -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} #{abcs[RAILS_ENV]["database"]} < db/production_data.sql`
else
`mysql -h #{abcs[RAILS_ENV]["host"]} -u #{abcs[RAILS_ENV]["username"]} -p#{abcs[RAILS_ENV]["password"]} #{abcs[RAILS_ENV]["database"]} < db/production_data.sql`
end
else
raise "Task not supported by '#{abcs[RAILS_ENV]['adapter']}'"
end
end
end
Now, add these tasks to your config/deploy.rb file:
desc 'Dumps the production database to db/production_data.sql on the remote server'
task :remote_db_dump, :roles => :db, :only => { :primary => true } do
run "cd #{deploy_to}/#{current_dir} && " +
"#{rake} RAILS_ENV=#{rails_env} db:database_dump --trace"
end
desc 'Downloads db/production_data.sql from the remote production environment to your local machine'
task :remote_db_download, :roles => :db, :only => { :primary => true } do
execute_on_servers(options) do |servers|
self.sessions[servers.first].sftp.connect do |tsftp|
tsftp.get_file "#{deploy_to}/#{current_dir}/db/production_data.sql", "db/production_data.sql"
end
end
end
desc 'Cleans up data dump file'
task :remote_db_cleanup, :roles => :db, :only => { :primary => true } do
delete "#{deploy_to}/#{current_dir}/db/production_data.sql"
end
desc 'Dumps, downloads and then cleans up the production data dump'
task :remote_db_runner do
remote_db_dump
remote_db_download
remote_db_cleanup
end
Enjoy! Please let me know if anything doesn’t work the way you think it should.

The task doesn’t work, would I replace the reference to wamily with my database name on production?
@ashif – sorry about that, I forgot to change the namespace from wamily to db. I updated the code above and it should work now. Let me know if you still have trouble.
I created a rake task that did a remote mysqldump, but yours works better because I now don’t have to constantly grant access when my IP changes.
I also updated the task to rehash all user passwords so that I can login as anyone.
Thanks for all the hard work.
Thanks for the script! I’ve been trying to deal with how to move DB’s around for Radiant CMS development and this foots the bill.
I did run into one problem that I fixed however… you need to change all of the
#{abcs[RAILS_ENV][“host”]}
to
#{“localhost” unless abcs[RAILS_ENV][“host”]}
These changes the assumption that if the host is blank in the database.yml files, the localhost database should be dumped.
I know… its not DRY… sue me.
Hey thats neat! Thanks for sharing.
Just a note that the current version of cap doesn’t do the rake remote:exec thing anymore. Replace with cap #task: task :production_data_refresh do `cap remote_db_runner` `rake db:production_data_load—trace` end
also I had to make the localhost edits as mentioned above.
Oh and dont forget to deploy the changes you make before running the task! Ha!
Again thanks alot.
Once doing all the changes in previous comments, I had a problem with the final `rake db:production_data_load—trace`
It just needs a space before the—trace.
Thanks for the script everyone, it’s great! Who knows why this isn’t already built into capistrano..
Hi!
your’e missing a colon on the Capistrano link.
cheers! Udi
has anyone done this the other way, from production down to local?
In response to Ivan’s comment, I thought it might be useful to get something that I use fairly regularly .
Good article!! Thank’s
Good article!! Thank you!
Hello. I love your site!!
Nice! It’s still a hit to your production database though, and running over the network it could take a while.
Run this task on a database slave that lives in the dev center (rather than on the production cluster) and you’ve got a nice setup!