Using and testing multiple databases in rails part 2
I previously published a plugin on this blog that allows you to easily use, test, and migrate multiple databases in rails.
Since then, I've found numerous problems and areas that could be improved about this process. Here they are:
- All databases are automatically cloned through rake tasks that are dependent on standard "rake test" commands.
- Transactions are properly set up across ALL open connections during testing (if transactional fixtures are enabled)
- This plugin is now in use in a large-scale production environment!
- Compatibility with my preload fixtures plugin to make fixture loading super fast and transactional (coming soon to this blog)
Examples of how to use it:
To point a model at a different database:
class OtherDbBase < ActiveRecord::Base use_db :prefix => "otherdb" # will look for otherdb_development, otherdb_test, etc. in database.yml end
To have a migration run on a different database
class AddOtherDbStuff < ActiveRecord::Migration def self.database_model OtherDbBase end end
Sample config/use_db.yml file:
db1: prefix: db1_ db2: prefix: db2_
To run tests across multiple databases (requires file config/use_db.yml):
rake test
To have fixtures load into a different database:
# Do nothing! Just name the fixture the same name as the model is represents (in our case it would be other_db_bases.yml), and the fixture loader will use that class's database connection to insert fixtures.
# I realize this might limit some people, especially those with habtm fixutres on other DBs (since there's no model). I'd love your input on where we should specify which DB connection a fixture should use to load itself.
Want to download this plugin? Download from ELC's SVN repository OR:
./script/plugin install https://wush.net/svn/public/use_db
Why associated models don't save
Acts_as_taggable conflicting with has_one
In our dashboard app, we attempted to use the following in a controller:
def create @employee = Employee.new(params[:employee]) @employee.address = Address.new(params[:address]) @employee.save end class Employee < ActiveRecord::Base has_one :address, :as => :addressable acts_as_taggable end
Result: tags in params[:employee][:tag_list] weren't being saved correctly. I then changed the order in the model:
class Employee < ActiveRecord::Base acts_as_taggable has_one :address, :as => :addressable end
The tag_list gets saved! What gives?
Failed validation weirdness
The culprit in this case was validation. There are two after_save filters involved: acts_as_taggable and has_one. Acts_as_taggable always returns true, continuing the the filter chain, regardless of the success of tagging/tag updating (I this that's a bug actually). Not the case for has_one, however, where the rails system checks that the associated model saves before continuing the filter chain. In our test, we were not passing in enough params[:address] to make the Address.new save correctly.
class Address < ActiveRecord::Base belongs_to :addressable, :polymorphic => true validates_presence_of :street, :city, :state, :postal, :country
This situation was our own fault, but nonetheless not trivial to debug. Rails did not help us understand what went wrong, just that no SQL to save the tags was ever getting run.
class Employee < ActiveRecord::Base has_one :address, :as => :addressable validates_associated :address acts_as_taggable end
Moral of the story: always use validates_associated for has_one and has_many associations because they will not stop the main model from being saved if they fail validation!
Using and Testing Rails with Multiple Databases
Using multiple databases
I recently wrote a rails plugin called "use_db", which allowed you to use a different database for some of your ActiveRecord models. I started by reading this article (which was borrowed from the rails wiki), and decided to make a plugin out of it. You can use it in the following way:class SomeBase < ActiveRecord::Base use_db :prefix => "secdb_" self.abstract_class = true end class OtherDbModel < SomeBase endNow any calls to data in OtherDbModel will go to a database called "secdb_development" (or secdb_test, secdb_production, etc). The database.yml file could have the following additions to support this:
secdb_development: adapter: mysql database: secdb_development username: root secdb_test: adapter: mysql database: secdb_test username: root
Testing multiple databases
One issue with my plugin, as stated on the original article, is that testing becomes very difficult. First, fixtures are automatically inserted into the primary database. Second, other databases will not automatically have their schemas migrated from dev to test.
Solving the fixture problem
I first examined active_record/fixtures.rb and noticed the following problem:
def delete_existing_fixtures @connection.delete "DELETE FROM #{@table_name}", 'Fixture Delete' end def insert_fixtures values.each do |fixture| @connection.execute "INSERT INTO #{@table_name} (#{fixture.key_list}) VALUES (#{fixture.value_list})", 'Fixture Insert' end end
These two methods are called automatically by the test helper when loading fixtures for a test. @connection was originally set to ActiveRecord::Base.connection, so the existing solution was not going to work. To solve this, I overrode those two methods in my plugin and replaced them with the following code:
alias_method :rails_delete_existing_fixtures, :delete_existing_fixtures def delete_existing_fixtures m = get_model return rails_delete_existing_fixtures unless m && m.respond_to?(:uses_db?) && m.uses_db? connection = m.connection connection.delete "DELETE FROM #{m.table_name}", 'Fixture Delete' end alias_method :rails_insert_fixtures, :insert_fixtures def insert_fixtures m = get_model return rails_insert_fixtures unless m && m.respond_to?(:uses_db?) && m.uses_db? connection = m.connection values.each do |fixture| connection.execute "INSERT INTO #{m.table_name} (#{fixture.key_list}) VALUES (#{fixture.value_list})", 'Fixture Insert' end end
This first code attempts to get the model associated with a fixture. If found, it asks that model if it uses a different database. FInally, it uses the connection of the model to execute the fixture INSERT and DELETE SQL commands. If any of this process fails, it falls back on the existing rails fixture methods.
Solving the schema migration problem
Rails typically does schema migrations using a rake task which runs before "rake test". It typically divides the work into 3 segments, dump_db_structure, clone_db_structure, and purge_db. The sequence is as follows:
- dump_db_structure dumps the development schema without data to an adapter-specific SQL file
- purge_db deletes all rows from the test database
- clone_db_structure imports the SQL dump into the test database
I simply duplicated the existing rake code, and modified it to use a different database connection. At the end of the day, I could execute a single command to migrate a second database. I chose to execute the command in my test helped in the following manner:
unless defined?(MIGRATED_SEC_DB_FOR_TEST) UseDbTest.prepare_test_db(:prefix => "secdb_") MIGRATED_SEC_DB_FOR_TEST = true end
The syntax is very similar to the "use_db" helper.
Source code
Download the first release 0.0.1 of use_db rails plugin here.
UPDATE: Look for the next version of this plugin in this blog
RubyGems 0.91 and the "refresh" error
All of us over at ELCTech recently upgraded to RubyGems 0.91, only to find the following error during "gem" operations:
$ sudo gem install acts_as_ferret
ERROR: While executing gem ... (NoMethodError)
undefined method `refresh' for #<Hash:0x127c854>
Apparently the format of the gem cache files have changed, the new version cannot read some older versions of the cache. To fix this problem, delete your source_cache files in the following locations:
sudo rm /usr/local/lib/ruby/gems/1.8/source_cache
rm ~/.gem/source_cache
Looks like someone forgot to think about us old folks with ancient versions of RubyGems. In the future, I'll plan on running "gem update --system" more often!
Installing RMagick properly in OSX
I've seen the craziest problems with people installing the RMagick gem before. In this article, I will show you the procedure I use to fix any and all rmagick problems. First, I start by cleaning out any of the following packages:
sudo port uninstall imagemagick
sudo port uninstall graphicsmagick
sudo port uninstall ghostscript
sudo port uninstall freetype
sudo gem uninstall rmagick
Then, I reinstall them:
sudo port install freetype
sudo port install ghostscript
sudo port install imagemagick
sudo port install graphicsmagick
sudo gem install rmagick
This software cocktail has solved the following problems:
- RMagick gem won't compile native extensions
- RMagick can't find fonts - This is a really common problem, and we have seen a case where this approach will not solve this problem.
- RMagick renders text incorrectly and/or unreadable in the validates_captcha plugin
ImageScience is an alternative to RMagick which requires fewer native libraries, but not zero unfortunately. It may or may not be easier for you to use, depending on what you need RMagick for (it won't do text rendering, for example).
Beating The Browser's Iframe Security
I recently ran into the problem of cross domain communication between iframes while working on RightCart. We wanted to be able to pop up a LightBox outside the cart iframe, but the event had to be triggered by something happening inside the iframe. To make matters worse, the parent frame could be any domain and the child frame was always RightCart. After some research, I found that DoJo and the Windows Live Team claimed to have done this using nested iFrames.
No one else who is using this technique is willing to describe EXACTLY how it is to be done, so here we go:
The situation:
- Parent frame is http://www.somedomain.com/somepath/somefile
- Child iframe is http://rightcart.com/cart
- Parent executes some javascript initially which we control
- Child is completely controlled by us, and wants to call functions in parent
How I did it:
The child knows the URL of the parent through the HTTP referer (sic) header. If it creates a "subchild" iframe within the child pointing to the same top level URL as the parent, then 1-way communication is possible. How? The child can set the location of the subchild (but not read the location) and the parent can read the location of subchild, as long as the domain is the same as the parent.
So how do you transmit messages on a URL? Use the hash!
The hash is the portion of the URL after the # sign. You can change that portion of the URL and not mess with the browser. Most importantly, the browser will not go make a new request if it already has the page, regardless of what the hash value is.
For example, I could send a message to the parent by setting the location of the subchild using document.getElementById('...').src to http://www.somedomain.com#message_is_here. The parent can get the hash of the subchild by executing some javascript like window.frames[0].frames[0].location.hash. Got it? In order to send more complex messages, you'll want to Base64 encode the contents of the hash.
As for synchronization, I gave the messages sequence numbers, so that the parent knew if they had already seen the message. I also set a limit to how often messages could be sent (every 500 ms in my case). So the child had to queue messages if they came in too quickly:
function sendJavascriptToParent(js) { rightcart_comm_queue[rightcart_comm_queue.length] = js; }
This is important, because there's only 1-way communication involved and the parent cannot tell the child that it is ready for the next message. I set both frames up at a 500 ms interval to update and check the message respectively. This is not 100% reliable, but if you're worried, try simply checking for messages twice as often as your minimum update delay.
Here's my code to check for messages in the parent:
function check_for_comm() { var rcFrame = null; var innerFrame = null; rcFrame = window.frames["rightcart"]; if (rcFrame) { innerFrame = rcFrame.frames[0]; if (innerFrame) { var loc = innerFrame.location; debug_print("InnerFrame location = "+loc.href); var frag = loc.hash.substr(1); var seqnum = frag.split("!")[0]; var data = frag.split("!")[1]; debug_print("SeqNum: "+seqnum+", Data="+data); if (data.length > 0 && seqnum != comm_last_seq) { comm_last_seq = seqnum; var decoded_data = decodeBase64(data); debug_print("Running: "+decoded_data); eval(decoded_data); } } } }
And my code to send a message from the child:
function doSend() { if (rightcart_comm_queue.length > 0) { var seq = new Date().getTime(); var ifr = document.getElementById('rightcart_comm_frame'); var all_js = ''; for(var i=0; i<rightcart_comm_queue.length; i++) { var js = rightcart_comm_queue[i]; all_js += js+'\\n'; } var data = encodeBase64(all_js); var url = rightcart_comm_baseurl+'#'+seq+'!'+data; /*alert('Sending Data: '+all_js+' using url '+url);*/ ifr.src = url; rightcart_comm_queue = []; } }
I had trouble using the EXACT same URL as the parent for the subchild. To solve this, I dissected the URL and set it to robots.txt in the same domain. This will cause a browser request for the first message.