Faster applications by using SQL over ORM

February 14, 2008 at 23:00

Filed under: Computing — Pistos @ 23:00

If you only ever interact with your database through your ORM’s interface and syntax, you may be slowing your application down.

Now and then, I see developers do validations, constraints and hooks using their ORM, and think to myself: “Why don’t these people use their database to do that?”

I’ve been comfortable with SQL for both queries and data definition for a long time, so I always write my all schemas and most queries in SQL, by hand. But I began to wonder: Is there really any difference between doing these things natively in the database as opposed to in Ruby? So I wrote up a modest benchmark script. Following are my findings.

I am far from being an expert in benchmarking or statistics. I would be appreciative of anyone that could point out errors in my method. I also acknowledge that the processes covered are not at all exhaustive of the many possible validations and hooks one might use.

The Test Code

In all test output, method 1 uses Ruby, whereas method 2 uses the database.

Test 1: Record creation hook

In this test I compare a before_create Ruby hook to a functional column default.

# Ruby
before_create do
  set( :time_created => Time.now )
end
-- SQL
time_created TIMESTAMP NOT NULL DEFAULT NOW()

Results

Set 1 mean: 7.756 s
Set 1 std dev: 2.373
Set 2 mean: 2.861 s
Set 2 std dev: 0.052
p.value: 1.45088891038497e-11
W: 400.0
The difference (-63.1%) IS statistically significant.

By using a database function to fill in default values on creation, I chopped my record creation time to nearly 1/3 of its original cost.

Test 2: Insertion validation

In this test, I compare validation of the length of a text field; ORM validation versus database validation.

# Ruby
validates_length_of :name, :minimum => 8
-- SQL
name VARCHAR( 255 ) NOT NULL CHECK(
  char_length( name ) > 7
),

Results

Set 1 mean: 3.772 s
Set 1 std dev: 0.993
Set 2 mean: 4.157 s
Set 2 std dev: 0.607
p.value: 0.13441884182804
W: 144.0
The difference (+10.2%) IS NOT statistically
significant.

Although the difference is substantively significant, we are informed by the statistical test that it is not statistically significant. Therefore, we cannot conclude that there is any difference between these two approaches to validation.

Test 3: Deletion hook - cascade deletion

In this test, I compare cascading a delete using your ORM versus the database.

# Ruby
before_destroy do
  Employee1.where( :company_id => pk ).destroy
end
-- SQL
company_id INTEGER NOT NULL
  REFERENCES companies2( id ) ON DELETE CASCADE,

Results

Set 1 mean: 3.581 s
Set 1 std dev: 0.836
Set 2 mean: 0.087 s
Set 2 std dev: 0.028
p.value: 1.75847735938658e-08
W: 383.0
The difference (-97.6%) IS statistically significant.

Here, the database method crushes in-Ruby cascading, having a microscopic 2.4% of the time cost of the Ruby method.

Test 4: Deletion hook - update other table

In this test, I compare the in-Ruby and in-database means of updating a table after deleting a record from another one.

# Ruby
class Post1 < Sequel::Model( :posts1 )
  after_destroy do
    author.update_post_count
  end
end
 
class Author1 < Sequel::Model( :authors1 )
  def update_post_count
    post_count = posts.size
    save
  end
end
-- SQL
CREATE RULE update_post_count AS
ON DELETE TO posts2
DO ALSO (
    UPDATE
        authors2
    SET
        post_count = (
            SELECT COUNT(*)
            FROM posts2
            WHERE author_id = OLD.author_id
        ) - 1
    WHERE
        id = OLD.author_id
    ;
);

Results

Set 1 mean: 1.801 s
Set 1 std dev: 0.230
Set 2 mean: 1.103 s
Set 2 std dev: 0.234
p.value: 2.90177782076994e-11
W: 399.0
The difference (-38.7%) IS statistically significant.

Again, the database version wins out. This time, the improvement is over 1/3 the time cost of the Ruby version.

Conclusion

The in-database method beat the in-Ruby method significantly in 3 of the 4 tests I ran. In the remaining test, the results were inconclusive (being statistically insignificant), so we are unable to reject the null hypothesis that there is no difference between the two methods in that test.

If you are after a more natural syntax, and want to stay in your programming language as much as possible, using your ORM for hooks, validations and constraints is the way to go.

But if speed is a priority, sharpen your SQL Fu so you can open up your options and harness the full potential of your RDBMS. As these test results demonstrate, most hooks and checks are better off done in your database, and not with your ORM.


Footnotes

Substantive significance is the importance, meaningfulness, or magnitude of an effect or difference. For example, a speed improvement of 100% is substantively significant, whereas a speed improvement of 0.1% is not.
Share This

2 Comments »

  1. Comment by Pistos — March 6, 2008 @ 13:25

    See also this post by Ara Howard about how ActiveRecord validations (and those of most ORMs too, probably) are fatally flawed.

  2. Comment by Pistos — June 5, 2008 @ 07:49

    I’ve more formally created/released better-benchmark here.

RSS feed for comments on this post. TrackBack URI

Leave a comment

You may use Markdown syntax in your comment.

Powered by WP Hashcash

Powered by WordPress.
Close
E-mail It
Socialized through Gregarious 42