Posts Tagged ‘PostgreSQL’
My experience with Django and Rails
I’ve had the opportunity to work on both Django and Rails frameworks recently as part of one project. The core application for the church administrative tools that I am working on is written in Rails, while the church guest follow-up application that I am responsible for is written in Django. Why two separate stacks? GuestView began its life independently from Gospel Software, and I chose Django there because of my familiarity with Python. Three developers are sharing responsibility for the core of Gospel Software, however, and we chose Rails as the most reasonable lingua franca.
What follows are my personal opinions and observations. These are mostly aesthetic or other value judgments, and I offer them simply for your consideration.
Language
I’ve used the Python programming language for a number of years and like it a lot. I particularly enjoy its functional aspects, although lately I’ve become more of a fan of using list comprehensions and generator expressions wherever possible compared to map() and filter() with lambdas. Compared to Python, Ruby has much more powerful functional capabilities, although some things don’t feel natural to me (Ruby’s design choice to not require parentheses to denote function invocation means that you must use .call to call a lambda, which feels clunky). There are also some cases in Ruby where choosing one of several alternative forms of an expression can have a significant impact on your performance. Lambdas seem particularly costly in Ruby as of version 1.8.
Overall I think the languages are fairly on par. Right now I prefer Python for aesthetic rather than technical reasons. As I grow in familiarity with Ruby, and as it matures and its performance improves I think I may eventually grow to prefer it.
Object-Relational Mapping (ORM)
The Django ORM is very powerful and you can express complicated queries very efficiently using it. Django queries are not executed until they are actually used, so you can construct your queries piecemeal, which helps in writing readable code. Django also allows you some flexibility with adding custom SQL to your queries, but for anything complicated I’ve found that I have to break down and write my own SQL.
Rails 2.1 introduced the ActiveRecord named_scope functionality. Prior to this Rails was significantly lacking compared to Django’s expressive power for query construction, but named_scope pretty much evens the playing field. And for complicated queries, which you will surely face in any real-world project as you seek to tweak performance, ActiveRecord gives you a degree of control over your SQL that really puts Django to shame.
Both Django and Rails seem to have adequate support for PostgreSQL, my database of choice.
URLs
Django lets you express your URLs using regular expressions; Rails accomplishes this using routes. I personally prefer Django’s method, but both work well enough.
Templates
While Rails’ Embedded Ruby allows you to include arbitrary code in your templates, Django’s template engine is much more spartan. It provides ways of getting at variables passed to the template, including objects, dictionaries, lists, and even methods. And it has some simle control structures, but not covering the full expressive power of Python. I yearned for a more powerful template language in Django at first. But I found over time that the discipline of a simple template language was helpful to me, forcing me to move any complicated behaviors to the controller (or “view” as Django calls it) which was in most cases the right thing to do anyway.
There are still some areas where I think the Django template language is lacking. However, there is an open-source alternative to the Django template engine that is similar but sufficiently more powerful to meet my needs: Jinja2.
For me it is a toss-up between Embedded Ruby and Jinja2.
Performance
I suspect it’s common knowledge that Rails has a little ways to go in performance. For our own purposes, I didn’t find too much difference in time measurements between Django and Rails. However, Rails clearly has a much larger memory footprint than Django.
I was surprised to learn that even with a FastCGI or WSGI model, Django still opens and closes a database connection for each request. While there may be technical reasons that the Django architecture requires this, it was still a surprise to me. Django performance still seems on par with Rails in spite of this. Interestingly, having Django use pgpool to connect to PostgreSQL didn’t improve my performance at all, perhaps because my application and database are currently located on the same host.
Console
Both Django and Rails allow you to run a REPL session for your application. The Rails script/console command beats out Django hands-down, because Rails’ internal magic automatically imports pretty much everything you need. In Django you still need to import any models or framework modules before you can use them.
Debugging
The Rails built-in log is enabled out of the box and is very handy. Django provides logging functionality but you have to do a little extra work to enable it. Rails wins out on logging. Django is better at in-browser rendering of exception tracebacks. Overall the handiness its logging means a slight win for Rails here for me.
Admin Application
Django’s admin application is truly its crown jewel. If you need a private admin interface to your web application, Django will give you a very attractive and powerful interface almost entirely for free. I’m not aware of any equivalent for Rails that even comes close to this.
Deployment
I’ve deployed Django using FastCGI and Rails using Mongrel. Right now I am using Nginx to proxy to Mongrel, and to connect directly to the Django FastCGI instance. Neither Django nor Rails seems to have a unique advantage or disadvantage in deployment.
Summary
I’ve spent more cumulative time with Django than with Rails, so I feel subjectively more at home with Django. If I were going to write a small toy project, I’d choose Django mainly for ease and efficiency. In fact, I took this route for the meal and potluck scheduler application that I recently wrote for Google App Engine. GAE has many similarities with Django, and even allows you to run much of the Django stack on it.
However, for larger projects my current framework of choice is Rails. With the named_scope functionality in Rails 2.1, ActiveRecord is finally on par with Django’s ORM. And for any complicated queries ActiveRecord is superior to Django’s ORM. While Django’s admin application is handy, I don’t make much use of it. And while Rails falls slightly behind in performance and storage characteristics, I believe that Ruby and Rails will both continue to improve in this regard.
PostgreSQL foreign keys and indexes
If you’re a frequent user of MySQL, you may be familiar with the fact that all MySQL table constraints automatically create indexes for you. This is true of the InnoDB foreign key constraints, for which “an index is created on the referencing table automatically if it does not exist.”
If you’re switching or considering a switch to PostgreSQL, you should be aware that not all PostgreSQL table constraints will automatically create indexes for for you. In PostgreSQL, a UNIQUE or PRIMARY KEY constraint on one or more fields will implicitly create an index for you. However, in PostgreSQL a FOREIGN KEY constraint will not automatically create an index for you.
For each of your foreign key constraints, you should evaluate whether you want to create an index. You may want to do this for optimizing your own queries, but be aware that it can also help to speed up DELETE queries on the referenced table and UPDATE queries on the referenced field. This is because any foreign key reference must be located to enforce whatever ON DELETE and ON UPDATE behavior is in effect for the constraint.
A performance comparison of AF_UNIX with loopback on Linux
On various Linux hosts I use either MySQL or PostgreSQL for my back-end database. Typically the database is running on the local host, and this raises the question of how to connect to the database. For both MySQL and PostgreSQL it is possible to connect using a local AF_UNIX connection, or using an AF_INET connection over the loopback socket (address 127.0.0.1).
I had thought that it might actually be more efficient to connect over loopback; for my day job I work on the TCP/IP stack for the z/OS operating system, and I know firsthand that our TCP/IP implementation is heavily optimized, including specific optimizations for local traffic.
I decided to test this hypothesis by comparing both the throughput and latency of AF_UNIX and AF_INET connections. I was running on a Celeron Dual-core system running Debian 4.0 with kernel version 2.6.18-4-686. The results disproved my hypothesis, at least for Linux:
- AF_UNIX:
- 100GB transferred in 80.575200 s
- 100 million 8-byte messages:
- Average latency 14.463910 μs
- Standard deviation 0.003376 μs
- AF_INET, loopback address
- 100GB transferred in 226.717520 s
- 100 million 8-byte messages:
- Average latency 1133.444424 μs
- Standard deviation 0.067419 μs
So, for both throughput and latency, on the Linux platform AF_UNIX is a superior choice to AF_INET. For local MySQL and PostgreSQL connections you should use a local socket rather than the loopback socket.