Part of the service we’re building is a socket server which uses Flash’s XMLSocket API to push updates to clients. Initially we developed this using the excellent Twisted library in Python, but as it grew, having to duplicate some of our data model code in another language started to hurt, and it made sense for us to port it to Ruby.
Luckily by that point, the EventMachine library had sprung up, offering something very similar to Twisted for Ruby, and we’ve been using that since.
While it’s well known that Ruby’s threading is non-native and not particularly speedy, event-based libraries don’t actually require much use of threading - one is encouraged to structure ones code in such a way that you write small methods which are called asynchronous and return quickly, yielding back to the event loop. For those with client-side experience, this is quite comparable with Javascript runtimes, where there is no threading but a core event loop, the ability to register event handlers, call setTimeout, and asynchronous APIs for longer-running IO (AJAX anyone?).
For this to work well, it is essential that your event handlers do their business as quickly as possible, and yield back to the event loop - as everything else in the event queue is sat there waiting for you to finish. This is all very well, until you need to deal with IO - other things (pesky database servers and clients) have a nasty habit of taking a while to get back to you, and if the API you’re calling to communicate with them blocks you, then it’s blocking everything else in the event queue too.
One way to get around this (despite the concurrency paradigm being based around an event-loop rather than pre-emptive threading), is to have some spare threads lying around to take care of blocking API calls, and fire off an event to the core event loop thread when they’re done. A way of turning a blocking API call into a non-blocking one, something asynchronous. While Ruby’s threads aren’t native or very performant, this shouldn’t matter too much in this case, as the threads aren’t really being used to do very much - just to sit around waiting for IO.
While this doesn’t require an asyncronous API at the Ruby level, it does at least require that the API calls only block the current Ruby thread, and don’t require an interpreter-wide lock in order to go about their business.
Unfortunately, it seems that many (most?) C-based Ruby libraries, including MySQL/Ruby (rather crucial to many), don’t seem to bother to give up Ruby’s Global Interpreter Lock while blocked on IO. So you can have as many threads as you like, but only one MySQL query will ever happen at a time. If you don’t believe me, try firing off a Thread.new { connection.execute(”sleep 10″) } and then see if you have any joy querying MySQL in the next 10 seconds. Even with a connection pool, you’re shit outta luck.
This kind of thing rather removes the whole point and usefulness of event-loop based libraries like EventMachine when used with MySQL, and makes ActiveRecord’s specially-thread-safe “allow_concurrency” option considerably less use when used with the MySQL adapter - if all the mysql query grunt work ends up serialized anyway, why bother using threads?
So, there’s a real need for non-blocking APIs, and for Ruby library writers to get serious about this if they want their libraries to be usable in these scenarios. I suspect that, because Rails dodges the whole issue by using separate processes for all its concurrency, this hasn’t been a priority. But really, Ruby needs to break away from Rails, and get serious about scalability, threading, performance and all that. Zed Shaw, who wrote Mongrel and knows a lot more than me about this stuff, has been battling with for a while with the Ruby community’s attitude towards some of these issues.
What I’m hoping is that, now, with the advent of alternatives like Merb (and, before too long I hope, our own Brix framework) which make better use of threading, people will start to work on these issues.
Really though, this is just a long-winded whinge, and I could have just said “Someone extend MySQL/Ruby to do non-blocking IO, like the Ruby Postgres library already can!”. I would offer to do this myself, but my C-fu is sufficiently rusty that it’s probably not worth it yet. So, a call to arms instead.
Comments
RSSSérgio Gomes, matthew, Sérgio Gomes
matthew, Anthony Green
matthew, Maximillian Dornsief
matthew, Mark, Anthony Green [...]
Jonathan Boutelle
matthew, Ashley Moran, matthew [...]