One of my favorite tricks is to generate the JSON for your API responses in Postgres instead of in Ruby.
You can get 100x speedups, but the downside is that you wind up with big nasty SQL queries that duplicate Ruby logic and are hard to maintain. There was a nice gem that would automatically produce JSON-generating SQL [0], but it is abandoned now. It only supports Rails 4 and ActiveModelSerializers 0.8, which are both quite old. I just published a similar gem [1] that works for Rails 5 and AMS 0.10. Unlike the old gem, mine outputs JSON:API (common in Ember projects). I hope to add the traditional JSON format too. AMS 0.10 makes this easier, since it introduced the concept of "adapters". My gem is super new and doesn't yet cover a lot of serializer functionality, but I'm hopeful about supporting more & more. Feedback is welcome!
On the writing side, I've found activerecord-import [2] to be very useful. It batches up INSERTs and UPDATEs and for Postgres it even knows how to use INSERT ON CONFLICT DO UPDATE.
I also have a handful of Postgres extensions that use arrays to do fast math/stats calculations. [3, 4, 5] If you are thinking about moving something into C, it's natural to consider a gem with native extensions, but perhaps a Postgres C extension will be even better.
One way to mitigate spreading logic between SQL and Ruby is by using SQL views with something like Scenic [0] which can back an ActiveRecord model just like an ordinary table. This is especially nice for complex reporting because it centralizes your query logic in a single view as a sort of virtual table that can be queried from both ActiveRecord and normal Postgres queries.
Fwiw I never really got this approach to caching b/c the keys are always based on calling the db, and calling the db is usually what you're specifically trying to avoid with caching, not just the transform of db content to json/html, which should be cheap?
The active record queries like User.all are lazily loaded, so it’s totally possible to pass them around like keys without them being executed.
That said, I wouldn’t use a query as a cache key since there may be more than one thing I want to cache about that query (GP’s example is their as_json representation, but what if I wanted something different to be computed? Like maybe html snippets? I would expect the cache key to mention everything about the thing it’s caching.)
> what if I wanted something different to be computed? Like maybe html snippets? I would expect the cache key to mention everything about the thing it’s caching.
This is exactly where you'd use something like ActiveSupport::Cache.expand_cache_key(["blog posts", blog.posts]). You can add bits of info to your cache keys to distinguish them from one another.
It's probably also worth mentioning that the `cache` view helper does parts of it for you already, and takes into account the digest of the view code itself among other things. Rails provides a really neat abstraction here that does make caching easier.
It has some rough edges, and some gotchas, but overall it's pretty smart.
Having used and written on this subject extensively, you are right but in practice these approaches mean something along the lines of making a single primary key lookup (like 0.1 to maybe 3ms in most cases) to save many tens or hundreds of queries (in addition to the cost of rendering/templating) to build the nested content, so it is still extremely practical.
To add to the excellent answer from SiliconAlley, it's definitely something to balance. What you cache needs to take a non-insignificant amount of time to be worth caching. Typically those DB queries for the cache key itself are single-digit ms and the stuff you want to cache is in the hundreds of ms.
Don't forget that even fetching the cache data itself takes time, even with fast cache storage like redis, especially over the network (where some cache layers live when they are shared).
Ruby On Rails is an amazing piece of software. I love the fact that solving problems in Ruby so intuitive & expressive - You just have to focus on building the solution rather than constantly cracking the language.
I’m forced to learn rails at a new job — and I really dislike it. There’s so much magic and things are built in an obscure/extremely inefficient way to work around/“take advantage” of frameworks limitations/features.
There is so much code that is not part of the codebase driving the behavior of the system — and that third party code is very tightly coupled to the actual behavior that will be observed by running the code. Figuring how anything actually works is made massively harder than required and developers are encouraged towards designs that will require ridiculously inefficient database interactions.
Rails is terrible I think for experienced developers because there are no mechanisms in the codebase — instead there are layers and layers of “conventions” which often only exist to try to avoid bugs that would’ve been much better to catch or prevent with a combination of a type system, high level tests, and less mutation.
This is a decent criticism of Rails, but there are things you can do to make it smoother: try to avoid doing unusual things.
A big point of using a framework is to leverage an ecosystem other people's libraries (and stackoverflow questions!), and they will typically only work well if your code is relatively "normal".
If your code looks like "beginner" code in the framework, you're doing it right.
As a point of comparison, I switched to Django for a few years after a few years of Rails. Django was a lot less magical and was more debuggable. But Rails' magic actually encourages a better project structure. Django doesn't care about your structure.
I also found the thoroughness and completeness of the ecosystem surrounding Rails to be stronger. That is the real selling point to me. For most things you want to do in Rails, you will find a library that integrates with the project and DB structure you already have.
Disclaimer: I haven't used Rails since around 3, but my assumption is the DX hasn't changed too much. Plus if Rails N was marketed at Rails "N-1" devs, then new devs coming in late have a lot of learning to do.
> avoid doing unusual things
If by "unusual" you mean "not the Rails way", then this makes sense. But the problem is, the Rails way doesn't always line up with the way an experienced programmer might want to do something.
In the specific example of fetching data from a DB - an experienced programmer might intuitively avoid multiple round-trips to the DB, and want to find the appropriate hook point(s) in a request handler to optimise this. Making 20 requests for 20 models is "unusual".
Rails doesn't make things like this easy. It's quite opinionated on how you should work. The core assumption in play here is that databases will either be fast, or can be made faster, essentially externalising the problem. Or if you want to address this - find a Gem to do it, or dive into the layered guts of it to figure it out.
Related - if I recall correctly, foreign keys were not deemed an important enough feature for inclusion in Active Record, and the Rails way is to embed the logic for joins into your models rather than the database. I'm sure many of us here could name several scenarios where this is just plain wrong.
Activerecord is indeed a major trade-off with rails. For heterogeneous projects, it can be frustrating that rails tends to favour validations, constraints and foreign key relationships in the ruby model files.
Eg. You have a dot.net service that needs to write some data to the database, and doing it via REST endpoints on your rails app is really slow compared to accessing the database directly. But now you don't have any constraints on your relationships, and it's easy to shoot yourself in the foot. Extra sad, since you do have some constraints (belongs_to etc) - they're just not communicated to your database.
I have some legacy rails projects I maintain, and my job would've been easier if previous developers followed the rails way more closely - and whenever I manage to refractor my way back, code becomes shorter, clearer, more efficient and less error prone.
Things like only half of a relationship being defined, and joins being hand-rolled in one direction is a "favorite"...
A long time ago I made "The complete guide to rails performance" by Nate Perkobec. It's not free but it really taught my a lot about on the ways to achieve very good performance with Ruby and Rails.
Does anybody have any good resources on how to achieve better performance with ruby and something like async, concurrent ruby, threadpools and forking?
Depends on what you're looking for and how much time you want to put into it. What Rails does, nothing else even holds a candle to. But it's hard-won mastery.
Rails, when learned properly, enables complete and total dominion over the entire sphere of web development. Nothing else compares, nothing else even comes close.
Anything you'd ever want to do with the web, save the kind of scaling that led Twitter to replatform 5 years in, can be done 10x faster and 10x more reliably with Rails. The main bottleneck is understanding. There's a zen to Rails that you have to appreciate before you can unlock its potential.
It greatly saddens me that Ruby and Rails has been falling out of favor.
what are modern practices for scaling rails apps when the database is the bottleneck? I have a GIS-heavy app that is built with rails, but leans on PostGIS a lot for the heavy computation. Simply getting bigger and bigger database instances doesn't seem to scale very well, and moving the GIS computation code in to ruby is not an option either (RGeo is good for some things, but can't replace PostGIS for most of my needs)
Postgres 10 introduced native table partitioning and it became much easier to use in postgres 11. This will let you divide large tables into smaller ones that can be queried more efficiently
Rails 6 introduced multiple databases natively in activerecord. You can more easily have read only databases or segment off high write workloads.
Foreign tables are a way to allow bringing in data from multiple databases into one so it “looks” like everything is in a single db. This got better/faster, particularly with joins in postgres 12.
Also consider things like materialized views or even plain views which can help bring together just the right data you need ahead of time.
Thanks! Partitioned and foreign tables look quite interesting -- I'm at the point where I need to scale the database compute load (with write access) across many nodes, but coordinating those nodes has seemed challenging with rails.
And yeah I can probably lean on views a bit more to save some search time for point/line-in-polygon operations.
Yes -- it's a good talk, but it focuses mostly on tricks you can do (spatial indexing, geometry subdividing) on the PostGIS side to make lookups/calculations faster, but discussion of actually scaling to multiple databases is a bit light. There is mention of Citus (which Microsoft has purchased and rebranded as Hyperscale), but it has a minimum cost that is more than I can afford right now.
I didn't know if there were other architectures to consider for rails -- like having a few read replicas for web traffic, and then a single large writable master for sidekiq/background jobs, for instance.
If you're at the limits of "scaling up" in PostGIS you could look into "scaling out" by migrating to a NoSQL solution like Cassandra or HBase + GeoMesa[1]. However, this would be a pretty significant change and introduces a lot of new operational burden, so I wouldn't do it unless you think you can't meet your goals with PostGIS.
I think a good first step would be bringing in a PostGIS consultant who can give you some expert advice on tuning for your current workload.
I appreciate it! Postgis I think is still quite good, it's just computationally intense for some operations (which I suspect geomesa might run in to as well -- I think a lot of gis libraries make use of gdal and proj libraries under the hood)
I vaguely remember an article about a fatal flaw in Ruby on Rails that was said to be not fixable? Is that ancient history and has been overcome, or was it not even a thing and I read bogus information?
I have developed in Django and avoided RoR due to this possibility, but am always interested in learning more, provided the tools have a decent shelf life...
You might be thinking of an old SQL injection vulnerability that allowed updating models from unchecked request parameters, aka the mass assignment problem[0]. That type of thing can be a concern in any web application, regardless of the framework. Modern Rails does a decent job of discouraging it out of the box.
Is ruby that slow? I always understood it was comparable to python?
Plus if it is performant enough for Shopify and Github it seems like it would be performant for 90%+ of use cases? The speed of development and the flexibility of development is much greater than most stacks once you add rails.
Your mileage may vary but in my experience writing a decent amount of complex Ruby and Python code, Ruby is often about 20-50% faster than similar Python. I don’t really know why that is.
That said, the difference isn’t enough to matter. Both languages are in the same performance magnitude and thus suited for the same kind of tasks.
On my 2013 Macbook Pro, using Ruby 2.6.5 and Python 3.7.4, I wrote two parsers for a 19Mb Apache log file - logs1.txt. Code for each language:
Ruby
IO.foreach('logs1.txt') {|x| puts x if /\b\w{15}\b/.match? x }
Python
with open('logs1.txt', 'r') as fh:
for line in fh:
if re.search(r'\b\w{15}\b', line): print(line, end='')
Ruby's best was 3.3 seconds whilst Python's was 3.9 seconds. If you factor-out Ruby taking 100ms longer than Python to startup Ruby is 18% faster than Python for string parsing. Results probably differ for Python's numeric performance but that depends whether you're including C-based third party libs such as numpy.
In fact, Ruby is not performant enough for Shopify, Github, and [<large_company>, ...].
It's very misleading for large companies to say "Ruby on Rails is performant enough", as they rely on other languages for their performance critical workflows.
I remember sometime ago on HN a GitLab developer repeating the same sentence. Actually, GitLab does use Golang for some microservices.
It's wonderful that a combination of Ruby and <performant_language> is easy to develop/maintain and fast at the same time, however, the distinction with a pure Ruby system is not a trivial semantic matter, as it perpetuates the wrong and misleading idea that pure Ruby is fast enough to develop a complex system.
I asked a couple of the Shopify engineers a a conference in late 2018: almost all of Shopify's web and background job processing is Ruby. They're quite open about it:
Hm, a lot of the article focuses on optimizing your interaction with a database (rdbms/sql), which has nothing to do with the performance of ruby (some of it arguably has to do with _Rails_).
Even more of the article has to do with Rails, and not ruby in general.
But the larger point, are you suggesting that in a language that is "performant enough", developers need no performance advice, they can write however they want and get adequate performance, for any performance needs?
Maybe, although I'm dubious. The existence of advice for programming does not mean a language/platform isn't "something enough" until we have AI writing code. At any rate, if such a language exists it is not ruby. But I disagree with that suggestion. The existence of optimization advice does not mean that a language "is not performant enough". If Shoppify is still happily using ruby/rails, I would say it indeed demonstrates they are performant enough for them.
Nope I'm not. But Shopify is large enough to have a first-class engineering team that would (presumably) be writing efficient code, which is nothing special. So if this first-class engineering team is still writing about _fast code in Ruby_, doesn't that imply Ruby isn't fast?
Shopify's engineering team is obviously writing code congruent with the advice they give here, that's why they give the advice! I'm not following you at all.
I don’t know, I think most of the high level points made in the article (caching, know your SQL/ORM, memory management and algorithm complexity) apply to almost any web application and framework once you hit some level of scale.
This is an article about the Ruby on Rails framework not the ruby language. Most of the discussion is about SQL and query optimisation.
Also describing common performance pitfalls does not imply that something is or isn’t “performant enough”. Only that you can make mistakes that will affect performance.
That, I’m afraid, is true for every language or framework.
Speaking as someone who uses Rails regularly, I agree - it is considerably slower than many other language choices. It probably isn't a good call for a performance critical bottlenecked service that can't scale horizontally.
For everything else, it's great, and it's almost always a good default to choose speed of development and engineer productivity over runtime performance.
> For everything else, it's great, and it's almost always a good default to choose speed of development and engineer productivity over runtime performance.
Doesn't Elixir on Phoenix[1] provide the elegance of Ruby on Rails without its drawbacks?
> Doesn't Elixir on Phoenix[1] provide the elegance of Ruby on Rails without its drawbacks?
I'm not sure I would say elegance is the main feature of Rails, but I think productivity is. IMO nothing out there beats Rails in terms of productivity on a new project. Phoenix is clearly inspired by Rails, but if you'll permit me to pull a semi-random number out of my ass I don't think Phoenix will ever be able to do better than about 80% of Rails's productivity. Some things are just not as simple (for example, just due to the nature of things Ecto is not as simple to write with as ActiveRecord is).
That said, I've totally moved away from Ruby on Rails and over to Elixir/Phoenix for my stuff. I absolutely love Elixir. So far in my experience I'm finding it more performant, easier to maintain, but a little less productive to write in. But still plenty productive enough.
> I don't think Phoenix will ever be able to do better than about 80% of Rails's productivity.
I haven't used Elixir/Phoenix myself, but my main issue with Ruby on Rails was how much of long-term productivity was sacrificed for short-term productivity gains.
In other words, what made it extremely easy to create the first version of the application with Rails, also made it very hard to maintain and scale it later on.
Therefore, I would consider it a win, if 20% decrease in short-term productivity provided an equal increase in long-term productivity, which is much harder to measure.
I think you’ll have to spell out specifically what those things are that affect long term productivity if you’re going to use them as a argument against use of Rails.
I not saying your wrong, just that a hand wavy “rails is bad in the long term” is not an argument.
> Phoenix will ever be able to do better than about 80% of Rails's productivity.
It took me 4 hours to learn LiveView, 3 to get it up and running with Phoenix PubSub (which I didn't know before either), and another 2 to get Presence in there. So I now have a live dashboard for the backend of our service (and it will probably be similar to the frontend). The rest of the week was developing a system so that I could run acceptance tests concurrently (you have to shard your Registries and your Channels and use a secret erlang feature to track who's what). Everything "just worked".
I think this 10 hour enterprise would have taken me at least a month to do with a standard RESTful interface and a backend off the shelf PubSub like RabbitMQ, and probably two months to do with an non-featureful backend websockets system (I find websockets confusing).
It's hard for me to believe that any other system is as productive.
I'm pretty much all in on Phoenix, but I suppose a good reason to stick with RoR would be that 1) it's easier to find (experienced) devs, and 2) there are (still?) many more packages available which can make quick development easier.
Our research has indicated that Elixir has basically stopped growing in the general market.... really only getting any growth from converted erlang devs.
Have you seen anything different? Most enthusiastic vendors I’ve talked to have moved away from elixir due to low adoption and demand.
Tiobe, google trends, etc tell the same story.
I’d love to find evidence to the contrary but the low numbers of elixir devs seems to be both a supply AND demand problem... ie lack of traction and interest.
I'm not on much of a 'market' and use Elixir/Phoenix for clients when I expect that either I will be primary on the codebase, or that I'll be training someone to take over.
In the latter case, I feel confident enough about Elixir that I'd train someone to use it rather than opting for a more 'mainstream' stack.
I suppose if getting new developers on-board asap is a concern, Elixir might not be the best choice. But personally if I ever were to build a product that would eventually need more programmers, I'd pick Elixir even just for the 'filter' effect. It's easy enough to pick up for a good developer, but would filter out many of the shittier devs I'd encounter if I went for PHP/Python/Ruby. and the benefits are worth it to me.
All that aside, while I am not confident that Elixir will gain much traction, I think there's a small chance LiveView and Telemetry, among other projects, might divert Elixir from the path that, say, Clojure, seems to have taken when it comes to traction.
Hex.pm (Elixir + Erlang) currently has 8900 libraries. Compared with Ruby's 155000 that's one big drawback of choosing Elixir. Add to this the mature Rails infrastructure and community resources there's definitely a trade-off.
While I agree that library number is an okay barometer of community health. I'd also look at age of packages and latest commits to gauge a trajectory. Furthermore, if I found the packages I needed to accomplish most tasks (auth, task scheduling, message queuing, database, logging, serialization) and those packages were of good quality, does sheer number even matter at that point? I'd say when looking at Elixir's community health vs something like Nim or Crystal, it's in great shape. (and I'm not wiling to call either of those languages "defunct")
Performance matters rarely in my experience. There is a world of low load services out there, with servers sitting idle most of the time waiting for requests. And that's not the case of Shopify which is doing well with Ruby anyway.
Performance matters in web applications, but that is almost never determined by the speed of execution of the web framework programming language. The performance bottlenecks are almost always in the database and/or the architecture.
It's also very easy to scale horizontally the execution of the web code. That's why, for most web software, programmer productivity is more important than raw language performance
Horizontal scalability is not always easy, but it is always expensive. Startups especially tend to discover the latter the hard way when they receive surprise bills from AWS.
Scaling horizontally a Rails application where the bottleneck is the Ruby code is always very easy - just add more instances working on the same DB. It is usually more difficult because the bottleneck is in the DB, not the Ruby code.
You're right about price though; if the cost of compute is relevant in your business model - which, again, usually isn't the case - then doing that computation with Ruby on AWS isn't the best choice.
> No way around the fact that it very slow though.
You can run Sinatra on TruffleRuby today and it's very competitive with e.g. Go web frameworks. It's not quite production ready but has commercial support from Shopify (Chris Seaton) and Oracle Labs. It will get there soon.
There's also no reason I know of that you couldn't build a simple optimizing JIT for CRuby like LuaJIT. I'm trying a proof of concept at the moment.
You can get 100x speedups, but the downside is that you wind up with big nasty SQL queries that duplicate Ruby logic and are hard to maintain. There was a nice gem that would automatically produce JSON-generating SQL [0], but it is abandoned now. It only supports Rails 4 and ActiveModelSerializers 0.8, which are both quite old. I just published a similar gem [1] that works for Rails 5 and AMS 0.10. Unlike the old gem, mine outputs JSON:API (common in Ember projects). I hope to add the traditional JSON format too. AMS 0.10 makes this easier, since it introduced the concept of "adapters". My gem is super new and doesn't yet cover a lot of serializer functionality, but I'm hopeful about supporting more & more. Feedback is welcome!
On the writing side, I've found activerecord-import [2] to be very useful. It batches up INSERTs and UPDATEs and for Postgres it even knows how to use INSERT ON CONFLICT DO UPDATE.
I also have a handful of Postgres extensions that use arrays to do fast math/stats calculations. [3, 4, 5] If you are thinking about moving something into C, it's natural to consider a gem with native extensions, but perhaps a Postgres C extension will be even better.
[0] https://github.com/DavyJonesLocker/postgres_ext-serializers
[1] https://github.com/pjungwir/active_model_serializers_pg
[2] https://github.com/zdennis/activerecord-import
[3] https://github.com/pjungwir/aggs_for_arrays
[4] https://github.com/pjungwir/aggs_for_vecs
[5] https://github.com/pjungwir/floatvec