logstash + elasticsearch are pretty amazing. however, if you are generating a high rate of log entries you may want to consider using mozilla hekad instead (http://hekad.readthedocs.org/en/latest/). on our servers logstash was running around 20% CPU during quite periods while hekad was running around 1-2% CPU. while during busy periods i think logstash was going up to 100% CPU while hekad was sitting around 20-30% CPU.
hekad is written in go which compiles down to native code while logstash is written in jruby which is not the most performant runtime.
I don't know what bottlenecks you had when were observing high resource usage in logstash, but, in general, if there's a performance problem, it is a bug, and we can fix it.
The next release of logstash (1.2.0 is in beta) has a 3.5x improvement in event throughput. For numbers: on my workstation at home (6 vcpu on virtualbox, host OS windows, 8gb ram, host cpu is FX-8150) - with logstash 1.1.13, I can process roughly 31,000 events/sec parsing apache logs. With logstash 1.2.0.beta1, I can process 102,000 events/sec.
Processing speed will vary greatly by what you are doing with your events and it doesn't make sense to generalize performance characteristics globally, especially with a metric that, alone, doesn't really tell me much (cpu utilization).
If it's slow, it's a bug. We can fix it. :)
Further, you can use hekad with logstash and with elasticsearch (one or both together, it doesn't matter).
In terms of problems solved, logstash helps solve transport and real-time processing problems. In cases where the logstash agent is too resource intensive, the logstash community offers many alternatives on this site: http://cookbook.logstash.net/recipes/log-shippers/
The community (myself included) is very interested in helping logstash be a success for its users, so if you do see performance problems, things that behave weirdly, or anything strange, it's probably a bug, and we can fix it.
that's good to hear. it could also have been a plugin that we were using that slowed things down. our log files are in csv so i wrote a plugin that uses ruby csv to parse lines and split them into key-value pairs based on String->List[String] hash we have. so it might have been the go csv parser has much better performance than the ruby csv parser.
Another possible log shipper is nxlog, it compiles to native code and does not have any noticeable impact in terms of CPU or memory usage on my various low-end servers.
I used to use nxlog to collect Windows AD logs to a Linux server. Often, it would end up deadlocking (2-3/week) on the Windows Side and would stop shipping the logs. It was very useful to have the logs but I'm very glad we were able to replace it.
On my servers I use the open source version of nxlog to collect various logs and forward them to a central nxlog server, which in turn feeds logstash. Behind logstash I have configured elasticsearch as storage and I use kibana as a GUI to search and browse.
We've had no issues with rsyslog, which already comes packaged in ubuntu. Runs with no issues on micro instances on the AWS cloud, even at heavy workloads.
on one of our machines that sends generates logs we do around 2400/s when the application is under heavy use. we have 9 machines that generates logs but they all generate different amounts. we mostly are using heka for generating stats from log files because we are too lazy to instrument the code and we have excellently detailed and formatted logs :) but we do have some logstash machines still pushing stuff to elastic search for low traffic applications we run.
we found that when using logstash even just for pushing stats to statsd it was not performing well enough. i've experimented with hekad pushing to elasticsearch on our staging cluster and performed well enough but we had weird problems showing up in nagios when we were using logstash+elasticsearch in production (checks were timing out even though we were seeing no degradation of performance on the servers). because of this it is quite difficult to get any kind of central log pushing into production. :(
hekad is written in go which compiles down to native code while logstash is written in jruby which is not the most performant runtime.