Thu Nov 24 2016
At Made.com we're unabashed fans of the ELK stack, and I spend a decent amount of my time thinking about how we parse, ship, and store logs.
Currently, we use an ELK stack setup that looks like this:
Rsyslog receives logs from our Docker containers, via the syslog docker logging driver, and from the rest of the system via the journal. We tag the logs at this point (as application, http, or system logs), and normalise them all to the same json format.
We ship those json logs to an Elasticache Redis instance, and consume them in Logstash. Finally, Logstash routes the logs to the correct Elasticsearch index based on their tags.
This is more-or-less a best-practice set up for ELK but Logstash, honestly, is my least favourite part of the stack. Recently I've had some conversations on the Rsyslog mailing list about whether we can replace Logstash entirely, and just use Rsyslog.
Why might we want to do that?
To test whether that's feasible, I want to build a couple of prototype architectures. Firstly I want to try our basic setup but replacing Logstash directly with RSyslog
Secondly I'd like to try skipping Redis altogether and going directly to a central rsyslog server using relp. I suspect this will have better throughput, and lower running costs, without loss in reliability.
For all of this to really work well, I'm going to have to roll up my sleeves and write some code.
I'm going to open source all the prototypes, including Ansible playbooks and Docker files so that you can play along at home or in the cloud. If you've got any suggestions for other reference architectures or use cases you'd like to see from the REK stack get in touch on the mailing list [http://lists.adiscon.net/mailman/listinfo/rsyslog].