Active vs Passive logging

https://rclayton.silvrback.com/container-services-logging-with-docker

Active logging

Active Logging is the practice of having services participate directly in the logging infrastructure. This is may include writing logs to the filesystem, sending them over the network to an aggregator (e.g. Logstash, Fluentd), or pushing entries to a SaaS service.

There are three reasons for avoiding the practice

1. Active loggers can cause a service to fail

  • What happens when your aggregation endpoint goes down and the microservice can't send log entries to it?
  • Does the microservice buffer logs until it runs out of memory?
  • What if the logging client causes an uncaught runtime exception?

2. Active loggers tend to function differently in every environment

  • Most companies rarely achieve feature parity between development, testing, and production, particularly when it comes to logging.

3. Active loggers are sensitive to infrastructure changes

  • What happens when you want to change your log infrastructure?
  • To accommodate these changes you are going to need to change all of your microservices and redeploy the system.

Passive Logging is the recommended practice

  • Passive Logging is the output of log data to standard interfaces, typically stdout and stderr.
  • This strategy leaves the deployment infrastructure responsible for aggregating data to the file system, external aggregators, and data stores.

Advantages

  • Simple, portable, and environment agnostic.
  • These features allow you to change the underlying logging platform without having to recompile/redploy services.

Implementing Passive Logging

  • A common deployment pattern is to have a local instance of Logstash (or Fluentd) deployed to every ECS host in the infrastructure that can receive log entries from the local Docker daemon and forward those entries to the rest of the log infrastructure
caution

If you override the logging driver to use something other than json-file or journald you will notice that docker logs no longer works.

An alternative to overriding the default logging driver is to use a special daemon called Logspout:

Logspout is a log router for Docker containers that runs inside Docker. It attaches to all containers on a host, then routes their logs wherever you want. - https://github.com/gliderlabs/logspout

A great example of this is when you want to preserve logs for local debugging purposes (e.g. docker logs) but also want logs shipped to a remote destination.

Write log entries from local aggregators to a remote buffer.

Instead of shipping logs directly to remote aggregators or data stores, ship them to a buffer (Redis is commonly used for this purpose, but Kafka is also a great option). This practice saved my organization a week's worth of log data once when our ElasticSearch cluster ran out of disk space. Using a remote buffer can also alleviate spikes in log traffic that lead to extra strain on destination data stores.

# Logstash configuration example
output {
redis {
host => "elasticcache.buffer"
data_type => "list"
key => "logstash"
}
}

Employ remote aggregators to route log entries from the buffer to destination indices and storage.

# Logstash configuration example
input {
redis {
host => "elasticcache.buffer"
type => "log"
data_type => "list"
key => "logstash"
}
}
output {
# Amazon hosted ElasticSearch
amazon_es {
hosts => ["logs-asdfjkasdfkhaiweruihwyehaskdf.us-west-2.es.amazonaws.com"]
region => "us-west-2"
}
s3 {
region => "us-west-2"
bucket => "com.example.logs.production"
size_file => 2048 # bytes
time_file => 5 # minutes
}
}
Last updated on