This document captures the design of event compression.
Kubernetes components can get into a state where they generate tons of events.
The events can be categorized in one of two ways:
For example, when pulling a non-existing image, Kubelet will repeatedly generate
image_not_existing
and container_is_waiting
events until upstream components
correct the image. When this happens, the spam from the repeated events makes
the entire event mechanism useless. It also appears to cause memory pressure in
etcd (see #3853).
The goal is introduce event counting to increment same events, and event aggregation to collapse similar events.
Each binary that generates events (for example, kubelet
) should keep track of
previously generated events so that it can collapse recurring events into a
single event instead of creating a new instance for each new event. In addition,
if many similar events are created, events should be aggregated into a single
event to reduce spam.
Event compression should be best effort (not guaranteed). Meaning, in the worst
case, n
identical (minus timestamp) events may still result in n
event
entries.
Instead of a single Timestamp, each event object contains the following fields:
FirstTimestamp unversioned.Time
LastTimestamp unversioned.Time
Count int
Each binary that generates events:
pkg/client/record/events_cache.go
.EventCorrelator
that manages two subcomponents:
EventAggregator
and EventLogger
.EventCorrelator
observes all incoming events and lets each
subcomponent visit and modify the event in turn.EventAggregator
runs an aggregation function over each event. This
function buckets each event based on an aggregateKey
and identifies the event
uniquely with a localKey
in that bucket.event.Message
. Its localKey
is event.Message
and its aggregate key is
produced by joining:
event.Source.Component
event.Source.Host
event.InvolvedObject.Kind
event.InvolvedObject.Namespace
event.InvolvedObject.Name
event.InvolvedObject.UID
event.InvolvedObject.APIVersion
event.Reason
EventAggregator
observes a similar event produced 10 times in a 10
minute window, it drops the event that was provided as input and creates a new
event that differs only on the message. The message denotes that this event is
used to group similar events that matched on reason. This aggregated Event
is
then used in the event processing sequence.EventLogger
observes the event out of EventAggregation
and tracks
the number of times it has observed that event previously by incrementing a key
in a cache associated with that matching event.event.Source.Component
event.Source.Host
event.InvolvedObject.Kind
event.InvolvedObject.Namespace
event.InvolvedObject.Name
event.InvolvedObject.UID
event.InvolvedObject.APIVersion
event.Reason
event.Message
EventAggregator
and
EventLogger
. That means if a component (e.g. kubelet) runs for a long period
of time and generates tons of unique events, the previously generated events
cache will not grow unchecked in memory. Instead, after 4096 unique events are
generated, the oldest events are evicted from the cache.pkg/client/unversioned/record/event.go
).
Sample kubectl output:
FIRSTSEEN LASTSEEN COUNT NAME KIND SUBOBJECT REASON SOURCE MESSAGE
Thu, 12 Feb 2015 01:13:02 +0000 Thu, 12 Feb 2015 01:13:02 +0000 1 kubernetes-node-4.c.saad-dev-vms.internal Minion starting {kubelet kubernetes-node-4.c.saad-dev-vms.internal} Starting kubelet.
Thu, 12 Feb 2015 01:13:09 +0000 Thu, 12 Feb 2015 01:13:09 +0000 1 kubernetes-node-1.c.saad-dev-vms.internal Minion starting {kubelet kubernetes-node-1.c.saad-dev-vms.internal} Starting kubelet.
Thu, 12 Feb 2015 01:13:09 +0000 Thu, 12 Feb 2015 01:13:09 +0000 1 kubernetes-node-3.c.saad-dev-vms.internal Minion starting {kubelet kubernetes-node-3.c.saad-dev-vms.internal} Starting kubelet.
Thu, 12 Feb 2015 01:13:09 +0000 Thu, 12 Feb 2015 01:13:09 +0000 1 kubernetes-node-2.c.saad-dev-vms.internal Minion starting {kubelet kubernetes-node-2.c.saad-dev-vms.internal} Starting kubelet.
Thu, 12 Feb 2015 01:13:05 +0000 Thu, 12 Feb 2015 01:13:12 +0000 4 monitoring-influx-grafana-controller-0133o Pod failedScheduling {scheduler } Error scheduling: no nodes available to schedule pods
Thu, 12 Feb 2015 01:13:05 +0000 Thu, 12 Feb 2015 01:13:12 +0000 4 elasticsearch-logging-controller-fplln Pod failedScheduling {scheduler } Error scheduling: no nodes available to schedule pods
Thu, 12 Feb 2015 01:13:05 +0000 Thu, 12 Feb 2015 01:13:12 +0000 4 kibana-logging-controller-gziey Pod failedScheduling {scheduler } Error scheduling: no nodes available to schedule pods
Thu, 12 Feb 2015 01:13:05 +0000 Thu, 12 Feb 2015 01:13:12 +0000 4 skydns-ls6k1 Pod failedScheduling {scheduler } Error scheduling: no nodes available to schedule pods
Thu, 12 Feb 2015 01:13:05 +0000 Thu, 12 Feb 2015 01:13:12 +0000 4 monitoring-heapster-controller-oh43e Pod failedScheduling {scheduler } Error scheduling: no nodes available to schedule pods
Thu, 12 Feb 2015 01:13:20 +0000 Thu, 12 Feb 2015 01:13:20 +0000 1 kibana-logging-controller-gziey BoundPod implicitly required container POD pulled {kubelet kubernetes-node-4.c.saad-dev-vms.internal} Successfully pulled image "kubernetes/pause:latest"
Thu, 12 Feb 2015 01:13:20 +0000 Thu, 12 Feb 2015 01:13:20 +0000 1 kibana-logging-controller-gziey Pod scheduled {scheduler } Successfully assigned kibana-logging-controller-gziey to kubernetes-node-4.c.saad-dev-vms.internal
This demonstrates what would have been 20 separate entries (indicating scheduling failure) collapsed/compressed down to 5 entries.