Organizations wish to avoid running "unapproved" images.
The exact nature of "approval" is beyond the scope of Kubernetes, but may include reasons like:
only allow images signed by certain public keys.
etc...
Goals of the design include:
Non-goals include:
Detailed discussion in Ensuring only images are from approved sources are run.
A new admission controller will be added. That will be the only change.
An ImagePolicyWebhook
admission controller will be written. The admission controller examines all pod objects which are
created or updated. It can either admit the pod, or reject it. If it is rejected, the request sees a 403 FORBIDDEN
The admission controller code will go in plugin/pkg/admission/imagepolicy
.
There will be a cache of decisions in the admission controller.
If the apiserver cannot reach the webhook backend, it will log a warning and either admit or deny the pod. A flag will control whether it admits or denys on failure. The rationale for deny is that an attacker could DoS the backend or wait for it to be down, and then sneak a bad pod into the system. The rationale for allow here is that, if the cluster admin also does after-the-fact auditing of what images were run (which we think will be common), this will catch any bad images run during periods of backend failure. With default-allow, the availability of Kubernetes does not depend on the availability of the backend.
The admission controller code in that directory does not contain logic to make an admit/reject decision. Instead, it extracts relevant fields from the Pod creation/update request and sends those fields to a Backend (which we have been loosely calling "WebHooks" in Kubernetes). The request the admission controller sends to the backend is called a WebHook request to distinguish it from the request being admission-controlled. The server that accepts the WebHook request from Kubernetes is called the "Backend" to distinguish it from the WebHook request itself, and from the API server.
The whole system will work similarly to the Authentication WebHook or the AuthorizationWebHook.
The WebHook request can optionally authenticate itself to its backend using a token from a kubeconfig
file.
The WebHook request and response are JSON, and correspond to the following go
structures:
// Filename: pkg/apis/imagepolicy.k8s.io/register.go
package imagepolicy
// ImageReview checks if the set of images in a pod are allowed.
type ImageReview struct {
unversioned.TypeMeta
// Spec holds information about the pod being evaluated
Spec ImageReviewSpec
// Status is filled in by the backend and indicates whether the pod should be allowed.
Status ImageReviewStatus
}
// ImageReviewSpec is a description of the pod creation request.
type ImageReviewSpec struct {
// Containers is a list of a subset of the information in each container of the Pod being created.
Containers []ImageReviewContainerSpec
// Annotations is a list of key-value pairs extracted from the Pod's annotations.
// It only includes keys which match the pattern `*.image-policy.k8s.io/*`.
// It is up to each webhook backend to determine how to interpret these annotations, if at all.
Annotations map[string]string
// Namespace is the namespace the pod is being created in.
Namespace string
}
// ImageReviewContainerSpec is a description of a container within the pod creation request.
type ImageReviewContainerSpec struct {
Image string
// In future, we may add command line overrides, exec health check command lines, and so on.
}
// ImageReviewStatus is the result of the token authentication request.
type ImageReviewStatus struct {
// Allowed indicates that all images were allowed to be run.
Allowed bool
// Reason should be empty unless Allowed is false in which case it
// may contain a short description of what is wrong. Kubernetes
// may truncate excessively long errors when displaying to the user.
Reason string
}
All annotations on a Pod that match *.image-policy.k8s.io/*
are sent to the webhook.
Sending annotations allows users who are aware of the image policy backend to send
extra information to it, and for different backends implementations to accept
different information.
Examples of information you might put here are
In any case, the annotations are provided by the user and are not validated by Kubernetes in any way. In the future, if an annotation is determined to be widely useful, we may promote it to a named field of ImageReviewSpec.
In the case of a Pod update, Kubernetes may send the backend either all images in the updated image, or only the ones that changed, at its discretion.
In the case of a Deployment object, no image check is done when the Deployment object is created or updated. Likewise, no check happens when the Deployment controller creates a ReplicaSet. The check only happens when the ReplicaSet controller creates a Pod. Checking Pod is necessary since users can directly create pods, and since third-parties can write their own controllers, which kubernetes might not be aware of or even contain pod templates.
The ReplicaSet, or other controller, is responsible for recognizing when a 403 has happened (whether due to user not having permission due to bad image, or some other permission reason) and throttling itself and surfacing the error in a way that CLIs and UIs can show to the user.
Issue 22298 needs to be resolved to propagate Pod creation errors up through a stack of controllers.
The Backend might change the policy over time. For example, yesterday redis:v1
was allowed, but today redis:v1
is not allowed
due to a CVE that just came out (fictional scenario). In this scenario:
.
It is up to the Backend implementor, and the cluster administrator who decides to use that backend, to decide whether the Backend should be allowed to change its mind. There is a tradeoff between responsiveness to changes in policy, versus keeping existing services running. The two models that make sense are:
If two clusters share an image policy backend, then they will have the same policies.
The clusters can pass different tokens to the backend, and the backend can use this to distinguish between different clusters.
Image tags are like: myrepo/myimage:v1
.
Image IDs are like: myrepo/myimage@sha256:beb6bd6a68f114c1dc2ea4b28db81bdf91de202a9014972bec5e4d9171d90ed
.
You can see image IDs with docker images --no-trunc
.
The Backend needs to be able to resolve tags to IDs (by talking to the images repo). If the Backend resolves tags to IDs, there is some risk that the tag-to-ID mapping will be modified after approval by the Backend, but before Kubelet pulls the image. We will not address this race condition at this time.
We will wait and see how much demand there is for closing this hole. If the community demands a solution, we may suggest one of these:
If none of these works well, we could do the following:
pod.spec.container[i].imageID
(or an annotation).
and kubelet will enforce that both the imageID and image match the image pulled.Since this adds complexity and interacts with imagePullPolicy, we avoid adding the above feature initially.
There will be a cache of decisions in the admission controller. TTL will be user-controllable, but default to 1 hour for allows and 30s for denies. Low TTL for deny allows user to correct a setting on the backend and see the fix rapidly. It is assumed that denies are infrequent. Caching allows permits RC to scale up services even during short unavailability of the webhook backend. The ImageReviewSpec is used as the key to the cache.
In the case of a cache miss and timeout talking to the backend, the default is to allow Pod creation. Keeping services running is more important than a hypothetical threat from an un-verified image.
There are several cases where an image not currently allowed might still run. Users wanting a complete audit solution are advised to also do after-the-fact auditing of what images ran. This can catch:
This proposal does not include post-pod-creation audit.
We could have done admission control on Deployments, Jobs, ReplicationControllers, and anything else that creates a Pod, directly or indirectly. This approach is good because it provides immediate feedback to the user that the image is not allowed. However, we do not expect disallowed images to be used often. And controllers need to be able to surface problems creating pods for a variety of other reasons anyways.
Other good things about this alternative are:
Impersonate-User
for controllers.Other problems are:
We could have sent the username of the pod creator to the backend. The username could be used to allow different users to run
different categories of images. This would require propagating the username from e.g. Deployment creation, through to
Pod creation via, e.g. the Impersonate-User:
header. This feature is not ready.
When it is, we will re-evaluate adding user as a field of ImagePolicyRequest
.
Docker supports plugins which can check any container creation before it happens. For example the twistlock/authz Docker plugin can audit the full request sent to the Docker daemon and approve or deny it. This could include checking if the image is allowed.
We reject this option because:
User
in policy).We decided to store policy about what SecurityContexts a pod can have in the API, via PodSecurityPolicy. This is because Pods are a Kubernetes object, and the Policy is very closely tied to the definition of Pods, and grows in step as the Pods API grows.
For Image policy, the connection is not as strong. To Kubernetes API, and Image is just a string, and it does not know any of the image metadata, which lives outside the API.
Image policy may depend on the Dockerfile, the source code, the source repo, the source review tools, vulnerability databases, and so on. Kubernetes does not have these as built-in concepts or have plans to add them anytime soon.
We considered a whitelist/blacklist of registries and/or repositories. Basically, a prefix match on image strings. The problem of approving images would be then pushed to a problem of controlling who has access to push to a trusted registry/repository. That approach is simple for kubernetes. Problems with it are:
We still want to use rkt trust, docker content trust, etc for any registries used. We just need additional image policy checks beyond what trust can provide.
Instead of just sending a subset of PodSpec to an Image Provenance backed, we could have sent every object that is created or updated (or deleted?) to one or ore Generic Admission Control Backends.
This might be a good idea, but needs quite a bit more thought. Some questions with that approach are: It will not be a generic webhook. A generic webhook would need a lot more discussion:
Additionally, just sending all the fields of just the Pod kind also has problems: