A security context is a set of constraints that are applied to a container in order to achieve the following goals (from security design):
The problem of securing containers in Kubernetes has come up before and the potential problems with container security are well known. Although it is not possible to completely isolate Docker containers from their hosts, new features like user namespaces make it possible to greatly reduce the attack surface.
In order to improve container isolation from host and other containers running on the host, containers should only be granted the access they need to perform their work. To this end it should be possible to take advantage of Docker features such as the ability to add or remove capabilities and assign MCS labels to the container process.
Support for user namespaces has recently been merged into Docker's libcontainer project and should soon surface in Docker itself. It will make it possible to assign a range of unprivileged uids and gids from the host to each container, improving the isolation between host and container and between containers.
In order to support external integration with shared storage, processes running in a Kubernetes cluster should be able to be uniquely identified by their Unix UID, such that a chain of ownership can be established. Processes in pods will need to have consistent UID/GID/SELinux category labels in order to access shared disks.
In order of increasing complexity, following are example use cases that would be addressed with security contexts:
Kubernetes is used to run a single cloud application. In order to protect nodes from containers:
Just like case #1, except that I have more than one application running on the Kubernetes cluster.
Kubernetes is used as the base for a PAAS with multiple projects, each project represented by a namespace.
A security context consists of a set of constraints that determine how a container is secured before getting created and run. A security context resides on the container and represents the runtime parameters that will be used to create and run the container via container APIs. A security context provider is passed to the Kubelet so it can have a chance to mutate Docker API calls in order to apply the security context.
It is recommended that this design be implemented in two phases:
The Kubelet will have an interface that points to a SecurityContextProvider
.
The SecurityContextProvider
is invoked before creating and running a given
container:
type SecurityContextProvider interface {
// ModifyContainerConfig is called before the Docker createContainer call.
// The security context provider can make changes to the Config with which
// the container is created.
// An error is returned if it's not possible to secure the container as
// requested with a security context.
ModifyContainerConfig(pod *api.Pod, container *api.Container, config *docker.Config)
// ModifyHostConfig is called before the Docker runContainer call.
// The security context provider can make changes to the HostConfig, affecting
// security options, whether the container is privileged, volume binds, etc.
// An error is returned if it's not possible to secure the container as requested
// with a security context.
ModifyHostConfig(pod *api.Pod, container *api.Container, hostConfig *docker.HostConfig)
}
If the value of the SecurityContextProvider field on the Kubelet is nil, the kubelet will create and run the container as it does today.
A security context resides on the container and represents the runtime parameters that will be used to create and run the container via container APIs. Following is an example of an initial implementation:
type Container struct {
... other fields omitted ...
// Optional: SecurityContext defines the security options the pod should be run with
SecurityContext *SecurityContext
}
// SecurityContext holds security configuration that will be applied to a container. SecurityContext
// contains duplication of some existing fields from the Container resource. These duplicate fields
// will be populated based on the Container configuration if they are not set. Defining them on
// both the Container AND the SecurityContext will result in an error.
type SecurityContext struct {
// Capabilities are the capabilities to add/drop when running the container
Capabilities *Capabilities
// Run the container in privileged mode
Privileged *bool
// SELinuxOptions are the labels to be applied to the container
// and volumes
SELinuxOptions *SELinuxOptions
// RunAsUser is the UID to run the entrypoint of the container process.
RunAsUser *int64
}
// SELinuxOptions are the labels to be applied to the container.
type SELinuxOptions struct {
// SELinux user label
User string
// SELinux role label
Role string
// SELinux type label
Type string
// SELinux level label.
Level string
}
It is up to an admission plugin to determine if the security context is acceptable or not. At the time of writing, the admission control plugin for security contexts will only allow a context that has defined capabilities or privileged. Contexts that attempt to define a UID or SELinux options will be denied by default. In the future the admission plugin will base this decision upon configurable policies that reside within the service account.