PodSecurityPolicy allows cluster administrators to control the creation and validation of a security context for a pod and containers.
Administration of a multi-tenant cluster requires the ability to provide varying sets of permissions among the tenants, the infrastructure components, and end users of the system who may themselves be administrators within their own isolated namespace.
Actors in a cluster may include infrastructure that is managed by administrators, infrastructure that is exposed to end users (builds, deployments), the isolated end user namespaces in the cluster, and the individual users inside those namespaces. Infrastructure components that operate on behalf of a user (builds, deployments) should be allowed to run at an elevated level of permissions without granting the user themselves an elevated set of permissions.
Use case 1: As an administrator, I can create a namespace for a person that can't create privileged containers AND enforce that the UID of the containers is set to a certain value
Use case 2: As a cluster operator, an infrastructure component should be able to create a pod with elevated privileges in a namespace where regular users cannot create pods with these privileges or execute commands in that pod.
Use case 3: As a cluster administrator, I can allow a given namespace (or service account) to create privileged pods or to run root pods
Use case 4: As a cluster administrator, I can allow a project administrator to control the security contexts of pods and service accounts within a project
PodSecurityPolicy
.user.Info
must be available to admission controllers. (Completed in
https://github.com/GoogleCloudPlatform/kubernetes/pull/8203)PodSecurityPolicy objects exist in the root scope, outside of a namespace. The
PodSecurityPolicy will reference users and groups that are allowed
to operate under the constraints. In order to support this, ServiceAccounts
must be mapped
to a user name or group list by the authentication/authorization layers. This allows the security
context to treat users, groups, and service accounts uniformly.
Below is a list of PodSecurityPolicies which will likely serve most use cases:
system:authenticated
group, and will likely be the most restrictive set of constraints.system:service-accounts
, which can be imposed by the service account authenticator / token generator.system:cluster-admins
group - a set of constraints with elevated privileges that can be used
by an administrative user or group.// PodSecurityPolicy governs the ability to make requests that affect the SecurityContext
// that will be applied to a pod and container.
type PodSecurityPolicy struct {
unversioned.TypeMeta `json:",inline"`
api.ObjectMeta `json:"metadata,omitempty"`
// Spec defines the policy enforced.
Spec PodSecurityPolicySpec `json:"spec,omitempty"`
}
// PodSecurityPolicySpec defines the policy enforced.
type PodSecurityPolicySpec struct {
// Privileged determines if a pod can request to be run as privileged.
Privileged bool `json:"privileged,omitempty"`
// Capabilities is a list of capabilities that can be added.
Capabilities []api.Capability `json:"capabilities,omitempty"`
// Volumes allows and disallows the use of different types of volume plugins.
Volumes VolumeSecurityPolicy `json:"volumes,omitempty"`
// HostNetwork determines if the policy allows the use of HostNetwork in the pod spec.
HostNetwork bool `json:"hostNetwork,omitempty"`
// HostPorts determines which host port ranges are allowed to be exposed.
HostPorts []HostPortRange `json:"hostPorts,omitempty"`
// HostPID determines if the policy allows the use of HostPID in the pod spec.
HostPID bool `json:"hostPID,omitempty"`
// HostIPC determines if the policy allows the use of HostIPC in the pod spec.
HostIPC bool `json:"hostIPC,omitempty"`
// SELinuxContext is the strategy that will dictate the allowable labels that may be set.
SELinuxContext SELinuxContextStrategyOptions `json:"seLinuxContext,omitempty"`
// RunAsUser is the strategy that will dictate the allowable RunAsUser values that may be set.
RunAsUser RunAsUserStrategyOptions `json:"runAsUser,omitempty"`
// The users who have permissions to use this policy
Users []string `json:"users,omitempty"`
// The groups that have permission to use this policy
Groups []string `json:"groups,omitempty"`
}
// HostPortRange defines a range of host ports that will be enabled by a policy
// for pods to use. It requires both the start and end to be defined.
type HostPortRange struct {
// Start is the beginning of the port range which will be allowed.
Start int `json:"start"`
// End is the end of the port range which will be allowed.
End int `json:"end"`
}
// VolumeSecurityPolicy allows and disallows the use of different types of volume plugins.
type VolumeSecurityPolicy struct {
// HostPath allows or disallows the use of the HostPath volume plugin.
// More info: http://releases.k8s.io/release-1.4/docs/user-guide/volumes.md#hostpath
HostPath bool `json:"hostPath,omitempty"`
// EmptyDir allows or disallows the use of the EmptyDir volume plugin.
// More info: http://releases.k8s.io/release-1.4/docs/user-guide/volumes.md#emptydir
EmptyDir bool `json:"emptyDir,omitempty"`
// GCEPersistentDisk allows or disallows the use of the GCEPersistentDisk volume plugin.
// More info: http://releases.k8s.io/release-1.4/docs/user-guide/volumes.md#gcepersistentdisk
GCEPersistentDisk bool `json:"gcePersistentDisk,omitempty"`
// AWSElasticBlockStore allows or disallows the use of the AWSElasticBlockStore volume plugin.
// More info: http://releases.k8s.io/release-1.4/docs/user-guide/volumes.md#awselasticblockstore
AWSElasticBlockStore bool `json:"awsElasticBlockStore,omitempty"`
// GitRepo allows or disallows the use of the GitRepo volume plugin.
GitRepo bool `json:"gitRepo,omitempty"`
// Secret allows or disallows the use of the Secret volume plugin.
// More info: http://releases.k8s.io/release-1.4/docs/user-guide/volumes.md#secrets
Secret bool `json:"secret,omitempty"`
// NFS allows or disallows the use of the NFS volume plugin.
// More info: http://releases.k8s.io/release-1.4/docs/user-guide/volumes.md#nfs
NFS bool `json:"nfs,omitempty"`
// ISCSI allows or disallows the use of the ISCSI volume plugin.
// More info: http://releases.k8s.io/release-1.4/examples/volumes/iscsi/README.md
ISCSI bool `json:"iscsi,omitempty"`
// Glusterfs allows or disallows the use of the Glusterfs volume plugin.
// More info: http://releases.k8s.io/release-1.4/examples/volumes/glusterfs/README.md
Glusterfs bool `json:"glusterfs,omitempty"`
// PersistentVolumeClaim allows or disallows the use of the PersistentVolumeClaim volume plugin.
// More info: http://releases.k8s.io/release-1.4/docs/user-guide/persistent-volumes.md#persistentvolumeclaims
PersistentVolumeClaim bool `json:"persistentVolumeClaim,omitempty"`
// RBD allows or disallows the use of the RBD volume plugin.
// More info: http://releases.k8s.io/release-1.4/examples/volumes/rbd/README.md
RBD bool `json:"rbd,omitempty"`
// Cinder allows or disallows the use of the Cinder volume plugin.
// More info: http://releases.k8s.io/release-1.4/examples/mysql-cinder-pd/README.md
Cinder bool `json:"cinder,omitempty"`
// CephFS allows or disallows the use of the CephFS volume plugin.
CephFS bool `json:"cephfs,omitempty"`
// DownwardAPI allows or disallows the use of the DownwardAPI volume plugin.
DownwardAPI bool `json:"downwardAPI,omitempty"`
// FC allows or disallows the use of the FC volume plugin.
FC bool `json:"fc,omitempty"`
}
// SELinuxContextStrategyOptions defines the strategy type and any options used to create the strategy.
type SELinuxContextStrategyOptions struct {
// Type is the strategy that will dictate the allowable labels that may be set.
Type SELinuxContextStrategy `json:"type"`
// seLinuxOptions required to run as; required for MustRunAs
// More info: http://releases.k8s.io/release-1.4/docs/design/security_context.md#security-context
SELinuxOptions *api.SELinuxOptions `json:"seLinuxOptions,omitempty"`
}
// SELinuxContextStrategyType denotes strategy types for generating SELinux options for a
// SecurityContext.
type SELinuxContextStrategy string
const (
// container must have SELinux labels of X applied.
SELinuxStrategyMustRunAs SELinuxContextStrategy = "MustRunAs"
// container may make requests for any SELinux context labels.
SELinuxStrategyRunAsAny SELinuxContextStrategy = "RunAsAny"
)
// RunAsUserStrategyOptions defines the strategy type and any options used to create the strategy.
type RunAsUserStrategyOptions struct {
// Type is the strategy that will dictate the allowable RunAsUser values that may be set.
Type RunAsUserStrategy `json:"type"`
// UID is the user id that containers must run as. Required for the MustRunAs strategy if not using
// a strategy that supports pre-allocated uids.
UID *int64 `json:"uid,omitempty"`
// UIDRangeMin defines the min value for a strategy that allocates by a range based strategy.
UIDRangeMin *int64 `json:"uidRangeMin,omitempty"`
// UIDRangeMax defines the max value for a strategy that allocates by a range based strategy.
UIDRangeMax *int64 `json:"uidRangeMax,omitempty"`
}
// RunAsUserStrategyType denotes strategy types for generating RunAsUser values for a
// SecurityContext.
type RunAsUserStrategy string
const (
// container must run as a particular uid.
RunAsUserStrategyMustRunAs RunAsUserStrategy = "MustRunAs"
// container must run as a particular uid.
RunAsUserStrategyMustRunAsRange RunAsUserStrategy = "MustRunAsRange"
// container must run as a non-root uid
RunAsUserStrategyMustRunAsNonRoot RunAsUserStrategy = "MustRunAsNonRoot"
// container may make requests for any uid.
RunAsUserStrategyRunAsAny RunAsUserStrategy = "RunAsAny"
)
As reusable objects in the root scope, PodSecurityPolicy follows the lifecycle of the cluster itself. Maintenance of constraints such as adding, assigning, or changing them is the responsibility of the cluster administrator.
Creating a new user within a namespace should not require the cluster administrator to define the user's PodSecurityPolicy. They should receive the default set of policies that the administrator has defined for the groups they are assigned.
In order to establish policy for service accounts and users, there must be a way
to identify the default set of constraints that is to be used. This is best accomplished by using
groups. As mentioned above, groups may be used by the authentication/authorization layer to ensure
that every user maps to at least one group (with a default example of system:authenticated
) and it
is up to the cluster administrator to ensure that a PodSecurityPolicy
object exists that
references the group.
If an administrator would like to provide a user with a changed set of security context permissions, they may do the following:
PodSecurityPolicy
object and add a reference to the user or a group
that the user belongs to.PodSecurityPolicy
object with the proper
elevated privileges.Admission control using an authorizer provides the ability to control the creation of resources
based on capabilities granted to a user. In terms of the PodSecurityPolicy
, it means
that an admission controller may inspect the user info made available in the context to retrieve
an appropriate set of policies for validation.
The appropriate set of PodSecurityPolicies is defined as all of the policies available that have reference to the user or groups that the user belongs to.
Admission will use the PodSecurityPolicy to ensure that any requests for a specific security context setting are valid and to generate settings using the following approach:
PodSecurityPolicy
objects that are allowed to be usedPodSecurityPolicy
objects in a most restrictive to least restrictive order.PodSecurityPolicy
, generate a SecurityContext
for each container. The generation phase will not override
any user requested settings in the SecurityContext
, and will rely on the validation phase to ensure that
the user requests are valid.SecurityContext
to ensure it falls within the boundaries of the PodSecurityPolicy
PodSecurityPolicy
then the pod will be admittedPodSecurityPolicy
then try the next PodSecurityPolicy
PodSecurityPolicy
validates for the pod then the pod will not be admittedThe creation of a SecurityContext
based on a PodSecurityPolicy
is based upon the configured
settings of the PodSecurityPolicy
.
There are three scenarios under which a PodSecurityPolicy
field may fall:
Governed by a boolean: fields of this type will be defaulted to the most restrictive value.
For instance, AllowPrivileged
will always be set to false if unspecified.
Governed by an allowable set: fields of this type will be checked against the set to ensure
their value is allowed. For example, AllowCapabilities
will ensure that only capabilities
that are allowed to be requested are considered valid. HostNetworkSources
will ensure that
only pods created from source X are allowed to request access to the host network.
Governed by a strategy: Items that have a strategy to generate a value will provide a mechanism to generate the value as well as a mechanism to ensure that a specified value falls into the set of allowable values. See the Types section for the description of the interfaces that strategies must implement.
Strategies have the ability to become dynamic. In order to support a dynamic strategy it should be
possible to make a strategy that has the ability to either be pre-populated with dynamic data by
another component (such as an admission controller) or has the ability to retrieve the information
itself based on the data in the pod. An example of this would be a pre-allocated UID for the namespace.
A dynamic RunAsUser
strategy could inspect the namespace of the pod in order to find the required pre-allocated
UID and generate or validate requests based on that information.
// SELinuxStrategy defines the interface for all SELinux constraint strategies.
type SELinuxStrategy interface {
// Generate creates the SELinuxOptions based on constraint rules.
Generate(pod *api.Pod, container *api.Container) (*api.SELinuxOptions, error)
// Validate ensures that the specified values fall within the range of the strategy.
Validate(pod *api.Pod, container *api.Container) fielderrors.ValidationErrorList
}
// RunAsUserStrategy defines the interface for all uid constraint strategies.
type RunAsUserStrategy interface {
// Generate creates the uid based on policy rules.
Generate(pod *api.Pod, container *api.Container) (*int64, error)
// Validate ensures that the specified values fall within the range of the strategy.
Validate(pod *api.Pod, container *api.Container) fielderrors.ValidationErrorList
}
An administrator may wish to create a resource in a namespace that runs with escalated privileges. By allowing security context constraints to operate on both the requesting user and the pod's service account, administrators are able to create pods in namespaces with elevated privileges based on the administrator's security context constraints.
This also allows the system to guard commands being executed in the non-conforming container. For
instance, an exec
command can first check the security context of the pod against the security
context constraints of the user or the user's ability to reference a service account.
If it does not validate then it can block users from executing the command. Since the validation
will be user aware, administrators would still be able to run the commands that are restricted to normal users.
In certain cases, the Kubelet may need provide information about
the image in order to validate the security context. An example of this is a cluster
that is configured to run with a UID strategy of MustRunAsNonRoot
.
In this case the admission controller can set the existing MustRunAsNonRoot
flag on the SecurityContext
based on the UID strategy of the SecurityPolicy
. It should still validate any requests on the pod
for a specific UID and fail early if possible. However, if the RunAsUser
is not set on the pod
it should still admit the pod and allow the Kubelet to ensure that the image does not run as
root
with the existing non-root checks.