Making the most out of kubernetes audit logs

Robert Boll @roboll_
Laurent Bernaille @lbernail
Making the Most Out of
Kubernetes Audit Logs

@lbernail @roboll_
Datadog
Monitoring service
Over 350 integrations
Over 1,200 employees
Over 8,000 customers
Runs on millions of hosts
Trillions of data points per day
10000s hosts in our infra
10s of Kubernetes clusters
Clusters from 50 to 3000 nodes
Multi-cloud
Very fast growth

@lbernail @roboll_
Understanding what happens can be hard

@lbernail @roboll_
Outline
1. Background: The Kubernetes API
2. Audit Logs
3. Configuring Audit Logs
4. 10000 foot view for a large cluster
5. Understanding Kubernetes Internals
6. Troubleshooting examples

Background:
The Kubernetes API

@lbernail @roboll_
apiserver
etcd
Calls to the apiservers

@lbernail @roboll_
apiserver
etcd
controllers scheduler
Control plane

@lbernail @roboll_
apiserver
etcd
kubelet controllers scheduler
Kubelet

@lbernail @roboll_
apiserver
etcd
kube-proxy
DaemonSet: kube-proxy

@lbernail @roboll_
apiserver
etcd
kube-proxy
other node
apps
Other DaemonSets (cni, etc.)

@lbernail @roboll_
apiserver
etcd
kube-proxy
coredns
other node
apps
Cluster services: DNS

@lbernail @roboll_
apiserver
etcd
kube-proxy cluster-
autoscaler
ingress
controllers
coredns
other node
apps
Other cluster services

@lbernail @roboll_
apiserver
etcd
kube-proxy cluster-
autoscaler
ingress
controllers
other apps
coredns
other node
apps
Probably several other applications

@lbernail @roboll_
And users, of course
apiserver
etcd
kubectl
kube-proxy cluster-
autoscaler
ingress
controllers
other apps
coredns
other node
apps

@lbernail @roboll_
And, surprise, the apiserver itself
apiserver
etcd
kubectl
kube-proxy cluster-
autoscaler
ingress
controllers
other apps
coredns
other node
apps

@lbernail @roboll_
What happens when you kubectl?
$ kubectl get pod echodeploy-77cf5c6f6-brj76 -v=8
[...]
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods/echodeploy-77cf5c6f6-brj76

@lbernail @roboll_
Let’s look at details
[...]
apiserver api version namespace resource type resource name

@lbernail @roboll_
A few more GET examples
[...]
$ kubectl get pods
[...]
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods?limit=500 List
(paginated)

@lbernail @roboll_
A few more GET examples
[...]
$ kubectl get pods
[...]
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods?limit=500
$ kubectl get pods --watch=true -v=8
[...]
[...]
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods?resourceVersion=282725545&watch=true
List &
Watch

@lbernail @roboll_
Describe resource
kubectl describe pod echodeploy-77cf5c6f6-5wmw9 -v=8
[...]
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods/echodeploy-77cf5c6f6-5wmw9
[...]
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/events?
fieldSelector=involvedObject.name=echodeploy-77cf5c6f6-5wmw9,
involvedObject.namespace=datadog,
involvedObject.uid=770b3a5e-0631-11ea-bc60-12d7306f3c0c
[...]
ResponseBody
{
"kind": "EventList",
"items": [
{
"involvedObject": { "kind": "Pod", "namespace": "datadog", "name": "echodeploy-77cf5c6f6-5wmw9"},
"reason": "Scheduled",
"message": "Successfully assigned echodeploy-77cf5c6f6-5wmw9 to ip-10-128-205-156.ec2.internal",
"source": {
"component": "default-scheduler"
},
}
]
}

@lbernail @roboll_
Describe resource
[...]
[...]
[...]
ResponseBody
{
"items": [
{
"source": {
},
}
]
}
Get resource

@lbernail @roboll_
Describe resource
[...]
[...]
[...]
ResponseBody
{
"items": [
{
"source": {
},
}
]
}
Get resource
Get events associated
with resource

@lbernail @roboll_
A few other examples
$ kubectl delete pod echodeploy-77cf5c6f6-brj76 -v=8
[...]
DELETE https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods/echodeploy-77cf5c6f6-brj76
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods?fieldSelector=metadata.name%3Dechodeploy-77cf5c6f6-brj76
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods?fieldSelector=metadata.name%3Dechodeploy-77cf5c6f6-brj76&
resourceVersion=282733788&watch=true
Delete
+
List &
Watch

@lbernail @roboll_
[...]
$ kubectl create deployment test --image=busybox -v=8
Request Body:
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"creationTimestamp":null,"labels":{"app":"test"},"name":"test"},"spec":{"replicas":1,"s
elector":{"matchLabels":{"app":"test"}},"strategy":{},"template":{"metadata":{"creationTimestamp":null,"labels":{"app":"test"}},"spec":{"container
s":[{"image":"busybox","name":"busybox","resources":{}}]}}},"status":{}}
POST https://kubernetes.fury.us1.staging.dog/apis/apps/v1/namespaces/datadog/deployments
Minimal
deployment
spec
POST call

@lbernail @roboll_
[...]
Request Body:
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"creationTimestamp":null,"labels":{"app":"test"},"name":"test"},"spec":{"replicas":1,"s
elector":{"matchLabels":{"app":"test"}},"strategy":{},"template":{"metadata":{"creationTimestamp":null,"labels":{"app":"test"}},"spec":{"container
s":[{"image":"busybox","name":"busybox","resources":{}}]}}},"status":{}}
$ kubectl scale deploy test --replicas=2 -v=8
GET https://kubernetes.fury.us1.staging.dog/apis/extensions/v1beta1/namespaces/datadog/deployments/test
Request Body: {"spec":{"replicas":2}}
PATCH https://kubernetes.fury.us1.staging.dog/apis/extensions/v1beta1/namespaces/datadog/deployments/test/scale
GET current
PATCH body
+call

@lbernail @roboll_
Takeaways
● A lot of components are making calls
○ Control plane: controllers, scheduler
○ Node daemons: kubelet, kube-proxy
○ Other controllers: autoscaler, ingress
● “Simple” user ops translate to many API calls
How can we understand what is going on?

@lbernail @roboll_
What are Audit Logs?
● Rich Structured json logs output by the apiserver
● Configurable Verbosity for each resource
● Logging can happen at different processing stages
apiserverclient
1
3 2: Request processing
1: Apiserver receives request, Stage: RequestReceived
2: Apiserver processes request
3: Apiserver answers, Stage: ResponseComplete/ResponseStarted

@lbernail @roboll_
Content of Audit Logs
● What happened?
● Who initiated it?
● Why was it authorized?
● When did it happen?
● From where?
● Depending on verbosity, Request/Response

@lbernail @roboll_
GET example from earlier
Structured JSON log
A lot of information
$ kubectl get pod echodeploy-77cf5c6f6-5wmw9 -v=8
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods/echodeploy-77cf5c6f6-5wamw9

@lbernail @roboll_
What happened?

@lbernail @roboll_
Who initiated it?
User was laurent.bernaille@datadoghq.com and mapped to groups
- datadoghq.com
- system:authenticated

@lbernail @roboll_
Why was it authorized?
It was authorized because group datadoghq.com is bound to role
datadoghq:cluster-user by ClusterRoleBinding datadoghq:cluster-admin-binding
(and this role has the required permissions)

@lbernail @roboll_
When, and from where?
Request received at 20:33:26.757
Response completed at 20:33:26:771
Duration: 14ms
From IP: 10.X.Y.74

@lbernail @roboll_
Another GET call
$ kubectl get pods -v=8
GET is mapped to different verbs (get/list)

@lbernail @roboll_
Watches
$ kubectl get pods -v=8 -w
GET https://kubernetes.fury.us1.staging.dog/api/v1/namespaces/datadog/pods?resourceVersion=288656279&watch=true
Call 1 : list
stage: ResponseComplete
Call 2 : watch
get + watch parameters
136ms later
stage: ResponseStarted

@lbernail @roboll_
Create call

@lbernail @roboll_
Takeaways
Audit logs contain information on all API calls
● What happened?
● Who initiated it?
● Why was it authorized?
● When did it happen?
● From where?
● Depending on verbosity, Request/Response
OK, how do I get them?

@lbernail @roboll_
Apiserver configuration
kube-apiserver
[...]
--audit-log-path=/var/log/kubernetes/apiserver/audit.log
--audit-policy-file=/etc/kubernetes/audit-policies/policy.yaml
[...]
Where to store them
What to collect
Minimum configuration
Advanced
● Rotation parameters (max size, backup options)
● Alternative backend (webhook)
● Batching mode

@lbernail @roboll_
apiVersion: audit.k8s.io/v1 kind: Policy
rules:
# Log pod changes at RequestResponse level
- level: RequestResponse
omitStages:
- "RequestReceived"
resources:
- group: "" # core API group
resources: ["pods"]
verbs: ["create", "patch"”, "update", "delete"]
# Log "pods/log", "pods/status" at Metadata level
- level: Metadata
omitStages:
- "RequestReceived"
resources:
- group: ""
resources: ["pods/log", "pods/status"]
Set of rules
Audit policy: what to log?

@lbernail @roboll_
rules:
omitStages:
- "RequestReceived"
resources:
resources: ["pods"]
- level: Metadata
omitStages:
- "RequestReceived"
resources:
- group: ""
Rules match api call
● api group / version
● resource
● verbs
> Similar to RBAC
Audit policy: what to log?

@lbernail @roboll_
rules:
omitStages:
- "RequestReceived"
resources:
resources: ["pods"]
- level: Metadata
omitStages:
- "RequestReceived"
resources:
- group: ""
For matching API calls
● Which verbosity?
● When? (stage)
Audit policy: when to log?

@lbernail @roboll_
rules:
omitStages:
- "RequestReceived"
resources:
resources: ["pods"]
- level: Metadata
omitStages:
- "RequestReceived"
resources:
- group: ""
Gotchas
Rules are evaluated in order
First matching rule sets level
Request/RequestResponse
> contain payload data
Careful with security implications
ex: tokenreviews calls
group: “” means core API only
Don’t forget to add
● 3rd party apiservices
● your apiservices

@lbernail @roboll_
Recommendations
● Ignore RequestReceived stage
● Use at least Metadata level for almost everything
○ Possibly ignore healthz, metrics
● Use Request/Response level for critical resource/verbs
○ Very valuable for retroactive debugging
○ Careful for large/sensitive request/response bodies
● Very complete example in GKE
https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/gci/configure-helper.sh
● Documentation
https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#audit-policy

@lbernail @roboll_
Takeaways
● Getting audit logs is “simple”: 2 flags
● Getting policies right is harder
● You will get a lot of logs
● Requires a solution to analyze them
Let’s look at an overview on a real large cluster

10000 foot view for a
large cluster

@lbernail @roboll_
Total number of API calls
~900 calls/second on this 2500 nodes cluster
Number of audit logs per hour

@lbernail @roboll_
Top API users?
apiserver: 40 rps kube-proxy: 20 rps
local-volume-provisioner:20 rps
cronjob-controller: 15 rps
spinnaker: 5-10 rps

@lbernail @roboll_
Top list, missing “small” users
total: ~900rps
Top 25: 150 rps
900 calls/second on this 2500 nodes cluster
What is doing ~80% of API calls?
Total number of API calls

@lbernail @roboll_
Grouping by users is not helping
Calls from users in group “system:nodes”: 750 rps (~80% of API calls)
In this 2500-nodes cluster, this means 0.3 rps per node!
Calls by user group
group=system:nodes

@lbernail @roboll_
Why is “system:nodes” so high?
nodes: 500 rps configmaps: 75 rps secrets: 60 rps
Calls from group “system:nodes” by resource targeted

@lbernail @roboll_
Verbs on “node” for a kubelet
10s
Each node update its status every 10s

@lbernail @roboll_
Verbs on “configmaps”
Only GETs
Regularity but not clear pattern

@lbernail @roboll_
Group by resource name
Each configmap is refreshed every ~5mn => GET call
Similar for secrets
~5mn

@lbernail @roboll_
List latency by resource
Latency for list by resource

@lbernail @roboll_
List latency by resource
Latency for list by resource
apiserver restarts

@lbernail @roboll_
Compare cluster performances
Latency for get pod by cluster (ms)
large clusters (1500+ nodes)
smaller clusters (<700 nodes)

@lbernail @roboll_
Takeaways
● Biggest users are the ones running on each node
○ kubelet, daemonsets (kube-proxy)
○ A lot of effort upstream to reduce their load
○ Be extra careful of damonsets doing API calls
● Audit logs structure allow to filter and slice & dice
● Audit logs are verbose (1000 logs/s in our example)

Understanding
Kubernetes Internals

@lbernail @roboll_
Creating a simple deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx

@lbernail @roboll_
Sequence
User creates
deploy
deployment controller
creates RS
RS controller
creates pods
Scheduler
bind pods
Kubelets
Update
pod status

@lbernail @roboll_
Sequence
Scheduler
bind pods
Scheduler callCreate
Binding
For nginx pod
To node
ip-10-x-y-123

@lbernail @roboll_
Actually a lot more
node 1 node 2 RS controller Deploy ctrl Scheduler apiserver user
Create of Deployment, RS, pods, binding + nodes accept
Nodes create pods and generate events
Pods are running
Deployments/RS update status
Additions
Apiserver verifies Quotas
Components also get/list
Creation of events + Update of status fields
Complete node workflow
~ 3s

@lbernail @roboll_
Node API calls
Initial calls
Get pod
Update pod/status: ContainerCreating
Get service account token

@lbernail @roboll_
Node API calls
Create events to show progression
MountVolume.SetUp succeeded for volume "default-token-dqmzw"
pulling image "nginx"
Successfully pulled image "nginx"
Created container
Started container

@lbernail @roboll_
Node API calls
Finalize pod creation
Get pod
Update container status to “Running”

@lbernail @roboll_
Takeaways
● A lot of interactions between kube components
● Audit logs give a great understanding of this!
● Events are spiky and get generate a lot of logs
● Events have a 1h default TTL, but stay in audit logs
Let’s identify some problems using audit logs

@lbernail @roboll_
Troubleshooting
● Understand what happened
○ “Why was a resource deleted?”
● Debug performance regressions/improve performances
○ “Which application is responsible for so many calls?”
● Also, identify issues by looking at HTTP status codes

@lbernail @roboll_
Status codes
Calls by status code
200 201 401

@lbernail @roboll_
4xx only
4xx by status code
401 403 404 422 409

@lbernail @roboll_
4xx only
4xx by status code
401?? (Unauthorized)

@lbernail @roboll_
Analyzing 401s
401s by source IP
4 nodes only, turns out they had expired certificates

@lbernail @roboll_
4xx only
4xx by status code
403?? (Forbidden)

@lbernail @roboll_
Analyzing 403s
403s by user
Most errors from Traefik serviceAccount
Bad RBAC configuration?

@lbernail @roboll_
Analyzing 403s for this user
403s for Traefik serviceAccount by resource
We use Traefik without Kubernetes secrets but it still tries to list them

@lbernail @roboll_
4xx only
4xx by status code
422?? (Invalid)

@lbernail @roboll_
What about 422?
422 by user
kube-system:generic-garbage-collector => ??

@lbernail @roboll_
What is failing?
generic-garbage-collector by verb / resource
patch on pod

@lbernail @roboll_
What is happening?
● Pods are in “Evicted” status and must be kept
● Controlling RS has been deleted and pods should be orphaned
● Garbage collector fails to orphan them (remove ownerRef)
● ReplicaSet has been Terminating for 2 months...
● Root cause: mutating webhook
○ We modify the pod spec at creation
○ Mutating webhook is registered on pods for CREATE/UPDATE
○ We modify immutable fields
○ Garbage collector patch triggers this...

@lbernail @roboll_
Takeaways
● Looking at calls triggering HTTP errors help find issues
○ Misconfigured RBAC (403)
○ Applications doing calls they shouldn’t (403)
○ Expired certificates (401)
○ Other weird things

@lbernail @roboll_
Conclusion
● Audit logs can be incredibly valuable
○ Low-level understanding of Kubernetes
○ Detection of misconfigurations
○ Troubleshooting of issues
○ Identify performance issues
● Taking advantage of them require some effort
○ Policies are not easy to get right
○ They are verbose and require a tool to analyze them

Thank you
We’re hiring!
Visit our Kubecon booth
or https://www.datadoghq.com/careers/
Or contact us directly:
@lbernail laurent@datadoghq.com
@roboll_ roboll@datadoghq.com

Making the most out of kubernetes audit logs

More Related Content

What's hot (20)

Similar to Making the most out of kubernetes audit logs (20)

More from Laurent Bernaille (17)

Recently uploaded (20)

Making the most out of kubernetes audit logs