Pod Dns Error
Introduction¶
- Pod-dns-error injects chaos to disrupt dns resolution in kubernetes pods.
- It causes loss of access to services by blocking dns resolution of hostnames/domains
Scenario: DNS error for the target pod
Uses¶
View the uses of the experiment
coming soon
Prerequisites¶
Verify the prerequisites
- Ensure that Kubernetes Version > 1.16
- Ensure that the Litmus Chaos Operator is running by executing
kubectl get pods
in operator namespace (typically,litmus
).If not, install from here - Ensure that the
pod-dns-error
experiment resource is available in the cluster by executingkubectl get chaosexperiments
in the desired namespace. If not, install from here
Default Validations¶
View the default validations
The application pods should be in running state before and after chaos injection.
Minimal RBAC configuration example (optional)¶
NOTE
If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.
View the Minimal RBAC permissions
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: pod-dns-error-sa
namespace: default
labels:
name: pod-dns-error-sa
app.kubernetes.io/part-of: litmus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-dns-error-sa
namespace: default
labels:
name: pod-dns-error-sa
app.kubernetes.io/part-of: litmus
rules:
# Create and monitor the experiment & helper pods
- apiGroups: [""]
resources: ["pods"]
verbs: ["create","delete","get","list","patch","update", "deletecollection"]
# Performs CRUD operations on the events inside chaosengine and chaosresult
- apiGroups: [""]
resources: ["events"]
verbs: ["create","get","list","patch","update"]
# Fetch configmaps details and mount it to the experiment pod (if specified)
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get","list",]
# Track and get the runner, experiment, and helper pods log
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get","list","watch"]
# for creating and managing to execute comands inside target container
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["get","list","create"]
# deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})
- apiGroups: ["apps"]
resources: ["deployments","statefulsets","replicasets", "daemonsets"]
verbs: ["list","get"]
# deriving the parent/owner details of the pod(if parent is deploymentConfig)
- apiGroups: ["apps.openshift.io"]
resources: ["deploymentconfigs"]
verbs: ["list","get"]
# deriving the parent/owner details of the pod(if parent is deploymentConfig)
- apiGroups: [""]
resources: ["replicationcontrollers"]
verbs: ["get","list"]
# deriving the parent/owner details of the pod(if parent is argo-rollouts)
- apiGroups: ["argoproj.io"]
resources: ["rollouts"]
verbs: ["list","get"]
# for configuring and monitor the experiment job by the chaos-runner pod
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create","list","get","delete","deletecollection"]
# for creation, status polling and deletion of litmus chaos resources used within a chaos workflow
- apiGroups: ["litmuschaos.io"]
resources: ["chaosengines","chaosexperiments","chaosresults"]
verbs: ["create","list","get","patch","update","delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: pod-dns-error-sa
namespace: default
labels:
name: pod-dns-error-sa
app.kubernetes.io/part-of: litmus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: pod-dns-error-sa
subjects:
- kind: ServiceAccount
name: pod-dns-error-sa
namespace: default
Experiment tunables¶
check the experiment tunables
Optional Fields
Variables | Description | Notes |
---|---|---|
TARGET_CONTAINER | Name of container which is subjected to dns-error | None |
TOTAL_CHAOS_DURATION | The time duration for chaos insertion (seconds) | Default (60s) |
TARGET_HOSTNAMES | List of the target hostnames or keywords eg. '["litmuschaos"]' | If not provided, all hostnames/domains will be targeted |
MATCH_SCHEME | Determines whether the dns query has to match exactly with one of the targets or can have any of the targets as substring. Can be either exact or substring |
if not provided, it will be set as exact |
PODS_AFFECTED_PERC | The Percentage of total pods to target | Defaults to 0 (corresponds to 1 replica), provide numeric value only |
CONTAINER_RUNTIME | container runtime interface for the cluster | Defaults to containerd, supported values: docker |
SOCKET_PATH | Path of the docker socket file | Defaults to /run/containerd/containerd.sock |
LIB | The chaos lib used to inject the chaos | Default value: litmus, supported values: litmus |
LIB_IMAGE | Image used to run the netem command | Defaults to litmuschaos/go-runner:latest |
RAMP_TIME | Period to wait before and after injection of chaos in sec | |
SEQUENCE | It defines sequence of chaos execution for multiple target pods | Default value: parallel. Supported: serial, parallel |
Experiment Examples¶
Common and Pod specific tunables¶
Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.
Target Host Names¶
It defines the comma-separated name of the target hosts subjected to chaos. It can be tuned with the TARGET_HOSTNAMES
ENV.
If TARGET_HOSTNAMES
not provided then all hostnames/domains will be targeted.
Use the following example to tune this:
# contains the target host names for the dns error
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: pod-dns-error-sa
experiments:
- name: pod-dns-error
spec:
components:
env:
## comma separated list of host names
## if not provided, all hostnames/domains will be targeted
- name: TARGET_HOSTNAMES
value: '["litmuschaos"]'
- name: TOTAL_CHAOS_DURATION
value: '60'
Match Scheme¶
It determines whether the DNS query has to match exactly with one of the targets or can have any of the targets as a substring. It can be tuned with MATCH_SCHEME
ENV. It supports exact
or substring
values.
Use the following example to tune this:
# contains match scheme for the dns error
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: pod-dns-error-sa
experiments:
- name: pod-dns-error
spec:
components:
env:
## it supports 'exact' and 'substring' values
- name: MATCH_SCHEME
value: 'exact'
- name: TOTAL_CHAOS_DURATION
value: '60'
Container Runtime Socket Path¶
It defines the CONTAINER_RUNTIME
and SOCKET_PATH
ENV to set the container runtime and socket file path.
CONTAINER_RUNTIME
: It supportsdocker
runtime only.SOCKET_PATH
: It contains path of docker socket file by default(/run/containerd/containerd.sock
).
Use the following example to tune this:
## provide the container runtime and socket file path
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: pod-dns-error-sa
experiments:
- name: pod-dns-error
spec:
components:
env:
# runtime for the container
# supports docker
- name: CONTAINER_RUNTIME
value: 'containerd'
# path of the socket file
- name: SOCKET_PATH
value: '/run/containerd/containerd.sock'
- name: TOTAL_CHAOS_DURATION
value: '60'