Taint, Toleration, and Node affinity related issues in cpServer
If one of the following cpServer control pool pod is in pending state, then perform the steps that follow:
flexsnap-agent, flexsnap-api-gateway, flexsnap-certauth, flexsnap-coordinator, flexsnap-idm, flexsnap-nginx, flexsnap-notification, flexsnap-scheduler, flexsnap-listener, flexsnap-postgresql, flexsnap-rabbitmq, flexsnap-fluentd-, flexsnap-fluentd
Obtain the pending pod's toleration and affinity status using the following command:
kubectl get pods <pod name>
Check if the node-affinity and tolerations of pod are matching with:
fields listed in or in the
cloudscale-values.yamlfile.taint and label of node pool, mentioned in or in the
cloudscale-values.yamlfile.
If all the above fields are correct and matching and still the control pool pod is in pending state, then the issue may be due to all the nodes in nodepool running at maximum capacity and cannot accommodate new pods. In such case the nodepool must be scaled properly.
If one of the following cpServer data pool pod is in pending state, then perform the steps that follow:
flexsnap-workflow,flexsnap-datamover
Obtain the pending pod's toleration and affinity status using the following command:
kubectl get pods <pod name>
Check if the node-affinity and tolerations of pod are matching with:
fields listed in in the
environment.yamlfile.taint and label of node pool, mentioned in in the
cloudscale-values.yamlfile.
If all the above fields are correct and matching and still the control pool pod is in pending state, then the issue may be due to all the nodes in nodepool running at maximum capacity and cannot accommodate new pods. In such case the nodepool must be scaled properly.
Obtain the pending pod's toleration and affinity status using the following command:
kubectl get pods <pod name>
Check if the node-affinity and tolerations of pod are matching with:
fields listed in file.
taint and label of node pool, mentioned in above values.
If all the above fields are correct and matching and still the control pool pod is in pending state, then the issue may be due to all the nodes in nodepool running at maximum capacity and cannot accommodate new pods. In such case the nodepool must be scaled properly.
If the nodes are configured with incorrect taint and label values, the user can edit them using the following command provided for AKS as an example:
az aks nodepool update \ --resource-group <resource_group> \ --cluster-name <cluster_name> \ --name <nodepool_name> \ --node-taints <key>=<value>:<effect> \ --no-wait
az aks nodepool update \ --resource-group <resource_group> \ --cluster-name <cluster_name> \ --name <cluster_name> \ --labels <key>=<value>