Evicting pods using the descheduler
安裝步驟
#加入repo列表
helm repo add descheduler https://kubernetes-sigs.github.io/descheduler/
#查看當前repo列表
helm repo list
#映射出value文件
helm show values descheduler/descheduler >./values.yaml
###自由進行vi修改value參數###
#安裝並套用自定義的參數
helm install descheduler -f value.yaml --namespace kube-system descheduler/descheduler
#pod狀態completed代表cronjob執行完了
kubectl get pod -n kube-system
#查看cronjob
kubectl get cronjobs -n kube-system
#如果要刪除,或測試完了要清理環境
helm uninstall descheduler -n kube-system
來源Github(及範例)
value.yaml內容
驗證「RemovePodsHavingTooManyRestarts」調度處置
新增一個不穩定的pod來觸發pod多次重啟的閾值
#參考 https://www.hwchiu.com/docs/2023/descheduler
apiVersion: apps/v1
kind: Deployment
metadata:
name: www-deployment
spec:
replicas: 1
selector:
matchLabels:
app: www
template:
metadata:
labels:
app: www
spec:
containers:
- name: www-server
image: hwchiu/python-example
command: ['sh', '-c', 'date && no']
ports:
- containerPort: 5000
protocol: "TCP"
#如果要刪除,或測試完了要清理環境
kubectl delete deployment www-deployment -n default
deschedulerPolicy: 區域
重啟條件配置為3次「podRestartThreshold: 3」
啟用「調度」裡面的「移除太多次重啟的pod」插件
觀察pod狀態
在第四次重啟之後產生新的pod
kubectl get pod -n <命名空間> -w
查詢日誌
因為觸發什麼規則而進行平衡或調度
kubectl logs -f <descheduler的pod名稱> -n kube-system
嘗試加入其他插件並驗證
#查看標籤
kubectl get nodes --show-labels
#查看指定節點下的pod
kubectl get pods -n <命名空間> --field-selector spec.nodeName=<節點名稱> -o wide
deschedulerPolicy:
nodeSelector: "<節點標籤1>,<節點標籤2>"
ignorePvcPods: true
profiles:
- name: default
pluginConfig:
- name: DefaultEvictor
args:
ignorePvcPods: true
evictLocalStoragePods: true
evictSystemCriticalPods: false
nodeFit: true
# - name: RemoveDuplicates
- name: RemovePodsHavingTooManyRestarts
args:
podRestartThreshold: 3
includingInitContainers: true
- name: RemovePodsViolatingNodeAffinity
args:
namespaces:
include:
- <命名空間1>
- <命名空間2>
nodeAffinityType:
- requiredDuringSchedulingIgnoredDuringExecution
# - preferredDuringSchedulingIgnoredDuringExecution
- name: LowNodeUtilization
args:
thresholds:
cpu: 20
memory: 20
pods: 20
targetThresholds:
cpu: 50
memory: 50
pods: 50
evictableNamespaces:
exclude:
- "kube-system"
- "kube-flannel"
plugins:
balance:
enabled:
# - RemoveDuplicates
# - RemovePodsViolatingNodeAffinity
# - RemovePodsViolatingTopologySpreadConstraint
- LowNodeUtilization
deschedule:
enabled:
- RemovePodsHavingTooManyRestarts
# - RemovePodsViolatingNodeTaints
- RemovePodsViolatingNodeAffinity
# - RemovePodsViolatingInterPodAntiAffinity
後記:
若要指定pod在指定節點上運行的調度配置,還是得在deployment.yaml上通過nodeSelector或affinity來實踐,而非通過descheduler來實踐。
排錯:
- 有些規則會衝突,無法全部開啟。
例如:節點親和性與節點污點「RemovePodsViolatingNodeAffinity」與「RemovePodsViolatingNodeTaints」
例如:只開啟「RemovePodsHavingTooManyRestarts」與「RemovePodsViolatingNodeAffinity」的調度處置
- 不支持「***」參數
政策裡面的配置,不支持「ignorePvcPods」及「namespaces」參數
(把該參數拿掉即可)
(官方文件有寫每種plugin插件支持哪些參數arg)
E0604 03:06:05.116645 1 run.go:74] "command failed" err="failed decoding descheduler's policy config \"/policy-dir/policy.yaml\": strict decoding error: unknown field \"ignorePvcPods\", unknown field \"namespaces.excludeNamespaces\", unknown field \"namespaces.includeNamespaces\""
參考文章: