kubernetes集群中夺命的5秒DNS延迟
如果在 公众号 文章发现状态为 已更新, 建议点击 查看原文 查看最新内容。
状态: 未更新
原文链接:
https://typonotes.com/posts/2023/08/05/k8s-dns-5s-resolv/
kubernetes集群中夺命的5秒DNS延迟
问题原因
相关文章
kubernetes集群中夺命的5秒DNS延迟
破案:Kubernetes/Docker 上无法解释的连接超时
Weave works的工程师
Martynas Pumputis
对这个问题做了很详细的分析:
https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts
+ `conntract`: http://people.netfilter.org/pablo/docs/login.pdf
本质原因
DNS client (glibc 或 musl libc) 会并发请求 A 和 AAAA 记录,跟 DNS Server 通信自然会先 connect (建立 fd),后面请求报文使用这个 fd 来发送,由于 UDP 是无状态协议, connect 时并不会发包,也就不会创建 conntrack 表项, 而并发请求的 A 和 AAAA 记录默认使用同一个 fd 发包,send 时各自发的包它们源 Port 相同(因为用的同一个 socket 发送),当并发发包时,两个包都还没有被插入 conntrack 表项,所以 netfilter 会为它们分别创建 conntrack 表项,而集群内请求 kube-dns 或 coredns 都是访问的 CLUSTER-IP,报文最终会被 DNAT 成一个 endpoint 的 POD IP,当两个包恰好又被 DNAT 成同一个 POD IP 时,它们的五元组就相同了,在最终插入的时候后面那个包就会被丢掉,如果 dns 的 pod 副本只有一个实例的情况就很容易发生(始终被 DNAT 成同一个 POD IP),现象就是 dns 请求超时,client 默认策略是等待 5s 自动重试,如果重试成功,我们看到的现象就是 dns 请求有 5s 的延时。
划重点
1. A
与 AAAA
2. 并发冲突
解决方式
通过 dnsOption
参数 解决并发解析
https://github.com/Azure/AKS/issues/667
https://studygolang.com/articles/25303
dnsConfig:
options:
- name: single-request-reopen
https://github.com/Azure/AKS/issues/667#issuecomment-425821085
CoreDNS
禁用 AAAA
ipv6 解析
1. [k8s与dns--coredns的一些实战经验 - kubernetes solutions - SegmentFault 思否](https://segmentfault.com/a/1190000020403096)
2. [coredns plugin template](https://coredns.io/plugins/template/)
使用 NodeLocalDNS Cache
实现本地缓存
1. [设置 NodeLocal DNSCache | Kubernetes Engine 文档 | Google Cloud](https://cloud.google.com/kubernetes-engine/docs/how-to/nodelocal-dns-cache?hl=zh-cn)
2. https://cloud.google.com/kubernetes-engine/docs/concepts/service-discovery?hl=zh-cn
3. [在 Kubernetes 集群中使用 NodeLocal DNSCache-阳明的博客|Kubernetes|Istio|Prometheus|Python|Golang|云原生](https://www.qikqiak.com/post/use-nodelocal-dns-cache/)
+ 但是到这里还没有完,如果 kube-proxy 组件使用的是 ipvs 模式的话我们还需要修改 kubelet 的 —cluster-dns 参数,将其指向 169.254.20.10,Daemonset 会在每个节点创建一个网卡来绑这个 IP,Pod 向本节点这个 IP 发 DNS 请求,缓存没有命中的时候才会再代理到上游集群 DNS 进行查询。 iptables 模式下 Pod 还是向原来的集群 DNS 请求,节点上有这个 IP 监听,会被本机拦截,再请求集群上游 DNS,所以不需要更改 —cluster-dns 参数
coredns.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
| # __MACHINE_GENERATED_WARNING__
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: Reconcile
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: EnsureExists
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
Corefile: |
pek3.qingcloud.com:53 {
forward . 100.64.9.5 100.64.9.9
# log
# errors
# cache 300 ## https://coredns.io/plugins/cache/
# cancel 1s ## https://coredns.io/plugins/cancel/
#### working
# loop
# reload 5s
}
qingstor.com:53 {
forward . 100.64.9.5 100.64.9.9
# log
# errors
# loop
# reload 5s
}
.:53 {
# https://coredns.io/plugins/template/
## 遇到 ipv6 解析 立即返回找不到
## NXNOMAIN 意味着 「查找正确, 但无解析记录」
## alpine3.13 之后, 默认优先 ipv6。 会出现解析错误。
# template ANY AAAA {
# rcode NXDOMAIN
# }
#### debug
# whoami
# errors
# log . {"local":"{local}","client":"{remote}:{port}","id":"{>id}","type":"{type}","class":"{class}","name":"{name}","proto":"{proto}","size":{size},"do":"{>do}","bufsize":{>bufsize},"rflags":"{>rflags}","rsize":{rsize},"duration":"{duration}","rcode":"{rcode}"}
# log
#### working
# loop
reload 5s
#### health check and performance
health
ready
prometheus :9153
kubernetes cluster.local. in-addr.arpa ip6.arpa {
# https://coredns.io/plugins/kubernetes/
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
ttl 1
}
# hosts in line
# specify global hosts for containers
# hosts {
# 1.1.1.1 a.example.com
# 2.2.2.2 b.example.com
# fallthrough # must be keep at bottom
# }
#### resolver
# 如果是 ubuntu18.04 使用 /etc/resolv.conf 会出现无法解析的问题
## 因为 nameserver 127.0.0.53
forward . 114.114.114.114
# forward . /etc/resolv.conf
# loadbalance ## https://coredns.io/plugins/loadbalance/
#### cache
cache 60 ## https://coredns.io/plugins/cache/
cancel 1s ## https://coredns.io/plugins/cancel/
}
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
priorityClassName: system-cluster-critical
serviceAccountName: coredns
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
nodeSelector:
beta.kubernetes.io/os: linux
containers:
- name: coredns
image: coredns/coredns:1.6.2
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 300Mi
requests:
cpu: 300m
memory: 300Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.66.0.2
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
|
其他文章
- 灵雀云的专家工程师刘梦馨,在《蓝鲸 X DeepFlow 可观测性 Meetup》 中的分享实录
记一次持续三个月的 K8s DNS 排障过程