手动修复v1.13-v1.15版本Kubernetes的issue#78421
由来
2019年的时候在公司内开始做业务的微服务化,逐步接触了Kubernetes,当时使用的版本是1.14.6,为了提高性能,kube-proxy使用ipvs模式,但是发现当删除services后,虽然在kube-proxy的日志中显示已成功删除ipvs规则,但是实际在node节点上执行ipvsadm -Ln发现相应的ipvs规则仍然存在,并没有清除。Google了下,发现社区也有人遇到相同的情况。相关issue见https://github.com/kubernetes/kubernetes/issues/78421
从这个issue可以发现这个问题在Kubernetes v1.13-v1.15版本中都存在,并且可以找到各版本对应的修复PR:
- v1.13
https://github.com/kubernetes/kubernetes/pull/81481/commits/5d6f542e1d6fd9b471f1b3e428ad4b09e6799460- v1.14
https://github.com/kubernetes/kubernetes/pull/81481/commits/5d6f542e1d6fd9b471f1b3e428ad4b09e6799460- v1.15
https://github.com/kubernetes/kubernetes/pull/81483/commits/28ae6a4f7bf5994765ebc4a23bd01a1729f67b24
但是这些PR只被merge到了1.13.11、1.14.7、1.15.4及以上版本,而我们使用的是1.14.6版本,要解决这个issue就存在两种方式:
- 升级Kubernetes版本至1.14.7以上
- 自行编译1.14.6版本的Kubernetes,替换修复
最终我们选择了第二种,本文主要是对编译过程进行记录,同时也作为自行编译Kubernetes的教程。
实战
环境要求
- 操作系统CentOS7.x
- 2核8G以上配置
获取源码
获取源码,以v1.14.6为例
1 | yum install git wget -y |
查看kube-cross的TAG版本号,Kubernetes使用的Golang版本与之对应,省略最后的-1
1 | # cat /root/kubernetes/build/build-image/cross/VERSION |
说明
Kubernetes v1.14.6使用的Golang版本为1.12.9。各个Kubernetes版本使用的Golang版本是不一样的,编译时安装的Golang版本要严格按照文件中的指定版本
安装Golang
在https://golang.org/dl/下载对应版本的Golang,并解压Golang
1 | tar zxf go1.12.9.linux-amd64.tar.gz -C /usr/local |
配置Golang环境变量
1 | cat >> /etc/profile << EOF |
验证Golang是否安装成功
1 | # go version |
安装Docker
清理原有Docker环境(如果有的话)
1 | yum remove docker \ |
配置docker-ce repository
1 | # 安装所需要的包,yum-utils提供了yum-config-manager工具,device-mapper-persistent-data和lvm2是设备映射存储驱动所需要的 |
说明
若无法访问国外网站,可配置国内阿里云的docker源
1 wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
配置好Docker仓库后,执行如下命令安装Docker v18.09.9
1 | yum install docker-ce-18.09.9 docker-ce-cli-18.09.9 containerd.io -y |
启动Docker并设置开机自启
1 | systemctl start docker |
设置阿里云镜像加速器(可选)
1 | mkdir -p /etc/docker |
编译前准备
安装patch
1 | yum install patch -y |
执行下面命令去掉dirty,否则因为我们获取后修改了源码,编译出的version信息会带有-dirty字样,例如v1.14.6-dirty
说明
也可以不执行下面的命令,在后面打完patch后先git add再git commit了也行
1 | cd /root/kubernetes |
拉取kube-cross镜像,镜像标签与/root/kubernetes/build/build-image/cross/VERSION中保持一致
1 | docker pull k8s.gcr.io/kube-cross:v1.12.9-1 |
说明
如果因网络问题无法拉取,可拉取阿里云镜像仓库镜像然后重新打TAG
1 | docker pull mirrorgooglecontainers/kube-cross:v1.12.9-1 |
把Merge打Patch
Merge的URL为:
https://github.com/kubernetes/kubernetes/pull/81482/commits/c92ecefd49b9a48b9868f2173e8c84b88a7816ed
说明
记得执行以下命令时需要在URL后面添加.patch
1 | cd /root/kubernetes |
出现以下提示表示Patch成功
1 | patching file pkg/proxy/ipvs/proxier.go |
编译
本地二进制文件编译
一切准备就绪,执行下面的命令开始编译。
说明
KUBE_BUILD_PLATFORMS指定目标平台- 如果只编译一个组件,例如kubectl,可以在make后面添加
WHAT=cmd/kubectl指定GOFLAGS=-v开启verbose日志GOGCFLAGS=”-N -l”禁止编译优化和内联,减小可执行程序大小
1 | cd /root/kubernetes |
出现类似如下输出表示编译完成,生成的二进制可执行程序发布在_output/dockerized/bin/linux/amd64/目录下
1 | ... |
Docker镜像编译
一切准备就绪,执行下面的命令开始编译。
说明
KUBE_BUILD_PLATFORMS指定目标平台KUBE_BUILD_CONFORMANCE=n和KUBE_BUILD_HYPERKUBE=n参数决定是否构建hyperkube-amd64和conformance-amd64镜像,默认是y构建,这里设置为n表示不构建
如果只编译一个组件,例如kubectl,可以在make后面添加WHAT=cmd/kubectl指定GOFLAGS=-v开启verbose日志GOGCFLAGS=”-N -l”禁止编译优化和内联,减小可执行程序大小。
1 | cd /root/kubernetes |
出现类似如下输出表示编译构建完成,生成的二进制可执行程序和Docker镜像tar包发布在_output/release-stage/server/linux-amd64/kubernetes/server/bin/目录下
1 | +++ [0825 00:41:14] Syncing out of container |
说明
如果编译过程中出现如下错误,大概率是因为无法访问国外网络拉取编译所需的Docker镜像导致。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 +++ [0825 00:16:34] Syncing out of container
+++ [0825 00:16:37] Building images: linux-amd64
+++ [0825 00:16:38] Starting docker build for image: cloud-controller-manager-amd64
+++ [0825 00:16:38] Starting docker build for image: kube-apiserver-amd64
+++ [0825 00:16:38] Starting docker build for image: kube-controller-manager-amd64
+++ [0825 00:16:38] Starting docker build for image: kube-scheduler-amd64
+++ [0825 00:16:38] Starting docker build for image: kube-proxy-amd64
Sending build context to Docker daemon 39.2MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while > awaiting headers)
!!! [0825 00:17:09] Call tree:
!!! [0825 00:17:09] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [0825 00:17:09] 2: build/release-images.sh:42 kube::release::build_server_images(...)
Sending build context to Docker daemon 36.63MB
Step 1/2 : FROM k8s.gcr.io/debian-iptables-amd64:v11.0.2
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while > awaiting headers)
!!! [0825 00:17:34] Call tree:
!!! [0825 00:17:34] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [0825 00:17:34] 2: build/release-images.sh:42 kube::release::build_server_images(...)
Sending build context to Docker daemon 115MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: dial tcp: lookup k8s.gcr.io on 10.211.55.1:53: read udp 10.211.55.6:33611->10.211.55.1:53: i/> o timeout
!!! [0825 00:17:49] Call tree:
!!! [0825 00:17:49] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [0825 00:17:49] 2: build/release-images.sh:42 kube::release::build_server_images(...)
Sending build context to Docker daemon 99.87MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while > awaiting headers)
!!! [0825 00:17:54] Call tree:
!!! [0825 00:17:54] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [0825 00:17:54] 2: build/release-images.sh:42 kube::release::build_server_images(...)
Sending build context to Docker daemon 167MB
Step 1/2 : FROM k8s.gcr.io/debian-base-amd64:v1.0.0
Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while > awaiting headers)
!!! [0825 00:18:09] Call tree:
!!! [0825 00:18:09] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [0825 00:18:09] 2: build/release-images.sh:42 kube::release::build_server_images(...)
!!! [0825 00:18:09] previous Docker build failed
!!! [0825 00:18:09] Call tree:
!!! [0825 00:18:09] 1: /root/kubernetes/build/lib/release.sh:231 kube::release::create_docker_images_for_server(...)
!!! [0825 00:18:09] 2: build/release-images.sh:42 kube::release::build_server_images(...)
make: *** [release-images] Error 1解决方法
手动拉取阿里云镜像仓库相应镜像然后重新打TAG,具体拉取的镜像标签以实际报错信息为准
1
2
3
4
5 docker pull mirrorgooglecontainers/debian-base-amd64:v1.0.0
docker pull mirrorgooglecontainers/debian-iptables-amd64:v11.0.2
docker tag mirrorgooglecontainers/debian-base-amd64:v1.0.0 k8s.gcr.io/debian-base-amd64:v1.0.0
docker tag mirrorgooglecontainers/debian-iptables-amd64:v11.0.2 k8s.gcr.io/debian-iptables-amd64:v11.0.2修改
/root/kubernetes/build/lib/release.sh文件,去掉"${docker_build_opts[@]}",避免构建镜像继续拉取镜像。
1
2
3
4
5 # 修改前
"${DOCKER[@]}" build "${docker_build_opts[@]}" -q -t "${docker_image_tag}" "${docker_build_path}" >/dev/null
# 修改后
"${DOCKER[@]}" build -q -t "${docker_image_tag}" "${docker_build_path}" >/dev/null
整个编译过程结束后,如果集群采用二进制方式部署,则依次替换Master节点和Node节点上的二进制可执行文件,然后重启服务生效;如果集群是kubeadm方式部署,则在所有节点上导入构建好的Docker镜像,并依此修改/etc/kubernetes/manifests/目录下kube-apiserver.yaml、kube-controller-manager.yaml、kube-scheduler.yaml文件中的image,修改完成立即生效,最后执行kubectl edit daemonset kube-proxy -n kube-system修改kube-proxy的镜像,修改完立即生效。