服务器硬件配置推荐
Mater节点
物理机虚拟机均可,至少1台,高可用集群至少2台(etcd集群必须奇数台)
推荐配置:实验环境2核2G、测试环境2核4G、生产环境8核16G
关闭所有swap分区或不划分swap分区
Node节点
物理机虚拟机均可,大于等于1台
推荐配置:实验环境2核2G、测试环境4核8G、生产环境16核64G
关闭所有swap分区或不划分swap分区
部署架构图
实验环境信息
主机名
配置
操作系统
IP地址
角色
组件
easyk8s1
2核2G
CentOS7.5
10.211.55.7
Master Node
kube-apiserver kube-controller-manager kube-scheduler etcd keepalived haproxy docker kubelet kube-proxy
easyk8s2
2核2G
CentOS7.5
10.211.55.8
Master Node
kube-apiserver kube-controller-manager kube-scheduler etcd keepalived haproxy docker kubelet kube-proxy
easyk8s3
2核2G
CentOS7.5
10.211.55.9
Master Node
kube-apiserver kube-controller-manager kube-scheduler etcd keepalived haproxy docker kubelet kube-proxy
VIP:10.211.55.10
系统初始化
关闭防火墙 1 2 systemctl stop firewalld systemctl disable firewalld
关闭selinux 1 2 setenforce 0 sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
关闭swap 1 2 swapoff -a echo 'swapoff -a ' >> /etc/rc.d/rc.local
配置主机名 1 hostnamectl set-hostname ${HOSTNAME}
添加所有节点的本地host解析 1 2 3 4 5 cat >> /etc/hosts << EOF 10.211.55.7 easyk8s1 10.211.55.8 easyk8s2 10.211.55.9 easyk8s3 EOF
安装基础软件包 1 yum install vim net-tools lrzsz unzip dos2unix telnet sysstat iotop pciutils lsof tcpdump psmisc bc wget socat -y
内核开启网络支持 1 2 3 4 5 6 7 8 9 10 11 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 net.ipv4.ip_nonlocal_bind = 1 net.ipv4.neigh.default.gc_thresh1 = 80000 net.ipv4.neigh.default.gc_thresh2 = 90000 net.ipv4.neigh.default.gc_thresh3 = 100000 EOF modprobe br_netfilter sysctl --system
配置master1到所有节点(包括自身)的ssh免密登录 在master1上执行以下命令生成密钥文件(一路直接回车即可)
然后把公钥拷贝到所有节点(包括自身)
1 2 3 ssh-copy-id -i ~/.ssh/id_rsa.pub easyk8s1 ssh-copy-id -i ~/.ssh/id_rsa.pub easyk8s2 ssh-copy-id -i ~/.ssh/id_rsa.pub easyk8s3
在master1上通过ssh命令验证到所有节点(包括自身)均可免密登录
1 2 3 ssh easyk8s1 ssh easyk8s2 ssh easyk8s3
节点之间时间同步 Server端
说明 如果环境可以访问互联网,可以不需要自己搭建Server端,参考Client端部分设置所有节点与公共NTP服务器(例如ntp.aliyun.com)同步即可
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 timedatectl set-timezone Asia/Shanghai yum install chrony ntpdate -y cp -a /etc/chrony.conf /etc/chrony.conf.bakcat > /etc/chrony.conf << EOF stratumweight 0 driftfile /var/lib/chrony/drift rtcsync makestep 10 3 allow 10.211.55.0/24 # 设置为实际环境客户端所属IP网段 smoothtime 400 0.01 bindcmdaddress 127.0.0.1 bindcmdaddress ::1 local stratum 8 manual keyfile /etc/chrony.keys #initstepslew 10 client1 client3 client6 noclientlog logchange 0.5 logdir /var/log/chrony EOF systemctl restart chronyd.service systemctl enable chronyd.service systemctl status chronyd.service
Client端 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 timedatectl set-timezone Asia/Shanghai yum install chrony ntpdate -y cp -a /etc/chrony.conf /etc/chrony.conf.baksed -i "s%^server%#server%g" /etc/chrony.conf echo "server 10.211.55.7 iburst" >> /etc/chrony.conf ntpdate 10.211.55.7 systemctl restart chronyd.service systemctl enable chronyd.service systemctl status chronyd.service chronyc sources chronyc tracking
安装CFSSL工具
CFSSL是CloudFlare开源的一款PKI/TLS工具。 CFSSL包含一个命令行工具和一个用于签名,验证并且捆绑TLS证书的HTTP API服务。使用Go语言编写
在其中一台节点(建议用master1)上执行以下命令直接进行安装
1 2 3 4 5 curl -s -L -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 curl -s -L -o /usr/local/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 curl -s -L -o /usr/local/bin/cfssl-certinfo https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 chmod +x /usr/local/bin/cfssl*cfssl version
说明 如果环境无法联网,则下载最新版本的cfssl_linux-amd64、cfssljson_linux-amd64、cfssl-certinfo_linux-amd64并上传到其中一台节点的/root目录下(建议用master1),并执行以下命令安装cfssl
1 2 3 4 5 mv /root/cfssl_linux-amd64 /usr/local/bin/cfsslmv /root/cfssljson_linux-amd64 /usr/local/bin/cfssljsonmv /root/cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfochmod +x /usr/local/bin/cfssl*cfssl version
部署etcd数据库集群
etcd是基于Raft的分布式key-value存储系统,由CoreOS开发,常用于服务发现、共享配置以及并发控制(如leader选举、分布式锁等)。Kubernetes使用 etcd存储所有运行数据。
使用cfssl为etcd生成自签证书 在安装了cfssl工具的节点上执行以下命令为etcd创建对应的ca机构并生成自签证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 mkdir /root/etcd-cert && cd /root/etcd-certcat > ca-csr.json << EOF { "CN": "etcd CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen" } ] } EOF cat > ca-config.json << EOF { "signing": { "default": { "expiry": "876000h" }, "profiles": { "etcd": { "expiry": "876000h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } EOF cat > etcd-csr.json << EOF { "CN": "etcd", "hosts": [ "10.211.55.7", "10.211.55.8", "10.211.55.9" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen" } ] } EOF cfssl gencert -initca ca-csr.json | cfssljson -bare ca - cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd-csr.json | cfssljson -bare etcd
部署etcd3.5版本 访问https://github.com/etcd-io/etcd/releases 下载etcd3.5版本的二进制包(本文以3.5.0为例),并上传到其中一台etcd节点的/root目录下,然后执行以下命令解压并创建etcd相关目录和配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 cd /root/tar zxf etcd-v3.5.0-linux-amd64.tar.gz -C /opt/ mv /opt/etcd-v3.5.0-linux-amd64/ /opt/etcdmkdir /opt/etcd/binmkdir /opt/etcd/cfgmkdir /opt/etcd/sslcp -a /opt/etcd/etcd* /opt/etcd/bin/cp -a /root/etcd-cert/{ca,etcd,etcd-key}.pem /opt/etcd/ssl/cat > /opt/etcd/cfg/etcd.conf << EOF #[Member] # 自定义此etcd节点的名称,集群内唯一 ETCD_NAME="etcd-1" # 定义etcd数据存放目录 ETCD_DATA_DIR="/var/lib/etcd/default.etcd" # 定义本机和成员之间通信的地址 ETCD_LISTEN_PEER_URLS="https://10.211.55.7:2380" # 定义etcd对外提供服务的地址 ETCD_LISTEN_CLIENT_URLS="https://10.211.55.7:2379,http://127.0.0.1:2379" #[Clustering] # 定义该节点成员对等URL地址,且会通告集群的其余成员节点 ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.211.55.7:2380" # 此成员的客户端URL列表,用于通告群集的其余部分 ETCD_ADVERTISE_CLIENT_URLS="https://10.211.55.7:2379" # 集群中所有节点的信息 ETCD_INITIAL_CLUSTER="etcd-1=https://10.211.55.7:2380,etcd-2=https://10.211.55.8:2380,etcd-3=https://10.211.55.9:2380" # 创建集群的token,这个值每个集群保持唯一 ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" # 设置new为初始静态或DNS引导期间出现的所有成员。如果将此选项设置为existing,则etcd将尝试加入现有群集 ETCD_INITIAL_CLUSTER_STATE="new" # flannel操作etcd使用的是v2的API,而kubernetes操作etcd使用的v3的API,在ETCD3.5版本中默认关闭v2版本,所以为了兼容flannel,要设置开启v2的API ETCD_ENABLE_V2="true" EOF cat > /usr/lib/systemd/system/etcd.service << EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=/opt/etcd/cfg/etcd.conf ExecStart=/opt/etcd/bin/etcd \\ --cert-file=/opt/etcd/ssl/etcd.pem \\ --key-file=/opt/etcd/ssl/etcd-key.pem \\ --peer-cert-file=/opt/etcd/ssl/etcd.pem \\ --peer-key-file=/opt/etcd/ssl/etcd-key.pem \\ --trusted-ca-file=/opt/etcd/ssl/ca.pem \\ --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
将etcd目录和Service文件拷贝到其余etcd集群节点上,并修改etcd配置文件中的ETCD_NAME、ETCD_LISTEN_PEER_URLS、ETCD_LISTEN_CLIENT_URLS、ETCD_INITIAL_ADVERTISE_PEER_URLS和ETCD_ADVERTISE_CLIENT_URLS参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 scp -r /opt/etcd/ easyk8s2:/opt/ scp -r /usr/lib/systemd/system/etcd.service easyk8s2:/usr/lib/systemd/system/ vim /opt/etcd/cfg/etcd.conf ETCD_NAME="etcd-2" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="https://10.211.55.8:2380" ETCD_LISTEN_CLIENT_URLS="https://10.211.55.8:2379,http://127.0.0.1:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.211.55.8:2380" ETCD_ADVERTISE_CLIENT_URLS="https://10.211.55.8:2379" ETCD_INITIAL_CLUSTER="etcd-1=https://10.211.55.7:2380,etcd-2=https://10.211.55.8:2380,etcd-3=https://10.211.55.9:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_ENABLE_V2="true"
在所有etcd集群节点上设置etcd开机自启并启动etcd
说明 在第一台节点上执行start后会一直卡着无法返回命令提示符,这是因为在等待其他节点准备就绪,继续启动其余节点即可
1 2 3 systemctl daemon-reload systemctl enable etcd systemctl start etcd
在任意etcd节点上执行以下命令查看集群状态,如果所有节点均处于healthy状态则表示etcd集群部署成功
1 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://10.211.55.7:2379,https://10.211.55.8:2379,https://10.211.55.9:2379" endpoint health
卸载etcd 如果安装失败需要卸载重新安装,在所有etcd节点上执行以下命令即可
1 2 3 4 5 systemctl stop etcd systemctl disable etcd rm -rf /opt/etcd/rm -rf /usr/lib/systemd/system/etcd.servicerm -rf /var/lib/etcd/
配置HAProxy和Keepalived(单Master节点部署忽略此步)
HAProxy
说明 所有HAProxy节点上的配置文件都是一样的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 yum install -y haproxy cp -a /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bakcat > /etc/haproxy/haproxy.cfg << EOF global chroot /var/lib/haproxy daemon group haproxy user haproxy log 127.0.0.1 local0 info pidfile /var/lib/haproxy.pid maxconn 20000 spread-checks 3 nbproc 8 defaults log global mode tcp retries 3 option redispatch listen stats bind 0.0.0.0:9000 mode http stats enable stats uri / stats refresh 15s stats realm Haproxy\ Stats stats auth k8s:k8s timeout server 15s timeout client 15s timeout connect 15s bind-process 1 listen k8s-apiserver bind 0.0.0.0:8443 # 指定绑定的IP和端口,端口建议用非6443端口(此处用8443),因为如果HAProxy是和kube-apiserver部署在同一台服务器上,用6443会产生端口冲突,如果不是部署在同一台机器上则此处端口可以使用6443 mode tcp balance roundrobin timeout server 15s timeout client 15s timeout connect 15s server easyk8s1-kube-apiserver 10.211.55.7:6443 check port 6443 inter 5000 fall 5 #转发到easyk8s1的kube-apiserver上,kube-apiserver端口默认是6443 server easyk8s2-kube-apiserver 10.211.55.8:6443 check port 6443 inter 5000 fall 5 #转发到easyk8s2的kube-apiserver上,kube-apiserver端口默认是6443 server easyk8s3-kube-apiserver 10.211.55.9:6443 check port 6443 inter 5000 fall 5 #转发到easyk8s3的kube-apiserver上,kube-apiserver端口默认是6443 EOF vim /etc/sysconfig/rsyslog SYSLOGD_OPTIONS="-c 2 -r -m 0" cat >> /etc/rsyslog.conf << EOF \$ModLoad imudp \$UDPServerRun 514 local0.* /var/log/haproxy/haproxy.log EOF mkdir -p /var/log/haproxy && chmod a+w /var/log/haproxysystemctl restart rsyslog netstat -nuple | grep 514 haproxy -c -f /etc/haproxy/haproxy.cfg systemctl restart haproxy systemctl enable haproxy systemctl status haproxy
Keepalived 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 yum install -y keepalived cp -a /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bakcat > /etc/keepalived/keepalived.conf << EOF ! Configuration File for keepalived global_defs { router_id easyk8s1 # 标识,用机器主机名作为标识 } vrrp_script check_haproxy { script "/etc/keepalived/check_haproxy.sh" } vrrp_instance VI_1 { state MASTER # 设置角色,第一个master节点为MASTER,剩余的节点均为BACKUP interface eth0 # 设置VIP绑定端口 virtual_router_id 51 # 让MASTER和BACKUP在同一个虚拟路由里,ID号必须相同 priority 150 # 优先级,谁的优先级高谁就是MASTER,值越大优先级越高 advert_int 1 # 心跳间隔时间 authentication { auth_type PASS # 认证 auth_pass k8s # 密码 } virtual_ipaddress { 10.211.55.10 # 虚拟IP } track_script { check_haproxy } } EOF cat > /etc/keepalived/check_haproxy.sh << EOF #!/bin/bash count=\$(ps -ef| grep haproxy | egrep -cv "grep|\$\$") if [ "\$count" -eq 0 ];then exit 1 else exit 0 fi EOF chmod +x /etc/keepalived/check_haproxy.shsystemctl restart keepalived.service systemctl enable keepalived.service systemctl status keepalived.service
部署Master组件
使用cfssl为各组件生成自签证书 在安装了cfssl工具的节点上执行以下命令为各个组件创建对应的ca机构并生成自签证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 mkdir /root/k8s-cert && cd /root/k8s-certcat > ca-csr.json << EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen", "O": "k8s", "OU": "System" } ] } EOF cat > ca-config.json << EOF { "signing": { "default": { "expiry": "876000h" }, "profiles": { "kubernetes": { "expiry": "876000h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } EOF cat > apiserver-csr.json << EOF { "CN": "kubernetes", "hosts": [ "10.0.0.1", "127.0.0.1", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local", "10.211.55.7", "10.211.55.8", "10.211.55.9", "10.211.55.10" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen", "O": "k8s", "OU": "System" } ] } EOF cat > kube-controller-manager-csr.json << EOF { "CN": "system:kube-controller-manager", "hosts": [""], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen", "O": "system:masters", "OU": "System" } ] } EOF cat > kube-scheduler-csr.json << EOF { "CN": "system:kube-scheduler", "hosts": [""], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen", "O": "system:masters", "OU": "System" } ] } EOF cat > admin-csr.json << EOF { "CN": "admin", "hosts": [""], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen", "O": "system:masters", "OU": "System" } ] } EOF cat > kube-proxy-csr.json << EOF { "CN": "system:kube-proxy", "hosts": [""], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "ShenZhen", "ST": "ShenZhen", "O": "k8s", "OU": "System" } ] } EOF cfssl gencert -initca ca-csr.json | cfssljson -bare ca - cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
部署apiserver、controller-manager和scheduler 二进制包下载地址:https://github.com/kubernetes/kubernetes/releases
在每个release版本的CHANGELOG中有每个版本的二进制包下载列表,下载对应平台下的Server Binaries(包含master/node组件),上传到其中一台Master节点的/root目录下(本文以1.20.9为例)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 mkdir /opt/kubernetesmkdir /opt/kubernetes/binmkdir /opt/kubernetes/cfgmkdir /opt/kubernetes/sslmkdir /opt/kubernetes/logscd /root/tar zxf kubernetes-server-linux-amd64.tar.gz cp -a /root/kubernetes/server/bin/kube-apiserver /opt/kubernetes/bin/cp -a /root/kubernetes/server/bin/kube-controller-manager /opt/kubernetes/bin/cp -a /root/kubernetes/server/bin/kube-scheduler /opt/kubernetes/bin/cp -a /root/kubernetes/server/bin/kubectl /usr/local/bin/cp -a /root/k8s-cert/{ca,ca-key,apiserver,apiserver-key}.pem /opt/kubernetes/ssl/cat > /opt/kubernetes/cfg/kube-apiserver.conf << EOF KUBE_APISERVER_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/opt/kubernetes/logs \\ --etcd-servers=https://10.211.55.7:2379,https://10.211.55.8:2379,https://10.211.55.9:2379 \\ --bind-address=10.211.55.7 \\ --secure-port=6443 \\ --advertise-address=10.211.55.7 \\ --allow-privileged=true \\ --service-cluster-ip-range=10.0.0.0/24 \\ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \\ --authorization-mode=RBAC,Node \\ --enable-bootstrap-token-auth=true \\ --token-auth-file=/opt/kubernetes/cfg/token.csv \\ --service-node-port-range=30000-32767 \\ --kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem \\ --kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem \\ --tls-cert-file=/opt/kubernetes/ssl/apiserver.pem \\ --tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem \\ --client-ca-file=/opt/kubernetes/ssl/ca.pem \\ --service-account-issuer=api \\ --service-account-signing-key-file=/opt/kubernetes/ssl/apiserver-key.pem \\ --service-account-key-file=/opt/kubernetes/ssl/ca-key.pem \\ --etcd-cafile=/opt/etcd/ssl/ca.pem \\ --etcd-certfile=/opt/etcd/ssl/etcd.pem \\ --etcd-keyfile=/opt/etcd/ssl/etcd-key.pem \\ --requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \\ --proxy-client-cert-file=/opt/kubernetes/ssl/apiserver.pem \\ --proxy-client-key-file=/opt/kubernetes/ssl/apiserver-key.pem \\ --requestheader-allowed-names=kubernetes \\ --requestheader-extra-headers-prefix=X-Remote-Extra- \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --enable-aggregator-routing=true \\ --audit-log-maxage=30 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-path=/opt/kubernetes/logs/k8s-audit.log \\ --default-not-ready-toleration-seconds=20 \\ --default-unreachable-toleration-seconds=20 \\ --feature-gates=RemoveSelfLink=false" EOF
参数
说明
--logtostderr
日志是否输出到标准错误输出
--v=2
日志级别0-8,数字越大,日志越详细
--log-dir
设置日志存放目录
--etcd-servers
指定etcd服务的URL
--bind-address
apiserver监听地址
--secure-port
apiserver监听端口,默认为6443
--advertise-address
通告地址,让其他节点通过此IP来连接apiserver
--allow-privileged
开启容器的privileged权限
--service-cluster-ip-range
Kubernetes集群中Service的虚拟IP地址范围,以CIDR格式表示,例如169.169.0.0/16,该IP范围不能与部署机器的IP地址有重合
--enable-admission-plugins
Kubernetes集群的准入控制设置,各控制模块以插件的形式依次生效。
--authorization-mode
授权模式
--enable-bootstrap-token-auth
启用bootstrap token认证
--service-node-port-range
Kubernetes集群中Service可使用的端口号范围,默认值为30000~32767
--kubelet-client-certificate、--kubelet-client-key
连接kubelet使用的证书和私钥
--tls-cert-file、--tls-private-key-file、--client-ca-file、--service-account-key-file
apiserver启用https所用的证书和私钥
--service-account-issuer、--service-account-signing-key-file
1.20.X版本必须配置的参数
--etcd-cafile、--etcd-certfile、--etcd-keyfile
连接etcd所使用的证书
--requestheader-client-ca-file、--proxy-client-cert-file、--proxy-client-key-file、--requestheader-allowed-names、--requestheader-extra-headers-prefix、--requestheader-group-headers、--requestheader-username-headers、--enable-aggregator-routing
启动聚合层相关配置
--audit-log-maxage、--audit-log-maxbackup、--audit-log-maxsize、--audit-log-path
日志轮转、日志路径
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 cat > /opt/kubernetes/cfg/kube-controller-manager.conf << EOF KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/opt/kubernetes/logs \\ --leader-elect=true \\ --kubeconfig=/opt/kubernetes/cfg/kube-controller-manager.kubeconfig \\ --bind-address=127.0.0.1 \\ --allocate-node-cidrs=true \\ --cluster-cidr=10.244.0.0/16 \\ --service-cluster-ip-range=10.0.0.0/24 \\ --cluster-signing-cert-file=/opt/kubernetes/ssl/ca.pem \\ --cluster-signing-key-file=/opt/kubernetes/ssl/ca-key.pem \\ --root-ca-file=/opt/kubernetes/ssl/ca.pem \\ --service-account-private-key-file=/opt/kubernetes/ssl/ca-key.pem \\ --cluster-signing-duration=876000h0m0s \\ --node-monitor-grace-period=20s \\ --node-monitor-period=2s \\ --node-startup-grace-period=20s \\ --pod-eviction-timeout=20s \\ --node-eviction-rate=1" EOF
参数
说明
--leader-elect
启用自动选举
--kubeconfig
指定连接apiserver的kubeconfig配置文件
--bind-address
配置controller-manager监听地址,不需要对外
--allocate-node-cidrs
允许安装CNI插件,自动分配IP
--cluster-cidr
集群pod的IP段,要与与CNI插件的IP段一致
--service-cluster-ip-range
Service Cluster IP段,与kube-apiserver.conf中的--service-cluster-ip-range参数配置保持一致
--cluster-signing-cert-file、--cluster-signing-key-file
用于集群签名的ca证书和私钥
--root-ca-file、--service-account-private-key-file
签署service account的证书和私钥
--cluster-signing-duration
签发证书的有效期
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 KUBE_CONFIG="/opt/kubernetes/cfg/kube-controller-manager.kubeconfig" KUBE_APISERVER="https://10.211.55.10:8443" cd /root/k8s-cert/kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-controller-manager \ --client-certificate=./kube-controller-manager.pem \ --client-key=./kube-controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-controller-manager \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
1 2 3 4 5 6 7 8 9 cat > /opt/kubernetes/cfg/kube-scheduler.conf << EOF KUBE_SCHEDULER_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/opt/kubernetes/logs \\ --leader-elect=true \\ --kubeconfig=/opt/kubernetes/cfg/kube-scheduler.kubeconfig \\ --bind-address=127.0.0.1" EOF
参数
说明
--kubeconfig
指定连接apiserver的kubeconfig配置文件
--leader-elect
启用自动选举
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 KUBE_CONFIG="/opt/kubernetes/cfg/kube-scheduler.kubeconfig" KUBE_APISERVER="https://10.211.55.10:8443" cd /root/k8s-cert/kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-scheduler \ --client-certificate=./kube-scheduler.pem \ --client-key=./kube-scheduler-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-scheduler \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 cat > /usr/lib/systemd/system/kube-apiserver.service << EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=/opt/kubernetes/cfg/kube-apiserver.conf ExecStart=/opt/kubernetes/bin/kube-apiserver \$KUBE_APISERVER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF cat > /usr/lib/systemd/system/kube-controller-manager.service << EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=/opt/kubernetes/cfg/kube-controller-manager.conf ExecStart=/opt/kubernetes/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF cat > /usr/lib/systemd/system/kube-scheduler.service << EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=/opt/kubernetes/cfg/kube-scheduler.conf ExecStart=/opt/kubernetes/bin/kube-scheduler \$KUBE_SCHEDULER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF
随机生成一个32位字符串,用以创建token.csv文件
1 2 3 token=`head -c 16 /dev/urandom | od -An -t x | tr -d ' ' ` echo "$token ,kubelet-bootstrap,10001,'system:node-bootstrapper'" > /opt/kubernetes/cfg/token.csv
说明 此处apiserver配置的token(32位随机字符串)必须要与后面node节点bootstrap.kubeconfig配置文件里的token一致
将kubernetes工作目录、etc工作目录下的ssl目录(证书和私钥文件)、Master组件的Service文件和kubectl二进制文件拷贝到其余Master节点的对应目录下
1 2 3 4 5 scp -r /opt/kubernetes/ easyk8s2:/opt/ ssh easyk8s2 mkdir -p /opt/etcd scp -r /opt/etcd/ssl/ easyk8s2:/opt/etcd/ scp -r /usr/lib/systemd/system/{kube-apiserver,kube-controller-manager,kube-scheduler}.service easyk8s2:/usr/lib/systemd/system/ scp -r /usr/local/bin/kubectl easyk8s2:/usr/local/bin/
修改其余Master节点上/opt/kubernetes/cfg/kube-apiserver.conf配置文件中的--bind-address和--advertise-address参数为本机IP
1 2 3 4 ... --bind-address=10.211.55.8 \ --advertise-address=10.211.55.8 \ ...
在所有Master节点上执行以下命令设置api-server、controller-manager、scheduler开机自启并启动
1 2 3 4 5 6 7 8 9 10 systemctl daemon-reload systemctl enable kube-apiserver systemctl enable kube-controller-manager systemctl enable kube-scheduler systemctl start kube-apiserver systemctl start kube-controller-manager systemctl start kube-scheduler systemctl status kube-apiserver systemctl status kube-controller-manager systemctl status kube-scheduler
在master1节点上执行以下命令生成kubectl访问集群的kubeconfig文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 mkdir -p /root/.kubeKUBE_CONFIG="/root/.kube/config" KUBE_APISERVER="https://10.211.55.10:8443" cd /root/k8s-cert/kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials cluster-admin \ --client-certificate=./admin.pem \ --client-key=./admin-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=cluster-admin \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
将/root/.kube目录拷贝到其余master节点上,使得所有master节点均可通过kubectl访问集群
1 scp -r /root/.kube/ easyk8s2:/root/
此时你可以通过执行kubectl get cs获取Kubernetes的各服务端组件状态看是否为Healthy
1 2 3 4 5 6 7 8 Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health" :"true" ,"reason" :"" } etcd-1 Healthy {"health" :"true" ,"reason" :"" } etcd-2 Healthy {"health" :"true" ,"reason" :"" }
在任意一台master上执行以下命令授权kubelet-bootstrap用户允许请求证书
1 kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
部署Node组件
安装Docker 二进制包下载地址:https://download.docker.com/linux/static/stable/
到对应平台的目录下载所需版本的Docker二进制包,并上传到所有Node节点的/root目录下(本文以x86平台下的18.09.9为例),然后依次在所有Node节点上执行以下命令安装Docker
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 cd /root/tar zxf docker-18.09.9.tgz chmod 755 docker/*cp -a docker/* /usr/bin/cat > /usr/lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com After=network-online.target firewalld.service containerd.service Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd ExecReload=/bin/kill -s HUP \$MAINPID TimeoutSec=0 RestartSec=2 Restart=always StartLimitBurst=3 StartLimitInterval=60s LimitNOFILE=1048576 LimitNPROC=1048576 LimitCORE=infinity TasksMax=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start docker systemctl enable docker systemctl status docker
部署kubelet和kube-proxy 二进制包下载地址:https://github.com/kubernetes/kubernetes/releases
在每个release版本的CHANGELOG中有每个版本的二进制包下载列表,下载对应平台下的Server Binaries(包含master/node组件),上传到所有Node节点的/root目录下(本文以1.20.9为例),然后依次在所有Node节点上执行以下操作安装kubelet和kube-proxy
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 mkdir -p /opt/kubernetesmkdir -p /opt/kubernetes/binmkdir -p /opt/kubernetes/cfgmkdir -p /opt/kubernetes/sslmkdir -p /opt/kubernetes/logscd /root/tar zxf kubernetes-server-linux-amd64.tar.gz cp -a /root/kubernetes/server/bin/kubelet /opt/kubernetes/bin/cp -a /root/kubernetes/server/bin/kube-proxy /opt/kubernetes/bin/cat > /opt/kubernetes/cfg/kubelet.conf << EOF KUBELET_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/opt/kubernetes/logs \\ --hostname-override=easyk8s1 \\ --network-plugin=cni \\ --kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \\ --bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \\ --config=/opt/kubernetes/cfg/kubelet-config.yml \\ --cert-dir=/opt/kubernetes/ssl" EOF cat > /opt/kubernetes/cfg/kubelet-config.yml << EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: 0.0.0.0 port: 10250 readOnlyPort: 10255 cgroupDriver: cgroupfs clusterDNS: - 10.0.0.2 clusterDomain: cluster.local failSwapOn: false authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /opt/kubernetes/ssl/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% maxOpenFiles: 1000000 maxPods: 110 EOF cat > /opt/kubernetes/cfg/kube-proxy.conf << EOF KUBE_PROXY_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/opt/kubernetes/logs \\ --config=/opt/kubernetes/cfg/kube-proxy-config.yml" EOF cat > /opt/kubernetes/cfg/kube-proxy-config.yml << EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 metricsBindAddress: 0.0.0.0:10249 clientConnection: kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig hostnameOverride: easyk8s1 clusterCIDR: 10.0.0.0/24 mode: ipvs ipvs: scheduler: "rr" iptables: masqueradeAll: true EOF yum install ipvsadm -y cat > /usr/lib/systemd/system/kubelet.service << EOF [Unit] Description=Kubernetes Kubelet After=docker.service Wants=docker.service [Service] EnvironmentFile=/opt/kubernetes/cfg/kubelet.conf ExecStart=/opt/kubernetes/bin/kubelet \$KUBELET_OPTS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF cat > /usr/lib/systemd/system/kube-proxy.service << EOF [Unit] Description=Kubernetes Proxy After=network.target [Service] EnvironmentFile=/opt/kubernetes/cfg/kube-proxy.conf ExecStart=/opt/kubernetes/bin/kube-proxy \$KUBE_PROXY_OPTS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
执行以下操作生成bootstrap.kubeconfig配置文件和kube-proxy.kubeconfig配置文件,并拷贝到所有Node节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 KUBE_CONFIG="/root/k8s-cert/bootstrap.kubeconfig" KUBE_APISERVER="https://10.211.55.10:8443" TOKEN="16654491086d9095bd387c665efe01dd" kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials "kubelet-bootstrap" \ --token=${TOKEN} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user="kubelet-bootstrap" \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG} scp -r /root/k8s-cert/bootstrap.kubeconfig easyk8s1:/opt/kubernetes/cfg/bootstrap.kubeconfig KUBE_CONFIG="/root/k8s-cert/kube-proxy.kubeconfig" KUBE_APISERVER="https://10.211.55.10:8443" cd /root/k8s-cert/kubectl config set-cluster kubernetes \ --certificate-authority=/opt/kubernetes/ssl/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-proxy \ --client-certificate=./kube-proxy.pem \ --client-key=./kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-proxy \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG} scp -r /root/k8s-cert/kube-proxy.kubeconfig easyk8s1:/opt/kubernetes/cfg/kube-proxy.kubeconfig
从上面安装了cfssl工具的主机上拷贝ca证书和kube-proxy的自签证书和私钥到所有Node节点的/opt/kubernetes/ssl目录下
1 2 cd /root/k8s-cert/scp -r ca.pem kube-proxy.pem kube-proxy-key.pem easyk8s1:/opt/kubernetes/ssl/
设置kubelet和kube-proxy开机自启并启动
1 2 3 4 5 6 7 systemctl daemon-reload systemctl enable kubelet systemctl enable kube-proxy systemctl start kubelet systemctl start kube-proxy systemctl status kubelet systemctl status kube-proxy
允许给Node颁发证书 当kubelet和kube-proxy成功启动后,此时在任意一台master节点上执行kubectl get csr可以看到有新的节点请求颁发证书(CONDITION字段处于Pending状态),执行以下命令允许给Node颁发证书
1 2 kubectl certificate approve node-csr--p9rVRfwl6f5UvR8iRQvpiHuN53qfwMNSRVSfSjTURk
授权颁发证书后,在任意一台master节点执行kubectl get node能看到Node节点都还处于NotReady状态,这是因为现在还没安装网络插件
补充知识点 1.若kubectl或kube-proxy配置文件中的hostname-override配置参数漏修改,导致授权后master无法正常获取到Node节点信息,除了修改kubelet.conf的--hostname-override配置和kube-proxy-config.yml的hostnameOverride配置外,还需要将kubelet.kubeconfig文件(这个文件是master认证后客户端自动生成的)删除,才可重新申请授权,否则报错信息类似如下:
1 kubelet_node_status.go:94] Unable to register node "k8s-node2" with API server: nodes "k8s-node2" is forbidden: node "k8s-node1" is not allowed to modify node "k8s-node2"
2.TLS Bootstrapping 机制流程(Kubelet) 3.如何删除一个Node节点并重新接入集群 在Master节点操作
1 2 kubectl drain 10.211.55.9 --delete-local-data kubectl delete node 10.211.55.9
在Node节点操作
1 2 3 4 rm -rf /opt/kubernetes/cfg/kubelet.kubeconfigrm -rf /opt/kubernetes/ssl/kubelet*systemctl restart kubelet systemctl restart kube-proxy
在Master节点重新授权
1 2 kubectl get csr kubectl certificate approve xxxx
部署CNI网络 Kubernetes支持多种网络类型,详细可参考以下文档:
本文将介绍Calico网络的安装方法
在任意一台master节点执行以下命令下载Calico的资源清单文件,或者到https://github.com/projectcalico/calico 获取对应版本的manifests/calico.yaml文件
1 2 cd /root/curl https://docs.projectcalico.org/manifests/calico.yaml -O
编辑calico.yaml文件,根据实际环境情况修改CALICO_IPV4POOL_CIDR配置
说明 CALICO_IPV4POOL_CIDR指定的网段要与/opt/kubernetes/cfg/kube-controller-manager.conf中的--cluster-cidr参数指定的网段一致
1 2 3 4 5 6 7 8 9 10 ... - name: CALICO_IPV4POOL_CIDR value: "10.244.0.0/16" - name: CALICO_DISABLE_FILE_LOGGING value: "true" ...
修改完成后进行安装
1 kubectl apply -f /root/calico.yaml
查看Calico的pod创建情况,待所有pod都处于Running状态后表示CNI部署完成
1 watch kubectl get pods -n kube-system -o wide
授权apiserver访问kubelet 为提供安全性,kubelet禁止匿名访问,必须授权才可以。一个常见的表现就是无法通过kubectl logs查看pod的日志,错误输出类似如下:
Error from server (Forbidden): Forbidden (user=kubernetes, verb=get, resource=nodes, subresource=proxy) ( pods/log kube-flannel-ds-amd64-cdmcd)
在任意一台master节点上执行以下命令进行授权
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 cat > /root/apiserver-to-kubelet-rbac.yaml << EOF apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-apiserver-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics - pods/log verbs: - "*" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kube-apiserver namespace: "" roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-apiserver-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kubernetes EOF kubectl apply -f /root/apiserver-to-kubelet-rbac.yaml
环境测试验证 在任意一个master节点上执行以下命令创建一个nginx pod并暴露端口测试是否可以从外部正常访问
1 2 3 4 5 6 7 8 9 10 kubectl create deployment web --image=nginx kubectl expose deployment web --port=80 --type =NodePort kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE web NodePort 10.0.0.34 <none> 80:32163/TCP 61s
浏览器访问:http://<Node_IP>:32163如果能正常返回nginx欢迎页面,则表示环境一切正常。
说明 验证正常后记得清理测试资源哦~
1 2 kubectl delete service web kubectl delete deployment web
部署CoreDNS
部署CoreDNS主要是为了给Kubernetes的Service提供DNS解析服务,使得程序可以通过Service的名称进行访问
DNS服务监视Kubernetes API,为每一个Service创建DNS记录用于域名解析。
ClusterIP A记录格式:<service-name>.<namespace-name>.svc.<domain_suffix>,示例:my-svc.my-namespace.svc.cluster.local
使用kubeadm方式部署的Kubernetes会自动安装CoreDNS,二进制部署方式则需要自行安装
从Github地址上下载coredns.yaml.base文件到任意master节点的/root/目录下,并重命名为coredns.yaml,然后参考下方标注修改其中的部分参数
__MACHINE_GENERATED_WARNING__替换为This is a file generated from the base underscore template file: coredns.yaml.base
__PILLAR__DNS__DOMAIN__或__DNS__DOMAIN__替换为cluster.local,若要使用非默认域名如koenli.net记得要与node节点上/opt/kubernetes/cfg/kubelet-config.yml文件中的clusterDomain参数保持一致,并要调整api-server证书中的hosts字段值并重新颁发证书
__PILLAR__DNS__MEMORY__LIMIT__或__DNS__MEMORY__LIMIT__替换为170Mi,此内存限制的值可根据实际环境资源进行调整
__PILLAR__DNS__SERVER__或__DNS__SERVER__替换为10.0.0.2,此IP地址需要与Node节点上/opt/kubernetes/cfg/kubelet-config.yml文件中配置的clusterDNS字段的IP一致
说明 官方提供的yaml文件中使用的镜像仓库在境外,国内无法访问。建议将其替换为国内阿里云的替代镜像仓库
官方镜像仓库
阿里云替代镜像仓库
docker.io/coredns/coredns
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns
以下为我替换后最终的文件内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 apiVersion: v1 kind: ServiceAccount metadata: name: coredns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: kubernetes.io/bootstrapping: rbac-defaults addonmanager.kubernetes.io/mode: Reconcile name: system:coredns rules: - apiGroups: - "" resources: - endpoints - services - pods - namespaces verbs: - list - watch - apiGroups: - "" resources: - nodes verbs: - get --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults addonmanager.kubernetes.io/mode: EnsureExists name: system:coredns roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:coredns subjects: - kind: ServiceAccount name: coredns namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system labels: addonmanager.kubernetes.io/mode: EnsureExists data: Corefile: | .:53 { errors health { lameduck 5s } ready kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa ttl 30 } prometheus :9153 forward . /etc/resolv.conf { max_concurrent 1000 } cache 30 loop reload loadbalance } --- apiVersion: apps/v1 kind: Deployment metadata: name: coredns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "CoreDNS" spec: strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 selector: matchLabels: k8s-app: kube-dns template: metadata: labels: k8s-app: kube-dns spec: securityContext: seccompProfile: type: RuntimeDefault priorityClassName: system-cluster-critical serviceAccountName: coredns affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: k8s-app operator: In values: ["kube-dns" ] topologyKey: kubernetes.io/hostname tolerations: - key: "CriticalAddonsOnly" operator: "Exists" nodeSelector: kubernetes.io/os: linux containers: - name: coredns image: docker.io/coredns/coredns:1.8.0 imagePullPolicy: IfNotPresent resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi args: [ "-conf" , "/etc/coredns/Corefile" ] volumeMounts: - name: config-volume mountPath: /etc/coredns readOnly: true ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9153 name: metrics protocol: TCP livenessProbe: httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /ready port: 8181 scheme: HTTP securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - all readOnlyRootFilesystem: true dnsPolicy: Default volumes: - name: config-volume configMap: name: coredns items: - key: Corefile path: Corefile --- apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system annotations: prometheus.io/port: "9153" prometheus.io/scrape: "true" labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "CoreDNS" spec: selector: k8s-app: kube-dns clusterIP: 10.0 .0 .2 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP - name: metrics port: 9153 protocol: TCP
执行以下命令进行安装
1 kubectl apply -f /root/coredns.yaml
查看CoreDNS的pod创建情况,待所有Pod均为Running状态后表示部署完成
1 watch kubectl get pod -n kube-system
在任意master节点上执行以下命令创建一个busybox容器,在容器中ping service的名称看是否可以正常解析出IP地址,如果可以则说明DNS服务部署成功。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 cat > /root/bs.yaml << EOF apiVersion: v1 kind: Pod metadata: name: busybox namespace: default spec: containers: - image: busybox:1.28.4 command: - sleep - "3600" imagePullPolicy: IfNotPresent name: busybox restartPolicy: Always EOF kubectl apply -f /root/bs.yaml watch kubectl get pods kubectl exec -ti busybox sh / Server: 10.0.0.2 Address 1: 10.0.0.2 kube-dns.kube-system.svc.cluster.local Name: kubernetes Address 1: 10.0.0.1 kubernetes.default.svc.cluster.local / kubectl delete -f /root/bs.yaml
部署Kubernetes Dashboard
在任意一台master节点上下载Kubernetes Dashboard的yaml文件到/root目录下
说明 本文以安装v2.2.0版本为例,实际安装时请到Github地址 获取对应版本的yaml文件下载地址
1 2 cd /root/wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml
编辑recommended.yaml文件,找到kubernetes-dashboard这个Service的部分,设置其type为NodePort,nodePort为30001(可自定义)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ... kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: type: NodePort ports: - port: 443 targetPort: 8443 nodePort: 30001 selector: k8s-app: kubernetes-dashboard ...
修改完成后,执行以下命令部署Kubernetes Dashboard
1 kubectl apply -f /root/recommended.yaml
查看Kubernetes Dashboard的pod创建情况,待所有pod都处于Running状态后再继续下面的步骤
1 watch kubectl get pods -n kubernetes-dashboard
创建service account并绑定默认cluster-admin管理员集群角色
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 cat > /root/dashboard-adminuser.yaml << EOF apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard EOF kubectl apply -f /root/dashboard-adminuser.yaml
获取登录token
1 kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}' )
使用上面输出的token登录Kubernetes Dashboard(https://<NODE_IP>:30001)。
说明 协议要用HTTPS
说明 如果在Chrome浏览器中访问Kubernetes Dashboard没有”继续前往x.x.x.x(不安全)”的选项,可参考下面的步骤进行处理 1.在刚执行命令部署Kubernetes Dashboard的master节点上执行以下命令删除默认的secret,并用自签证书创建新的secret(注意修改自签证书的路径是否与实际环境一致)
1 2 kubectl delete secret kubernetes-dashboard-certs -n kubernetes-dashboard kubectl create secret generic kubernetes-dashboard-certs --from-file=/opt/kubernetes/ssl/apiserver-key.pem --from-file=/opt/kubernetes/ssl/apiserver.pem -n kubernetes-dashboard
2.修改/root/recommended.yaml文件,在args下面指定证书文件和Key文件(搜索auto-generate-certificates即可跳转到对应位置)
1 2 3 4 5 args: - --auto-generate-certificates - --tls-key-file=apiserver-key.pem - --tls-cert-file=apiserver.pem
3.重新应用recommended.yaml
1 kubectl apply -f /root/recommended.yaml
4.确认Kubernetes Dashboard的pod都处于Running状态
1 watch kubectl get pods -n kubernetes-dashboard
5.重新访问https://<NODE_IP>:30001
到目前为止一套3Master+3个Node的高可用Kubernetes集群全部搭建完成。如果还需要再配置kubectl命令自动补全、安装ingress-nginx、安装Helm可继续参考后续章节。
配置kubectl命令自动补全
在所有master节点上执行以下操作
1 2 3 4 5 6 7 8 9 yum install bash-completion -y source /usr/share/bash-completion/bash_completionsed -i '$a\source <(kubectl completion bash)' /etc/profile source /etc/profile
安装ingress-nginx
访问Github地址 ,切换到所需版本,找到deploy/static/provider/baremetal/deploy.yaml或者deploy/static/mandatory.yaml资源清单文件,下载并上传到任意一台master节点的/root目录,重命名为ingress.yaml(也可直接将文件内容复制然后在机器上新建文件粘贴,保存为ingress.yaml)
说明 本文以0.30.0为例
编辑ingress.yaml文件,在Deployment.spec.template.spec中设置启用hostNetwork并添加Service相关配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 ... apiVersion: apps/v1 kind: Deployment metadata: name: nginx-ingress-controller namespace: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx template: metadata: labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx annotations: prometheus.io/port: "10254" prometheus.io/scrape: "true" spec: hostNetwork: true terminationGracePeriodSeconds: 300 serviceAccountName: nginx-ingress-serviceaccount nodeSelector: kubernetes.io/os: linux ... --- apiVersion: v1 kind: Service metadata: name: ingress-nginx namespace: ingress-nginx spec: type: ClusterIP ports: - name: http port: 80 targetPort: 80 protocol: TCP - name: https port: 443 targetPort: 443 protocol: TCP selector: app.kubernetes.io/name: ingress-nginx
执行以下命令安装ingress-nginx
1 2 cd /root/kubectl apply -f ingress.yaml
确认ingress-nginx的pod创建情况,待所有pod都处于Completed或Running状态后表示ingress-nginx部署完成
1 watch kubectl get pods -n ingress-nginx
验证ingress-nginx
1 2 3 4 5 6 7 8 9 10 11 12 13 14 kubectl create deployment web --image=nginx kubectl expose deployment web --port=80 kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission kubectl create ingress web --class=nginx \ --rule="test.koenli.com/*=web:80" kubectl get pod -o wide -n ingress-nginx | grep ingress-nginx-controller curl --resolve test.koenli.com:80:10.211.55.8 http://test.koenli.com
说明 验证正常后记得清理测试资源哦~
1 2 3 kubectl delete ingress web kubectl delete service web kubectl delete deployment web
安装Helm
说明 Helm目前存在Helm2 与Helm3 两个大版本,两个版本互不兼容,根据Helm与Kubernetes版本兼容情况结合实际业务需求安装其中一个版本即可
Helm2 下载所需版本的Helm安装包(以2.17.0版本为例),上传到所有的master节点的/root/helm目录下(如果没有此目录需先创建),执行以下命令安装Helm客户端
1 2 3 4 cd /root/helm/tar zxf helm-v2.17.0-linux-amd64.tar.gz cd linux-amd64/cp -a helm /usr/local/bin/
创建Tiller授权清单并应用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 cat > /root/helm/tiller-rbac.yaml << EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: tiller namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: tiller-cluster-rule roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: tiller namespace: kube-system EOF kubectl apply -f /root/helm/tiller-rbac.yaml
在其中一台master执行以下命令初始化Helm服务端
1 helm init --service-account tiller --skip-refresh
使用命令查看Tiller的pod
1 kubectl get pods -n kube-system |grep tiller
待Running后执行helm version,如果出现如下输出说明Helm的客户端和服务端安装完成。
1 2 Client: &version.Version{SemVer:"v2.17.0" , GitCommit:"a690bad98af45b015bd3da1a41f6218b1a451dbe" , GitTreeState:"clean" } Server: &version.Version{SemVer:"v2.17.0" , GitCommit:"a690bad98af45b015bd3da1a41f6218b1a451dbe" , GitTreeState:"clean" }
说明 如果STATUS为ImagePullBackOff状态,说明拉取镜像失败,可尝试执行kubectl edit pods tiller-deploy-xxxxxxxx -n kube-system编辑Tiller的deployment,更换所使用的镜像为sapcc/tiller:[tag],此镜像为Mirror of https://gcr.io/kubernetes-helm/tiller/
1 2 3 4 ... image: sapcc/tiller:v2.17.0 ...
保存退出后执行以下命令查看Tiller的pod,待Running后执行helm version确认Helm是否安装完成。
1 kubectl get pods -n kube-system |grep tiller
说明 如果执行helm version出现类似如下报错
1 2 Client: &version.Version{SemVer:"v2.17.0" , GitCommit:"a690bad98af45b015bd3da1a41f6218b1a451dbe" , GitTreeState:"clean" } E1213 15:58:40.605638 10274 portforward.go:400] an error occurred forwarding 34583 -> 44134: error forwarding port 44134 to pod 1e92153b279110f9464193c4ea7d6314ac69e70ce60e7319df9443e379b52ed4, uid : unable to do port forwarding: socat not found
解决方法 在所有node节点上安装socat
Helm3 下载所需版本的Helm安装包(以3.14.2版本为例),上传到所有的master节点的/root/helm目录下(如果没有此目录需先创建),执行以下命令安装Helm并验证
1 2 3 4 5 6 7 8 cd /root/helm/tar zxf helm-v3.14.2-linux-amd64.tar.gz cd linux-amd64/cp -a helm /usr/local/bin/helm version