Setup K8S with Ansible from zero

1. 删除之前的环境

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ois@ois:~/data/k8s-cilium-lab$ cd ..
ois@ois:~/data$ ./07-undefine-vms.sh
Domain 'k8s-node-1' destroyed

Domain 'k8s-node-1' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/k8s-node-1.qcow2) removed.

Domain 'k8s-node-2' destroyed

Domain 'k8s-node-2' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/k8s-node-2.qcow2) removed.

Domain 'k8s-node-3' destroyed

Domain 'k8s-node-3' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/k8s-node-3.qcow2) removed.

Domain 'dns-bgp-server' destroyed

Domain 'dns-bgp-server' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/dns-bgp-server.qcow2) removed.

ois@ois:~/data$ rm -rf k8s-cilium-lab/
ois@ois:~/data$

2. 重新构建项目文件结构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
ois@ois:~/data$ ./00-create-project-structure.sh 
--- K8s + Cilium Lab Setup (Warning-Free) ---
This script will prepare a new Ansible project directory named 'k8s-cilium-lab'.

--- Step 1: Checking Prerequisites ---
✅ OS is Debian-based.
✅ Ansible is already installed.
--> Ensuring libvirt and whois are up-to-date...
✅ Dependencies are present.
✅ SSH key already exists at ~/.ssh/id_rsa. Skipping generation.

--- Step 2: Creating Project Structure ---
✅ Project directory structure created in 'k8s-cilium-lab/'.

--- Step 3: Configuring User Password Hash ---
A secure password hash is required for the 'ubuntu' user on the VMs.
Enter the password for the 'ubuntu' user (input will be hidden):
--> Generating password hash...
✅ Password hash generated.

--- Step 4: Generating Configuration Files ---
✅ Created ansible.cfg
✅ Created inventory.ini
✅ Created group_vars/all.yml with password hash.
✅ Created host_vars/k8s-node-1.yml
✅ Created host_vars/k8s-node-2.yml
✅ Created host_vars/k8s-node-3.yml
✅ Created host_vars/dns-bgp-server.yml

--- Setup Complete! ---
✅ Project 'k8s-cilium-lab' has been successfully configured.

Next Steps:
1. Run the next script to generate the VM creation playbook:
../01-create-vms.sh
2. Run the playbook to create your lab VMs (no vault password needed):
ansible-playbook playbooks/1_create_vms.yml

3. 生成Playbook,用于构建Lab环境虚拟机

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
ois@ois:~/data$ ./01-create-vms.sh 
--- Lab VM Playbook Generator (Final Corrected Version) ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Ensuring Directories Exist ---
✅ Directories 'playbooks/' and 'templates/' are ready.

--- Step 3: Generating Files ---
✅ Generated playbook: playbooks/1_create_vms.yml
✅ Generated template: templates/user-data.j2
✅ Generated template: templates/network-config.j2
✅ Generated template: templates/meta-data.j2

--- Generation Complete! ---
✅ All necessary files have been created inside the 'k8s-cilium-lab' directory.

Next Step:
1. Change into the project directory: cd k8s-cilium-lab
2. Run the playbook to create your lab VMs:
ansible-playbook playbooks/1_create_vms.yml

4. 运行Playbook,构建虚拟机环境

自动创建k8s-nodes 和 运行FRR的虚拟机,并进行联通性探测,添加到已知主机列表(known_hosts)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$
ois@ois:~/data/k8s-cilium-lab$
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/1_create_vms.yml

PLAY [Play 1 - Pre-flight Check for Existing VMs] ******************************************************************************************************************************************************

TASK [Check status of each VM with virsh] **************************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

PLAY [Play 2 - Decide if Provisioning is Needed] *******************************************************************************************************************************************************

TASK [Initialize an empty list for missing VMs] ********************************************************************************************************************************************************
ok: [localhost]

TASK [Populate the list of missing VMs] ****************************************************************************************************************************************************************
ok: [localhost] => (item=dns-bgp-server)
ok: [localhost] => (item=k8s-node-1)
ok: [localhost] => (item=k8s-node-2)
ok: [localhost] => (item=k8s-node-3)

TASK [Set global flag if provisioning is required] *****************************************************************************************************************************************************
ok: [localhost]

TASK [Report status] ***********************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "Provisioning needed: True. Missing VMs: ['dns-bgp-server', 'k8s-node-1', 'k8s-node-2', 'k8s-node-3']"
}

PLAY [Play 3 - Prepare VM Assets in Parallel] **********************************************************************************************************************************************************
[WARNING]: Using run_once with the free strategy is not currently supported. This task will still be executed for every host in the inventory list.

TASK [Ensure VM directories exist] *********************************************************************************************************************************************************************
changed: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
ok: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
ok: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
ok: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
changed: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
ok: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
ok: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
ok: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)

TASK [Check if VM disk image already exists] ***********************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [Create VM disk image from base image] ************************************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [Resize VM disk image] ****************************************************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [Generate cloud-init files] ***********************************************************************************************************************************************************************
changed: [k8s-node-2] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_user-data'})
changed: [k8s-node-3] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_user-data'})
changed: [dns-bgp-server] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_user-data'})
changed: [k8s-node-1] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_user-data'})
changed: [k8s-node-3] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_network-config'})
changed: [k8s-node-2] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_network-config'})
changed: [k8s-node-1] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_network-config'})
changed: [dns-bgp-server] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_network-config'})
changed: [k8s-node-3] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_meta-data'})
changed: [k8s-node-2] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_meta-data'})
changed: [k8s-node-1] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_meta-data'})
changed: [dns-bgp-server] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_meta-data'})

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [dns-bgp-server]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [k8s-node-1]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [k8s-node-2]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [k8s-node-3]

PLAY [Play 5 - Verify VM Connectivity in Parallel] *****************************************************************************************************************************************************

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
ok: [dns-bgp-server -> localhost]
ok: [k8s-node-1 -> localhost]
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
changed: [k8s-node-1 -> localhost]
changed: [dns-bgp-server -> localhost]

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
ok: [k8s-node-3 -> localhost]
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
changed: [k8s-node-3 -> localhost]

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
ok: [k8s-node-2 -> localhost]
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
changed: [k8s-node-2 -> localhost]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server : ok=9 changed=6 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
k8s-node-1 : ok=9 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
k8s-node-2 : ok=9 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
k8s-node-3 : ok=9 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
localhost : ok=4 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

多次运行,以测试幂等性,多次运行不影响结果,已执行过的任务会自行跳过。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/1_create_vms.yml

PLAY [Play 1 - Pre-flight Check for Existing VMs] ******************************************************************************************************************************************************

TASK [Check status of each VM with virsh] **************************************************************************************************************************************************************
ok: [k8s-node-3]
ok: [k8s-node-1]
ok: [dns-bgp-server]
ok: [k8s-node-2]

PLAY [Play 2 - Decide if Provisioning is Needed] *******************************************************************************************************************************************************

TASK [Initialize an empty list for missing VMs] ********************************************************************************************************************************************************
ok: [localhost]

TASK [Populate the list of missing VMs] ****************************************************************************************************************************************************************
skipping: [localhost] => (item=dns-bgp-server)
skipping: [localhost] => (item=k8s-node-1)
skipping: [localhost] => (item=k8s-node-2)
skipping: [localhost] => (item=k8s-node-3)
skipping: [localhost]

TASK [Set global flag if provisioning is required] *****************************************************************************************************************************************************
ok: [localhost]

TASK [Report status] ***********************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "Provisioning needed: False. Missing VMs: []"
}

PLAY [Play 3 - Prepare VM Assets in Parallel] **********************************************************************************************************************************************************
[WARNING]: Using run_once with the free strategy is not currently supported. This task will still be executed for every host in the inventory list.

TASK [Ensure VM directories exist] *********************************************************************************************************************************************************************
skipping: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
skipping: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
skipping: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
skipping: [dns-bgp-server]
skipping: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
skipping: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
skipping: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
skipping: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
skipping: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Check if VM disk image already exists] ***********************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Create VM disk image from base image] ************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Resize VM disk image] ****************************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Generate cloud-init files] ***********************************************************************************************************************************************************************
skipping: [k8s-node-1] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_user-data'})
skipping: [k8s-node-1] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_network-config'})
skipping: [k8s-node-2] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_user-data'})
skipping: [k8s-node-1] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_meta-data'})
skipping: [k8s-node-1]
skipping: [k8s-node-2] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_network-config'})
skipping: [dns-bgp-server] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_user-data'})
skipping: [k8s-node-2] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_meta-data'})
skipping: [k8s-node-2]
skipping: [dns-bgp-server] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_network-config'})
skipping: [dns-bgp-server] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_meta-data'})
skipping: [k8s-node-3] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_user-data'})
skipping: [dns-bgp-server]
skipping: [k8s-node-3] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_network-config'})
skipping: [k8s-node-3] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_meta-data'})
skipping: [k8s-node-3]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [dns-bgp-server]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [k8s-node-1]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [k8s-node-2]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [k8s-node-3]

PLAY [Play 5 - Verify VM Connectivity in Parallel] *****************************************************************************************************************************************************

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server : ok=1 changed=0 unreachable=0 failed=0 skipped=8 rescued=0 ignored=0
k8s-node-1 : ok=1 changed=0 unreachable=0 failed=0 skipped=8 rescued=0 ignored=0
k8s-node-2 : ok=1 changed=0 unreachable=0 failed=0 skipped=8 rescued=0 ignored=0
k8s-node-3 : ok=1 changed=0 unreachable=0 failed=0 skipped=8 rescued=0 ignored=0
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0

5. 生成Playbook以便安装Containerd和K8S工具

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
ois@ois:~/data/k8s-cilium-lab$ cd ..
ois@ois:~/data$ ./02-prepare-nodes.sh
--- Node Preparation Playbook Generator (with cloud-init wait) ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Ensuring Role Directories Exist ---
✅ Role directories created.

--- Step 3: Generating Role Task Files ---
✅ Created tasks for 'common' role.
✅ Created tasks for 'k8s_node' role.
✅ Created tasks for 'infra_server' role.

--- Step 4: Generating Config Templates ---
✅ Created /etc/hosts template.
✅ Created containerd config template.
✅ Created FRR config template.

--- Step 5: Generating Main Playbook ---
✅ Created main playbook: playbooks/2_prepare_nodes.yml

--- Generation Complete! ---
✅ All necessary files for node preparation have been created.

Next Step:
1. Change into the project directory: cd k8s-cilium-lab
2. Run the playbook to prepare your nodes. You will be prompted for the sudo password:
ansible-playbook playbooks/2_prepare_nodes.yml --ask-become-pass

6. 执行Playbook,安装Containerd和K8S工具

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/2_prepare_nodes.yml --ask-become-pass
BECOME password:

PLAY [Play 1 - Prepare All Nodes] **********************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-3]
ok: [k8s-node-2]
ok: [k8s-node-1]

TASK [common : Wait for cloud-init to complete] ********************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]
ok: [k8s-node-1]

TASK [common : Update apt cache and upgrade all packages] **********************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-3]
changed: [k8s-node-1]
changed: [k8s-node-2]

TASK [common : Configure /etc/hosts from template] *****************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [dns-bgp-server]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [common : Turn off all swap devices] **************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Comment out swap entries in /etc/fstab] *************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Load required kernel modules] ***********************************************************************************************************************************************************
changed: [dns-bgp-server] => (item=overlay)
changed: [k8s-node-1] => (item=overlay)
changed: [k8s-node-2] => (item=overlay)
changed: [k8s-node-3] => (item=overlay)
changed: [k8s-node-2] => (item=br_netfilter)
changed: [dns-bgp-server] => (item=br_netfilter)
changed: [k8s-node-1] => (item=br_netfilter)
changed: [k8s-node-3] => (item=br_netfilter)

TASK [common : Ensure kernel modules are loaded on boot] ***********************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-1]
changed: [k8s-node-3]
changed: [k8s-node-2]

TASK [common : Configure sysctl parameters for Kubernetes networking] **********************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-1]
changed: [k8s-node-3]
changed: [k8s-node-2]

TASK [common : Apply sysctl settings without reboot] ***************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-3]
ok: [k8s-node-2]

PLAY [Play 2 - Prepare Kubernetes Nodes] ***************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-3]
ok: [k8s-node-1]
ok: [k8s-node-2]

TASK [k8s_node : Install prerequisite packages] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Ensure apt keyrings directory exists] *************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-3]
ok: [k8s-node-2]

TASK [k8s_node : Add Docker's official GPG key] ********************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [k8s_node : Add Docker's repository to Apt sources] ***********************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Install containerd] *******************************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Configure containerd from template] ***************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Install prerequisite packages for Kubernetes repo] ************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [k8s_node : Download the Kubernetes public signing key] *******************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Dearmor the Kubernetes GPG key] *******************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Add Kubernetes APT repository] ********************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Clean up temporary key file] **********************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Install kubelet, kubeadm, and kubectl] ************************************************************************************************************************************************
changed: [k8s-node-3]
changed: [k8s-node-1]
changed: [k8s-node-2]

TASK [k8s_node : Pin Kubernetes package versions] ******************************************************************************************************************************************************
changed: [k8s-node-2] => (item=kubelet)
changed: [k8s-node-3] => (item=kubelet)
changed: [k8s-node-1] => (item=kubelet)
changed: [k8s-node-2] => (item=kubeadm)
changed: [k8s-node-1] => (item=kubeadm)
changed: [k8s-node-3] => (item=kubeadm)
changed: [k8s-node-2] => (item=kubectl)
changed: [k8s-node-3] => (item=kubectl)
changed: [k8s-node-1] => (item=kubectl)

TASK [k8s_node : Enable and start kubelet service] *****************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

RUNNING HANDLER [Restart containerd] *******************************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

PLAY [Play 3 - Prepare Infrastructure Server] **********************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Install dnsmasq and FRR] **********************************************************************************************************************************************************
changed: [dns-bgp-server]

TASK [infra_server : Configure dnsmasq] ****************************************************************************************************************************************************************
changed: [dns-bgp-server]

TASK [infra_server : Configure FRR daemons] ************************************************************************************************************************************************************
ok: [dns-bgp-server] => (item=zebra)
changed: [dns-bgp-server] => (item=bgpd)

TASK [infra_server : Configure frr.conf] ***************************************************************************************************************************************************************
changed: [dns-bgp-server]

TASK [infra_server : Ensure FRR config has correct permissions] ****************************************************************************************************************************************
ok: [dns-bgp-server]

RUNNING HANDLER [Restart dnsmasq] **********************************************************************************************************************************************************************
changed: [dns-bgp-server]

RUNNING HANDLER [Restart frr] **************************************************************************************************************************************************************************
changed: [dns-bgp-server]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server : ok=16 changed=11 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
k8s-node-1 : ok=24 changed=18 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
k8s-node-2 : ok=24 changed=18 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
k8s-node-3 : ok=24 changed=18 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0

多次执行Playbook,验证幂等性

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/2_prepare_nodes.yml --ask-become-pass
BECOME password:

PLAY [Play 1 - Prepare All Nodes] **********************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [dns-bgp-server]
ok: [k8s-node-3]
ok: [k8s-node-1]

TASK [common : Wait for cloud-init to complete] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [common : Update apt cache and upgrade all packages] **********************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]
ok: [k8s-node-1]

TASK [common : Configure /etc/hosts from template] *****************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-3]
ok: [dns-bgp-server]
ok: [k8s-node-2]

TASK [common : Turn off all swap devices] **************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Comment out swap entries in /etc/fstab] *************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Load required kernel modules] ***********************************************************************************************************************************************************
ok: [k8s-node-2] => (item=overlay)
ok: [dns-bgp-server] => (item=overlay)
ok: [k8s-node-3] => (item=overlay)
ok: [k8s-node-1] => (item=overlay)
ok: [k8s-node-2] => (item=br_netfilter)
ok: [dns-bgp-server] => (item=br_netfilter)
ok: [k8s-node-3] => (item=br_netfilter)
ok: [k8s-node-1] => (item=br_netfilter)

TASK [common : Ensure kernel modules are loaded on boot] ***********************************************************************************************************************************************
ok: [k8s-node-1]
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [common : Configure sysctl parameters for Kubernetes networking] **********************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [common : Apply sysctl settings without reboot] ***************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

PLAY [Play 2 - Prepare Kubernetes Nodes] ***************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Install prerequisite packages] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Ensure apt keyrings directory exists] *************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Add Docker's official GPG key] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Add Docker's repository to Apt sources] ***********************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Install containerd] *******************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Configure containerd from template] ***************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Install prerequisite packages for Kubernetes repo] ************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Download the Kubernetes public signing key] *******************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Dearmor the Kubernetes GPG key] *******************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Add Kubernetes APT repository] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Clean up temporary key file] **********************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Install kubelet, kubeadm, and kubectl] ************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Pin Kubernetes package versions] ******************************************************************************************************************************************************
ok: [k8s-node-1] => (item=kubelet)
ok: [k8s-node-2] => (item=kubelet)
ok: [k8s-node-3] => (item=kubelet)
ok: [k8s-node-1] => (item=kubeadm)
ok: [k8s-node-2] => (item=kubeadm)
ok: [k8s-node-3] => (item=kubeadm)
ok: [k8s-node-1] => (item=kubectl)
ok: [k8s-node-2] => (item=kubectl)
ok: [k8s-node-3] => (item=kubectl)

TASK [k8s_node : Enable and start kubelet service] *****************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

PLAY [Play 3 - Prepare Infrastructure Server] **********************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Install dnsmasq and FRR] **********************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Configure dnsmasq] ****************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Configure FRR daemons] ************************************************************************************************************************************************************
ok: [dns-bgp-server] => (item=zebra)
ok: [dns-bgp-server] => (item=bgpd)

TASK [infra_server : Configure frr.conf] ***************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Ensure FRR config has correct permissions] ****************************************************************************************************************************************
ok: [dns-bgp-server]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server : ok=14 changed=0 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
k8s-node-1 : ok=23 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
k8s-node-2 : ok=23 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
k8s-node-3 : ok=23 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0

7. 执行脚本,构建设置K8S Cluster的Playbook

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
ois@ois:~/data/k8s-cilium-lab$ cd ..
ois@ois:~/data$ ./03-setup-cluster.sh
--- Kubernetes Cluster Setup Playbook Generator ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Checking for 'community.kubernetes' Ansible Collection ---
'community.kubernetes' collection is already installed.

--- Step 3: Generating Cilium BGP Template ---
✅ Created Cilium BGP config template.

--- Step 4: Generating Main Playbook ---
✅ Created main playbook: playbooks/3_setup_cluster.yml

--- Generation Complete! ---
✅ All necessary files for cluster setup have been created.

Next Step:
1. IMPORTANT: Reset your cluster nodes to ensure a clean state for this new workflow.
On each K8s VM, run: sudo kubeadm reset -f
2. Change into the project directory: cd k8s-cilium-lab
3. Run the playbook to build your Kubernetes cluster. You will be prompted for the sudo password:
ansible-playbook playbooks/3_setup_cluster.yml --ask-become-pass

8. 执行Playbook 设置K8S Cluster和Clium

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/3_setup_cluster.yml --ask-become-pass
BECOME password:
[DEPRECATION WARNING]: community.kubernetes.helm_repository has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core
instead. This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.helm has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead.
This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead. This
feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s_info has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead.
This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.

PLAY [Play 1 - Initialize and Configure Control Plane] *************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Check if cluster is already initialized] *********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Initialize the cluster] **************************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Create .kube directory for ubuntu user] **********************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Copy admin.conf to user's kube config] ***********************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Set KUBECONFIG for root user permanently] ********************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Install prerequisites for Kubernetes modules] ****************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Install Helm] ************************************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Install Cilium CLI] ******************************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Add Helm repositories] ***************************************************************************************************************************************************************************
changed: [k8s-node-1] => (item={'name': 'cilium', 'url': 'https://helm.cilium.io/'})
changed: [k8s-node-1] => (item={'name': 'isovalent', 'url': 'https://helm.isovalent.com/'})

TASK [Deploy Cilium and Hubble with Helm] **************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Expose Hubble UI service via NodePort] ***********************************************************************************************************************************************************
[WARNING]: kubernetes<24.2.0 is not supported or tested. Some features may not work.
changed: [k8s-node-1]

TASK [Wait for Cilium CRDs to become available] ********************************************************************************************************************************************************
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (20 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (19 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (18 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (17 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (16 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (15 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (14 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (13 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (12 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (11 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (10 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (9 retries left).
ok: [k8s-node-1]

TASK [Apply Cilium BGP Configuration from template] ****************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Generate a token for workers to join] ************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Store the join command for other hosts to access] ************************************************************************************************************************************************
ok: [k8s-node-1]

PLAY [Play 2 - Join Worker Nodes to the Fully Configured Cluster] **************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Check if node has already joined] ****************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Join the cluster] ********************************************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]

PLAY [Play 3 - Display Final Access Information] *******************************************************************************************************************************************************

TASK [Get Hubble UI service details] *******************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Display the final access URL] ********************************************************************************************************************************************************************
ok: [k8s-node-1] => {
"msg": "========================================================\n🚀 Your Kubernetes Lab is Ready!\n\nAccess the Hubble UI at:\nhttp://10.75.59.81:31708\n========================================================\n"
}

PLAY RECAP *********************************************************************************************************************************************************************************************
k8s-node-1 : ok=18 changed=12 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
k8s-node-2 : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
k8s-node-3 : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

9. 多次执行Playbook,验证幂等性。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/3_setup_cluster.yml --ask-become-pass
BECOME password:
[DEPRECATION WARNING]: community.kubernetes.helm_repository has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core
instead. This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.helm has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead.
This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead. This
feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s_info has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead.
This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.

PLAY [Play 1 - Initialize and Configure Control Plane] *************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Check if cluster is already initialized] *********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Initialize the cluster] **************************************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Create .kube directory for ubuntu user] **********************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Copy admin.conf to user's kube config] ***********************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Set KUBECONFIG for root user permanently] ********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Install prerequisites for Kubernetes modules] ****************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Install Helm] ************************************************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Install Cilium CLI] ******************************************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Add Helm repositories] ***************************************************************************************************************************************************************************
ok: [k8s-node-1] => (item={'name': 'cilium', 'url': 'https://helm.cilium.io/'})
ok: [k8s-node-1] => (item={'name': 'isovalent', 'url': 'https://helm.isovalent.com/'})

TASK [Deploy Cilium and Hubble with Helm] **************************************************************************************************************************************************************
[WARNING]: The default idempotency check can fail to report changes in certain cases. Install helm diff >= 3.4.1 for better results.
ok: [k8s-node-1]

TASK [Expose Hubble UI service via NodePort] ***********************************************************************************************************************************************************
[WARNING]: kubernetes<24.2.0 is not supported or tested. Some features may not work.
ok: [k8s-node-1]

TASK [Wait for Cilium CRDs to become available] ********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Apply Cilium BGP Configuration from template] ****************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Generate a token for workers to join] ************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Store the join command for other hosts to access] ************************************************************************************************************************************************
ok: [k8s-node-1]

PLAY [Play 2 - Join Worker Nodes to the Fully Configured Cluster] **************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Check if node has already joined] ****************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Join the cluster] ********************************************************************************************************************************************************************************
skipping: [k8s-node-2]
skipping: [k8s-node-3]

PLAY [Play 3 - Display Final Access Information] *******************************************************************************************************************************************************

TASK [Get Hubble UI service details] *******************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Display the final access URL] ********************************************************************************************************************************************************************
ok: [k8s-node-1] => {
"msg": "========================================================\n🚀 Your Kubernetes Lab is Ready!\n\nAccess the Hubble UI at:\nhttp://10.75.59.81:31708\n========================================================\n"
}

PLAY RECAP *********************************************************************************************************************************************************************************************
k8s-node-1 : ok=13 changed=1 unreachable=0 failed=0 skipped=5 rescued=0 ignored=0
k8s-node-2 : ok=2 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
k8s-node-3 : ok=2 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0

10. 部署Demo App

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
ois@ois:~/data/k8s-cilium-lab$ cd ../
ois@ois:~/data$ ./04-deploy-star-wars.sh
--- Star Wars Demo Script Deployer ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Generating the script template ---
✅ Created template: templates/deploy-star-wars.sh.j2

--- Step 3: Generating the deployment playbook ---
✅ Created playbook: playbooks/4_deploy_app.yml

--- Generation Complete! ---
✅ All necessary files for deploying the application script have been created.

Next Steps:
1. Change into the project directory: cd k8s-cilium-lab
2. Run the playbook to copy the script to your control plane node:
ansible-playbook playbooks/4_deploy_app.yml --ask-become-pass
3. SSH into the control plane and run the script:
ssh ubuntu@k8s-node-1
sudo /root/deploy-star-wars.sh
ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/4_deploy_app.yml --ask-become-pass
BECOME password:

PLAY [Deploy Star Wars Demo Script] ********************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Copy the Star Wars deployment script to the control plane] ***************************************************************************************************************************************
changed: [k8s-node-1]

PLAY RECAP *********************************************************************************************************************************************************************************************
k8s-node-1 : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

ois@ois:~/data/k8s-cilium-lab$
ois@ois:~/data/k8s-cilium-lab$ ssh ubuntu@10.75.59.81
Welcome to Ubuntu 24.04.2 LTS (GNU/Linux 6.8.0-63-generic x86_64)

* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/pro

System information as of Mon Aug 4 05:00:52 PM CST 2025

System load: 0.26 Processes: 195
Usage of /: 33.4% of 18.33GB Users logged in: 0
Memory usage: 16% IPv4 address for enp1s0: 10.75.59.81
Swap usage: 0%


Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status


*** System restart required ***
Last login: Mon Aug 4 16:55:53 2025 from 10.75.59.129
ubuntu@k8s-node-1:~$ sudo su
[sudo] password for ubuntu:
root@k8s-node-1:/home/ubuntu# cd
root@k8s-node-1:~# cilium status
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: OK
\__/¯¯\__/ Hubble Relay: OK
\__/ ClusterMesh: disabled

DaemonSet cilium Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet cilium-envoy Desired: 3, Ready: 3/3, Available: 3/3
Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2
Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-ui Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 3
cilium-envoy Running: 3
cilium-operator Running: 2
clustermesh-apiserver
hubble-relay Running: 1
hubble-ui Running: 1
Cluster Pods: 8/8 managed by Cilium
Helm chart version: 1.17.6
Image versions cilium quay.io/isovalent/cilium:v1.17.6-cee.1@sha256:2d01daf4f25f7d644889b49ca856e1a4269981fc963e50bd3962665b41b6adb3: 3
cilium-envoy quay.io/isovalent/cilium-envoy:v1.17.6-cee.1@sha256:318eff387835ca2717baab42a84f35a83a5f9e7d519253df87269f80b9ff0171: 3
cilium-operator quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265: 2
hubble-relay quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e: 1
hubble-ui quay.io/isovalent/hubble-ui-backend:v1.17.6-cee.1@sha256:a034b7e98e6ea796ed26df8f4e71f83fc16465a19d166eff67a03b822c0bfa15: 1
hubble-ui quay.io/isovalent/hubble-ui:v1.17.6-cee.1@sha256:9e37c1296b802830834cc87342a9182ccbb71ffebb711971e849221bd9d59392: 1
root@k8s-node-1:~#
root@k8s-node-1:~# ./deploy-star-wars.sh
🚀 Starting Star Wars Demo Application Deployment...
=================================================

--- Step 1: Ensuring 'star-wars' namespace exists ---

▶️ Running command:
kubectl create namespace star-wars
namespace/star-wars created
✅ Namespace 'star-wars' created.

--- Step 2: Applying application manifest from GitHub ---

▶️ Running command:
kubectl apply -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/minikube/http-sw-app.yaml
service/deathstar created
deployment.apps/deathstar created
pod/tiefighter created
pod/xwing created

--- Step 3: Waiting for all application pods to be ready ---
(This may take a minute as images are pulled...)

▶️ Running command:
kubectl wait --for=condition=ready pod --all -n star-wars --timeout=120s
pod/deathstar-86f85ffb4d-8xbb4 condition met
pod/deathstar-86f85ffb4d-dwfx5 condition met
pod/tiefighter condition met
pod/xwing condition met
✅ All pods are running and ready.

--- Step 4: Displaying pod status ---

▶️ Running command:
kubectl -n star-wars get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
deathstar-86f85ffb4d-8xbb4 1/1 Running 0 24s 172.16.2.92 k8s-node-3 <none> <none> app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
deathstar-86f85ffb4d-dwfx5 1/1 Running 0 24s 172.16.1.198 k8s-node-2 <none> <none> app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
tiefighter 1/1 Running 0 24s 172.16.2.180 k8s-node-3 <none> <none> app.kubernetes.io/name=tiefighter,class=tiefighter,org=empire
xwing 1/1 Running 0 24s 172.16.2.156 k8s-node-3 <none> <none> app.kubernetes.io/name=xwing,class=xwing,org=alliance

--- Step 5: Exposing 'deathstar' service via NodePort ---

▶️ Running command:
kubectl -n star-wars patch service deathstar -p {"spec":{"type":"NodePort"}}
service/deathstar patched
✅ Service 'deathstar' patched to NodePort.

--- Step 6: Testing connectivity from client pods ---
(A 'Ship landed' message indicates success)

▶️ Running command:
kubectl -n star-wars exec tiefighter -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed

▶️ Running command:
kubectl -n star-wars exec xwing -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed

--- Step 7: Displaying external access information ---

=================================================
🎉 Star Wars Demo Application Deployed Successfully!
You can access the Deathstar service from outside the cluster at:
curl -XPOST http://10.75.59.81:30719/v1/request-landing
=================================================
root@k8s-node-1:~# curl -XPOST http://10.75.59.81:30719/v1/request-landing
Ship landed
root@k8s-node-1:~# exit
exit
ubuntu@k8s-node-1:~$ curl -XPOST http://10.75.59.81:30719/v1/request-landing
Ship landed
root@k8s-node-1:~# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-node-1 Ready control-plane 21m v1.33.3 10.75.59.81 <none> Ubuntu 24.04.2 LTS 6.8.0-63-generic containerd://1.7.27
k8s-node-2 Ready <none> 19m v1.33.3 10.75.59.82 <none> Ubuntu 24.04.2 LTS 6.8.0-63-generic containerd://1.7.27
k8s-node-3 Ready <none> 19m v1.33.3 10.75.59.83 <none> Ubuntu 24.04.2 LTS 6.8.0-63-generic containerd://1.7.27
root@k8s-node-1:~# kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system cilium-45qr7 1/1 Running 0 19m 10.75.59.83 k8s-node-3 <none> <none>
kube-system cilium-9q5s7 1/1 Running 0 19m 10.75.59.82 k8s-node-2 <none> <none>
kube-system cilium-envoy-72jj7 1/1 Running 0 19m 10.75.59.82 k8s-node-2 <none> <none>
kube-system cilium-envoy-d8hb4 1/1 Running 0 20m 10.75.59.81 k8s-node-1 <none> <none>
kube-system cilium-envoy-vvsms 1/1 Running 0 19m 10.75.59.83 k8s-node-3 <none> <none>
kube-system cilium-operator-d67c55dc8-lfpjb 1/1 Running 0 20m 10.75.59.81 k8s-node-1 <none> <none>
kube-system cilium-operator-d67c55dc8-rpfv8 1/1 Running 0 20m 10.75.59.82 k8s-node-2 <none> <none>
kube-system cilium-xbjqm 1/1 Running 0 20m 10.75.59.81 k8s-node-1 <none> <none>
kube-system coredns-674b8bbfcf-n9wgt 1/1 Running 0 21m 172.16.0.105 k8s-node-1 <none> <none>
kube-system coredns-674b8bbfcf-ntssg 1/1 Running 0 21m 172.16.0.161 k8s-node-1 <none> <none>
kube-system etcd-k8s-node-1 1/1 Running 0 21m 10.75.59.81 k8s-node-1 <none> <none>
kube-system hubble-relay-cfb755899-46pzc 1/1 Running 0 20m 172.16.1.115 k8s-node-2 <none> <none>
kube-system hubble-ui-68c64498c4-p2nq4 2/2 Running 0 20m 172.16.1.105 k8s-node-2 <none> <none>
kube-system kube-apiserver-k8s-node-1 1/1 Running 0 21m 10.75.59.81 k8s-node-1 <none> <none>
kube-system kube-controller-manager-k8s-node-1 1/1 Running 0 21m 10.75.59.81 k8s-node-1 <none> <none>
kube-system kube-scheduler-k8s-node-1 1/1 Running 0 21m 10.75.59.81 k8s-node-1 <none> <none>
star-wars deathstar-86f85ffb4d-8xbb4 1/1 Running 0 12m 172.16.2.92 k8s-node-3 <none> <none>
star-wars deathstar-86f85ffb4d-dwfx5 1/1 Running 0 12m 172.16.1.198 k8s-node-2 <none> <none>
star-wars tiefighter 1/1 Running 0 12m 172.16.2.180 k8s-node-3 <none> <none>
star-wars xwing 1/1 Running 0 12m 172.16.2.156 k8s-node-3 <none> <none>
root@k8s-node-1:~# kubectl get deployment -A -o wide
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
kube-system cilium-operator 2/2 2 2 21m cilium-operator quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265 io.cilium/app=operator,name=cilium-operator
kube-system coredns 2/2 2 2 22m coredns registry.k8s.io/coredns/coredns:v1.12.0 k8s-app=kube-dns
kube-system hubble-relay 1/1 1 1 21m hubble-relay quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e k8s-app=hubble-relay
kube-system hubble-ui 1/1 1 1 21m frontend,backend quay.io/isovalent/hubble-ui:v1.17.6-cee.1@sha256:9e37c1296b802830834cc87342a9182ccbb71ffebb711971e849221bd9d59392,quay.io/isovalent/hubble-ui-backend:v1.17.6-cee.1@sha256:a034b7e98e6ea796ed26df8f4e71f83fc16465a19d166eff67a03b822c0bfa15 k8s-app=hubble-ui
star-wars deathstar 2/2 2 2 12m deathstar quay.io/cilium/starwars@sha256:896dc536ec505778c03efedb73c3b7b83c8de11e74264c8c35291ff6d5fe8ada class=deathstar,org=empire

11. 系统信息探索

11.1 K8S 和 Cilium相关信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
root@k8s-node-1:~# helm get values cilium -n kube-system
USER-SUPPLIED VALUES:
autoDirectNodeRoutes: true
bgpControlPlane:
announce:
podCIDR: true
enabled: true
bpf:
lb:
externalClusterIP: true
sock: true
masquerade: true
enableIPv4Masquerade: true
hubble:
enabled: true
relay:
enabled: true
ui:
enabled: true
ipam:
mode: kubernetes
ipv4NativeRoutingCIDR: 172.16.0.0/20
k8s:
requireIPv4PodCIDR: true
k8sServiceHost: 10.75.59.81
k8sServicePort: 6443
kubeProxyReplacement: true
routingMode: native
root@k8s-node-1:~#
root@k8s-node-1:~# kubectl -n kube-system get configmap cilium-config -o yaml
apiVersion: v1
data:
agent-not-ready-taint-key: node.cilium.io/agent-not-ready
arping-refresh-period: 30s
auto-direct-node-routes: "true"
bgp-secrets-namespace: kube-system
bpf-distributed-lru: "false"
bpf-events-drop-enabled: "true"
bpf-events-policy-verdict-enabled: "true"
bpf-events-trace-enabled: "true"
bpf-lb-acceleration: disabled
bpf-lb-algorithm-annotation: "false"
bpf-lb-external-clusterip: "false"
bpf-lb-map-max: "65536"
bpf-lb-mode-annotation: "false"
bpf-lb-sock: "false"
bpf-lb-source-range-all-types: "false"
bpf-map-dynamic-size-ratio: "0.0025"
bpf-policy-map-max: "16384"
bpf-root: /sys/fs/bpf
cgroup-root: /run/cilium/cgroupv2
cilium-endpoint-gc-interval: 5m0s
cluster-id: "0"
cluster-name: default
clustermesh-enable-endpoint-sync: "false"
clustermesh-enable-mcs-api: "false"
cni-exclusive: "true"
cni-log-file: /var/run/cilium/cilium-cni.log
custom-cni-conf: "false"
datapath-mode: veth
debug: "false"
debug-verbose: ""
default-lb-service-ipam: lbipam
direct-routing-skip-unreachable: "false"
dnsproxy-enable-transparent-mode: "true"
dnsproxy-socket-linger-timeout: "10"
egress-gateway-ha-reconciliation-trigger-interval: 1s
egress-gateway-reconciliation-trigger-interval: 1s
enable-auto-protect-node-port-range: "true"
enable-bfd: "false"
enable-bgp-control-plane: "true"
enable-bgp-control-plane-status-report: "true"
enable-bpf-clock-probe: "false"
enable-bpf-masquerade: "true"
enable-cluster-aware-addressing: "false"
enable-egress-gateway-ha-socket-termination: "false"
enable-endpoint-health-checking: "true"
enable-endpoint-lockdown-on-policy-overflow: "false"
enable-experimental-lb: "false"
enable-health-check-loadbalancer-ip: "false"
enable-health-check-nodeport: "true"
enable-health-checking: "true"
enable-hubble: "true"
enable-inter-cluster-snat: "false"
enable-internal-traffic-policy: "true"
enable-ipv4: "true"
enable-ipv4-big-tcp: "false"
enable-ipv4-masquerade: "true"
enable-ipv6: "false"
enable-ipv6-big-tcp: "false"
enable-ipv6-masquerade: "true"
enable-k8s-networkpolicy: "true"
enable-k8s-terminating-endpoint: "true"
enable-l2-neigh-discovery: "true"
enable-l7-proxy: "true"
enable-lb-ipam: "true"
enable-local-redirect-policy: "false"
enable-masquerade-to-route-source: "false"
enable-metrics: "true"
enable-node-selector-labels: "false"
enable-non-default-deny-policies: "false"
enable-phantom-services: "false"
enable-policy: default
enable-policy-secrets-sync: "true"
enable-runtime-device-detection: "true"
enable-sctp: "false"
enable-source-ip-verification: "true"
enable-srv6: "false"
enable-svc-source-range-check: "true"
enable-tcx: "true"
enable-vtep: "false"
enable-well-known-identities: "false"
enable-xt-socket-fallback: "true"
envoy-access-log-buffer-size: "4096"
envoy-base-id: "0"
envoy-keep-cap-netbindservice: "false"
export-aggregation: ""
export-aggregation-renew-ttl: "true"
export-aggregation-state-filter: ""
export-file-path: ""
external-envoy-proxy: "true"
feature-gates-approved: ""
feature-gates-strict: "true"
health-check-icmp-failure-threshold: "3"
http-retry-count: "3"
hubble-disable-tls: "false"
hubble-export-file-max-backups: "5"
hubble-export-file-max-size-mb: "10"
hubble-listen-address: :4244
hubble-socket-path: /var/run/cilium/hubble.sock
hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt
hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt
hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key
identity-allocation-mode: crd
identity-gc-interval: 15m0s
identity-heartbeat-timeout: 30m0s
install-no-conntrack-iptables-rules: "false"
ipam: kubernetes
ipam-cilium-node-update-rate: 15s
iptables-random-fully: "false"
ipv4-native-routing-cidr: 172.16.0.0/20
k8s-require-ipv4-pod-cidr: "true"
k8s-require-ipv6-pod-cidr: "false"
kube-proxy-replacement: "true"
kube-proxy-replacement-healthz-bind-address: ""
max-connected-clusters: "255"
mesh-auth-enabled: "true"
mesh-auth-gc-interval: 5m0s
mesh-auth-queue-size: "1024"
mesh-auth-rotated-identities-queue-size: "1024"
monitor-aggregation: medium
monitor-aggregation-flags: all
monitor-aggregation-interval: 5s
multicast-enabled: "false"
nat-map-stats-entries: "32"
nat-map-stats-interval: 30s
node-port-bind-protection: "true"
nodeport-addresses: ""
nodes-gc-interval: 5m0s
operator-api-serve-addr: 127.0.0.1:9234
operator-prometheus-serve-addr: :9963
policy-cidr-match-mode: ""
policy-secrets-namespace: cilium-secrets
policy-secrets-only-from-secrets-namespace: "true"
preallocate-bpf-maps: "false"
procfs: /host/proc
proxy-connect-timeout: "2"
proxy-idle-timeout-seconds: "60"
proxy-initial-fetch-timeout: "30"
proxy-max-concurrent-retries: "128"
proxy-max-connection-duration-seconds: "0"
proxy-max-requests-per-connection: "0"
proxy-xff-num-trusted-hops-egress: "0"
proxy-xff-num-trusted-hops-ingress: "0"
remove-cilium-node-taints: "true"
routing-mode: native
service-no-backend-response: reject
set-cilium-is-up-condition: "true"
set-cilium-node-taints: "true"
srv6-encap-mode: reduced
srv6-locator-pool-enabled: "false"
synchronize-k8s-nodes: "true"
tofqdns-dns-reject-response-code: refused
tofqdns-enable-dns-compression: "true"
tofqdns-endpoint-max-ip-per-hostname: "1000"
tofqdns-idle-connection-grace-period: 0s
tofqdns-max-deferred-connection-deletes: "10000"
tofqdns-proxy-response-max-delay: 100ms
tunnel-protocol: vxlan
tunnel-source-port-range: 0-0
unmanaged-pod-watcher-interval: "15"
vtep-cidr: ""
vtep-endpoint: ""
vtep-mac: ""
vtep-mask: ""
write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: cilium
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2025-08-04T08:52:46Z"
labels:
app.kubernetes.io/managed-by: Helm
name: cilium-config
namespace: kube-system
resourceVersion: "436"
uid: 614c6b36-9112-4fd0-bebf-e92741fa28da
root@k8s-node-1:~# kubectl exec -n kube-system -it cilium-45qr7 -- /bin/sh
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
/home/cilium # cilium status
KVStore: Disabled
Kubernetes: Ok 1.33 (v1.33.3) [linux/amd64]
Kubernetes APIs: ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "isovalent/v1alpha1::IsovalentClusterwideNetworkPolicy", "isovalent/v1alpha1::IsovalentNetworkPolicy", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement: True [enp1s0 10.75.59.83 fe80::5054:ff:fead:b814 (Direct Routing)]
Host firewall: Disabled
SRv6: Disabled
CNI Chaining: none
CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium: Ok 1.17.6-cee.1 (v1.17.6-cee.1-a33b0b85)
NodeMonitor: Listening for events on 4 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 5/254 allocated from 172.16.2.0/24,
IPv4 BIG TCP: Disabled
IPv6 BIG TCP: Disabled
BandwidthManager: Disabled
Routing: Network: Native Host: BPF
Attach Mode: TCX
Device Mode: veth
Masquerading: BPF [enp1s0] 172.16.0.0/20 [IPv4: Enabled, IPv6: Disabled]
Controller Status: 36/36 healthy
Proxy Status: OK, ip 172.16.2.88, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range: min 256, max 65535
Hubble: Ok Current/Max Flows: 2522/4095 (61.59%), Flows/s: 1.51 Metrics: Disabled
Encryption: Disabled
Cluster health: 3/3 reachable (2025-08-04T09:21:04Z)
Name IP Node Endpoints
Modules Health: Stopped(0) Degraded(0) OK(68)
/home/cilium # cilium status --verbose
KVStore: Disabled
Kubernetes: Ok 1.33 (v1.33.3) [linux/amd64]
Kubernetes APIs: ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "isovalent/v1alpha1::IsovalentClusterwideNetworkPolicy", "isovalent/v1alpha1::IsovalentNetworkPolicy", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement: True [enp1s0 10.75.59.83 fe80::5054:ff:fead:b814 (Direct Routing)]
Host firewall: Disabled
SRv6: Disabled
CNI Chaining: none
CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium: Ok 1.17.6-cee.1 (v1.17.6-cee.1-a33b0b85)
NodeMonitor: Listening for events on 4 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 5/254 allocated from 172.16.2.0/24,
Allocated addresses:
172.16.2.156 (star-wars/xwing)
172.16.2.180 (star-wars/tiefighter)
172.16.2.35 (health)
172.16.2.88 (router)
172.16.2.92 (star-wars/deathstar-86f85ffb4d-8xbb4)
IPv4 BIG TCP: Disabled
IPv6 BIG TCP: Disabled
BandwidthManager: Disabled
Routing: Network: Native Host: BPF
Attach Mode: TCX
Device Mode: veth
Masquerading: BPF [enp1s0] 172.16.0.0/20 [IPv4: Enabled, IPv6: Disabled]
Clock Source for BPF: ktime
Controller Status: 36/36 healthy
Name Last success Last error Count Message
cilium-health-ep 1m0s ago never 0 no error
ct-map-pressure 1s ago never 0 no error
daemon-validate-config 35s ago never 0 no error
dns-garbage-collector-job 3s ago never 0 no error
endpoint-1375-regeneration-recovery never never 0 no error
endpoint-196-regeneration-recovery never never 0 no error
endpoint-2338-regeneration-recovery never never 0 no error
endpoint-523-regeneration-recovery never never 0 no error
endpoint-640-regeneration-recovery never never 0 no error
endpoint-gc 2m3s ago never 0 no error
endpoint-periodic-regeneration 1m3s ago never 0 no error
ep-bpf-prog-watchdog 1s ago never 0 no error
ipcache-inject-labels 1s ago never 0 no error
k8s-heartbeat 3s ago never 0 no error
link-cache 3s ago never 0 no error
node-neighbor-link-updater 1s ago never 0 no error
proxy-ports-checkpoint 27m1s ago never 0 no error
resolve-identity-1375 2m0s ago never 0 no error
resolve-identity-196 1m41s ago never 0 no error
resolve-identity-2338 1m41s ago never 0 no error
resolve-identity-523 2m1s ago never 0 no error
resolve-identity-640 1m41s ago never 0 no error
resolve-labels-star-wars/deathstar-86f85ffb4d-8xbb4 21m41s ago never 0 no error
resolve-labels-star-wars/tiefighter 21m41s ago never 0 no error
resolve-labels-star-wars/xwing 21m41s ago never 0 no error
sync-lb-maps-with-k8s-services 27m1s ago never 0 no error
sync-policymap-1375 11m56s ago never 0 no error
sync-policymap-196 6m41s ago never 0 no error
sync-policymap-2338 6m41s ago never 0 no error
sync-policymap-523 11m57s ago never 0 no error
sync-policymap-640 6m41s ago never 0 no error
sync-to-k8s-ciliumendpoint (196) 1s ago never 0 no error
sync-to-k8s-ciliumendpoint (2338) 1s ago never 0 no error
sync-to-k8s-ciliumendpoint (640) 1s ago never 0 no error
sync-utime 1s ago never 0 no error
write-cni-file 27m3s ago never 0 no error
Proxy Status: OK, ip 172.16.2.88, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range: min 256, max 65535
Hubble: Ok Current/Max Flows: 2570/4095 (62.76%), Flows/s: 1.51 Metrics: Disabled
KubeProxyReplacement Details:
Status: True
Socket LB: Enabled
Socket LB Tracing: Enabled
Socket LB Coverage: Full
Devices: enp1s0 10.75.59.83 fe80::5054:ff:fead:b814 (Direct Routing)
Mode: SNAT
Backend Selection: Random
Session Affinity: Enabled
Graceful Termination: Enabled
NAT46/64 Support: Disabled
XDP Acceleration: Disabled
Services:
- ClusterIP: Enabled
- NodePort: Enabled (Range: 30000-32767)
- LoadBalancer: Enabled
- externalIPs: Enabled
- HostPort: Enabled
Annotations:
- service.cilium.io/node
- service.cilium.io/src-ranges-policy
- service.cilium.io/type
BPF Maps: dynamic sizing: on (ratio: 0.002500)
Name Size
Auth 524288
Non-TCP connection tracking 65536
TCP connection tracking 131072
Endpoint policy 65535
IP cache 512000
IPv4 masquerading agent 16384
IPv6 masquerading agent 16384
IPv4 fragmentation 8192
IPv4 service 65536
IPv6 service 65536
IPv4 service backend 65536
IPv6 service backend 65536
IPv4 service reverse NAT 65536
IPv6 service reverse NAT 65536
Metrics 1024
Ratelimit metrics 64
NAT 131072
Neighbor table 131072
Global policy 16384
Session affinity 65536
Sock reverse NAT 65536
Encryption: Disabled
Cluster health: 3/3 reachable (2025-08-04T09:21:04Z)
Name IP Node Endpoints
k8s-node-3 (localhost):
Host connectivity to 10.75.59.83:
ICMP to stack: OK, RTT=451.197µs
HTTP to agent: OK, RTT=625.346µs
Endpoint connectivity to 172.16.2.35:
ICMP to stack: OK, RTT=376.939µs
HTTP to agent: OK, RTT=882.754µs
k8s-node-1:
Host connectivity to 10.75.59.81:
ICMP to stack: OK, RTT=582.892µs
HTTP to agent: OK, RTT=1.042743ms
Endpoint connectivity to 172.16.0.116:
ICMP to stack: OK, RTT=703.331µs
HTTP to agent: OK, RTT=1.533329ms
k8s-node-2:
Host connectivity to 10.75.59.82:
ICMP to stack: OK, RTT=632.658µs
HTTP to agent: OK, RTT=1.156736ms
Endpoint connectivity to 172.16.1.173:
ICMP to stack: OK, RTT=636.518µs
HTTP to agent: OK, RTT=1.37198ms
Modules Health:
enterprise-agent
├── agent
│ ├── controlplane
│ │ ├── auth
│ │ │ ├── observer-job-auth-gc-identity-events [OK] OK (1.812µs) [5] (21m, x1)
│ │ │ ├── observer-job-auth-request-authentication [OK] Primed (27m, x1)
│ │ │ └── timer-job-auth-gc-cleanup [OK] OK (15.847µs) (2m3s, x1)
│ │ ├── bgp-control-plane
│ │ │ ├── job-bgp-controller [OK] Running (27m, x1)
│ │ │ ├── job-bgp-crd-status-initialize [OK] Running (27m, x1)
│ │ │ ├── job-bgp-crd-status-update-job [OK] Running (27m, x1)
│ │ │ ├── job-bgp-policy-observer [OK] Running (27m, x1)
│ │ │ ├── job-bgp-reconcile-error-statedb-tracker [OK] Running (27m, x1)
│ │ │ ├── job-bgp-state-observer [OK] Running (27m, x1)
│ │ │ ├── job-bgpcp-resource-store-events [OK] Running (27m, x5)
│ │ │ └── job-diffstore-events [OK] Running (27m, x2)
│ │ ├── ciliumenvoyconfig
│ │ │ └── experimental
│ │ │ ├── job-reconcile [OK] OK, 0 object(s) (27m, x3)
│ │ │ └── job-refresh [OK] Next refresh in 30m0s (27m, x1)
│ │ ├── daemon
│ │ │ ├── [OK] daemon-validate-config (35s, x27)
│ │ │ ├── ep-bpf-prog-watchdog
│ │ │ │ └── ep-bpf-prog-watchdog [OK] ep-bpf-prog-watchdog (1s, x55)
│ │ │ └── job-sync-hostips [OK] Synchronized (1s, x29)
│ │ ├── dynamic-lifecycle-manager
│ │ │ ├── job-reconcile [OK] OK, 0 object(s) (27m, x3)
│ │ │ └── job-refresh [OK] Next refresh in 30m0s (27m, x1)
│ │ ├── enabled-features
│ │ │ └── job-update-config-metric [OK] Waiting for agent config (27m, x1)
│ │ ├── endpoint-manager
│ │ │ ├── cilium-endpoint-1375 (/)
│ │ │ │ ├── datapath-regenerate [OK] Endpoint regeneration successful (63s, x15)
│ │ │ │ └── policymap-sync [OK] sync-policymap-1375 (11m, x2)
│ │ │ ├── cilium-endpoint-196 (star-wars/xwing)
│ │ │ │ ├── cep-k8s-sync [OK] sync-to-k8s-ciliumendpoint (196) (1s, x132)
│ │ │ │ ├── datapath-regenerate [OK] Endpoint regeneration successful (63s, x12)
│ │ │ │ └── policymap-sync [OK] sync-policymap-196 (6m41s, x2)
│ │ │ ├── cilium-endpoint-2338 (star-wars/tiefighter)
│ │ │ │ ├── cep-k8s-sync [OK] sync-to-k8s-ciliumendpoint (2338) (1s, x132)
│ │ │ │ ├── datapath-regenerate [OK] Endpoint regeneration successful (63s, x12)
│ │ │ │ └── policymap-sync [OK] sync-policymap-2338 (6m41s, x2)
│ │ │ ├── cilium-endpoint-523 (/)
│ │ │ │ ├── datapath-regenerate [OK] Endpoint regeneration successful (63s, x16)
│ │ │ │ └── policymap-sync [OK] sync-policymap-523 (11m, x2)
│ │ │ ├── cilium-endpoint-640 (star-wars/deathstar-86f85ffb4d-8xbb4)
│ │ │ │ ├── cep-k8s-sync [OK] sync-to-k8s-ciliumendpoint (640) (1s, x132)
│ │ │ │ ├── datapath-regenerate [OK] Endpoint regeneration successful (63s, x12)
│ │ │ │ └── policymap-sync [OK] sync-policymap-640 (6m41s, x2)
│ │ │ └── endpoint-gc [OK] endpoint-gc (2m3s, x6)
│ │ ├── envoy-proxy
│ │ │ ├── observer-job-k8s-secrets-resource-events-cilium-secrets [OK] Primed (27m, x1)
│ │ │ └── timer-job-version-check [OK] OK (13.805158ms) (2m1s, x1)
│ │ ├── hubble
│ │ │ └── job-hubble [OK] Running (27m, x1)
│ │ ├── identity
│ │ │ └── timer-job-id-alloc-update-policy-maps [OK] OK (103.031µs) (21m, x1)
│ │ ├── l2-announcer
│ │ │ └── job-l2-announcer-lease-gc [OK] Running (27m, x1)
│ │ ├── nat-stats
│ │ │ └── timer-job-nat-stats [OK] OK (2.520538ms) (1s, x1)
│ │ ├── node-manager
│ │ │ ├── background-sync [OK] Node validation successful (66s, x19)
│ │ │ ├── neighbor-link-updater
│ │ │ │ ├── k8s-node-1 [OK] Node neighbor link update successful (61s, x20)
│ │ │ │ └── k8s-node-2 [OK] Node neighbor link update successful (31s, x20)
│ │ │ ├── node-checkpoint-writer [OK] node checkpoint written (25m, x3)
│ │ │ └── nodes-add [OK] Node adds successful (27m, x3)
│ │ ├── policy
│ │ │ └── observer-job-policy-importer [OK] Primed (27m, x1)
│ │ ├── service-manager
│ │ │ ├── job-health-check-event-watcher [OK] Waiting for health check events (27m, x1)
│ │ │ └── job-service-reconciler [OK] 1 NodePort frontend addresses (27m, x1)
│ │ ├── service-resolver
│ │ │ └── job-service-reloader-initializer [OK] Running (27m, x1)
│ │ └── stale-endpoint-cleanup
│ │ └── job-endpoint-cleanup [OK] Running (27m, x1)
│ ├── datapath
│ │ ├── agent-liveness-updater
│ │ │ └── timer-job-agent-liveness-updater [OK] OK (82.885µs) (0s, x1)
│ │ ├── iptables
│ │ │ ├── ipset
│ │ │ │ ├── job-ipset-init-finalizer [OK] Running (27m, x1)
│ │ │ │ ├── job-reconcile [OK] OK, 0 object(s) (27m, x2)
│ │ │ │ └── job-refresh [OK] Next refresh in 30m0s (27m, x1)
│ │ │ └── job-iptables-reconciliation-loop [OK] iptables rules full reconciliation completed (27m, x1)
│ │ ├── l2-responder
│ │ │ └── job-l2-responder-reconciler [OK] Running (27m, x1)
│ │ ├── maps
│ │ │ └── bwmap
│ │ │ └── timer-job-pressure-metric-throttle [OK] OK (18.336µs) (1s, x1)
│ │ ├── mtu
│ │ │ ├── job-endpoint-mtu-updater [OK] Endpoint MTU updated (27m, x1)
│ │ │ └── job-mtu-updater [OK] MTU updated (1500) (27m, x1)
│ │ ├── node-address
│ │ │ └── job-node-address-update [OK] 172.16.2.88 (primary), fe80::7019:6fff:febf:e8a7 (primary) (27m, x1)
│ │ ├── orchestrator
│ │ │ └── job-reinitialize [OK] OK (26m, x2)
│ │ └── sysctl
│ │ ├── job-reconcile [OK] OK, 16 object(s) (6m56s, x35)
│ │ └── job-refresh [OK] Next refresh in 9m53.185634443s (6m56s, x1)
│ └── infra
│ ├── k8s-synced-crdsync
│ │ └── job-sync-crds [OK] Running (27m, x1)
│ ├── metrics
│ │ ├── job-collect [OK] Sampled 24 metrics in 4.183045ms, next collection at 2025-08-04 09:26:00.386029804 +0000 UTC m=+1803.177514891 (2m1s, x1)
│ │ └── timer-job-cleanup [OK] Primed (27m, x1)
│ └── shell
│ └── job-listener [OK] Listening on /var/run/cilium/shell.sock (27m, x1)
└── enterprise-controlplane
└── cec-ingress-policy
└── timer-job-enterprise-endpoint-policy-periodic-regeneration [OK] OK (21.899µs) (1s, x1)

/home/cilium # cilium service list
ID Frontend Service Type Backend
1 172.16.32.1:443/TCP ClusterIP 1 => 10.75.59.81:6443/TCP (active)
2 172.16.42.186:443/TCP ClusterIP 1 => 10.75.59.83:4244/TCP (active)
3 172.16.42.14:80/TCP ClusterIP 1 => 172.16.1.115:4245/TCP (active)
4 172.16.38.134:80/TCP ClusterIP 1 => 172.16.1.105:8081/TCP (active)
5 10.75.59.83:31708/TCP NodePort 1 => 172.16.1.105:8081/TCP (active)
6 0.0.0.0:31708/TCP NodePort 1 => 172.16.1.105:8081/TCP (active)
7 172.16.32.10:53/TCP ClusterIP 1 => 172.16.0.161:53/TCP (active)
2 => 172.16.0.105:53/TCP (active)
8 172.16.32.10:9153/TCP ClusterIP 1 => 172.16.0.161:9153/TCP (active)
2 => 172.16.0.105:9153/TCP (active)
9 172.16.32.10:53/UDP ClusterIP 1 => 172.16.0.161:53/UDP (active)
2 => 172.16.0.105:53/UDP (active)
10 172.16.34.108:80/TCP ClusterIP 1 => 172.16.1.198:80/TCP (active)
2 => 172.16.2.92:80/TCP (active)
11 10.75.59.83:30719/TCP NodePort 1 => 172.16.1.198:80/TCP (active)
2 => 172.16.2.92:80/TCP (active)
12 0.0.0.0:30719/TCP NodePort 1 => 172.16.1.198:80/TCP (active)
2 => 172.16.2.92:80/TCP (active)
/home/cilium # cilium bpf nat list
TCP IN 10.75.59.82:4240 -> 10.75.59.83:35986 XLATE_DST 10.75.59.83:35986 Created=1076sec ago NeedsCT=1
ICMP IN 10.75.59.81:0 -> 10.75.59.83:63865 XLATE_DST 10.75.59.83:63865 Created=156sec ago NeedsCT=1
TCP IN 10.75.59.81:4240 -> 10.75.59.83:58402 XLATE_DST 10.75.59.83:58402 Created=56sec ago NeedsCT=1
ICMP OUT 10.75.59.83:47633 -> 10.75.59.81:0 XLATE_SRC 10.75.59.83:47633 Created=56sec ago NeedsCT=1
ICMP OUT 10.75.59.83:56308 -> 172.16.0.116:0 XLATE_SRC 10.75.59.83:56308 Created=206sec ago NeedsCT=1
ICMP OUT 10.75.59.83:40570 -> 10.75.59.82:0 XLATE_SRC 10.75.59.83:40570 Created=226sec ago NeedsCT=1
TCP IN 172.16.0.116:4240 -> 10.75.59.83:33274 XLATE_DST 10.75.59.83:33274 Created=706sec ago NeedsCT=1
ICMP IN 172.16.0.116:0 -> 10.75.59.83:56308 XLATE_DST 10.75.59.83:56308 Created=206sec ago NeedsCT=1
TCP OUT 10.75.59.83:37066 -> 10.75.59.81:6443 XLATE_SRC 10.75.59.83:37066 Created=1655sec ago NeedsCT=1
TCP OUT 10.75.59.83:44064 -> 172.16.1.173:4240 XLATE_SRC 10.75.59.83:44064 Created=1066sec ago NeedsCT=1
TCP OUT 10.75.59.83:46184 -> 10.75.59.81:6443 XLATE_SRC 10.75.59.83:46184 Created=1651sec ago NeedsCT=1
TCP OUT 10.75.59.83:43981 -> 10.75.59.86:179 XLATE_SRC 10.75.59.83:43981 Created=1648sec ago NeedsCT=1
TCP OUT 10.75.59.83:57802 -> 10.75.59.81:4240 XLATE_SRC 10.75.59.83:57802 Created=716sec ago NeedsCT=1
ICMP IN 10.75.59.82:0 -> 10.75.59.83:43531 XLATE_DST 10.75.59.83:43531 Created=36sec ago NeedsCT=1
TCP IN 10.75.59.86:179 -> 10.75.59.83:43981 XLATE_DST 10.75.59.83:43981 Created=1648sec ago NeedsCT=1
TCP OUT 10.75.59.83:58402 -> 10.75.59.81:4240 XLATE_SRC 10.75.59.83:58402 Created=56sec ago NeedsCT=1
ICMP OUT 10.75.59.83:43619 -> 10.75.59.82:0 XLATE_SRC 10.75.59.83:43619 Created=176sec ago NeedsCT=1
ICMP OUT 10.75.59.83:40760 -> 10.75.59.81:0 XLATE_SRC 10.75.59.83:40760 Created=216sec ago NeedsCT=1
TCP IN 142.250.199.100:443 -> 10.75.59.83:40732 XLATE_DST 172.16.2.180:40732 Created=205sec ago NeedsCT=0
TCP OUT 10.75.59.83:34202 -> 172.16.0.116:4240 XLATE_SRC 10.75.59.83:34202 Created=46sec ago NeedsCT=1
ICMP IN 10.75.59.81:0 -> 10.75.59.83:47633 XLATE_DST 10.75.59.83:47633 Created=56sec ago NeedsCT=1
TCP OUT 10.75.59.83:33274 -> 172.16.0.116:4240 XLATE_SRC 10.75.59.83:33274 Created=706sec ago NeedsCT=1
ICMP OUT 10.75.59.83:43531 -> 10.75.59.82:0 XLATE_SRC 10.75.59.83:43531 Created=36sec ago NeedsCT=1
ICMP IN 10.75.59.82:0 -> 10.75.59.83:43619 XLATE_DST 10.75.59.83:43619 Created=176sec ago NeedsCT=1
TCP IN 10.75.59.81:6443 -> 10.75.59.83:37066 XLATE_DST 10.75.59.83:37066 Created=1655sec ago NeedsCT=1
ICMP OUT 10.75.59.83:36675 -> 172.16.1.173:0 XLATE_SRC 10.75.59.83:36675 Created=106sec ago NeedsCT=1
TCP OUT 172.16.2.180:40732 -> 142.250.199.100:443 XLATE_SRC 10.75.59.83:40732 Created=205sec ago NeedsCT=0
ICMP IN 172.16.0.116:0 -> 10.75.59.83:38433 XLATE_DST 10.75.59.83:38433 Created=276sec ago NeedsCT=1
TCP OUT 10.75.59.83:35986 -> 10.75.59.82:4240 XLATE_SRC 10.75.59.83:35986 Created=1076sec ago NeedsCT=1
ICMP IN 172.16.1.173:0 -> 10.75.59.83:36675 XLATE_DST 10.75.59.83:36675 Created=106sec ago NeedsCT=1
TCP IN 10.75.59.81:6443 -> 10.75.59.83:46184 XLATE_DST 10.75.59.83:46184 Created=1651sec ago NeedsCT=1
TCP IN 172.16.0.116:4240 -> 10.75.59.83:34202 XLATE_DST 10.75.59.83:34202 Created=46sec ago NeedsCT=1
ICMP OUT 10.75.59.83:63865 -> 10.75.59.81:0 XLATE_SRC 10.75.59.83:63865 Created=156sec ago NeedsCT=1
ICMP OUT 10.75.59.83:38433 -> 172.16.0.116:0 XLATE_SRC 10.75.59.83:38433 Created=276sec ago NeedsCT=1
ICMP IN 10.75.59.82:0 -> 10.75.59.83:40570 XLATE_DST 10.75.59.83:40570 Created=226sec ago NeedsCT=1
ICMP OUT 10.75.59.83:36899 -> 172.16.1.173:0 XLATE_SRC 10.75.59.83:36899 Created=236sec ago NeedsCT=1
ICMP IN 172.16.1.173:0 -> 10.75.59.83:36899 XLATE_DST 10.75.59.83:36899 Created=236sec ago NeedsCT=1
TCP IN 10.75.59.81:4240 -> 10.75.59.83:57802 XLATE_DST 10.75.59.83:57802 Created=716sec ago NeedsCT=1
ICMP IN 10.75.59.81:0 -> 10.75.59.83:40760 XLATE_DST 10.75.59.83:40760 Created=216sec ago NeedsCT=1
TCP IN 172.16.1.173:4240 -> 10.75.59.83:44064 XLATE_DST 10.75.59.83:44064 Created=1066sec ago NeedsCT=1

/home/cilium # cilium config
##### Read-write configurations #####
ConntrackAccounting : Disabled
ConntrackLocal : Disabled
Debug : Disabled
DebugLB : Disabled
DropNotification : Enabled
MonitorAggregationLevel : Medium
PolicyAccounting : Enabled
PolicyAuditMode : Disabled
PolicyTracing : Disabled
PolicyVerdictNotification : Enabled
SourceIPVerification : Enabled
TraceNotification : Enabled
MonitorNumPages : 64
PolicyEnforcement : default
/home/cilium # exit
root@k8s-node-1:~#

11.2 BGP相关信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
root@k8s-node-1:~# cilium bgp peers
Node Local AS Peer AS Peer Address Session State Uptime Family Received Advertised
k8s-node-1 65000 65000 10.75.59.86 established 41m41s ipv4/unicast 1 8
k8s-node-2 65000 65000 10.75.59.86 established 39m53s ipv4/unicast 1 8
k8s-node-3 65000 65000 10.75.59.86 established 39m52s ipv4/unicast 1 8
root@k8s-node-1:~# cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node VRouter Prefix NextHop Age Attrs
k8s-node-1 65000 172.16.0.0/24 0.0.0.0 41m48s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 41m48s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 41m48s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.34.108/32 0.0.0.0 34m39s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.38.134/32 0.0.0.0 41m48s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.42.14/32 0.0.0.0 41m48s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.42.186/32 0.0.0.0 41m47s [{Origin: i} {Nexthop: 0.0.0.0}]
k8s-node-2 65000 172.16.1.0/24 0.0.0.0 39m59s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 39m59s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 39m59s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.34.108/32 0.0.0.0 34m39s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.38.134/32 0.0.0.0 39m59s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.42.14/32 0.0.0.0 39m59s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.42.186/32 0.0.0.0 39m56s [{Origin: i} {Nexthop: 0.0.0.0}]
k8s-node-3 65000 172.16.2.0/24 0.0.0.0 39m58s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 39m58s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 39m58s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.34.108/32 0.0.0.0 34m39s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.38.134/32 0.0.0.0 39m58s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.42.14/32 0.0.0.0 39m58s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.42.186/32 0.0.0.0 39m55s [{Origin: i} {Nexthop: 0.0.0.0}]

root@dns-bgp-server:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.86
172.16.0.0/24 nhid 8 via 10.75.59.81 dev enp1s0 proto bgp metric 20
172.16.1.0/24 nhid 12 via 10.75.59.82 dev enp1s0 proto bgp metric 20
172.16.2.0/24 nhid 19 via 10.75.59.83 dev enp1s0 proto bgp metric 20
172.16.32.1 nhid 18 proto bgp metric 20
nexthop via 10.75.59.81 dev enp1s0 weight 1
nexthop via 10.75.59.82 dev enp1s0 weight 1
nexthop via 10.75.59.83 dev enp1s0 weight 1
172.16.32.10 nhid 18 proto bgp metric 20
nexthop via 10.75.59.81 dev enp1s0 weight 1
nexthop via 10.75.59.82 dev enp1s0 weight 1
nexthop via 10.75.59.83 dev enp1s0 weight 1
172.16.34.108 nhid 18 proto bgp metric 20
nexthop via 10.75.59.81 dev enp1s0 weight 1
nexthop via 10.75.59.82 dev enp1s0 weight 1
nexthop via 10.75.59.83 dev enp1s0 weight 1
172.16.38.134 nhid 18 proto bgp metric 20
nexthop via 10.75.59.81 dev enp1s0 weight 1
nexthop via 10.75.59.82 dev enp1s0 weight 1
nexthop via 10.75.59.83 dev enp1s0 weight 1
172.16.42.14 nhid 18 proto bgp metric 20
nexthop via 10.75.59.81 dev enp1s0 weight 1
nexthop via 10.75.59.82 dev enp1s0 weight 1
nexthop via 10.75.59.83 dev enp1s0 weight 1
172.16.42.186 nhid 18 proto bgp metric 20
nexthop via 10.75.59.81 dev enp1s0 weight 1
nexthop via 10.75.59.82 dev enp1s0 weight 1
nexthop via 10.75.59.83 dev enp1s0 weight 1
root@dns-bgp-server:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.75.59.1 0.0.0.0 UG 0 0 0 enp1s0
10.75.59.0 0.0.0.0 255.255.255.0 U 0 0 0 enp1s0
172.16.0.0 10.75.59.81 255.255.255.0 UG 20 0 0 enp1s0
172.16.1.0 10.75.59.82 255.255.255.0 UG 20 0 0 enp1s0
172.16.2.0 10.75.59.83 255.255.255.0 UG 20 0 0 enp1s0
172.16.32.1 10.75.59.81 255.255.255.255 UGH 20 0 0 enp1s0
172.16.32.10 10.75.59.81 255.255.255.255 UGH 20 0 0 enp1s0
172.16.34.108 10.75.59.81 255.255.255.255 UGH 20 0 0 enp1s0
172.16.38.134 10.75.59.81 255.255.255.255 UGH 20 0 0 enp1s0
172.16.42.14 10.75.59.81 255.255.255.255 UGH 20 0 0 enp1s0
172.16.42.186 10.75.59.81 255.255.255.255 UGH 20 0 0 enp1s0
root@dns-bgp-server:~# vtysh -c "show ip bgp"
BGP table version is 15, local router ID is 10.75.59.86, vrf id 0
Default local pref 100, local AS 65000
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

Network Next Hop Metric LocPrf Weight Path
*> 10.75.59.0/24 0.0.0.0 0 32768 ?
*>i172.16.0.0/24 10.75.59.81 100 0 i
*>i172.16.1.0/24 10.75.59.82 100 0 i
*>i172.16.2.0/24 10.75.59.83 100 0 i
*=i172.16.32.1/32 10.75.59.83 100 0 i
*=i 10.75.59.82 100 0 i
*>i 10.75.59.81 100 0 i
*=i172.16.32.10/32 10.75.59.83 100 0 i
*=i 10.75.59.82 100 0 i
*>i 10.75.59.81 100 0 i
*>i172.16.34.108/32 10.75.59.81 100 0 i
*=i 10.75.59.82 100 0 i
*=i 10.75.59.83 100 0 i
*=i172.16.38.134/32 10.75.59.83 100 0 i
*=i 10.75.59.82 100 0 i
*>i 10.75.59.81 100 0 i
*=i172.16.42.14/32 10.75.59.83 100 0 i
*=i 10.75.59.82 100 0 i
*>i 10.75.59.81 100 0 i
*=i172.16.42.186/32 10.75.59.83 100 0 i
*=i 10.75.59.82 100 0 i
*>i 10.75.59.81 100 0 i

Displayed 10 routes and 22 total paths
root@dns-bgp-server:~# vtysh -c "show ip bgp summary"

IPv4 Unicast Summary (VRF default):
BGP router identifier 10.75.59.86, local AS number 65000 vrf-id 0
BGP table version 15
RIB entries 19, using 3648 bytes of memory
Peers 3, using 2172 KiB of memory

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.75.59.81 4 65000 247 244 0 0 0 00:40:09 7 1 N/A
10.75.59.82 4 65000 240 234 0 0 0 00:38:20 7 1 N/A
10.75.59.83 4 65000 239 233 0 0 0 00:38:19 7 1 N/A

Total number of neighbors 3

12. 项目文档结构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
ois@ois:~/data/k8s-cilium-lab$ tree
.
├── ansible.cfg
├── group_vars
│   └── all.yml
├── host_vars
│   ├── dns-bgp-server.yml
│   ├── k8s-node-1.yml
│   ├── k8s-node-2.yml
│   └── k8s-node-3.yml
├── inventory.ini
├── nodevm_cfg
│   ├── dns-bgp-server_meta-data
│   ├── dns-bgp-server_network-config
│   ├── dns-bgp-server_user-data
│   ├── k8s-node-1_meta-data
│   ├── k8s-node-1_network-config
│   ├── k8s-node-1_user-data
│   ├── k8s-node-2_meta-data
│   ├── k8s-node-2_network-config
│   ├── k8s-node-2_user-data
│   ├── k8s-node-3_meta-data
│   ├── k8s-node-3_network-config
│   └── k8s-node-3_user-data
├── nodevms
│   ├── dns-bgp-server.qcow2
│   ├── k8s-node-1.qcow2
│   ├── k8s-node-2.qcow2
│   └── k8s-node-3.qcow2
├── playbooks
│   ├── 1_create_vms.yml
│   ├── 2_prepare_nodes.yml
│   ├── 3_setup_cluster.yml
│   └── 4_deploy_app.yml
├── roles
│   ├── common
│   │   └── tasks
│   │   └── main.yml
│   ├── infra_server
│   │   └── tasks
│   │   └── main.yml
│   └── k8s_node
│   └── tasks
│   └── main.yml
└── templates
├── cilium-bgp.yaml.j2
├── containerd.toml.j2
├── deploy-star-wars.sh.j2
├── frr.conf.j2
├── hosts.j2
├── meta-data.j2
├── network-config.j2
└── user-data.j2

14 directories, 38 files

Kubernetes with CNI Cilium 安装学习

1. 使用自动化脚本安装Ubuntu 24.04虚拟机

1.1 准备工作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
密码加密字符串
ois@ois:~$ mkpasswd --method=SHA-512 --rounds=4096
Password:
$6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1

生成ssh-key,以便实现无密码登录。
ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa
Your public key has been saved in /root/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:1+BaD0K3fe6saxPFf41r0SyZEpqhq29AVeRwz+WEXiU
The key's randomart image is:
+---[RSA 4096]----+
| .o+ .E..|
| .+ o.+.. |
| .. +.oo. |
| .. o.=o o |
| . S.*+oo.B.|
| . .=oooo* *|
| ... .o.+.|
| o ooo |
| .+. .o=o |
+----[SHA256]-----+

1.2 自动化脚本创建虚拟机节点— 由Gemini生成

create-vms.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
#!/bin/bash

# --- Configuration ---
BASE_IMAGE_PATH="/home/ois/data/vmimages/noble-server-cloudimg-amd64.img"
VM_IMAGE_DIR="/home/ois/data/k8s/nodevms"
VM_CONFIG_DIR="/home/ois/data/k8s/nodevm_cfg"
RAM_MB=8192
VCPUS=4
DISK_SIZE_GB=20 # <--- ADDED: Increased disk size to 20 GB
BRIDGE_INTERFACE="br0"
BASE_IP="10.75.59"
NETWORK_PREFIX="/24"
GATEWAY="10.75.59.1"
NAMESERVER1="64.104.76.247"
NAMESERVER2="64.104.14.184"
SEARCH_DOMAIN="cisco.com"
VNC_PORT_START=5905 # VNC ports will be 5905, 5906, 5907
PASSWORD_HASH='$6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1'
SSH_PUB_KEY=$(cat ~/.ssh/id_rsa.pub)

# --- Loop to create 3 VMs ---
for i in {1..3}; do
VM_NAME="kube-node-$i"
VM_IP="${BASE_IP}.$((70 + i))" # IPs will be 10.75.59.71, 10.75.59.72, 10.75.59.73
VM_IMAGE_PATH="${VM_IMAGE_DIR}/${VM_NAME}.qcow2"
VM_VNC_PORT=$((VNC_PORT_START + i - 1)) # VNC ports will be 5905, 5906, 5907

echo "--- Preparing for $VM_NAME (IP: $VM_IP) ---"

# Create directories if they don't exist
mkdir -p "$VM_IMAGE_DIR"
mkdir -p "$VM_CONFIG_DIR"

# Create a fresh image for each VM
if [ -f "$VM_IMAGE_PATH" ]; then
echo "Removing existing image for $VM_NAME..."
rm "$VM_IMAGE_PATH"
fi
echo "Copying base image to $VM_IMAGE_PATH..."
cp "$BASE_IMAGE_PATH" "$VM_IMAGE_PATH"

# --- NEW: Resize the copied image before virt-install ---
echo "Resizing VM image to ${DISK_SIZE_GB}GB..."
qemu-img resize "$VM_IMAGE_PATH" "${DISK_SIZE_GB}G"
# --- END NEW ---

# Generate user-data for the current VM
USER_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_user-data"
cat <<EOF > "$USER_DATA_FILE"
#cloud-config

locale: en_US
keyboard:
layout: us
timezone: Asia/Shanghai
hostname: ${VM_NAME}
create_hostname_file: true

ssh_pwauth: yes

groups:
- ubuntu

users:
- name: ubuntu
gecos: ubuntu
primary_group: ubuntu
groups: sudo, cdrom
sudo: ALL=(ALL:ALL) ALL
shell: /bin/bash
lock_passwd: false
passwd: ${PASSWORD_HASH}
ssh_authorized_keys:
- "${SSH_PUB_KEY}"

apt:
primary:
- arches: [default]
uri: http://us.archive.ubuntu.com/ubuntu/

packages:
- openssh-server
- net-tools
- iftop
- htop
- iperf3
- vim
- curl
- wget
- cloud-guest-utils # Ensure growpart is available

ntp:
servers: ['ntp.esl.cisco.com']

runcmd:
- echo "Attempting to resize root partition and filesystem..."
- growpart /dev/vda 1 # Expand the first partition on /dev/vda
- resize2fs /dev/vda1 # Expand the ext4 filesystem on /dev/vda1
- echo "Disk resize commands executed. Verify with 'df -h' after boot."
EOF

# Generate network-config for the current VM
NETWORK_CONFIG_FILE="${VM_CONFIG_DIR}/${VM_NAME}_network-config"
cat <<EOF > "$NETWORK_CONFIG_FILE"
network:
version: 2
ethernets:
enp1s0:
addresses:
- "${VM_IP}${NETWORK_PREFIX}"
nameservers:
addresses:
- ${NAMESERVER1}
- ${NAMESERVER2}
search:
- ${SEARCH_DOMAIN}
routes:
- to: "default"
via: "${GATEWAY}"
EOF

# Generate meta-data (can be static for now)
META_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_meta-data"
cat <<EOF > "$META_DATA_FILE"
instance-id: ${VM_NAME}
local-hostname: ${VM_NAME}
EOF

echo "--- Installing $VM_NAME ---"
virt-install --name "${VM_NAME}" --ram "${RAM_MB}" --vcpus "${VCPUS}" --noreboot \
--os-variant ubuntu24.04 \
--network bridge="${BRIDGE_INTERFACE}" \
--graphics vnc,listen=0.0.0.0,port="${VM_VNC_PORT}" \
--disk path="${VM_IMAGE_PATH}",format=qcow2 \
--console pty,target_type=serial \
--cloud-init user-data="${USER_DATA_FILE}",meta-data="${META_DATA_FILE}",network-config="${NETWORK_CONFIG_FILE}" \
--import \
--wait 0

echo "Successfully initiated creation of $VM_NAME."
echo "You can connect to VNC on port ${VM_VNC_PORT} to monitor installation (optional)."
echo "Wait a few minutes for the VM to boot and cloud-init to run."
echo "--------------------------------------------------------"
done

echo "All 3 VMs have been initiated. Please wait for them to fully provision."
echo "You can SSH into them using 'ssh ubuntu@<IP_ADDRESS>' where IP addresses are 10.75.59.71, 10.75.59.72, 10.75.59.73."

上述脚本可自动生成三个虚拟机,并可以使用 ssh ubuntu@10.75.59.71 登录

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
chmod +x create-vms.sh

ois@ois:~/data/k8s$ ./create-vms.sh
--- Preparing for kube-node-1 (IP: 10.75.59.71) ---
Removing existing image for kube-node-1...
Copying base image to /home/ois/data/k8s/nodevms/kube-node-1.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing kube-node-1 ---
WARNING Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-3jev55ba-cloudinit.iso' | 0 B 00:00:00 ...
Transferring 'virtinst-3jev55ba-cloudinit.iso' | 0 B 00:00:00 ...
Creating domain... | 0 B 00:00:00
Domain creation completed.
Successfully initiated creation of kube-node-1.
You can connect to VNC on port 5905 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
--- Preparing for kube-node-2 (IP: 10.75.59.72) ---
Removing existing image for kube-node-2...
Copying base image to /home/ois/data/k8s/nodevms/kube-node-2.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing kube-node-2 ---
WARNING Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-c4ruhql3-cloudinit.iso' | 0 B 00:00:00 ...
Transferring 'virtinst-c4ruhql3-cloudinit.iso' | 0 B 00:00:00 ...
Creating domain... | 0 B 00:00:00
Domain creation completed.
Successfully initiated creation of kube-node-2.
You can connect to VNC on port 5906 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
--- Preparing for kube-node-3 (IP: 10.75.59.73) ---
Removing existing image for kube-node-3...
Copying base image to /home/ois/data/k8s/nodevms/kube-node-3.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing kube-node-3 ---
WARNING Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-u5e8k9a9-cloudinit.iso' | 0 B 00:00:00 ...
Transferring 'virtinst-u5e8k9a9-cloudinit.iso' | 0 B 00:00:00 ...
Creating domain... | 0 B 00:00:00
Domain creation completed.
Successfully initiated creation of kube-node-3.
You can connect to VNC on port 5907 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
All 3 VMs have been initiated. Please wait for them to fully provision.
You can SSH into them using 'ssh ubuntu@<IP_ADDRESS>' where IP addresses are 10.75.59.71, 10.75.59.72, 10.75.59.73.

ois@ois:~/data/k8s$ virsh list
Id Name State
-------------------------------------
83 kube-node-1 running
84 kube-node-2 running
85 kube-node-3 running

user-data 示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#cloud-config

locale: en_US
keyboard:
layout: us
timezone: Asia/Shanghai
hostname: kube-node-1
create_hostname_file: true

ssh_pwauth: yes

groups:
- ubuntu

users:
- name: ubuntu
gecos: ubuntu
primary_group: ubuntu
groups: sudo, cdrom
sudo: ALL=(ALL:ALL) ALL
shell: /bin/bash
lock_passwd: false
passwd: $6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1
ssh_authorized_keys:
- "ssh-rsa AAAAB3NzaC1yXXXXXXXXXX=="

apt:
primary:
- arches: [default]
uri: http://us.archive.ubuntu.com/ubuntu/

packages:
- openssh-server
- net-tools
- iftop
- htop
- iperf3
- vim
- curl
- wget
- cloud-guest-utils # Ensure growpart is available

ntp:
servers: ['ntp.esl.cisco.com']

runcmd:
- echo "Attempting to resize root partition and filesystem..."
- growpart /dev/vda 1 # Expand the first partition on /dev/vda
- resize2fs /dev/vda1 # Expand the ext4 filesystem on /dev/vda1
- echo "Disk resize commands executed. Verify with 'df -h' after boot."

network-config

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
network:
version: 2
ethernets:
enp1s0:
addresses:
- "10.75.59.71/24"
nameservers:
addresses:
- 64.104.76.247
- 64.104.14.184
search:
- cisco.com
routes:
- to: "default"
via: "10.75.59.1"

meta-data

1
2
instance-id: kube-node-1
local-hostname: kube-node-1

2. 使用Ansible安装Kubernetes

2.1 脚本内容

以下脚本可用于自动化安装Ansible,并生成Playbook,可直接执行安装任务。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
#!/bin/bash

# This script automates the setup of an Ansible environment for Kubernetes cluster deployment.
# It installs Ansible, creates the project directory, inventory, configuration,
# and defines common Kubernetes setup tasks.
# This version stops after installing Kubernetes components, allowing manual kubeadm init/join.
# Includes a robust fix for Containerd's SystemdCgroup configuration and CRI plugin enabling,
# defines the necessary handler for restarting Containerd, dynamically adds host entries to /etc/hosts,
# and updates the pause image version in the manual instructions.
# This update also addresses the runc runtime root configuration in containerd and fixes
# YAML escape character issues in the hosts file regex, and updates the sandbox image in containerd config.

# --- Configuration ---
PROJECT_DIR="k8s_cluster_setup"
MASTER_NODE_IP="10.75.59.71" # Based on your previous script's IP assignment for kube-node-1
WORKER_NODE_IP_1="10.75.59.72" # Based on your previous script's IP assignment for kube-node-2
WORKER_NODE_IP_2="10.75.59.73" # Based on your previous script's IP address for kube-node-3
ANSIBLE_USER="ubuntu" # The user created by cloud-init on your VMs
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Path to your SSH private key on the Ansible control machine

# --- Functions ---

# Function to install Ansible
install_ansible() {
echo "--- Installing Ansible ---"
if ! command -v ansible &> /dev/null; then
sudo apt update -y
sudo apt install -y ansible
echo "Ansible installed successfully."
else
echo "Ansible is already installed."
fi
}

# Function to create project directory and navigate into it
create_project_dir() {
echo "--- Creating project directory: ${PROJECT_DIR} ---"
mkdir -p "${PROJECT_DIR}"
cd "${PROJECT_DIR}" || { echo "Failed to change directory to ${PROJECT_DIR}. Exiting."; exit 1; }
echo "Changed to directory: $(pwd)"
}

# Function to create ansible.cfg
create_ansible_cfg() {
echo "--- Creating ansible.cfg ---"
cat <<EOF > ansible.cfg
[defaults]
inventory = inventory.ini
roles_path = ./roles
host_key_checking = False # WARNING: Disable host key checking for convenience during initial setup. Re-enable for production!
EOF
echo "ansible.cfg created."
}

# Function to create inventory.ini (UPDATED with IP variables)
create_inventory() {
echo "--- Creating inventory.ini ---"
cat <<EOF > inventory.ini
[master]
kube-node-1 ansible_host=${MASTER_NODE_IP}

[workers]
kube-node-2 ansible_host=${WORKER_NODE_IP_1}
kube-node-3 ansible_host=${WORKER_NODE_IP_2}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH}
ansible_python_interpreter=/usr/bin/python3
# These variables are now primarily for documentation/script clarity,
# as the hosts file task will dynamically read from inventory groups.
master_node_ip=${MASTER_NODE_IP}
worker_node_ip_1=${WORKER_NODE_IP_1}
worker_node_ip_2=${WORKER_NODE_IP_2}
EOF
echo "inventory.ini created."
}

# Function to create main playbook.yml (Modified to only include common setup)
create_playbook() {
echo "--- Creating playbook.yml ---"
cat <<EOF > playbook.yml
---
- name: Common Kubernetes Setup for all nodes
hosts: all
become: yes
roles:
- common_k8s_setup
EOF
echo "playbook.yml created (only common setup included)."
}

# Function to create roles and their tasks
create_roles() {
echo "--- Creating Ansible roles and tasks ---"

# common_k8s_setup role
mkdir -p roles/common_k8s_setup/tasks
# UPDATED: main.yml to include the new hosts entry task first
cat <<EOF > roles/common_k8s_setup/tasks/main.yml
---
- name: Include add hosts entries task
ansible.builtin.include_tasks: 00_add_hosts_entries.yml

- name: Include disable swap task
ansible.builtin.include_tasks: 01_disable_swap.yml

- name: Include containerd setup task
ansible.builtin.include_tasks: 02_containerd_setup.yml

- name: Include kernel modules and sysctl task
ansible.builtin.include_tasks: 03_kernel_modules_sysctl.yml

- name: Include kube repo, install, and hold task
ansible.builtin.include_tasks: 04_kube_repo_install_hold.yml

- name: Include initial apt upgrade task
ansible.builtin.include_tasks: 05_initial_upgrade.yml

- name: Include configure weekly updates task
ansible.builtin.include_tasks: 06_configure_weekly_updates.yml
EOF

# NEW FILE: 00_add_hosts_entries.yml (Dynamically adds hosts from inventory, FIXED: Escaped backslashes in regex)
cat <<EOF > roles/common_k8s_setup/tasks/00_add_hosts_entries.yml
---
- name: Add all inventory hosts to /etc/hosts on each node
ansible.builtin.lineinfile:
path: /etc/hosts
regexp: "^{{ hostvars[item]['ansible_host'] }}\\\\s+{{ item }}" # Fixed: \\\\s for escaped backslash in regex
line: "{{ hostvars[item]['ansible_host'] }} {{ item }}"
state: present
create: yes
mode: '0644'
owner: root
group: root
loop: "{{ groups['all'] }}" # Loop over all hosts defined in the inventory
EOF

# Create handlers directory and file
mkdir -p roles/common_k8s_setup/handlers
cat <<EOF > roles/common_k8s_setup/handlers/main.yml
---
- name: Restart containerd service
ansible.builtin.systemd:
name: containerd
state: restarted
daemon_reload: yes
EOF

# 01_disable_swap.yml
cat <<EOF > roles/common_k8s_setup/tasks/01_disable_swap.yml
---
- name: Check if swap is active
ansible.builtin.command: swapon --show
register: swap_check_result
changed_when: false # This command itself doesn't change state
failed_when: false # Don't fail if swapon --show returns non-zero (e.g., no swap enabled)

- name: Disable swap
ansible.builtin.command: swapoff -a
when: swap_check_result.rc == 0 # Only run if swapon --show indicated swap is active

- name: Persistently disable swap (comment out swapfile in fstab)
ansible.builtin.replace:
path: /etc/fstab
regexp: '^(/swapfile.*)$'
replace: '#\1'
when: swap_check_result.rc == 0 # Only run if swap was found to be active
EOF

# 02_containerd_setup.yml (UPDATED for sandbox_image)
cat <<EOF > roles/common_k8s_setup/tasks/02_containerd_setup.yml
---
- name: Install required packages for Containerd
ansible.builtin.apt:
name:
- ca-certificates
- curl
- gnupg
- lsb-release
- apt-transport-https
- software-properties-common
state: present
update_cache: yes

- name: Add Docker GPG key
ansible.builtin.apt_key:
url: https://download.docker.com/linux/ubuntu/gpg
state: present
keyring: /etc/apt/keyrings/docker.gpg # Use keyring for modern apt

- name: Add Docker APT repository
ansible.builtin.apt_repository:
repo: "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu {{ ansible_distribution_release }} stable"
state: present
filename: docker

- name: Install Containerd
ansible.builtin.apt:
name: containerd.io
state: present
update_cache: yes

- name: Create containerd configuration directory
ansible.builtin.file:
path: /etc/containerd
state: directory
mode: '0755'

- name: Generate default containerd configuration directly to final path
ansible.builtin.shell: containerd config default > /etc/containerd/config.toml
changed_when: true # Always report change as we're ensuring a default state

- name: Ensure CRI plugin is enabled (remove any disabled_plugins line containing "cri")
ansible.builtin.lineinfile:
path: /etc/containerd/config.toml
regexp: '^\s*disabled_plugins = \[.*"cri".*\]' # More general regexp
state: absent
backup: yes
notify: Restart containerd service

- name: Remove top-level systemd_cgroup from CRI plugin section
ansible.builtin.lineinfile:
path: /etc/containerd/config.toml
regexp: '^\s*systemd_cgroup = (true|false)' # Matches the 'systemd_cgroup' directly under [plugins."io.containerd.grpc.v1.cri"]
state: absent # Remove this line
backup: yes
notify: Restart containerd service

- name: Remove old runtime_root from runc runtime section
ansible.builtin.lineinfile:
path: /etc/containerd/config.toml
regexp: '^\s*runtime_root = ".*"' # Matches runtime_root line
state: absent
backup: yes
notify: Restart containerd service

- name: Configure runc runtime to use SystemdCgroup = true
ansible.builtin.lineinfile:
path: /etc/containerd/config.toml
regexp: '^\s*#?\s*SystemdCgroup = (true|false)' # Matches the 'SystemdCgroup' under runc.options
line: ' SystemdCgroup = true' # Ensure correct indentation
insertafter: '^\s*\[plugins\."io\.containerd\.grpc\.v1\.cri"\.containerd\.runtimes\.runc\.options\]'
backup: yes
notify: Restart containerd service

- name: Add Root path to runc options
ansible.builtin.lineinfile:
path: /etc/containerd/config.toml
regexp: '^\s*Root = ".*"' # Matches existing Root line if any
line: ' Root = "/run/containerd/runc"' # New Root path
insertafter: '^\s*\[plugins\."io\.containerd\.grpc\.v1\.cri"\.containerd\.runtimes\.runc\.options\]'
backup: yes
notify: Restart containerd service

- name: Update sandbox_image to pause:3.10
ansible.builtin.lineinfile:
path: /etc/containerd/config.toml
regexp: '^\s*sandbox_image = "registry.k8s.io/pause:.*"'
line: ' sandbox_image = "registry.k8s.io/pause:3.10"'
insertafter: '^\s*\[plugins\."io\.containerd\.grpc\.v1\.cri"\]' # Insert after the CRI plugin section start
backup: yes
notify: Restart containerd service
EOF

# 03_kernel_modules_sysctl.yml
cat <<EOF > roles/common_k8s_setup/tasks/03_kernel_modules_sysctl.yml
---
- name: Load overlay module
ansible.builtin.command: modprobe overlay
args:
creates: /sys/module/overlay # Check if module is loaded
changed_when: false

- name: Load br_netfilter module
ansible.builtin.command: modprobe br_netfilter
args:
creates: /sys/module/br_netfilter # Check if module is loaded
changed_when: false

- name: Add modules to /etc/modules-load.d/k8s.conf
ansible.builtin.copy:
dest: /etc/modules-load.d/k8s.conf
content: |
overlay
br_netfilter

- name: Configure sysctl parameters for Kubernetes networking
ansible.builtin.copy:
dest: /etc/sysctl.d/k8s.conf
content: |
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1

- name: Apply sysctl parameters
ansible.builtin.command: sysctl --system
changed_when: false
EOF

# 04_kube_repo_install_hold.yml
cat <<EOF > roles/common_k8s_setup/tasks/04_kube_repo_install_hold.yml
---
- name: Create Kubernetes apt keyring directory
ansible.builtin.file:
path: /etc/apt/keyrings
state: directory
mode: '0755'

- name: Download Kubernetes GPG key and dearmor
ansible.builtin.shell: |
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.33/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
args:
creates: /etc/apt/keyrings/kubernetes-apt-keyring.gpg
changed_when: false # This command is idempotent enough for our purposes

- name: Add Kubernetes APT repository source list
ansible.builtin.copy:
dest: /etc/apt/sources.list.d/kubernetes.list
content: "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.33/deb/ /\n"
mode: '0644'
backup: yes

- name: Update apt cache after adding Kubernetes repo
ansible.builtin.apt:
update_cache: yes

- name: Install kubelet, kubeadm, kubectl
ansible.builtin.apt:
name:
- kubelet
- kubeadm
- kubectl
state: present
update_cache: yes # Ensure apt cache is updated after adding repo.

- name: Hold kubelet, kubeadm, kubectl packages
ansible.builtin.dpkg_selections:
name: "{{ item }}"
selection: hold
loop:
- kubelet
- kubeadm
- kubectl

- name: Enable and start kubelet service
ansible.builtin.systemd:
name: kubelet
state: started
enabled: yes
EOF

# NEW FILE: 05_initial_upgrade.yml
cat <<EOF > roles/common_k8s_setup/tasks/05_initial_upgrade.yml
---
- name: Perform initial apt update and upgrade
ansible.builtin.apt:
update_cache: yes
upgrade: yes
autoremove: yes
purge: yes
EOF

# NEW FILE: 06_configure_weekly_updates.yml
cat <<EOF > roles/common_k8s_setup/tasks/06_configure_weekly_updates.yml
---
- name: Configure weekly apt update and upgrade cron job
ansible.builtin.cron:
name: "weekly apt update and upgrade"
weekday: "0" # Sunday
hour: "3" # 3 AM
minute: "0"
job: "/usr/bin/apt update && /usr/bin/apt upgrade -y && /usr/bin/apt autoremove -y && /usr/bin/apt clean"
user: root
state: present
EOF

# Master and Worker roles are still created for structure, but not called by playbook.yml
mkdir -p roles/k8s-master/tasks
cat <<EOF > roles/k8s-master/tasks/main.yml
---
- name: This role is intentionally skipped by the main playbook for manual setup.
ansible.builtin.debug:
msg: "This master role is not executed by default. Run 'kubeadm init' manually on the master node."
EOF

mkdir -p roles/k8s-worker/tasks
cat <<EOF > roles/k8s-worker/tasks/main.yml
---
- name: This role is intentionally skipped by the main playbook for manual setup.
ansible.builtin.debug:
msg: "This worker role is not executed by default. Run 'kubeadm join' manually on worker nodes."
EOF

echo "Ansible roles and tasks created."
}

# --- Main execution ---
install_ansible
create_project_dir
create_ansible_cfg
create_inventory
create_playbook
create_roles

echo ""
echo "--- Ansible setup for Kubernetes installation is complete! ---"
echo "Navigate to the project directory:"
echo "cd ${PROJECT_DIR}"
echo ""
echo "Then, run the Ansible playbook to install Kubernetes components on all nodes:"
echo "ansible-playbook playbook.yml -K"
echo ""
echo "After the playbook finishes, you will need to manually initialize the Kubernetes cluster:"
echo "1. SSH into the master node (kube-node-1):"
echo " ssh ubuntu@${MASTER_NODE_IP}"
echo ""
echo "2. Initialize the Kubernetes control plane on the master node:"
echo " sudo kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy --pod-infra-container-image=registry.k8s.io/pause:3.10"
echo ""
echo "3. After 'kubeadm init' completes, it will print instructions to set up kubectl and the 'kubeadm join' command."
echo " Follow the instructions to set up kubectl for the 'ubuntu' user:"
echo " mkdir -p \$HOME/.kube"
echo " sudo cp -i /etc/kubernetes/admin.conf \$HOME/.kube/config"
echo " sudo chown \$(id -u):\$(id -g) \$HOME/.kube/config"
echo ""
echo "4. Copy the 'kubeadm join' command (including the token and discovery-token-ca-cert-hash) printed by 'kubeadm init'."
echo " It will look something like: 'kubeadm join <master-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>'"
echo ""
echo "5. SSH into each worker node (kube-node-2, kube-node-3) and run the join command:"
echo " ssh ubuntu@${WORKER_NODE_IP_1} (for kube-node-2)"
echo " sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>"
echo ""
echo " ssh ubuntu@${WORKER_NODE_IP_2} (for kube-node-3)"
echo " sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>"
echo ""
echo "6. Verify your cluster status from the master node:"
echo " ssh ubuntu@${MASTER_NODE_IP}"
echo " kubectl get nodes"
echo " kubectl get pods --all-namespaces"

上述脚本执行完后,创建以下文档

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
root@ois:/home/ois/data/k8s/k8s_cluster_setup# ls -R1
.:
ansible.cfg
inventory.ini
playbook.yml
roles

./roles:
common_k8s_setup
k8s-master
k8s-worker

./roles/common_k8s_setup:
handlers
tasks

./roles/common_k8s_setup/handlers:
main.yml

./roles/common_k8s_setup/tasks:
00_add_hosts_entries.yml
01_disable_swap.yml
02_containerd_setup.yml
03_kernel_modules_sysctl.yml
04_kube_repo_install_hold.yml
05_initial_upgrade.yml
06_configure_weekly_updates.yml
main.yml

./roles/k8s-master:
tasks

./roles/k8s-master/tasks:
main.yml

./roles/k8s-worker:
tasks

./roles/k8s-worker/tasks:
main.yml

2.2 执行过程

在执行前,需要让.ssh/known_hosts 文件保存这些主机的SSH Key。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#!/bin/bash

# This script prepares the Ansible control node by adding the SSH host keys
# of the target nodes to the ~/.ssh/known_hosts file.
# It first removes any outdated keys for the specified hosts before scanning
# for the new ones, preventing "REMOTE HOST IDENTIFICATION HAS CHANGED" errors.

# --- Configuration ---
# List of hosts (IPs or FQDNs) to scan.
# These should be the same hosts you have in your Ansible inventory.
HOSTS=(
"10.75.59.71"
"10.75.59.72"
"10.75.59.73"
)

# The location of your known_hosts file.
KNOWN_HOSTS_FILE=~/.ssh/known_hosts

# --- Main Logic ---
echo "Starting SSH host key scan to update ${KNOWN_HOSTS_FILE}..."
echo ""

# Ensure the .ssh directory exists with the correct permissions.
mkdir -p ~/.ssh
chmod 700 ~/.ssh

# Loop through each host defined in the HOSTS array.
for host in "${HOSTS[@]}"; do
echo "--- Processing host: ${host} ---"

# 1. Remove the old host key (if it exists).
# This is the key step to ensure we replace outdated entries.
# The command is silent if no key is found.
echo "Step 1: Removing any old key for ${host}..."
ssh-keygen -R "${host}"

# 2. Scan for the new host key and append it.
# The -H flag hashes the hostname, which is a security best practice.
echo "Step 2: Scanning for new key and adding it to known_hosts..."
ssh-keyscan -H "${host}" >> "${KNOWN_HOSTS_FILE}"

echo "Successfully updated key for ${host}."
echo ""
done

# Set the correct permissions for the known_hosts file, as SSH is strict about this.
chmod 600 "${KNOWN_HOSTS_FILE}"

echo "✅ All hosts have been scanned and keys have been updated."
echo "You can now run your Ansible playbook without host key verification prompts."
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
chmod +x ansible-k8s-v2.sh 
ois@ois:~/data/k8s$ ./ansible-k8s-v2.sh
--- Installing Ansible ---
Ansible is already installed.
--- Creating project directory: k8s_cluster_setup ---
Changed to directory: /home/ois/data/k8s/k8s_cluster_setup
--- Creating ansible.cfg ---
ansible.cfg created.
--- Creating inventory.ini ---
inventory.ini created.
--- Creating playbook.yml ---
playbook.yml created (only common setup included).
--- Creating Ansible roles and tasks ---
Ansible roles and tasks created.

--- Ansible setup for Kubernetes installation is complete! ---
Navigate to the project directory:
cd k8s_cluster_setup

Then, run the Ansible playbook to install Kubernetes components on all nodes:
ansible-playbook playbook.yml -K

After the playbook finishes, you will need to manually initialize the Kubernetes cluster:
1. SSH into the master node (kube-node-1):
ssh ubuntu@10.75.59.71

2. Initialize the Kubernetes control plane on the master node:
sudo kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy --pod-infra-container-image=registry.k8s.io/pause:3.10

3. After 'kubeadm init' completes, it will print instructions to set up kubectl and the 'kubeadm join' command.
Follow the instructions to set up kubectl for the 'ubuntu' user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

4. Copy the 'kubeadm join' command (including the token and discovery-token-ca-cert-hash) printed by 'kubeadm init'.
It will look something like: 'kubeadm join <master-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>'

5. SSH into each worker node (kube-node-2, kube-node-3) and run the join command:
ssh ubuntu@10.75.59.72 (for kube-node-2)
sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>

ssh ubuntu@10.75.59.73 (for kube-node-3)
sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>

6. Verify your cluster status from the master node:
ssh ubuntu@10.75.59.71
kubectl get nodes
kubectl get pods --all-namespaces
ois@ois:~/data/k8s$ cd k8s_cluster_setup/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
ois@ois:~/data/k8s/k8s_cluster_setup$ ansible-playbook playbook.yml -K
BECOME password:

PLAY [Common Kubernetes Setup for all nodes] *******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Include add hosts entries task] *******************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/00_add_hosts_entries.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Add all inventory hosts to /etc/hosts on each node] ***********************************************************************************************************************************************************************************
changed: [kube-node-2] => (item=kube-node-1)
changed: [kube-node-1] => (item=kube-node-1)
changed: [kube-node-3] => (item=kube-node-1)
changed: [kube-node-1] => (item=kube-node-2)
changed: [kube-node-2] => (item=kube-node-2)
changed: [kube-node-3] => (item=kube-node-2)
changed: [kube-node-2] => (item=kube-node-3)
changed: [kube-node-1] => (item=kube-node-3)
changed: [kube-node-3] => (item=kube-node-3)

TASK [common_k8s_setup : Include disable swap task] ************************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/01_disable_swap.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Check if swap is active] **************************************************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-1]
ok: [kube-node-3]

TASK [common_k8s_setup : Disable swap] *************************************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Persistently disable swap (comment out swapfile in fstab)] ****************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-3]
ok: [kube-node-1]

TASK [common_k8s_setup : Include containerd setup task] ********************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/02_containerd_setup.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Install required packages for Containerd] *********************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

TASK [common_k8s_setup : Add Docker GPG key] *******************************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Add Docker APT repository] ************************************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

TASK [common_k8s_setup : Install Containerd] *******************************************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-1]
changed: [kube-node-2]

TASK [common_k8s_setup : Create containerd configuration directory] ********************************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-3]
ok: [kube-node-1]

TASK [common_k8s_setup : Generate default containerd configuration directly to final path] *********************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

TASK [common_k8s_setup : Ensure CRI plugin is enabled (remove any disabled_plugins line containing "cri")] *****************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Remove top-level systemd_cgroup from CRI plugin section] ******************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Remove old runtime_root from runc runtime section] ************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

TASK [common_k8s_setup : Configure runc runtime to use SystemdCgroup = true] ***********************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Add Root path to runc options] ********************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

TASK [common_k8s_setup : Update sandbox_image to pause:3.10] ***************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Include kernel modules and sysctl task] ***********************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/03_kernel_modules_sysctl.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Load overlay module] ******************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Load br_netfilter module] *************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Add modules to /etc/modules-load.d/k8s.conf] ******************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Configure sysctl parameters for Kubernetes networking] ********************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Apply sysctl parameters] **************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Include kube repo, install, and hold task] ********************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/04_kube_repo_install_hold.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Create Kubernetes apt keyring directory] **********************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Download Kubernetes GPG key and dearmor] **********************************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-1]
ok: [kube-node-3]

TASK [common_k8s_setup : Add Kubernetes APT repository source list] ********************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-3]
changed: [kube-node-1]

TASK [common_k8s_setup : Update apt cache after adding Kubernetes repo] ****************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

TASK [common_k8s_setup : Install kubelet, kubeadm, kubectl] ****************************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-2]
changed: [kube-node-1]

TASK [common_k8s_setup : Hold kubelet, kubeadm, kubectl packages] **********************************************************************************************************************************************************************************************
changed: [kube-node-1] => (item=kubelet)
changed: [kube-node-3] => (item=kubelet)
changed: [kube-node-2] => (item=kubelet)
changed: [kube-node-3] => (item=kubeadm)
changed: [kube-node-1] => (item=kubeadm)
changed: [kube-node-2] => (item=kubeadm)
changed: [kube-node-3] => (item=kubectl)
changed: [kube-node-1] => (item=kubectl)
changed: [kube-node-2] => (item=kubectl)

TASK [common_k8s_setup : Enable and start kubelet service] *****************************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-2]
changed: [kube-node-1]

TASK [common_k8s_setup : Include initial apt upgrade task] *****************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/05_initial_upgrade.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Perform initial apt update and upgrade] ***********************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-2]
changed: [kube-node-1]

TASK [common_k8s_setup : Include configure weekly updates task] ************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/06_configure_weekly_updates.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Configure weekly apt update and upgrade cron job] *************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

RUNNING HANDLER [common_k8s_setup : Restart containerd service] ************************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
kube-node-1 : ok=39 changed=22 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
kube-node-2 : ok=39 changed=22 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
kube-node-3 : ok=39 changed=22 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

3. 设置本地DNS和BGP

设置一个本地DNS和BGP服务器用于测试。

3.1 create-dns.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
#!/bin/bash

# --- Configuration ---
BASE_IMAGE_PATH="/home/ois/data/vmimages/noble-server-cloudimg-amd64.img"
VM_IMAGE_DIR="/home/ois/data/k8s/nodevms"
VM_CONFIG_DIR="/home/ois/data/k8s/nodevm_cfg"
RAM_MB=8192
VCPUS=4
DISK_SIZE_GB=20
BRIDGE_INTERFACE="br0"
# Specific IP for the single VM
VM_IP="10.75.59.76"
NETWORK_PREFIX="/24"
GATEWAY="10.75.59.1"
# DNS servers for the VM's initial resolution (for internet access)
VM_NAMESERVER1="64.104.76.247"
VM_NAMESERVER2="64.104.14.184"
SEARCH_DOMAIN="cisco.com"
VNC_PORT=5909 # Fixed VNC port for the single VM
PASSWORD_HASH='$6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1'
SSH_PUB_KEY=$(cat ~/.ssh/id_rsa.pub)

# --- VM Details ---
VM_NAME="dns-server-vm"
VM_IMAGE_PATH="${VM_IMAGE_DIR}/${VM_NAME}.qcow2"

echo "--- Preparing for $VM_NAME (IP: $VM_IP) ---"

# Create directories if they don't exist
mkdir -p "$VM_IMAGE_DIR"
mkdir -p "$VM_CONFIG_DIR"

# Create a fresh image for the VM
if [ -f "$VM_IMAGE_PATH" ]; then
echo "Removing existing image for $VM_NAME..."
rm "$VM_IMAGE_PATH"
fi
echo "Copying base image to $VM_IMAGE_PATH..."
cp "$BASE_IMAGE_PATH" "$VM_IMAGE_PATH"

# Resize the copied image before virt-install
echo "Resizing VM image to ${DISK_SIZE_GB}GB..."
qemu-img resize "$VM_IMAGE_PATH" "${DISK_SIZE_GB}G"

# Generate user-data for the VM (installing dnsmasq, no UFW, no dnsmasq config here)
USER_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_user-data"
cat <<EOF > "$USER_DATA_FILE"
#cloud-config

locale: en_US
keyboard:
layout: us
timezone: Asia/Tokyo
hostname: ${VM_NAME}
create_hostname_file: true

ssh_pwauth: yes

groups:
- ubuntu

users:
- name: ubuntu
gecos: ubuntu
primary_group: ubuntu
groups: sudo, cdrom
sudo: ALL=(ALL:ALL) ALL
shell: /bin/bash
lock_passwd: false
passwd: ${PASSWORD_HASH}
ssh_authorized_keys:
- "${SSH_PUB_KEY}"

apt:
primary:
- arches: [default]
uri: http://us.archive.ubuntu.com/ubuntu/

packages:
- openssh-server
- net-tools
- iftop
- htop
- iperf3
- vim
- curl
- wget
- cloud-guest-utils # Ensure growpart is available
- dnsmasq # Install dnsmasq for DNS server functionality

ntp:
servers: ['ntp.esl.cisco.com']

runcmd:
- echo "Attempting to resize root partition and filesystem..."
- growpart /dev/vda 1 # Expand the first partition on /dev/vda
- resize2fs /dev/vda1 # Expand the ext4 filesystem on /dev/vda1
- echo "Disk resize commands executed. Verify with 'df -h' after boot."
EOF

# Generate network-config for the VM (pointing to external DNS for initial connectivity)
NETWORK_CONFIG_FILE="${VM_CONFIG_DIR}/${VM_NAME}_network-config"
cat <<EOF > "$NETWORK_CONFIG_FILE"
network:
version: 2
ethernets:
enp1s0:
addresses:
- "${VM_IP}${NETWORK_PREFIX}"
nameservers:
addresses:
- ${VM_NAMESERVER1} # Point VM to external DNS for initial internet access
- ${VM_NAMESERVER2}
search:
- ${SEARCH_DOMAIN}
routes:
- to: "default"
via: "${GATEWAY}"
EOF

# Generate meta-data
META_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_meta-data"
cat <<EOF > "$META_DATA_FILE"
instance-id: ${VM_NAME}
local-hostname: ${VM_NAME}
EOF

echo "--- Installing $VM_NAME ---"
virt-install --name "${VM_NAME}" --ram "${RAM_MB}" --vcpus "${VCPUS}" --noreboot \
--os-variant ubuntu24.04 \
--network bridge="${BRIDGE_INTERFACE}" \
--graphics vnc,listen=0.0.0.0,port="${VNC_PORT}" \
--disk path="${VM_IMAGE_PATH}",format=qcow2 \
--console pty,target_type=serial \
--cloud-init user-data="${USER_DATA_FILE}",meta-data="${META_DATA_FILE}",network-config="${NETWORK_CONFIG_FILE}" \
--import \
--wait 0

echo "Successfully initiated creation of $VM_NAME."
echo "You can connect to VNC on port ${VNC_PORT} to monitor installation (optional)."
echo "Wait a few minutes for the VM to boot and cloud-init to run."
echo "--------------------------------------------------------"

echo "The DNS server VM has been initiated. Please wait for it to fully provision."
echo "You can SSH into it using 'ssh ubuntu@${VM_IP}'."
echo "Once provisioned, proceed to use the 'setup-dnsmasq-ansible.sh' script to configure DNS using Ansible."

ois@ois:~/data/k8s$ ./create-dns.sh
--- Preparing for dns-server-vm (IP: 10.75.59.76) ---
Copying base image to /home/ois/data/k8s/nodevms/dns-server-vm.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing dns-server-vm ---
WARNING Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-y1pxxrj5-cloudinit.iso' | 0 B 00:00:00 ...
Transferring 'virtinst-y1pxxrj5-cloudinit.iso' | 0 B 00:00:00 ...
Creating domain... | 0 B 00:00:00
Domain creation completed.
Successfully initiated creation of dns-server-vm.
You can connect to VNC on port 5909 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
The DNS server VM has been initiated. Please wait for it to fully provision.
You can SSH into it using 'ssh ubuntu@10.75.59.76'.
Once provisioned, you can test the DNS forwarding by configuring another machine to use 10.75.59.76 as its DNS server and performing a DNS query.

3.2 使用Ansible设置dnsmasq

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
#!/bin/bash

# --- Configuration for Ansible and DNSmasq ---
VM_IP="10.75.59.76"
SSH_USER="ubuntu"
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Ensure this path is correct for your setup
FORWARD_NAMESERVER1="64.104.76.247"
FORWARD_NAMESERVER2="64.104.14.184"
SEARCH_DOMAIN="cisco.com"
ANSIBLE_DIR="ansible_dnsmasq_setup"

echo "--- Setting up Ansible environment and configuring DNSmasq on ${VM_IP} ---"

# Create a directory for Ansible files
mkdir -p "$ANSIBLE_DIR"
cd "$ANSIBLE_DIR" || exit 1

# --- Install Ansible if not already installed ---
if ! command -v ansible &> /dev/null
then
echo "Ansible not found. Installing Ansible..."
# Check OS and install accordingly
if [ -f /etc/os-release ]; then
. /etc/os-release
if [[ "$ID" == "ubuntu" || "$ID" == "debian" ]]; then
sudo apt update
sudo apt install -y software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible
elif [[ "$ID" == "centos" || "$ID" == "rhel" || "$ID" == "fedora" ]]; then
sudo yum install -y epel-release
sudo yum install -y ansible
else
echo "Unsupported OS for automatic Ansible installation. Please install Ansible manually."
exit 1
fi
else
echo "Could not determine OS for automatic Ansible installation. Please install Ansible manually."
exit 1
fi
else
echo "Ansible is already installed."
fi

# --- Create Ansible Inventory File ---
echo "Creating Ansible inventory file: inventory.ini"
cat <<EOF > inventory.ini
[dns_server]
${VM_IP} ansible_user=${SSH_USER} ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH} ansible_python_interpreter=/usr/bin/python3
EOF

# --- Create Ansible Playbook (setup-dnsmasq.yml) ---
echo "Creating Ansible playbook: setup-dnsmasq.yml"
cat <<EOF > setup-dnsmasq.yml
---
- name: Configure DNSmasq Server on Ubuntu VM
hosts: dns_server
become: yes # Run tasks with sudo privileges
vars:
dns_forwarder_1: "${FORWARD_NAMESERVER1}"
dns_forwarder_2: "${FORWARD_NAMESERVER2}"
vm_ip: "${VM_IP}"
search_domain: "${SEARCH_DOMAIN}"

tasks:
- name: Ensure apt cache is updated
ansible.builtin.apt:
update_cache: yes
cache_valid_time: 3600 # Cache for 1 hour

- name: Install dnsmasq package
ansible.builtin.apt:
name: dnsmasq
state: present

- name: Stop dnsmasq service before configuration
ansible.builtin.systemd:
name: dnsmasq
state: stopped
ignore_errors: yes # Ignore if it's not running initially

- name: Backup original dnsmasq.conf
ansible.builtin.command: mv /etc/dnsmasq.conf /etc/dnsmasq.conf.bak
args:
removes: /etc/dnsmasq.conf # Only run if dnsmasq.conf exists
ignore_errors: yes

- name: Configure dnsmasq for forwarding
ansible.builtin.template:
src: dnsmasq.conf.j2
dest: /etc/dnsmasq.conf
owner: root
group: root
mode: '0644'
notify: Restart dnsmasq

- name: Set VM's /etc/resolv.conf to point to itself (local DNS)
ansible.builtin.template:
src: resolv.conf.j2
dest: /etc/resolv.conf
owner: root
group: root
mode: '0644'
vars:
local_dns_ip: "127.0.0.1" # dnsmasq listens on 127.0.0.1
# Removed: search_domain: "{{ search_domain }}" - it's already available from play vars
notify: Restart systemd-resolved # Or NetworkManager, depending on Ubuntu version

handlers:
- name: Restart dnsmasq
ansible.builtin.systemd:
name: dnsmasq
state: restarted
enabled: yes # Ensure it's enabled to start on boot

- name: Restart systemd-resolved
ansible.builtin.systemd:
name: systemd-resolved
state: restarted
ignore_errors: yes # systemd-resolved might not be used on server installs
EOF

# --- Create dnsmasq.conf.j2 template ---
echo "Creating dnsmasq.conf.j2 template"
cat <<EOF > dnsmasq.conf.j2
# This file is managed by Ansible. Do not edit manually.

# Do not read /etc/resolv.conf, use the servers below
no-resolv

# Specify upstream DNS servers for forwarding
server={{ dns_forwarder_1 }}
server={{ dns_forwarder_2 }}

# Listen on localhost and the VM's primary IP
listen-address=127.0.0.1,{{ vm_ip }}

# Allow queries from any interface
interface={{ ansible_default_ipv4.interface }} # Listen on the primary network interface

# Bind to the interfaces to prevent dnsmasq from listening on all interfaces
bind-interfaces

# Cache DNS results
cache-size=150
EOF

# --- Create resolv.conf.j2 template ---
echo "Creating resolv.conf.j2 template"
cat <<EOF > resolv.conf.j2
# This file is managed by Ansible. Do not edit manually.
nameserver {{ local_dns_ip }}
search {{ search_domain }}
EOF

# --- Run the Ansible Playbook ---
echo "Running Ansible playbook to configure DNSmasq..."
ansible-playbook -i inventory.ini setup-dnsmasq.yml -K

# --- Final Instructions ---
echo "--------------------------------------------------------"
echo "DNSmasq configuration complete on ${VM_IP}."
echo "You can now test the DNS server from the VM or from another machine."
echo "From the VM, run: dig @127.0.0.1 www.cisco.com"
echo "From another machine on the same network, configure its DNS to ${VM_IP} and run: dig www.cisco.com"
echo "Remember to exit the '${ANSIBLE_DIR}' directory when done (cd ..)."

执行效果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
ois@ois:~/data/k8s$ ./ansible-dnsmasq.sh 
--- Setting up Ansible environment and configuring DNSmasq on 10.75.59.76 ---
Ansible is already installed.
Creating Ansible inventory file: inventory.ini
Creating Ansible playbook: setup-dnsmasq.yml
Creating dnsmasq.conf.j2 template
Creating resolv.conf.j2 template
Running Ansible playbook to configure DNSmasq...
BECOME password:

PLAY [Configure DNSmasq Server on Ubuntu VM] *******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Ensure apt cache is updated] *****************************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Install dnsmasq package] *********************************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Stop dnsmasq service before configuration] ***************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Backup original dnsmasq.conf] ****************************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

TASK [Configure dnsmasq for forwarding] ************************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

TASK [Set VM's /etc/resolv.conf to point to itself (local DNS)] ************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

RUNNING HANDLER [Restart dnsmasq] ******************************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

RUNNING HANDLER [Restart systemd-resolved] *********************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
10.75.59.76 : ok=9 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

--------------------------------------------------------
DNSmasq configuration complete on 10.75.59.76.
You can now test the DNS server from the VM or from another machine.
From the VM, run: dig @127.0.0.1 www.cisco.com
From another machine on the same network, configure its DNS to 10.75.59.76 and run: dig www.cisco.com
Remember to exit the 'ansible_dnsmasq_setup' directory when done (cd ..).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
root@dns-server-vm:~# dig @127.0.0.1 www.cisco.com

; <<>> DiG 9.18.30-0ubuntu0.24.04.2-Ubuntu <<>> @127.0.0.1 www.cisco.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10545
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: ba16202024b9dc8301000000688b40f90daaa0a500a50d2d (good)
;; QUESTION SECTION:
;www.cisco.com. IN A

;; ANSWER SECTION:
www.cisco.com. 3600 IN CNAME origin-www.cisco.com.
origin-www.cisco.com. 1800 IN CNAME origin-www.xgslb-v3.cisco.com.
origin-www.xgslb-v3.CISCO.com. 10 IN A 72.163.4.161

;; Query time: 71 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Thu Jul 31 19:10:01 JST 2025
;; MSG SIZE rcvd: 183

3.3 使用Ansible更新Node 的DNS配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
#!/bin/bash

# --- Configuration ---
ANSIBLE_DIR="ansible_dns_update"
INVENTORY_FILE="${ANSIBLE_DIR}/hosts.ini"
PLAYBOOK_FILE="${ANSIBLE_DIR}/update_dns.yml"

# Kubernetes Node IPs (ensure these match your actual VM IPs)
KUBE_NODE_1_IP="10.75.59.71"
KUBE_NODE_2_IP="10.75.59.72"
KUBE_NODE_3_IP="10.75.59.73"

# Common Ansible user and Python interpreter
ANSIBLE_USER="ubuntu"
ANSIBLE_PYTHON_INTERPRETER="/usr/bin/python3"

# --- Functions ---

# Function to check and install Ansible
install_ansible() {
if ! command -v ansible &> /dev/null
then
echo "Ansible not found. Attempting to install Ansible..."
if [ -f /etc/debian_version ]; then
# Debian/Ubuntu
sudo apt update
sudo apt install -y software-properties-common
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible
elif [ -f /etc/redhat-release ]; then
# CentOS/RHEL/Fedora
sudo yum install -y epel-release
sudo yum install -y ansible
else
echo "Unsupported OS for automatic Ansible installation. Please install Ansible manually."
exit 1
fi
if ! command -v ansible &> /dev/null; then
echo "Ansible installation failed. Please install it manually and re-run this script."
exit 1
fi
echo "Ansible installed successfully."
else
echo "Ansible is already installed."
fi
}

# Function to create Ansible inventory file
create_inventory() {
echo "Creating Ansible inventory file: ${INVENTORY_FILE}"
mkdir -p "$ANSIBLE_DIR"
cat <<EOF > "$INVENTORY_FILE"
[kubernetes_nodes]
kube-node-1 ansible_host=${KUBE_NODE_1_IP}
kube-node-2 ansible_host=${KUBE_NODE_2_IP}
kube-node-3 ansible_host=${KUBE_NODE_3_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_python_interpreter=${ANSIBLE_PYTHON_INTERPRETER}
EOF
echo "Inventory file created."
}

# Function to create Ansible playbook file
create_playbook() {
echo "Creating Ansible playbook file: ${PLAYBOOK_FILE}"
mkdir -p "$ANSIBLE_DIR"
cat <<'EOF' > "$PLAYBOOK_FILE"
---
- name: Update DNS server on Kubernetes nodes to use local DNS only
hosts: kubernetes_nodes
become: yes # This allows Ansible to run commands with sudo privileges

tasks:
- name: Ensure netplan configuration directory exists
ansible.builtin.file:
path: /etc/netplan
state: directory
mode: '0755'

- name: Get current network configuration file (e.g., 00-installer-config.yaml)
ansible.builtin.find:
paths: /etc/netplan
patterns: '*.yaml'
# We assume there's only one primary netplan config file for simplicity.
# If there are multiple, you might need to specify which one.
register: netplan_files

- name: Set network config file variable
ansible.builtin.set_fact:
netplan_config_file: "{{ netplan_files.files[0].path }}"
when: netplan_files.files | length > 0

- name: Fail if no netplan config file found
ansible.builtin.fail:
msg: "No Netplan configuration file found in /etc/netplan. Cannot proceed."
when: netplan_files.files | length == 0

- name: Read current netplan configuration
ansible.builtin.slurp:
src: "{{ netplan_config_file }}"
register: current_netplan_config

- name: Parse current netplan configuration
ansible.builtin.set_fact:
parsed_netplan: "{{ current_netplan_config['content'] | b64decode | from_yaml }}"

- name: Update nameservers in netplan configuration to local DNS only
ansible.builtin.set_fact:
updated_netplan: "{{ parsed_netplan | combine(
{
'network': {
'ethernets': {
'enp1s0': {
'nameservers': {
'addresses': ['10.75.59.76'],
'search': ['cisco.com']
}
}
}
}
}, recursive=True) }}"

- name: Write updated netplan configuration
ansible.builtin.copy:
content: "{{ updated_netplan | to_yaml }}"
dest: "{{ netplan_config_file }}"
mode: '0600'
notify: Apply Netplan Configuration

handlers:
- name: Apply Netplan Configuration
ansible.builtin.command: netplan apply
listen: "Apply Netplan Configuration"
EOF
echo "Playbook file created."
}

# --- Main Script Execution ---

echo "Starting Ansible DNS update process..."

# 1. Install Ansible if not present
install_ansible

# 2. Create Ansible inventory file
create_inventory

# 3. Create Ansible playbook file
create_playbook

# 4. Run the Ansible playbook
echo "Running Ansible playbook to update DNS on Kubernetes nodes..."
echo "You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs."
ansible-playbook -i "$INVENTORY_FILE" "$PLAYBOOK_FILE" --ask-become-pass

if [ $? -eq 0 ]; then
echo "Ansible playbook executed successfully."
echo "Your Kubernetes nodes should now be configured to use 10.75.59.76 as their only DNS server."
else
echo "Ansible playbook failed. Please check the output for errors."
fi

echo "Process complete."
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
ois@ois:~/data/k8s$ ./updatedns.sh 
Starting Ansible DNS update process...
Ansible is already installed.
Creating Ansible inventory file: ansible_dns_update/hosts.ini
Inventory file created.
Creating Ansible playbook file: ansible_dns_update/update_dns.yml
Playbook file created.
Running Ansible playbook to update DNS on Kubernetes nodes...
You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs.
BECOME password:

PLAY [Update DNS server on Kubernetes nodes to use local DNS only] *********************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Ensure netplan configuration directory exists] ***********************************************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-2]
ok: [kube-node-1]

TASK [Get current network configuration file (e.g., 00-installer-config.yaml)] *********************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-1]
ok: [kube-node-2]

TASK [Set network config file variable] ************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Fail if no netplan config file found] ********************************************************************************************************************************************************************************************************************
skipping: [kube-node-1]
skipping: [kube-node-2]
skipping: [kube-node-3]

TASK [Read current netplan configuration] **********************************************************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-1]
ok: [kube-node-2]

TASK [Parse current netplan configuration] *********************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Update nameservers in netplan configuration to local DNS only] *******************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Write updated netplan configuration] *********************************************************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-1]
ok: [kube-node-2]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
kube-node-1 : ok=8 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
kube-node-2 : ok=8 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
kube-node-3 : ok=8 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0

Ansible playbook executed successfully.
Your Kubernetes nodes should now be configured to use 10.75.59.76 as their only DNS server.
Process complete.

root@kube-node-1:~# cat /etc/resolv.conf
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search cisco.com
root@kube-node-1:~# cat /etc/netplan/50-cloud-init.yaml
network:
ethernets:
enp1s0:
addresses: [10.75.59.71/24]
nameservers:
addresses: [10.75.59.76]
search: [cisco.com]
routes:
- {to: default, via: 10.75.59.1}
version: 2

3.4 使用Ansible配置FRR BGP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
#!/bin/bash

# This script automates the setup of an Ansible environment for installing and configuring FRRouting (FRR).
# It creates the project directory, inventory, configuration, and the playbook
# with an idempotent role to install and configure FRR.

# --- Configuration ---
PROJECT_DIR="ansible-frr-setup" # Changed project directory name
FRR_NODE_IP="10.75.59.76" # IP address of your FRR VM (frr-server-vm)
ANSIBLE_USER="ubuntu" # The user created by cloud-init on your VMs
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Path to your SSH private key on the Ansible control machine

# FRR specific configuration
FRR_AS=65000 # The Autonomous System number for this FRR node (example AS, choose your own)
K8S_MASTER_IP="10.75.59.71" # From your create-vms.sh script
K8S_WORKER_1_IP="10.75.59.72" # From your create-vms.sh script
K8S_WORKER_2_IP="10.75.59.73" # From your create-vms.sh script
CILIUM_BGP_AS=65000 # AS for Cilium as per your CiliumBGPClusterConfig

# --- Functions ---

# Function to install Ansible (if not already installed)
install_ansible() {
echo "--- Installing Ansible ---"
if ! command -v ansible &> /dev/null; then
sudo apt update -y
sudo apt install -y ansible
echo "Ansible installed successfully."
else
echo "Ansible is already installed."
fi
}

# Function to create project directory and navigate into it
create_project_dir() {
echo "--- Creating project directory: ${PROJECT_DIR} ---"
# Check if directory exists, if so, just navigate, otherwise create and navigate
if [ ! -d "${PROJECT_DIR}" ]; then
mkdir -p "${PROJECT_DIR}"
echo "Created new directory: ${PROJECT_DIR}"
else
echo "Directory ${PROJECT_DIR} already exists."
fi
cd "${PROJECT_DIR}" || { echo "Failed to change directory to ${PROJECT_DIR}. Exiting."; exit 1; }
echo "Changed to directory: $(pwd)"
}

# Function to create ansible.cfg
create_ansible_cfg() {
echo "--- Creating ansible.cfg ---"
cat <<EOF > ansible.cfg
[defaults]
inventory = inventory.ini
roles_path = ./roles
host_key_checking = False # WARNING: Disable host key checking for convenience. Re-enable for production!
EOF
echo "ansible.cfg created."
}

# Function to create inventory.ini
create_inventory() {
echo "--- Creating inventory.ini ---"
cat <<EOF > inventory.ini
[frr_nodes]
frr-node-1 ansible_host=${FRR_NODE_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH}
ansible_python_interpreter=/usr/bin/python3
FRR_AS=${FRR_AS}
K8S_MASTER_IP=${K8S_MASTER_IP}
K8S_WORKER_1_IP=${K8S_WORKER_1_IP}
K8S_WORKER_2_IP=${K8S_WORKER_2_IP}
CILIUM_BGP_AS=${CILIUM_BGP_AS}
EOF
echo "inventory.ini created."
}

# Function to create the main playbook.yml
create_playbook() {
echo "--- Creating playbook.yml ---"
cat <<EOF > playbook.yml
---
- name: Install and Configure FRRouting (FRR)
hosts: frr_nodes
become: yes
roles:
- frr_setup # Changed role name to frr_setup
EOF
echo "playbook.yml created."
}

# Function to create the FRR installation and configuration role
create_frr_role() { # Changed function name from create_gobgp_role
echo "--- Creating Ansible role for FRR setup ---"
mkdir -p roles/frr_setup/tasks
cat <<EOF > roles/frr_setup/tasks/main.yml
---
- name: Install FRRouting (FRR)
ansible.builtin.apt:
name: frr
state: present
update_cache: yes

- name: Configure FRR daemons (enable zebra and bgpd)
ansible.builtin.lineinfile:
path: /etc/frr/daemons
regexp: '^(zebra|bgpd)='
line: '\1=yes'
state: present
backrefs: yes # Required to make regexp work for replacement
notify: Restart FRR service

- name: Configure frr.conf
ansible.builtin.copy:
dest: /etc/frr/frr.conf
content: |
!
hostname {{ ansible_hostname }}
password zebra
enable password zebra
!
log syslog informational
!
router bgp {{ FRR_AS }}
bgp router-id {{ ansible_host }}
!
neighbor {{ K8S_MASTER_IP }} remote-as {{ CILIUM_BGP_AS }}
neighbor {{ K8S_WORKER_1_IP }} remote-as {{ CILIUM_BGP_AS }}
neighbor {{ K8S_WORKER_2_IP }} remote-as {{ CILIUM_BGP_AS }}
!
address-family ipv4 unicast
# Crucial: Redistribute BGP learned routes into the kernel
redistribute connected
redistribute static
redistribute kernel
exit-address-family
!
line vty
!
mode: '0644'
notify: Restart FRR service # Handler only runs if file content changes

- name: Set permissions for frr.conf
ansible.builtin.file:
path: /etc/frr/frr.conf
owner: frr
group: frr
mode: '0640'

- name: Enable and start FRR service
ansible.builtin.systemd:
name: frr
state: started
enabled: yes
daemon_reload: yes # Ensure systemd reloads unit files if service file changed

EOF

mkdir -p roles/frr_setup/handlers
cat <<EOF > roles/frr_setup/handlers/main.yml
---
- name: Restart FRR service
ansible.builtin.systemd:
name: frr
state: restarted
EOF
echo "FRR Ansible role created."
}

# --- Main execution ---
install_ansible
create_project_dir
create_ansible_cfg
create_inventory
create_playbook
create_frr_role # Changed function call

echo ""
echo "--- Ansible setup for FRR installation is complete! ---"
echo "Navigate to the new project directory:"
echo "cd ${PROJECT_DIR}"
echo ""
echo "Then, run the Ansible playbook to install and configure FRR on your VM:"
echo "ansible-playbook playbook.yml -K"
echo ""
echo "After the playbook finishes, FRR should be running and configured on ${FRR_NODE_IP}."
echo "You can SSH into the VM and verify with 'sudo vtysh -c \"show ip bgp summary\"' and 'sudo ip route show'."

ois@ois:~/data/k8s$ ./ansible-frr.sh
--- Installing Ansible ---
Ansible is already installed.
--- Creating project directory: ansible-frr-setup ---
Created new directory: ansible-frr-setup
Changed to directory: /home/ois/data/k8s/ansible-frr-setup
--- Creating ansible.cfg ---
ansible.cfg created.
--- Creating inventory.ini ---
inventory.ini created.
--- Creating playbook.yml ---
playbook.yml created.
--- Creating Ansible role for FRR setup ---
FRR Ansible role created.

--- Ansible setup for FRR installation is complete! ---
Navigate to the new project directory:
cd ansible-frr-setup

Then, run the Ansible playbook to install and configure FRR on your VM:
ansible-playbook playbook.yml -K

After the playbook finishes, FRR should be running and configured on 10.75.59.76.
You can SSH into the VM and verify with 'sudo vtysh -c "show ip bgp summary"' and 'sudo ip route show'.
ois@ois:~/data/k8s$ cd ansible-frr-setup/
ois@ois:~/data/k8s/ansible-frr-setup$ ansible-playbook playbook.yml -K
BECOME password:

PLAY [Install and Configure FRRouting (FRR)] *******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [frr-node-1]

TASK [frr_setup : Install FRRouting (FRR)] *********************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Configure FRR daemons (enable zebra and bgpd)] ***********************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Configure frr.conf] **************************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Set permissions for frr.conf] ****************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Enable and start FRR service] ****************************************************************************************************************************************************************************************************************
ok: [frr-node-1]

RUNNING HANDLER [frr_setup : Restart FRR service] **************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
frr-node-1 : ok=7 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
root@dns-server-vm:~# cat /etc/frr/frr.conf 
!
hostname dns-server-vm
password zebra
enable password zebra
!
log syslog informational
!
router bgp 65000
bgp router-id 10.75.59.76
!
neighbor 10.75.59.71 remote-as 65000
neighbor 10.75.59.72 remote-as 65000
neighbor 10.75.59.73 remote-as 65000
!
address-family ipv4 unicast
# Crucial: Redistribute BGP learned routes into the kernel
redistribute connected
redistribute static
redistribute kernel
exit-address-family
!
line vty
!
root@dns-server-vm:~# systemctl status frr
* frr.service - FRRouting
Loaded: loaded (/usr/lib/systemd/system/frr.service; enabled; preset: enabled)
Active: active (running) since Wed 2025-07-23 12:16:58 JST; 1 week 1 day ago
Docs: https://frrouting.readthedocs.io/en/latest/setup.html
Process: 15611 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
Main PID: 15623 (watchfrr)
Status: "FRR Operational"
Tasks: 13 (limit: 9486)
Memory: 21.1M (peak: 28.3M)
CPU: 5min 23.845s
CGroup: /system.slice/frr.service
|-15623 /usr/lib/frr/watchfrr -d -F traditional zebra bgpd staticd
|-15636 /usr/lib/frr/zebra -d -F traditional -A 127.0.0.1 -s 90000000
|-15641 /usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1
`-15648 /usr/lib/frr/staticd -d -F traditional -A 127.0.0.1

Jul 31 16:26:25 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
Jul 31 16:27:24 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.73 in vrf default
Jul 31 16:27:24 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.72 in vrf default
Jul 31 16:27:24 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
Jul 31 16:46:48 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.72 in vrf default
Jul 31 16:46:48 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.73 in vrf default
Jul 31 16:46:48 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
Jul 31 16:47:54 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.73 in vrf default
Jul 31 16:47:54 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.72 in vrf default
Jul 31 16:47:54 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
root@dns-server-vm:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.76
172.16.0.0/24 nhid 95 via 10.75.59.71 dev enp1s0 proto bgp metric 20
172.16.1.0/24 nhid 90 via 10.75.59.72 dev enp1s0 proto bgp metric 20
172.16.2.0/24 nhid 100 via 10.75.59.73 dev enp1s0 proto bgp metric 20
172.16.16.1 nhid 202 proto bgp metric 20
nexthop via 10.75.59.72 dev enp1s0 weight 1
nexthop via 10.75.59.71 dev enp1s0 weight 1
nexthop via 10.75.59.73 dev enp1s0 weight 1
172.16.16.10 nhid 202 proto bgp metric 20
nexthop via 10.75.59.72 dev enp1s0 weight 1
nexthop via 10.75.59.71 dev enp1s0 weight 1
nexthop via 10.75.59.73 dev enp1s0 weight 1
172.16.20.119 nhid 202 proto bgp metric 20
nexthop via 10.75.59.72 dev enp1s0 weight 1
nexthop via 10.75.59.71 dev enp1s0 weight 1
nexthop via 10.75.59.73 dev enp1s0 weight 1
172.16.22.26 nhid 202 proto bgp metric 20
nexthop via 10.75.59.72 dev enp1s0 weight 1
nexthop via 10.75.59.71 dev enp1s0 weight 1
nexthop via 10.75.59.73 dev enp1s0 weight 1
172.16.23.18 nhid 202 proto bgp metric 20
nexthop via 10.75.59.72 dev enp1s0 weight 1
nexthop via 10.75.59.71 dev enp1s0 weight 1
nexthop via 10.75.59.73 dev enp1s0 weight 1
172.16.30.170 nhid 202 proto bgp metric 20
nexthop via 10.75.59.72 dev enp1s0 weight 1
nexthop via 10.75.59.71 dev enp1s0 weight 1
nexthop via 10.75.59.73 dev enp1s0 weight 1
root@dns-server-vm:~# vtysh -c 'show ip bgp summary'

IPv4 Unicast Summary (VRF default):
BGP router identifier 10.75.59.76, local AS number 65000 vrf-id 0
BGP table version 222
RIB entries 19, using 3648 bytes of memory
Peers 3, using 2172 KiB of memory

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.75.59.71 4 65000 71265 71182 0 0 0 03:21:08 7 2 N/A
10.75.59.72 4 65000 71344 71264 0 0 0 03:21:09 7 2 N/A
10.75.59.73 4 65000 71240 71162 0 0 0 03:21:09 7 2 N/A

Total number of neighbors 3

4. 设置Kubernetes集群并安装CNI Cilium

4.1 使用kubeadm设置Kubernetes

使用以下命令进行集群初始化

1
2
3
4
5
kubeadm config images pull

kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy

--skip-phases=addon/kube-proxy 表示不安装kube-proxy,会用Cilium进行替代
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
ubuntu@kube-node-1:~$ sudo kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy[sudo] password for ubuntu: 
[init] Using Kubernetes version: v1.33.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-node-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.248.0.1 10.75.59.71]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.002649961s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://10.75.59.71:6443/livez
[control-plane-check] Checking kube-controller-manager at https://127.0.0.1:10257/healthz
[control-plane-check] Checking kube-scheduler at https://127.0.0.1:10259/livez
[control-plane-check] kube-controller-manager is healthy after 1.813351787s
[control-plane-check] kube-scheduler is healthy after 3.309147352s
[control-plane-check] kube-apiserver is healthy after 5.505049123s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 1r5ugd.o2pjzipcq69z71l8
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

kubeadm join kube-node-1:6443 --token 1r5ugd.o2pjzipcq69z71l8 \
--discovery-token-ca-cert-hash sha256:e29fb62581a4d21268585c3b345f9e060827c52a8325b1d28b8437c792ba7923 \
--control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join kube-node-1:6443 --token 1r5ugd.o2pjzipcq69z71l8 \
--discovery-token-ca-cert-hash sha256:e29fb62581a4d21268585c3b345f9e060827c52a8325b1d28b8437c792ba7923
ubuntu@kube-node-1:~$
ubuntu@kube-node-1:~$ mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
ubuntu@kube-node-1:~$ sudo su
root@kube-node-1:/home/ubuntu# cd

在.bashrc 中增加以下内容
root@kube-node-1:~# cat .bashrc | grep export
export KUBECONFIG=/etc/kubernetes/admin.conf

这样root才能执行 kubectl 命令与 api-server通信.

root@kube-node-1:~# kubectl cluster-info
Kubernetes control plane is running at https://kube-node-1:6443
CoreDNS is running at https://kube-node-1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
root@kube-node-1:~# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kube-node-1 NotReady control-plane 68s v1.33.3 10.75.59.71 <none> Ubuntu 24.04.2 LTS 6.8.0-63-generic containerd://1.7.27

还没有安装CNI,Node NotReady

4.2 使用Ansible安装Helm

设置Helm的目的是为了安装Cilium,并非一定要使用Ansible来进行设置,供参考。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
#!/bin/bash

# This script automates the setup of an Ansible environment for installing Helm.
# It creates the project directory, inventory, configuration, and the playbook
# with an idempotent role to install Helm.

# --- Configuration ---
PROJECT_DIR="ansible-helm"
MASTER_NODE_IP="10.75.59.71" # IP address of your Kubernetes master node (kube-node-1)
ANSIBLE_USER="ubuntu" # The user created by cloud-init on your VMs
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Path to your SSH private key on the Ansible control machine

# Helm version to install
HELM_VERSION="v3.18.4" # You can change this to a desired stable version

# --- Functions ---

# Function to create project directory and navigate into it
create_project_dir() {
echo "--- Creating project directory: ${PROJECT_DIR} ---"
# Check if directory exists, if so, just navigate, otherwise create and navigate
if [ ! -d "${PROJECT_DIR}" ]; then
mkdir -p "${PROJECT_DIR}"
echo "Created new directory: ${PROJECT_DIR}"
else
echo "Directory ${PROJECT_DIR} already exists."
fi
cd "${PROJECT_DIR}" || { echo "Failed to change directory to ${PROJECT_DIR}. Exiting."; exit 1; }
echo "Changed to directory: $(pwd)"
}

# Function to create ansible.cfg
create_ansible_cfg() {
echo "--- Creating ansible.cfg ---"
cat <<EOF > ansible.cfg
[defaults]
inventory = inventory.ini
roles_path = ./roles
host_key_checking = False # WARNING: Disable host key checking for convenience. Re-enable for production!
EOF
echo "ansible.cfg created."
}

# Function to create inventory.ini
create_inventory() {
echo "--- Creating inventory.ini ---"
cat <<EOF > inventory.ini
[kubernetes_master]
kube-node-1 ansible_host=${MASTER_NODE_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH}
ansible_python_interpreter=/usr/bin/python3
HELM_VERSION=${HELM_VERSION}
EOF
echo "inventory.ini created."
}

# Function to create the main playbook.yml
create_playbook() {
echo "--- Creating playbook.yml ---"
cat <<EOF > playbook.yml
---
- name: Install Helm on Kubernetes Master Node
hosts: kubernetes_master
become: yes
environment: # Ensure KUBECONFIG is set for helm commands run with become
KUBECONFIG: /etc/kubernetes/admin.conf # Use the admin kubeconfig on the master
roles:
- helm_install
EOF
echo "playbook.yml created."
}

# Function to create the Helm installation role (with idempotent check)
create_helm_role() {
echo "--- Creating Ansible role for Helm installation ---"
mkdir -p roles/helm_install/tasks
cat <<EOF > roles/helm_install/tasks/main.yml
---
- name: Check if Helm is installed and get version
ansible.builtin.command: helm version --short
register: helm_version_raw
ignore_errors: yes
changed_when: false

- name: Set installed Helm version fact
ansible.builtin.set_fact:
installed_helm_version: "{{ (helm_version_raw.stdout | default('') | regex_findall('^(v[0-9]+\\\\.[0-9]+\\\\.[0-9]+)') | first | default('') | trim) }}"
changed_when: false

- name: Debug installed Helm version
ansible.builtin.debug:
msg: "Current installed Helm version: {{ installed_helm_version | default('Not installed') }}"

- name: Debug raw Helm version output
ansible.builtin.debug:
msg: "Raw Helm version output: {{ helm_version_raw.stdout | default('No output') }}"
when: helm_version_raw.stdout is defined and helm_version_raw.stdout | length > 0

- name: Check if Helm binary exists
ansible.builtin.stat:
path: /usr/local/bin/helm
register: helm_binary_stat
when: installed_helm_version == HELM_VERSION

- name: Download Helm tarball
ansible.builtin.get_url:
url: "https://get.helm.sh/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
dest: "/tmp/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
mode: '0644'
checksum: "sha256:{{ lookup('url', 'https://get.helm.sh/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz.sha256sum', wantlist=True)[0].split(' ')[0] }}"
register: download_helm_result
until: download_helm_result is success
retries: 5
delay: 5
when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Create Helm installation directory
ansible.builtin.file:
path: /usr/local/bin
state: directory
mode: '0755'
when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Extract Helm binary
ansible.builtin.unarchive:
src: "/tmp/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
dest: "/tmp"
remote_src: yes
creates: "/tmp/linux-amd64/helm"
when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Move Helm binary to /usr/local/bin
ansible.builtin.copy:
src: "/tmp/linux-amd64/helm"
dest: "/usr/local/bin/helm"
mode: '0755'
remote_src: yes
owner: root
group: root
when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Clean up Helm tarball and extracted directory
ansible.builtin.file:
path: "{{ item }}"
state: absent
loop:
- "/tmp/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
- "/tmp/linux-amd64"
when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Verify Helm installation
ansible.builtin.command: helm version --client
register: helm_version_output
changed_when: false

- name: Display Helm version
ansible.builtin.debug:
msg: "{{ helm_version_output.stdout }}"
EOF
echo "Helm installation role created."
}

# --- Main execution ---
create_project_dir
create_ansible_cfg
create_inventory
create_playbook
create_helm_role

echo ""
echo "--- Ansible setup for Helm installation is complete! ---"
echo "Navigate to the new project directory:"
echo "cd ${PROJECT_DIR}"
echo ""
echo "Then, run the Ansible playbook to install only Helm on your master node:"
echo "ansible-playbook playbook.yml -K"
echo ""
echo "After Helm is installed, you can SSH into your master node (kube-node-1) and manage Cilium Enterprise installation directly using Helm."
echo "Remember to use the correct Cilium chart version and your custom values file."
echo "Example steps for manual Cilium installation via Helm:"
echo "ssh ubuntu@${MASTER_NODE_IP}"
echo "sudo helm repo add cilium https://helm.cilium.io/"
echo "sudo helm repo add isovalent https://helm.isovalent.com"
echo "sudo helm repo update"
echo "sudo helm install cilium isovalent/cilium --version 1.17.6 --namespace kube-system -f <path_to_your_cilium_values_file.yaml> --wait"
echo "Example content for /tmp/cilium-enterprise-values.yaml:"
echo "hubble:"
echo " enabled: true"
echo " relay:"
echo " enabled: true"
echo " ui:"
echo " enabled: false"
echo "kubeProxyReplacement: strict"
echo "ipam:"
echo " mode: kubernetes"
echo "ipv4NativeRoutingCIDR: 10.244.0.0/16"
echo "k8s:"
echo " requireIPv4PodCIDR: true"
echo "routingMode: native"
echo "autoDirectNodeRoutes: false"
echo "bgpControlPlane:"
echo " enabled: true"

执行效果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
ois@ois:~/data/k8s$ ./ansible-helm.sh 
--- Creating project directory: ansible-helm ---
Created new directory: ansible-helm
Changed to directory: /home/ois/data/k8s/ansible-helm
--- Creating ansible.cfg ---
ansible.cfg created.
--- Creating inventory.ini ---
inventory.ini created.
--- Creating playbook.yml ---
playbook.yml created.
--- Creating Ansible role for Helm installation ---
Helm installation role created.

--- Ansible setup for Helm installation is complete! ---
Navigate to the new project directory:
cd ansible-helm

Then, run the Ansible playbook to install only Helm on your master node:
ansible-playbook playbook.yml -K

After Helm is installed, you can SSH into your master node (kube-node-1) and manage Cilium Enterprise installation directly using Helm.
Remember to use the correct Cilium chart version and your custom values file.
Example steps for manual Cilium installation via Helm:
ssh ubuntu@10.75.59.71
sudo helm repo add cilium https://helm.cilium.io/
sudo helm repo add isovalent https://helm.isovalent.com
sudo helm repo update
sudo helm install cilium isovalent/cilium --version 1.17.6 --namespace kube-system -f <path_to_your_cilium_values_file.yaml> --wait
Example content for /tmp/cilium-enterprise-values.yaml:
hubble:
enabled: true
relay:
enabled: true
ui:
enabled: false
kubeProxyReplacement: strict
ipam:
mode: kubernetes
ipv4NativeRoutingCIDR: 172.16.0.0/20
k8s:
requireIPv4PodCIDR: true
routingMode: native
autoDirectNodeRoutes: false
bgpControlPlane:
enabled: true
ois@ois:~/data/k8s$ cd ansible-helm
ois@ois:~/data/k8s/ansible-helm$ ansible-playbook playbook.yml -K
BECOME password:

PLAY [Install Helm on Kubernetes Master Node] ******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Check if Helm is installed and get version] ***********************************************************************************************************************************************************************************************
fatal: [kube-node-1]: FAILED! => {"changed": false, "cmd": "helm version --short", "msg": "[Errno 2] No such file or directory: b'helm'", "rc": 2, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring

TASK [helm_install : Set installed Helm version fact] **********************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Debug installed Helm version] *************************************************************************************************************************************************************************************************************
ok: [kube-node-1] => {
"msg": "Current installed Helm version: "
}

TASK [helm_install : Debug raw Helm version output] ************************************************************************************************************************************************************************************************************
skipping: [kube-node-1]

TASK [helm_install : Check if Helm binary exists] **************************************************************************************************************************************************************************************************************
skipping: [kube-node-1]

TASK [helm_install : Download Helm tarball] ********************************************************************************************************************************************************************************************************************
changed: [kube-node-1]

TASK [helm_install : Create Helm installation directory] *******************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Extract Helm binary] **********************************************************************************************************************************************************************************************************************
changed: [kube-node-1]

TASK [helm_install : Move Helm binary to /usr/local/bin] *******************************************************************************************************************************************************************************************************
changed: [kube-node-1]

TASK [helm_install : Clean up Helm tarball and extracted directory] ********************************************************************************************************************************************************************************************
changed: [kube-node-1] => (item=/tmp/helm-v3.18.4-linux-amd64.tar.gz)
changed: [kube-node-1] => (item=/tmp/linux-amd64)

TASK [helm_install : Verify Helm installation] *****************************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Display Helm version] *********************************************************************************************************************************************************************************************************************
ok: [kube-node-1] => {
"msg": "version.BuildInfo{Version:\"v3.18.4\", GitCommit:\"d80839cf37d860c8aa9a0503fe463278f26cd5e2\", GitTreeState:\"clean\", GoVersion:\"go1.24.4\"}"
}

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
kube-node-1 : ok=11 changed=4 unreachable=0 failed=0 skipped=2 rescued=0 ignored=1

4.3 使用Helm安装Cilium

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

root@kube-node-1:~# helm repo add cilium https://helm.cilium.io/
"cilium" has been added to your repositories
root@kube-node-1:~# helm repo add isovalent https://helm.isovalent.com
"isovalent" has been added to your repositories
root@kube-node-1:~#

准备配置文件如下:

root@kube-node-1:~# cat > cilium-enterprise-values.yaml <<EOF
hubble:
enabled: true
relay:
enabled: true
ui:
enabled: false

# Enable Gateway API
#gatewayAPI:
# enabled: true

# Explicitly disable Egress Gateway
#egressGateway:
# enabled: false

# BGP native-routing configuration
ipam:
mode: kubernetes
ipv4NativeRoutingCIDR: 172.16.0.0/20 # Advertises all pod CIDRs; ensure BGP router supports this
k8s:
requireIPv4PodCIDR: true
routingMode: native
autoDirectNodeRoutes: true
bgpControlPlane:
enabled: true
# Configure BGP peers (replace with your BGP router details)
announce:
podCIDR: true # Advertise pod CIDRs to BGP peers
enableIPv4Masquerade: true

# Enable kube-proxy replacement
kubeProxyReplacement: true

bpf:
masquerade: true
lb:
externalClusterIP: true
sock: true
EOF

root@kube-node-1:~# helm install cilium isovalent/cilium --version 1.17.6 \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=10.75.59.71 \
--set k8sServicePort=6443 \
-f cilium-enterprise-values.yaml
NAME: cilium
LAST DEPLOYED: Fri Aug 1 09:54:27 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble Relay.

Your release version is 1.17.6.

For any further help, visit https://docs.isovalent.com/v1.17

上述命令中的参数k8sServiceHost=10.75.59.71 和 k8sServicePort=6443 是不能少的。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
在其他Node上执行kubeadm join

ubuntu@kube-node-2:~$ sudo su
[sudo] password for ubuntu:
root@kube-node-2:/home/ubuntu# cd
root@kube-node-2:~# kubeadm join kube-node-1:6443 --token wnc2sl.st6g6c4o0cd42bi4 \
--discovery-token-ca-cert-hash sha256:381868d3e0faab6dbd3e240d8f40e0e81ab46cb54b2f15ffbfe0f587fac5d982
[preflight] Running pre-flight checks
[preflight] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[preflight] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0801 09:56:27.830756 47851 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:wnc2sl" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.503396779s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

等待一段时间,Cilium、Hubble Relay即可Ready

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
root@kube-node-1:~# kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cilium-2vrgj 1/1 Running 0 3m58s 10.75.59.73 kube-node-3 <none> <none>
cilium-65kvc 1/1 Running 0 4m14s 10.75.59.72 kube-node-2 <none> <none>
cilium-envoy-24sd7 1/1 Running 0 4m14s 10.75.59.72 kube-node-2 <none> <none>
cilium-envoy-7pr4g 1/1 Running 0 6m12s 10.75.59.71 kube-node-1 <none> <none>
cilium-envoy-k86tp 1/1 Running 0 3m58s 10.75.59.73 kube-node-3 <none> <none>
cilium-operator-867fb7f659-2vnld 1/1 Running 0 6m12s 10.75.59.72 kube-node-2 <none> <none>
cilium-operator-867fb7f659-5998x 1/1 Running 0 6m12s 10.75.59.71 kube-node-1 <none> <none>
cilium-x4pr7 1/1 Running 0 6m12s 10.75.59.71 kube-node-1 <none> <none>
coredns-674b8bbfcf-6t8np 1/1 Running 0 13m 172.16.1.40 kube-node-2 <none> <none>
coredns-674b8bbfcf-bx8xd 1/1 Running 0 13m 172.16.1.64 kube-node-2 <none> <none>
etcd-kube-node-1 1/1 Running 1 13m 10.75.59.71 kube-node-1 <none> <none>
hubble-relay-cfb755899-gch8w 1/1 Running 0 6m12s 172.16.1.81 kube-node-2 <none> <none>
kube-apiserver-kube-node-1 1/1 Running 1 13m 10.75.59.71 kube-node-1 <none> <none>
kube-controller-manager-kube-node-1 1/1 Running 1 13m 10.75.59.71 kube-node-1 <none> <none>
kube-scheduler-kube-node-1 1/1 Running 1 13m 10.75.59.71 kube-node-1 <none> <none>

4.4 安装企业版Cilium-cli

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
curl -L --remote-name-all https://github.com/isovalent/cilium-cli-releases/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}

sha256sum --check cilium-linux-amd64.tar.gz.sha256sum

tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin

root@kube-node-1:~# cilium status
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: OK
\__/¯¯\__/ Hubble Relay: OK
\__/ ClusterMesh: disabled

DaemonSet cilium Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet cilium-envoy Desired: 3, Ready: 3/3, Available: 3/3
Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2
Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 3
cilium-envoy Running: 3
cilium-operator Running: 2
clustermesh-apiserver
hubble-relay Running: 1
Cluster Pods: 3/3 managed by Cilium
Helm chart version: 1.17.6
Image versions cilium quay.io/isovalent/cilium:v1.17.6-cee.1@sha256:2d01daf4f25f7d644889b49ca856e1a4269981fc963e50bd3962665b41b6adb3: 3
cilium-envoy quay.io/isovalent/cilium-envoy:v1.17.6-cee.1@sha256:318eff387835ca2717baab42a84f35a83a5f9e7d519253df87269f80b9ff0171: 3
cilium-operator quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265: 2
hubble-relay quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e: 1
root@kube-node-1:~#

4.5 设置Cilium BGP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
root@kube-node-1:~#cat > cilium-bgp.yaml << EOF
--- #bgp的对外宣告策略,宣告POD和serviceip
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
name: bgp-advertisements
labels:
advertise: bgp
spec:
advertisements:
- advertisementType: "PodCIDR" # Only for Kubernetes or ClusterPool IPAM cluster-pool
- advertisementType: "Service"
service:
addresses:
- ClusterIP
- ExternalIP
#- LoadBalancerIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']} # 等同于宣告所有

--- #bgp邻居组的配置,类似template peer ,里面调用相关的advertisement策略
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeerConfig
metadata:
name: cilium-peer
spec:
timers:
holdTimeSeconds: 30 #default 90s
keepAliveTimeSeconds: 10 #default 30s
connectRetryTimeSeconds: 40 #default 120s
gracefulRestart:
enabled: true
restartTimeSeconds: 120 #default 120s
#transport:
# peerPort: 179
families:
- afi: ipv4
safi: unicast
advertisements:
matchLabels:
advertise: "bgp"

--- #bgp的邻居配置
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
name: cilium-bgp-default
spec:
bgpInstances:
- name: "instance-65000"
localASN: 65000
peers:
- name: "GoBGP"
peerASN: 65000
peerAddress: 10.75.59.76
peerConfigRef:
name: "cilium-peer"
EOF
root@kube-node-1:~#
root@kube-node-1:~#
root@kube-node-1:~# kubectl apply -f cilium-bgp.yaml
ciliumbgpadvertisement.cilium.io/bgp-advertisements created
ciliumbgppeerconfig.cilium.io/cilium-peer created
ciliumbgpclusterconfig.cilium.io/cilium-bgp-default created
root@kube-node-1:~# cilium bgp peers
Node Local AS Peer AS Peer Address Session State Uptime Family Received Advertised
kube-node-1 65000 65000 10.75.59.76 established 7s ipv4/unicast 2 6
kube-node-2 65000 65000 10.75.59.76 established 6s ipv4/unicast 2 6
kube-node-3 65000 65000 10.75.59.76 established 6s ipv4/unicast 2 6
root@kube-node-1:~# cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node VRouter Prefix NextHop Age Attrs
kube-node-1 65000 172.16.0.0/24 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.37.239/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.43.10/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
kube-node-2 65000 172.16.1.0/24 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.37.239/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.43.10/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
kube-node-3 65000 172.16.3.0/24 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.37.239/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.43.10/32 0.0.0.0 12s [{Origin: i} {Nexthop: 0.0.0.0}]
root@kube-node-1:~#

4.6 安装Hubble UI

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
root@kube-node-1:~# helm search repo isovalent/hubble-ui -l
NAME CHART VERSION APP VERSION DESCRIPTION
isovalent/hubble-ui 1.3.6 1.3.6 Hubble UI Enterprise
isovalent/hubble-ui 1.3.5 1.3.5 Hubble UI Enterprise

root@kube-node-1:~# cat > hubble-ui-values.yaml << EOF
relay:
address: "hubble-relay.kube-system.svc.cluster.local"
EOF
root@kube-node-1:~#
root@kube-node-1:~# helm install hubble-ui isovalent/hubble-ui --version 1.3.6 --namespace kube-system --values hubble-ui-values.yaml --wait
NAME: hubble-ui
LAST DEPLOYED: Fri Aug 1 10:47:58 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Hubble-Ui.
Your release version is 1.3.6.

For any further help, visit https://docs.isovalent.com
root@kube-node-1:~# kubectl patch service hubble-ui -n kube-system -p '{"spec": {"type": "NodePort"}}'
service/hubble-ui patched
root@kube-node-1:~# kubectl get svc -n kube-system -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
cilium-envoy ClusterIP None <none> 9964/TCP 54m k8s-app=cilium-envoy
hubble-peer ClusterIP 172.16.43.10 <none> 443/TCP 54m k8s-app=cilium
hubble-relay NodePort 172.16.37.239 <none> 80:31234/TCP 54m k8s-app=hubble-relay
hubble-ui NodePort 172.16.35.177 <none> 80:31225/TCP 64s k8s-app=hubble-ui
kube-dns ClusterIP 172.16.32.10 <none> 53/UDP,53/TCP,9153/TCP 61m k8s-app=kube-dns
root@kube-node-1:~# kubectl -n kube-system exec ds/cilium -- cilium service list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ID Frontend Service Type Backend
1 172.16.32.1:443/TCP ClusterIP 1 => 10.75.59.71:6443/TCP (active)
2 172.16.43.10:443/TCP ClusterIP 1 => 10.75.59.71:4244/TCP (active)
3 172.16.37.239:80/TCP ClusterIP 1 => 172.16.1.81:4245/TCP (active)
4 10.75.59.71:31234/TCP NodePort 1 => 172.16.1.81:4245/TCP (active)
5 0.0.0.0:31234/TCP NodePort 1 => 172.16.1.81:4245/TCP (active)
6 172.16.32.10:53/TCP ClusterIP 1 => 172.16.1.64:53/TCP (active)
2 => 172.16.1.40:53/TCP (active)
7 172.16.32.10:9153/TCP ClusterIP 1 => 172.16.1.64:9153/TCP (active)
2 => 172.16.1.40:9153/TCP (active)
8 172.16.32.10:53/UDP ClusterIP 1 => 172.16.1.64:53/UDP (active)
2 => 172.16.1.40:53/UDP (active)
9 172.16.35.177:80/TCP ClusterIP 1 => 172.16.3.127:8081/TCP (active)
10 10.75.59.71:31225/TCP NodePort 1 => 172.16.3.127:8081/TCP (active)
11 0.0.0.0:31225/TCP NodePort 1 => 172.16.3.127:8081/TCP (active)

浏览器可以通过http://10.75.59.71:31225/ 访问Hubble-ui

root@dns-server-vm:~# curl http://10.75.59.71:31225/
<!doctype html><html><head><meta charset="utf-8"/><title>Hubble UI Enterprise</title><meta http-equiv="X-UA-Compatible" content="IE=edge"/><meta name="viewport" content="width=device-width,user-scalable=0,initial-scale=1,minimum-scale=1,maximum-scale=1"/><link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png"/><link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png"/><link rel="shortcut icon" href="/favicon.ico"/><link rel="stylesheet" href="/fonts/inter/stylesheet.css"/><link rel="stylesheet" href="/fonts/roboto-mono/stylesheet.css"/><script defer="defer" src="/bundle.app.77bec96f333a96efe6ea.js"></script><link href="/bundle.app.f1e6c0c33f1535bc8508.css" rel="stylesheet"><script type="text/template" id="hubble-ui/feature-flags">[10, 0, 18, 0, 26, 0, 34, 0, 42, 0, 50, 0]</script><script type="text/template" id="hubble-ui/authorization">[8, 1, 26, 4, 24, 1, 32, 1]</script></head><body><div id="test-process-tree-char" style="font-family: 'Roboto Mono', monospace;
font-size: 16px;
position: absolute;
visibility: hidden;
height: auto;
width: auto;
white-space: nowrap;">a</div><div id="app"></div></body></html>root@dns-server-vm:~#
root@dns-server-vm:~#

5. 部署Star Wars App

5.1 部署APP

Isovalent 提供了一个Demo APP,以下是部署脚本。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
root@kube-node-1:~# kubectl create namespace star-wars
namespace/star-wars created
root@kube-node-1:~# kubectl apply -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/minikube/http-sw-app.yaml
service/deathstar created
deployment.apps/deathstar created
pod/tiefighter created
pod/xwing created
root@kube-node-1:~# kubectl -n star-wars get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
deathstar-86f85ffb4d-4ldsj 1/1 Running 0 39s 172.16.1.231 kube-node-2 <none> <none> app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
deathstar-86f85ffb4d-dbzft 1/1 Running 0 39s 172.16.3.161 kube-node-3 <none> <none> app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
tiefighter 1/1 Running 0 39s 172.16.3.247 kube-node-3 <none> <none> app.kubernetes.io/name=tiefighter,class=tiefighter,org=empire
xwing 1/1 Running 0 39s 172.16.3.155 kube-node-3 <none> <none> app.kubernetes.io/name=xwing,class=xwing,org=alliance
root@kube-node-1:~# kubectl -n star-wars get service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
deathstar ClusterIP 172.16.39.138 <none> 80/TCP 82s class=deathstar,org=empire
root@kube-node-1:~# kubectl -n star-wars patch service deathstar -p '{"spec":{"type":"NodePort"}}'
service/deathstar patched
root@kube-node-1:~# kubectl -n star-wars get service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
deathstar NodePort 172.16.39.138 <none> 80:32271/TCP 112s class=deathstar,org=empire
root@kube-node-1:~# kubectl -n kube-system exec ds/cilium -- cilium service list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ID Frontend Service Type Backend
1 172.16.32.1:443/TCP ClusterIP 1 => 10.75.59.71:6443/TCP (active)
2 172.16.43.10:443/TCP ClusterIP 1 => 10.75.59.71:4244/TCP (active)
3 172.16.37.239:80/TCP ClusterIP 1 => 172.16.1.81:4245/TCP (active)
4 10.75.59.71:31234/TCP NodePort 1 => 172.16.1.81:4245/TCP (active)
5 0.0.0.0:31234/TCP NodePort 1 => 172.16.1.81:4245/TCP (active)
6 172.16.32.10:53/TCP ClusterIP 1 => 172.16.1.64:53/TCP (active)
2 => 172.16.1.40:53/TCP (active)
7 172.16.32.10:9153/TCP ClusterIP 1 => 172.16.1.64:9153/TCP (active)
2 => 172.16.1.40:9153/TCP (active)
8 172.16.32.10:53/UDP ClusterIP 1 => 172.16.1.64:53/UDP (active)
2 => 172.16.1.40:53/UDP (active)
9 172.16.35.177:80/TCP ClusterIP 1 => 172.16.3.127:8081/TCP (active)
10 10.75.59.71:31225/TCP NodePort 1 => 172.16.3.127:8081/TCP (active)
11 0.0.0.0:31225/TCP NodePort 1 => 172.16.3.127:8081/TCP (active)
12 172.16.39.138:80/TCP ClusterIP 1 => 172.16.3.161:80/TCP (active)
2 => 172.16.1.231:80/TCP (active)
13 10.75.59.71:32271/TCP NodePort 1 => 172.16.3.161:80/TCP (active)
2 => 172.16.1.231:80/TCP (active)
14 0.0.0.0:32271/TCP NodePort 1 => 172.16.3.161:80/TCP (active)
2 => 172.16.1.231:80/TCP (active)
到这里就算部署成功了。

root@kube-node-1:~# kubectl -n star-wars exec tiefighter -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed
root@kube-node-1:~# kubectl -n star-wars exec xwing -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed
root@kube-node-1:~#

在外部的设备也可以通过节点IP访问。

root@dns-server-vm:~# curl -s -XPOST http://10.75.59.72:32271/v1/request-landing
Ship landed
root@dns-server-vm:~# curl -s -XPOST http://10.75.59.71:32271/v1/request-landing
Ship landed
root@dns-server-vm:~# curl -s -XPOST http://10.75.59.73:32271/v1/request-landing
Ship landed

5.2 在Node外部抓包 ( Pod to Pod )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
查找虚拟机连接在网桥上的网卡

root@ois:/home/ois/data/k8s# virsh list
Id Name State
--------------------------------------
4 win1 running
17 r1 running
69 ubuntu-2404-desktop running
96 dns-server-vm running
97 u1 running
98 ubuntu24042 running
99 kube-node-1 running
100 kube-node-2 running
101 kube-node-3 running

root@ois:/home/ois/data/k8s# virsh domiflist 99
Interface Type Source Model MAC
-----------------------------------------------------------
vnet90 bridge br0 virtio 52:54:00:90:8c:cf

root@ois:/home/ois/data/k8s# virsh domiflist 100
Interface Type Source Model MAC
-----------------------------------------------------------
vnet91 bridge br0 virtio 52:54:00:ba:d4:1f

root@ois:/home/ois/data/k8s# virsh domiflist 101
Interface Type Source Model MAC
-----------------------------------------------------------
vnet92 bridge br0 virtio 52:54:00:37:e0:96


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
在Pod中发起访问

root@kube-node-1:~# kubectl -n star-wars get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
deathstar-86f85ffb4d-4ldsj 1/1 Running 0 4m1s 172.16.1.231 kube-node-2 <none> <none>
deathstar-86f85ffb4d-dbzft 1/1 Running 0 4m1s 172.16.3.161 kube-node-3 <none> <none>
tiefighter 1/1 Running 0 4m1s 172.16.3.247 kube-node-3 <none> <none>
xwing 1/1 Running 0 4m1s 172.16.3.155 kube-node-3 <none> <none>
root@kube-node-1:~# kubectl -n star-wars exec xwing -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
11: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether f2:2a:b8:da:e7:d2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 172.16.3.155/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::f02a:b8ff:feda:e7d2/64 scope link
valid_lft forever preferred_lft forever
root@kube-node-1:~# kubectl -n star-wars exec xwing -- ping 172.16.1.231
error: Internal error occurred: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "1b1120795a5b60f35bac5a4056a1714a5c0df32762e2a79f272cf2c82089e970": OCI runtime exec failed: exec failed: unable to start container process: exec: "ping": executable file not found in $PATH: unknown
root@kube-node-1:~# kubectl -n star-wars exec xwing -- curl -s http://172.16.1.231/v1
{
"name": "Death Star",
"hostname": "deathstar-86f85ffb4d-4ldsj",
"model": "DS-1 Orbital Battle Station",
"manufacturer": "Imperial Department of Military Research, Sienar Fleet Systems",
"cost_in_credits": "1000000000000",
"length": "120000",
"crew": "342953",
"passengers": "843342",
"cargo_capacity": "1000000000000",
"hyperdrive_rating": "4.0",
"starship_class": "Deep Space Mobile Battlestation",
"api": [
"GET /v1",
"GET /v1/healthz",
"POST /v1/request-landing",
"PUT /v1/cargobay",
"GET /v1/hyper-matter-reactor/status",
"PUT /v1/exhaust-port"
]
}

在宿主机上进行抓包,可以看到两者之间是直接路由的。

root@ois:/home/ois/data/k8s# tcpdump -i vnet92 -vn 'tcp port 80'
tcpdump: listening on vnet92, link-type EN10MB (Ethernet), snapshot length 262144 bytes
11:11:11.478887 IP (tos 0x0, ttl 63, id 25688, offset 0, flags [DF], proto TCP (6), length 60)
172.16.3.155.37162 > 172.16.1.231.80: Flags [S], cksum 0x5dd1 (incorrect -> 0xdb07), seq 542023762, win 64240, options [mss 1460,sackOK,TS val 1624400247 ecr 0,nop,wscale 7], length 0
11:11:11.479178 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60)
172.16.1.231.80 > 172.16.3.155.37162: Flags [S.], cksum 0x5dd1 (incorrect -> 0x7b14), seq 3892089835, ack 542023763, win 65160, options [mss 1460,sackOK,TS val 727758081 ecr 1624400247,nop,wscale 7], length 0
11:11:11.479479 IP (tos 0x0, ttl 63, id 25689, offset 0, flags [DF], proto TCP (6), length 52)
172.16.3.155.37162 > 172.16.1.231.80: Flags [.], cksum 0x5dc9 (incorrect -> 0xa673), ack 1, win 502, options [nop,nop,TS val 1624400247 ecr 727758081], length 0
11:11:11.479552 IP (tos 0x0, ttl 63, id 25690, offset 0, flags [DF], proto TCP (6), length 130)
172.16.3.155.37162 > 172.16.1.231.80: Flags [P.], cksum 0x5e17 (incorrect -> 0x366d), seq 1:79, ack 1, win 502, options [nop,nop,TS val 1624400247 ecr 727758081], length 78: HTTP, length: 78
GET /v1 HTTP/1.1
Host: 172.16.1.231
User-Agent: curl/7.88.1
Accept: */*

11:11:11.479684 IP (tos 0x0, ttl 63, id 13836, offset 0, flags [DF], proto TCP (6), length 52)
172.16.1.231.80 > 172.16.3.155.37162: Flags [.], cksum 0x5dc9 (incorrect -> 0xa61e), ack 79, win 509, options [nop,nop,TS val 727758081 ecr 1624400247], length 0
11:11:11.480420 IP (tos 0x0, ttl 63, id 13837, offset 0, flags [DF], proto TCP (6), length 746)
172.16.1.231.80 > 172.16.3.155.37162: Flags [P.], cksum 0x607f (incorrect -> 0xc657), seq 1:695, ack 79, win 509, options [nop,nop,TS val 727758082 ecr 1624400247], length 694: HTTP, length: 694
HTTP/1.1 200 OK
Content-Type: text/plain
Date: Fri, 01 Aug 2025 03:11:11 GMT
Content-Length: 591

{
"name": "Death Star",
"hostname": "deathstar-86f85ffb4d-4ldsj",
"model": "DS-1 Orbital Battle Station",
"manufacturer": "Imperial Department of Military Research, Sienar Fleet Systems",
"cost_in_credits": "1000000000000",
"length": "120000",
"crew": "342953",
"passengers": "843342",
"cargo_capacity": "1000000000000",
"hyperdrive_rating": "4.0",
"starship_class": "Deep Space Mobile Battlestation",
"api": [
"GET /v1",
"GET /v1/healthz",
"POST /v1/request-landing",
"PUT /v1/cargobay",
"GET /v1/hyper-matter-reactor/status",
"PUT /v1/exhaust-port"
]
}

Cilium将各个Node对应的Pod路由自动安装了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
root@kube-node-3:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.73
172.16.0.0/24 via 10.75.59.71 dev enp1s0 proto kernel
172.16.1.0/24 via 10.75.59.72 dev enp1s0 proto kernel
172.16.3.0/24 via 172.16.3.22 dev cilium_host proto kernel src 172.16.3.22
172.16.3.22 dev cilium_host proto kernel scope link
root@kube-node-3:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.75.59.1 0.0.0.0 UG 0 0 0 enp1s0
10.75.59.0 0.0.0.0 255.255.255.0 U 0 0 0 enp1s0
172.16.0.0 10.75.59.71 255.255.255.0 UG 0 0 0 enp1s0
172.16.1.0 10.75.59.72 255.255.255.0 UG 0 0 0 enp1s0
172.16.3.0 172.16.3.22 255.255.255.0 UG 0 0 0 cilium_host
172.16.3.22 0.0.0.0 255.255.255.255 UH 0 0 0 cilium_host

root@kube-node-2:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.72
172.16.0.0/24 via 10.75.59.71 dev enp1s0 proto kernel
172.16.1.0/24 via 172.16.1.128 dev cilium_host proto kernel src 172.16.1.128
172.16.1.128 dev cilium_host proto kernel scope link
172.16.3.0/24 via 10.75.59.73 dev enp1s0 proto kernel
root@kube-node-2:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.75.59.1 0.0.0.0 UG 0 0 0 enp1s0
10.75.59.0 0.0.0.0 255.255.255.0 U 0 0 0 enp1s0
172.16.0.0 10.75.59.71 255.255.255.0 UG 0 0 0 enp1s0
172.16.1.0 172.16.1.128 255.255.255.0 UG 0 0 0 cilium_host
172.16.1.128 0.0.0.0 255.255.255.255 UH 0 0 0 cilium_host
172.16.3.0 10.75.59.73 255.255.255.0 UG 0 0 0 enp1s0

在Hubble UI中,可以看到详细的Flow

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
{
"uuid": "2af76725-f6f9-49b4-b9f1-8d01b3f14930",
"verdict": 1,
"drop_reason": 0,
"auth_type": 0,
"Type": 1,
"node_name": "kube-node-2",
"node_labels": [
"beta.kubernetes.io/arch=amd64",
"beta.kubernetes.io/os=linux",
"kubernetes.io/arch=amd64",
"kubernetes.io/hostname=kube-node-2",
"kubernetes.io/os=linux"
],
"source_names": [],
"destination_names": [],
"reply": false,
"traffic_direction": 2,
"policy_match_type": 0,
"trace_observation_point": 101,
"trace_reason": 1,
"drop_reason_desc": 0,
"debug_capture_point": 0,
"proxy_port": 0,
"sock_xlate_point": 0,
"socket_cookie": 0,
"cgroup_id": 0,
"Summary": "TCP Flags: SYN",
"egress_allowed_by": [],
"ingress_allowed_by": [],
"egress_denied_by": [],
"ingress_denied_by": [],
"time": {
"seconds": 1754022736,
"nanos": 962926009
},
"ethernet": {
"source": "f2:64:9f:b5:8e:81",
"destination": "82:31:36:d1:5a:00"
},
"IP": {
"source": "172.16.3.155",
"source_xlated": "",
"destination": "172.16.1.231",
"ipVersion": 1,
"encrypted": false
},
"l4": {
"protocol": {
"oneofKind": "TCP",
"TCP": {
"source_port": 38680,
"destination_port": 80,
"flags": {
"FIN": false,
"SYN": true,
"RST": false,
"PSH": false,
"ACK": false,
"URG": false,
"ECE": false,
"CWR": false,
"NS": false
}
}
}
},
"source": {
"ID": 0,
"identity": 36770,
"cluster_name": "default",
"namespace": "star-wars",
"labels": [
"k8s:app.kubernetes.io/name=xwing",
"k8s:class=xwing",
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=star-wars",
"k8s:io.cilium.k8s.policy.cluster=default",
"k8s:io.cilium.k8s.policy.serviceaccount=default",
"k8s:io.kubernetes.pod.namespace=star-wars",
"k8s:org=alliance"
],
"pod_name": "xwing",
"workloads": []
},
"destination": {
"ID": 284,
"identity": 15153,
"cluster_name": "default",
"namespace": "star-wars",
"labels": [
"k8s:app.kubernetes.io/name=deathstar",
"k8s:class=deathstar",
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=star-wars",
"k8s:io.cilium.k8s.policy.cluster=default",
"k8s:io.cilium.k8s.policy.serviceaccount=default",
"k8s:io.kubernetes.pod.namespace=star-wars",
"k8s:org=empire"
],
"pod_name": "deathstar-86f85ffb4d-4ldsj",
"workloads": [
{
"name": "deathstar",
"kind": "Deployment"
}
]
},
"event_type": {
"type": 4,
"sub_type": 0
},
"is_reply": {
"value": false
},
"interface": {
"index": 14,
"name": "lxcf734e435e18a"
}
}

5.3 在Node外部抓包 ( Pod to External )

接下来,在Pod内部访问Internet进行测试。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
root@kube-node-1:~# kubectl get configmap cilium-config -n kube-system -o yaml | grep -E 'enable-ipv4-masquerade|enable-bpf-masquerade'
enable-bpf-masquerade: "true"
enable-ipv4-masquerade: "true"

Pod xwing 位于Kube-node3,在Node3查看

root@kube-node-1:~# kubectl get pods -n kube-system -l k8s-app=cilium -o wide | grep kube-node-3
cilium-2vrgj 1/1 Running 0 106m 10.75.59.73 kube-node-3 <none> <none>

root@kube-node-1:~# kubectl -n star-wars exec xwing -- curl -s https://echo.free.beeceptor.com
{
"method": "GET",
"protocol": "https",
"host": "echo.free.beeceptor.com",
"path": "/",
"ip": "64.104.44.105:35834",
"headers": {
"Host": "echo.free.beeceptor.com",
"User-Agent": "curl/7.88.1",
"Accept": "*/*",
"Via": "2.0 Caddy",
"Accept-Encoding": "gzip"
},
"parsedQueryParams": {}
}root@kube-node-1:~#
root@kube-node-1:~#

使用 cilium bpf nat list 查看NAT表

root@kube-node-1:~# kubectl exec -n kube-system cilium-2vrgj -- cilium bpf nat list | grep 172.16.3.155
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
TCP OUT 172.16.3.155:35834 -> 147.182.252.2:443 XLATE_SRC 10.75.59.73:35834 Created=4sec ago NeedsCT=0
TCP IN 147.182.252.2:443 -> 10.75.59.73:35834 XLATE_DST 172.16.3.155:35834 Created=4sec ago NeedsCT=0
root@kube-node-1:~#
root@kube-node-1:~#

在外部的抓包:
root@ois:/home/ois/data/k8s# tcpdump -i vnet92 -vn 'tcp port 443'
tcpdump: listening on vnet92, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:28:22.307306 IP (tos 0x0, ttl 63, id 45759, offset 0, flags [DF], proto TCP (6), length 60)
10.75.59.73.49208 > 147.182.252.2.443: Flags [S], cksum 0xd57b (incorrect -> 0x363a), seq 3275650602, win 64240, options [mss 1460,sackOK,TS val 493037768 ecr 0,nop,wscale 7], length 0
12:28:22.477754 IP (tos 0x0, ttl 42, id 0, offset 0, flags [DF], proto TCP (6), length 60)
147.182.252.2.443 > 10.75.59.73.49208: Flags [S.], cksum 0xed16 (correct), seq 450982431, ack 3275650603, win 65160, options [mss 1254,sackOK,TS val 1573215106 ecr 493037768,nop,wscale 7], length 0
12:28:22.478053 IP (tos 0x0, ttl 63, id 45760, offset 0, flags [DF], proto TCP (6), length 52)
10.75.59.73.49208 > 147.182.252.2.443: Flags [.], cksum 0xd573 (incorrect -> 0x16fd), ack 1, win 502, options [nop,nop,TS val 493037939 ecr 1573215106], length 0
12:28:22.490922 IP (tos 0x0, ttl 63, id 45761, offset 0, flags [DF], proto TCP (6), length 569)
10.75.59.73.49208 > 147.182.252.2.443: Flags [P.], cksum 0xd778 (incorrect -> 0x1d27), seq 1:518, ack 1, win 502, options [nop,nop,TS val 493037952 ecr 1573215106], length 517

6. 安装K8S和Cilium脚本自动化

接下来打算把K8S和Cilium的安装脚本自动化。

首先需要解决在kube-node-1 能 无密码登录到kube-node-2 和 kube-node-3。

6.1 无密码登录设置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
#!/bin/bash

# --- Configuration ---
ANSIBLE_DIR="ansible_ssh_setup"
INVENTORY_FILE="${ANSIBLE_DIR}/hosts.ini"
PLAYBOOK_FILE="${ANSIBLE_DIR}/setup_ssh.yml"

# Kubernetes Node IPs
KUBE_NODE_1_IP="10.75.59.71"
KUBE_NODE_2_IP="10.75.59.72"
KUBE_NODE_3_IP="10.75.59.73"

# Common Ansible user and Python interpreter
ANSIBLE_USER="ubuntu"
ANSIBLE_PYTHON_INTERPRETER="/usr/bin/python3"

# --- Functions ---

# Function to check and install Ansible
install_ansible() {
if ! command -v ansible &> /dev/null
then
echo "Ansible not found. Attempting to install Ansible..."
if [ -f /etc/debian_version ]; then
# Debian/Ubuntu
sudo apt update
sudo apt install -y software-properties-common
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible
elif [ -f /etc/redhat-release ]; then
# CentOS/RHEL/Fedora
sudo yum install -y epel-release
sudo yum install -y ansible
else
echo "Unsupported OS for automatic Ansible installation. Please install Ansible manually."
exit 1
fi
if ! command -v ansible &> /dev/null; then
echo "Ansible installation failed. Please install it manually and re-run this script."
exit 1
fi
echo "Ansible installed successfully."
else
echo "Ansible is already installed."
fi
}

# Function to create Ansible inventory file
create_inventory() {
echo "Creating Ansible inventory file: ${INVENTORY_FILE}"
mkdir -p "$ANSIBLE_DIR"
cat <<EOF > "$INVENTORY_FILE"
[kubernetes_nodes]
kube-node-1 ansible_host=${KUBE_NODE_1_IP}
kube-node-2 ansible_host=${KUBE_NODE_2_IP}
kube-node-3 ansible_host=${KUBE_NODE_3_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_python_interpreter=${ANSIBLE_PYTHON_INTERPRETER}
EOF
echo "Inventory file created."
}

# Function to create Ansible playbook file
create_playbook() {
echo "Creating Ansible playbook file: ${PLAYBOOK_FILE}"
mkdir -p "$ANSIBLE_DIR"
cat <<'EOF' > "$PLAYBOOK_FILE"
---
- name: Generate SSH key on kube-node-1 and distribute to other nodes
hosts: kubernetes_nodes
become: yes

tasks:
- name: Generate SSH key on kube-node-1
ansible.builtin.command:
cmd: ssh-keygen -t rsa -b 4096 -N "" -f /root/.ssh/id_rsa
creates: /root/.ssh/id_rsa
when: inventory_hostname == 'kube-node-1'

- name: Ensure .ssh directory exists on all nodes
ansible.builtin.file:
path: /root/.ssh
state: directory
mode: '0700'

- name: Ensure authorized_keys file exists
ansible.builtin.file:
path: /root/.ssh/authorized_keys
state: touch
mode: '0600'

- name: Fetch public key from kube-node-1
ansible.builtin.slurp:
src: /root/.ssh/id_rsa.pub
register: ssh_public_key
when: inventory_hostname == 'kube-node-1'

- name: Distribute public key to kube-node-2 and kube-node-3
ansible.builtin.lineinfile:
path: /root/.ssh/authorized_keys
line: "{{ hostvars['kube-node-1']['ssh_public_key']['content'] | b64decode }}"
state: present
when: inventory_hostname in ['kube-node-2', 'kube-node-3']
EOF
echo "Playbook file created."
}

# --- Main Script Execution ---

echo "Starting Ansible SSH key setup process..."

# 1. Install Ansible if not present
install_ansible

# 2. Create Ansible inventory file
create_inventory

# 3. Create Ansible playbook file
create_playbook

echo "Setup complete. You can now run the Ansible playbook manually using:"
echo "ansible-playbook -i \"$INVENTORY_FILE\" \"$PLAYBOOK_FILE\" --ask-become-pass"
echo "You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs."
echo "Process complete."
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
chmod +x ansible_ssh.sh
./ansible_ssh.sh

ois@ois:~/data/k8s$ ./ansible_ssh.sh
Starting Ansible SSH key setup process...
Ansible is already installed.
Creating Ansible inventory file: ansible_ssh_setup/hosts.ini
Inventory file created.
Creating Ansible playbook file: ansible_ssh_setup/setup_ssh.yml
Playbook file created.
Setup complete. You can now run the Ansible playbook manually using:
ansible-playbook -i "ansible_ssh_setup/hosts.ini" "ansible_ssh_setup/setup_ssh.yml" --ask-become-pass
You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs.
Process complete.
ois@ois:~/data/k8s$ cd ansible_ssh_setup/
ois@ois:~/data/k8s/ansible_ssh_setup$ ansible-playbook setup_ssh.yml -i hosts.ini -K
BECOME password:

PLAY [Generate SSH key on kube-node-1 and distribute to other nodes] ********************************************************************************************************

TASK [Gathering Facts] ******************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-3]
ok: [kube-node-2]

TASK [Generate SSH key on kube-node-1] **************************************************************************************************************************************
skipping: [kube-node-2]
skipping: [kube-node-3]
changed: [kube-node-1]

TASK [Ensure .ssh directory exists on all nodes] ****************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-1]
ok: [kube-node-3]

TASK [Ensure authorized_keys file exists] ***********************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [Fetch public key from kube-node-1] ************************************************************************************************************************************
skipping: [kube-node-2]
skipping: [kube-node-3]
ok: [kube-node-1]

TASK [Distribute public key to kube-node-2 and kube-node-3] *****************************************************************************************************************
skipping: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

PLAY RECAP ******************************************************************************************************************************************************************
kube-node-1 : ok=5 changed=2 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
kube-node-2 : ok=4 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
kube-node-3 : ok=4 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0

效果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
root@kube-node-1:~# ssh root@10.75.59.72
Welcome to Ubuntu 24.04.2 LTS (GNU/Linux 6.8.0-63-generic x86_64)

* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/pro

System information as of Fri Aug 1 02:28:23 PM CST 2025

System load: 0.13 Processes: 188
Usage of /: 30.2% of 18.33GB Users logged in: 1
Memory usage: 10% IPv4 address for enp1s0: 10.75.59.72
Swap usage: 0%

* Strictly confined Kubernetes makes edge and IoT secure. Learn how MicroK8s
just raised the bar for easy, resilient and secure K8s cluster deployment.

https://ubuntu.com/engage/secure-kubernetes-at-the-edge

Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status

*** System restart required ***

6.2 自动化安装设置脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
#!/bin/bash

# ==============================================================================
# Idempotent Kubernetes and Cilium Setup Script
#
# This script can be run multiple times. It checks the current state
# at each step and only performs actions if necessary.
# It must be run as root on the primary control-plane node.
# ==============================================================================

# --- Configuration ---
CONTROL_PLANE_ENDPOINT="kube-node-1"
CONTROL_PLANE_IP="10.75.59.71"
WORKER_NODES=("10.75.59.72" "10.75.59.73")
POD_CIDR="172.16.0.0/20"
SERVICE_CIDR="172.16.32.0/20"
BGP_PEER_IP="10.75.59.76"
LOCAL_ASN=65000
PEER_ASN=65000
CILIUM_VERSION="1.17.6"
HUBBLE_UI_VERSION="1.3.6"

# ==============================================================================
# Helper Function
# ==============================================================================
print_header() { echo -e "\n### $1 ###"; }

# ==============================================================================
# STEP 1: Initialize Kubernetes Control-Plane
# ==============================================================================
print_header "STEP 1: Initializing Kubernetes Control-Plane"

if kubectl get nodes &> /dev/null; then
echo "✅ Kubernetes cluster is already running. Skipping kubeadm init."
else
echo "--> Kubernetes cluster not found. Initializing..."
kubeadm config images pull
kubeadm init \
--control-plane-endpoint=${CONTROL_PLANE_ENDPOINT} \
--pod-network-cidr=${POD_CIDR} \
--service-cidr=${SERVICE_CIDR} \
--skip-phases=addon/kube-proxy
mkdir -p /root/.kube
cp -i /etc/kubernetes/admin.conf /root/.kube/config
echo "✅ Control-Plane initialization complete."
fi

# ==============================================================================
# STEP 2: Install or Upgrade Cilium CNI
# ==============================================================================
print_header "STEP 2: Installing or Upgrading Cilium CNI"

helm repo add cilium https://helm.cilium.io/ &> /dev/null
helm repo add isovalent https://helm.isovalent.com/ &> /dev/null
helm repo update > /dev/null

cat > cilium-values.yaml <<EOF
hubble:
enabled: true
relay:
enabled: true
ui:
enabled: false
ipam:
mode: kubernetes
ipv4NativeRoutingCIDR: ${POD_CIDR}
k8s:
requireIPv4PodCIDR: true
routingMode: native
autoDirectNodeRoutes: true
enableIPv4Masquerade: true
bgpControlPlane:
enabled: true
announce:
podCIDR: true
kubeProxyReplacement: true
bpf:
masquerade: true
lb:
externalClusterIP: true
sock: true
EOF

if helm status cilium -n kube-system &> /dev/null; then
echo "--> Cilium is already installed. Upgrading to apply latest configuration..."
helm upgrade cilium isovalent/cilium --version ${CILIUM_VERSION} --namespace kube-system --set k8sServiceHost=${CONTROL_PLANE_IP},k8sServicePort=6443 -f cilium-values.yaml
else
echo "--> Cilium not found. Installing..."
helm install cilium isovalent/cilium --version ${CILIUM_VERSION} --namespace kube-system --set k8sServiceHost=${CONTROL_PLANE_IP},k8sServicePort=6443 -f cilium-values.yaml
fi
echo "--> Waiting for Cilium pods to become ready..."
kubectl -n kube-system wait --for=condition=Ready pod -l k8s-app=cilium --timeout=5m
echo "✅ Cilium is configured."

# ==============================================================================
# STEP 3: Join Worker Nodes to the Cluster
# ==============================================================================
print_header "STEP 3: Joining Worker Nodes"

for NODE_IP in "${WORKER_NODES[@]}"; do
if kubectl get nodes -o wide | grep -q "$NODE_IP"; then
echo "✅ Node ${NODE_IP} is already in the cluster. Skipping join."
else
echo "--> Node ${NODE_IP} not found in cluster. Attempting to join..."
JOIN_COMMAND=$(kubeadm token create --print-join-command)
ssh -o StrictHostKeyChecking=no root@${NODE_IP} "${JOIN_COMMAND}"
if [ $? -ne 0 ]; then
echo "❌ Failed to join node ${NODE_IP}. Please check SSH connectivity and logs." >&2
exit 1
fi
echo "✅ Node ${NODE_IP} joined successfully."
fi
done

# ==============================================================================
# STEP 4: Install Cilium CLI
# ==============================================================================
print_header "STEP 4: Installing Cilium CLI"

if command -v cilium &> /dev/null; then
echo "✅ Cilium CLI is already installed. Skipping."
else
echo "--> Installing Cilium CLI..."
curl -L --silent --remote-name-all https://github.com/isovalent/cilium-cli-releases/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-amd64.tar.gz.sha256sum > /dev/null
tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin > /dev/null
rm cilium-linux-amd64.tar.gz cilium-linux-amd64.tar.gz.sha256sum
echo "✅ Cilium CLI installed."
fi

# ==============================================================================
# STEP 5: Configure Cilium BGP Peering
# ==============================================================================
print_header "STEP 5: Configuring Cilium BGP Peering"

echo "--> Applying BGP configuration. 'unchanged' means it's already correct."
# CORRECTION: Using the correct schema with matchLabels.
cat > cilium-bgp.yaml << EOF
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
name: bgp-advertisements
labels:
advertise: bgp
spec:
advertisements:
- advertisementType: "PodCIDR" # Only for Kubernetes or ClusterPool IPAM cluster-pool
- advertisementType: "Service"
service:
addresses:
- ClusterIP
- ExternalIP
#- LoadBalancerIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']}

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeerConfig
metadata:
name: cilium-peer
spec:
timers:
holdTimeSeconds: 30 #default 90s
keepAliveTimeSeconds: 10 #default 30s
connectRetryTimeSeconds: 40 #default 120s
gracefulRestart:
enabled: true
restartTimeSeconds: 120 #default 120s
#transport:
# peerPort: 179
families:
- afi: ipv4
safi: unicast
advertisements:
matchLabels:
advertise: "bgp"

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
name: cilium-bgp-default
spec:
bgpInstances:
- name: "instance-65000"
localASN: ${LOCAL_ASN}
peers:
- name: "FRR_BGP"
peerASN: ${PEER_ASN}
peerAddress: ${BGP_PEER_IP}
peerConfigRef:
name: "cilium-peer"
EOF

# Apply the configuration and check for errors
kubectl apply -f cilium-bgp.yaml
if [ $? -ne 0 ]; then
echo "❌ Failed to apply BGP configuration. Please check the errors above." >&2
exit 1
fi
echo "✅ BGP configuration applied."

# ==============================================================================
# STEP 6: Install or Upgrade Hubble UI
# ==============================================================================
print_header "STEP 6: Installing or Upgrading Hubble UI"

cat > hubble-ui-values.yaml << EOF
relay:
address: "hubble-relay.kube-system.svc.cluster.local"
EOF

if helm status hubble-ui -n kube-system &> /dev/null; then
echo "--> Hubble UI is already installed. Upgrading..."
helm upgrade hubble-ui isovalent/hubble-ui --version ${HUBBLE_UI_VERSION} --namespace kube-system --values hubble-ui-values.yaml --wait
else
echo "--> Hubble UI not found. Installing..."
helm install hubble-ui isovalent/hubble-ui --version ${HUBBLE_UI_VERSION} --namespace kube-system --values hubble-ui-values.yaml --wait
fi

SERVICE_TYPE=$(kubectl get service hubble-ui -n kube-system -o jsonpath='{.spec.type}')
if [ "$SERVICE_TYPE" != "NodePort" ]; then
echo "--> Patching Hubble UI service to NodePort..."
kubectl patch service hubble-ui -n kube-system -p '{"spec": {"type": "NodePort"}}'
else
echo "--> Hubble UI service is already of type NodePort."
fi
echo "✅ Hubble UI is configured."

# ==============================================================================
# STEP 7: Final Verification
# ==============================================================================
print_header "STEP 7: Final Verification"
echo "--> Waiting for all nodes to be ready..."
kubectl wait --for=condition=Ready node --all --timeout=5m
echo "--> Checking Cilium status..."
cilium status --wait

HUBBLE_UI_PORT=$(kubectl get service hubble-ui -n kube-system -o jsonpath='{.spec.ports[0].nodePort}')
echo -e "\n----------------------------------------------------------------"
echo "🚀 Cluster setup is complete and verified!"
echo "Access Hubble UI at: http://${CONTROL_PLANE_IP}:${HUBBLE_UI_PORT}"
echo "----------------------------------------------------------------"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
root@kube-node-1:~# ./k8s-cilium-setup.sh 
### STEP 1: Initializing Kubernetes Control-Plane on kube-node-1... ###
--> Pulling Kubernetes container images...
[config/images] Pulled registry.k8s.io/kube-apiserver:v1.33.3
[config/images] Pulled registry.k8s.io/kube-controller-manager:v1.33.3
[config/images] Pulled registry.k8s.io/kube-scheduler:v1.33.3
[config/images] Pulled registry.k8s.io/kube-proxy:v1.33.3
[config/images] Pulled registry.k8s.io/coredns/coredns:v1.12.0
[config/images] Pulled registry.k8s.io/pause:3.10
[config/images] Pulled registry.k8s.io/etcd:3.5.21-0
--> Running kubeadm init...
[init] Using Kubernetes version: v1.33.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-node-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [172.16.32.1 10.75.59.71]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.001873284s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://10.75.59.71:6443/livez
[control-plane-check] Checking kube-controller-manager at https://127.0.0.1:10257/healthz
[control-plane-check] Checking kube-scheduler at https://127.0.0.1:10259/livez
[control-plane-check] kube-controller-manager is healthy after 2.272937689s
[control-plane-check] kube-scheduler is healthy after 3.04977489s
[control-plane-check] kube-apiserver is healthy after 5.003048769s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 0q5m6l.5pc7hz15orcc0b6b
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

kubeadm join kube-node-1:6443 --token 0q5m6l.5pc7hz15orcc0b6b \
--discovery-token-ca-cert-hash sha256:4795595e8237c54f1bf20c7fb56feea9a1960af5802c08c410733d54b5e317a6 \
--control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join kube-node-1:6443 --token 0q5m6l.5pc7hz15orcc0b6b \
--discovery-token-ca-cert-hash sha256:4795595e8237c54f1bf20c7fb56feea9a1960af5802c08c410733d54b5e317a6
--> Configuring kubectl...
✅ Control-Plane initialization complete.
### STEP 2: Installing Cilium... ###
--> Adding Helm repositories...
"cilium" has been added to your repositories
"isovalent" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "cilium" chart repository
...Successfully got an update from the "isovalent" chart repository
Update Complete. ⎈Happy Helming!⎈
--> Creating cilium-enterprise-values.yaml...
--> Installing Cilium with Helm...
NAME: cilium
LAST DEPLOYED: Fri Aug 1 14:16:16 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble Relay.

Your release version is 1.17.6.

For any further help, visit https://docs.isovalent.com/v1.17
--> Waiting for Cilium pods to become ready...
pod/cilium-5ngcm condition met
✅ Cilium installation complete.
### STEP 3: Joining Worker Nodes... ###
--> Generated join command: kubeadm join kube-node-1:6443 --token whltm8.4zs5af6hiht167da --discovery-token-ca-cert-hash sha256:4795595e8237c54f1bf20c7fb56feea9a1960af5802c08c410733d54b5e317a6
--> Joining node 10.75.59.72 to the cluster...
[preflight] Running pre-flight checks
[preflight] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[preflight] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0801 14:18:01.674767 20870 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:whltm8" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.501212696s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

✅ Node 10.75.59.72 joined successfully.
--> Joining node 10.75.59.73 to the cluster...
[preflight] Running pre-flight checks
[preflight] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[preflight] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0801 14:18:07.347008 20684 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:whltm8" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.002187345s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

✅ Node 10.75.59.73 joined successfully.
### STEP 4: Installing the Cilium CLI... ###
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
100 59.2M 100 59.2M 0 0 13.1M 0 0:00:04 0:00:04 --:--:-- 23.8M
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
100 92 100 92 0 0 70 0 0:00:01 0:00:01 --:--:-- 70
cilium-linux-amd64.tar.gz: OK
cilium
✅ Cilium CLI installed.
### STEP 5: Configuring Cilium BGP Peering ###
--> Applying BGP configuration. 'unchanged' means it's already correct.
ciliumbgpadvertisement.cilium.io/bgp-advertisements unchanged
ciliumbgppeerconfig.cilium.io/cilium-peer unchanged
ciliumbgpclusterconfig.cilium.io/cilium-bgp-default configured
✅ BGP configuration applied.
### STEP 6: Installing Hubble UI... ###
--> Creating hubble-ui-values.yaml...
--> Installing Hubble UI with Helm...
NAME: hubble-ui
LAST DEPLOYED: Fri Aug 1 14:18:18 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Hubble-Ui.
Your release version is 1.3.6.

For any further help, visit https://docs.isovalent.com
--> Exposing Hubble UI service via NodePort...
service/hubble-ui patched
✅ Hubble UI installed.
### STEP 7: Verifying the Setup... ###
--> Waiting for all nodes to be ready...
node/kube-node-1 condition met
node/kube-node-2 condition met
node/kube-node-3 condition met
--> Checking Cilium status...
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: OK
\__/¯¯\__/ Hubble Relay: OK
\__/ ClusterMesh: disabled

DaemonSet cilium Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet cilium-envoy Desired: 3, Ready: 3/3, Available: 3/3
Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2
Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-ui Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 3
cilium-envoy Running: 3
cilium-operator Running: 2
clustermesh-apiserver
hubble-relay Running: 1
hubble-ui Running: 1
Cluster Pods: 8/8 managed by Cilium
Helm chart version: 1.17.6
Image versions cilium quay.io/isovalent/cilium:v1.17.6-cee.1@sha256:2d01daf4f25f7d644889b49ca856e1a4269981fc963e50bd3962665b41b6adb3: 3
cilium-envoy quay.io/isovalent/cilium-envoy:v1.17.6-cee.1@sha256:318eff387835ca2717baab42a84f35a83a5f9e7d519253df87269f80b9ff0171: 3
cilium-operator quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265: 2
hubble-relay quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e: 1
hubble-ui quay.io/isovalent/hubble-ui-enterprise-backend:v1.3.6: 1
hubble-ui quay.io/isovalent/hubble-ui-enterprise:v1.3.6: 1

----------------------------------------------------------------
🚀 Cluster setup is complete and verified!
Access Hubble UI at: http://10.75.59.71:30583
----------------------------------------------------------------
root@kube-node-1:~# cilium bgp peers
Node Local AS Peer AS Peer Address Session State Uptime Family Received Advertised
kube-node-1 65000 65000 10.75.59.76 established 19m1s ipv4/unicast 2 8
kube-node-2 65000 65000 10.75.59.76 established 19m2s ipv4/unicast 2 8
kube-node-3 65000 65000 10.75.59.76 established 19m2s ipv4/unicast 2 8
root@kube-node-1:~# cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node VRouter Prefix NextHop Age Attrs
kube-node-1 65000 172.16.0.0/24 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.36.130/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.40.165/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.43.51/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.47.30/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
kube-node-2 65000 172.16.1.0/24 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.36.130/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.40.165/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.43.51/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.47.30/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
kube-node-3 65000 172.16.2.0/24 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.1/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.32.10/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.36.130/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.40.165/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.43.51/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
65000 172.16.47.30/32 0.0.0.0 19m8s [{Origin: i} {Nexthop: 0.0.0.0}]
root@kube-node-1:~#

7. Kubectl 常用命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
kubectl describe pod hubble-ui-5fdd8b4495-dv7nr -n kube-system

kubectl get nodes -o wide
kubectl get pods -n kube-system -o wide
kubectl get pods --all-namespaces
kubectl get service -n kube-system -o wide
kubectl get daemonset -n kube-system cilium
kubectl get deployment -o wide

kubectl get endpoints -n kube-system kube-dns
kubectl get cm cilium-config -n kube-system -o yaml
kubectl get endpoints -n kube-system
kubectl get pods -n kube-system -l k8s-app=cilium -o wide
# kubectl describe pod hubble-relay-cfb755899-r42l8



kubectl -n kube-system get pods -l k8s-app=cilium
kubectl -n kube-system exec ds/cilium -- cilium-dbg bpf ipmasq list
kubectl -n kube-system exec ds/cilium -- cilium-dbg status --verbose
kubectl -n kube-system exec ds/cilium -- cilium status
kubectl -n kube-system exec ds/cilium -- cilium service list
kubectl -n kube-system exec ds/cilium -- cilium bpf nat list
kubectl exec -n kube-system cilium-2vrgj -- cilium bpf nat list

kubectl -n kube-system get configmap cilium-config -o yaml

kubectl create namespace star-wars
kubectl apply -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/minikube/http-sw-app.yaml
kubectl -n star-wars get pod -o wide --show-labels
kubectl -n star-wars patch service deathstar -p '{"spec":{"type":"NodePort"}}'
kubectl -n star-wars exec tiefighter -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
kubectl -n star-wars exec xwing -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing

# kubectl delete -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/1.18.0/examples/minikube/http-sw-app.yaml

# Commands for reference only.


kubectl delete pod,svc,daemonset -n kube-system -l k8s-app=cilium
kubectl delete daemonset -n kube-system kube-proxy


kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=ClusterIP

kubectl run -it --rm busybox --image=busybox --restart=Never -- sh

kubectl run -it --rm curl --image=curlimages/curl --restart=Never -- sh

for i in {1..10}; do kubectl exec -n star-wars tiefighter -- curl -s http://deathstar.default.svc.cluster.local/v1 | jq -r '.hostname'; done

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#cloud-config
#Update the packages onboot.
package_update: true

#ncurses-compat-libs is for Amazon Linux 2.
packages:
- libaio
- numactl
- tzdata
- ncurses-compat-libs

write_files:
- path: /etc/profile.d/lang.sh
content: |
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8

- path: /etc/security/limits.conf
content: |
root hard nofile 65535
root soft nofile 65535
root hard nproc 8192
root soft nproc 8192

- path: /opt/appdynamics/response.varfile.bak
content: |
serverHostName=HOST_NAME
sys.languageId=en
disableEULA=true
platformAdmin.port=9191
platformAdmin.databasePort=3377
platformAdmin.dataDir=/opt/appdynamics/platform/mysql/data
platformAdmin.databasePassword=ENTER_PASSWORD
platformAdmin.databaseRootPassword=ENTER_PASSWORD
platformAdmin.adminPassword=ENTER_PASSWORD
platformAdmin.useHttps$Boolean=false
sys.installationDir=/opt/appdynamics/platform

- path: /etc/systemd/system/appd.console.service
permissions: '0644'
content: |
[Unit]
Description=AppDynamics Enterprise Console
After=network.target

[Service]
Type=forking
ExecStart=/opt/appdynamics/platform/platform-admin/bin/platform-admin.sh start-platform-admin
ExecStop=/opt/appdynamics/platform/platform-admin/bin/platform-admin.sh stop-platform-admin
User=root
Restart=always

[Install]
WantedBy=multi-user.target

- path: /etc/systemd/system/appd.console.install.service
permissions: '0644'
content: |
[Unit]
Description=AppDynamics Enterprise Console Installation
After=network.target

[Service]
Type=oneshot
RemainAfterExit=no
ExecStart=/bin/sh -c 'sleep 5 && cp /opt/appdynamics/response.varfile.bak /opt/appdynamics/response.varfile && sed -i \"s/ENTER_PASSWORD/`curl http://169.254.169.254/latest/meta-data/instance-id`/g\" /opt/appdynamics/response.varfile && sed -i \"s/HOST_NAME/`curl http://169.254.169.254/latest/meta-data/hostname`/g\" /opt/appdynamics/response.varfile && /opt/appdynamics/platform-setup-x64-linux-23.1.1.18.sh -q -varfile /opt/appdynamics/response.varfile && systemctl daemon-reload && systemctl enable appd.console.service && systemctl start appd.console.service'

[Install]
WantedBy=multi-user.target

runcmd:
# Create directory and copy Cisco AppDynamics Enterprise Console setup file
- aws s3 cp s3://ciscoappdnx/platform-setup-x64-linux-23.1.1.18.sh /opt/appdynamics/ --region cn-northwest-1
- chmod +x /opt/appdynamics/platform-setup-x64-linux-23.1.1.18.sh
- systemctl daemon-reload
- systemctl enable appd.console.install.service
- sed -i 's/#PermitRootLogin yes/PermitRootLogin no/g' /etc/ssh/sshd_config
- rm -rf /root/.ssh/authorized_keys
- rm -rf /home/ec2-user/.ssh/authorized_keys
- shred -u /etc/ssh/*_key /etc/ssh/*_key.pub

手工作坊与“云”格格不入

Cisco AppDynamics 是一款功能强大、易于使用的应用程序性能管理(APM)解决方案,能够端到端监控亚马逊云的应用程序,包括微服务和 Docker,通过 CloudWatch 集成为 EC2、DynamoDB、Lambda 等提供支持。AppDynamics可比较和验证云迁移前后的从客户到业务的优化,从而加速客户上云,因而深受用户喜爱。

为了提升用户在亚马逊云科技云端安装部署AppDynamics软件的效率,我们需要制作一个打包好的安装镜像 —— Amazon Machine Images (AMI)。用户使用AMI镜像启动虚拟机即可进入AppDynamics的设置界面,这能帮助用户节省大量软件下载、安装调试的时间,极大改善用户的安装体验。
我们采用什么方式来制作AMI镜像呢?

使用纯手工方式当然可以完成制作,但是这个AMI镜像封存了整个虚拟机的磁盘,包括操作系统和软件包。如果AppDynamics软件发布新版本,或者操作系统发现安全漏洞,就需要进行软件升级或系统漏洞修复的工作。在这种情况下,手工作坊难以招架,换句话说,在云的世界,已经没有手工作坊的一席之地,只有自动化一种选项。

那么,接下来的问题是:自动化需要工具和代码的支持,代码要怎么写呢?

笔者虽然能写点简单的Python代码、Shell脚本,可是要编写一个综合性的代码,恐怕没有两周时间,再加上掉几把头发是写不出来的。

阅读全文 »

作者: 饶维波

本文记录通过ChatGPT生成cloud-init制作AppDynamics在AWS上 AMI安装镜像的过程,形成操作文档作为参考指南。

任务简介和总体思路

任务简介

Cisco AppDynamics 提供功能强大、易于使用的应用程序性能管理(APM)解决方案,端到端监控亚马逊云的应用程序,包括微服务和 Docker,通过 CloudWatch 集成为 EC2、DynamoDB、Lambda 等提供支持。AppDynamics可比较和验证云迁移前后的从客户到业务的优化,从而加速客户上云,因而深受用户喜爱。

为了提升用户在亚马逊云科技云端安装部署AppDynamics软件的效率,我们需要制作一个打包好的安装镜像,叫做Amazon Machine Images (AMI)。用户使用AMI镜像启动虚拟机即可进入AppDynamics的设置界面,这能帮助用户节省大量软件下载、安装调试的时间,极大改善用户的安装体验。

本次任务的目标是需要完成AppDynamics AMI的制作,且为了便于后续维护,维护的工作主要包括操作系统层面的安全漏洞修复、AppDynamics软件版本的升级,尽量使用自动化,节省人的时间精力的同时,避免人为错误。

任务关键点是需要编写一个自动化脚本,考虑为Shell脚本或者cloud-init脚本。

阅读全文 »