Ansible-K8S-Cilium

发表于 2025-08-04 本文字数： 12k 阅读时长 ≈ 42 分钟

Setup K8S with Ansible from zero

1. 删除之前的环境

ois@ois:~/data/k8s-cilium-lab$ cd ..
ois@ois:~/data$ ./07-undefine-vms.sh 
Domain 'k8s-node-1' destroyed

Domain 'k8s-node-1' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/k8s-node-1.qcow2) removed.

Domain 'k8s-node-2' destroyed

Domain 'k8s-node-2' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/k8s-node-2.qcow2) removed.

Domain 'k8s-node-3' destroyed

Domain 'k8s-node-3' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/k8s-node-3.qcow2) removed.

Domain 'dns-bgp-server' destroyed

Domain 'dns-bgp-server' has been undefined
Volume 'vda'(/home/ois/data/k8s-cilium-lab/nodevms/dns-bgp-server.qcow2) removed.

ois@ois:~/data$ rm -rf k8s-cilium-lab/
ois@ois:~/data$

2. 重新构建项目文件结构

ois@ois:~/data$ ./00-create-project-structure.sh 
--- K8s + Cilium Lab Setup (Warning-Free) ---
This script will prepare a new Ansible project directory named 'k8s-cilium-lab'.

--- Step 1: Checking Prerequisites ---
✅ OS is Debian-based.
✅ Ansible is already installed.
--> Ensuring libvirt and whois are up-to-date...
✅ Dependencies are present.
✅ SSH key already exists at ~/.ssh/id_rsa. Skipping generation.

--- Step 2: Creating Project Structure ---
✅ Project directory structure created in 'k8s-cilium-lab/'.

--- Step 3: Configuring User Password Hash ---
A secure password hash is required for the 'ubuntu' user on the VMs.
Enter the password for the 'ubuntu' user (input will be hidden): 
--> Generating password hash...
✅ Password hash generated.

--- Step 4: Generating Configuration Files ---
✅ Created ansible.cfg
✅ Created inventory.ini
✅ Created group_vars/all.yml with password hash.
✅ Created host_vars/k8s-node-1.yml
✅ Created host_vars/k8s-node-2.yml
✅ Created host_vars/k8s-node-3.yml
✅ Created host_vars/dns-bgp-server.yml

--- Setup Complete! ---
✅ Project 'k8s-cilium-lab' has been successfully configured.

Next Steps:
1. Run the next script to generate the VM creation playbook:
   ../01-create-vms.sh
2. Run the playbook to create your lab VMs (no vault password needed):
   ansible-playbook playbooks/1_create_vms.yml

3. 生成Playbook，用于构建Lab环境虚拟机

ois@ois:~/data$ ./01-create-vms.sh 
--- Lab VM Playbook Generator (Final Corrected Version) ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Ensuring Directories Exist ---
✅ Directories 'playbooks/' and 'templates/' are ready.

--- Step 3: Generating Files ---
✅ Generated playbook: playbooks/1_create_vms.yml
✅ Generated template: templates/user-data.j2
✅ Generated template: templates/network-config.j2
✅ Generated template: templates/meta-data.j2

--- Generation Complete! ---
✅ All necessary files have been created inside the 'k8s-cilium-lab' directory.

Next Step:
1. Change into the project directory: cd k8s-cilium-lab
2. Run the playbook to create your lab VMs:
   ansible-playbook playbooks/1_create_vms.yml

4. 运行Playbook，构建虚拟机环境

自动创建k8s-nodes 和运行FRR的虚拟机，并进行联通性探测，添加到已知主机列表(known_hosts)

ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$ 
ois@ois:~/data/k8s-cilium-lab$ 
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/1_create_vms.yml

PLAY [Play 1 - Pre-flight Check for Existing VMs] ******************************************************************************************************************************************************

TASK [Check status of each VM with virsh] **************************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

PLAY [Play 2 - Decide if Provisioning is Needed] *******************************************************************************************************************************************************

TASK [Initialize an empty list for missing VMs] ********************************************************************************************************************************************************
ok: [localhost]

TASK [Populate the list of missing VMs] ****************************************************************************************************************************************************************
ok: [localhost] => (item=dns-bgp-server)
ok: [localhost] => (item=k8s-node-1)
ok: [localhost] => (item=k8s-node-2)
ok: [localhost] => (item=k8s-node-3)

TASK [Set global flag if provisioning is required] *****************************************************************************************************************************************************
ok: [localhost]

TASK [Report status] ***********************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": "Provisioning needed: True. Missing VMs: ['dns-bgp-server', 'k8s-node-1', 'k8s-node-2', 'k8s-node-3']"
}

PLAY [Play 3 - Prepare VM Assets in Parallel] **********************************************************************************************************************************************************
[WARNING]: Using run_once with the free strategy is not currently supported. This task will still be executed for every host in the inventory list.

TASK [Ensure VM directories exist] *********************************************************************************************************************************************************************
changed: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
ok: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
ok: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
ok: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevms)
changed: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
ok: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
ok: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)
ok: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg)

TASK [Check if VM disk image already exists] ***********************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [Create VM disk image from base image] ************************************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [Resize VM disk image] ****************************************************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [Generate cloud-init files] ***********************************************************************************************************************************************************************
changed: [k8s-node-2] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_user-data'})
changed: [k8s-node-3] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_user-data'})
changed: [dns-bgp-server] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_user-data'})
changed: [k8s-node-1] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_user-data'})
changed: [k8s-node-3] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_network-config'})
changed: [k8s-node-2] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_network-config'})
changed: [k8s-node-1] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_network-config'})
changed: [dns-bgp-server] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_network-config'})
changed: [k8s-node-3] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_meta-data'})
changed: [k8s-node-2] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_meta-data'})
changed: [k8s-node-1] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_meta-data'})
changed: [dns-bgp-server] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_meta-data'})

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [dns-bgp-server]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [k8s-node-1]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [k8s-node-2]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
changed: [k8s-node-3]

PLAY [Play 5 - Verify VM Connectivity in Parallel] *****************************************************************************************************************************************************

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
ok: [dns-bgp-server -> localhost]
ok: [k8s-node-1 -> localhost]
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.81:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.86:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
changed: [k8s-node-1 -> localhost]
changed: [dns-bgp-server -> localhost]

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
ok: [k8s-node-3 -> localhost]
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.83:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
changed: [k8s-node-3 -> localhost]

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
ok: [k8s-node-2 -> localhost]
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12
# 10.75.59.82:22 SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.12

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
changed: [k8s-node-2 -> localhost]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server             : ok=9    changed=6    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
k8s-node-1                 : ok=9    changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
k8s-node-2                 : ok=9    changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
k8s-node-3                 : ok=9    changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
localhost                  : ok=4    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

多次运行，以测试幂等性，多次运行不影响结果，已执行过的任务会自行跳过。

ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/1_create_vms.yml

PLAY [Play 1 - Pre-flight Check for Existing VMs] ******************************************************************************************************************************************************

TASK [Check status of each VM with virsh] **************************************************************************************************************************************************************
ok: [k8s-node-3]
ok: [k8s-node-1]
ok: [dns-bgp-server]
ok: [k8s-node-2]

PLAY [Play 2 - Decide if Provisioning is Needed] *******************************************************************************************************************************************************

TASK [Initialize an empty list for missing VMs] ********************************************************************************************************************************************************
ok: [localhost]

TASK [Populate the list of missing VMs] ****************************************************************************************************************************************************************
skipping: [localhost] => (item=dns-bgp-server) 
skipping: [localhost] => (item=k8s-node-1) 
skipping: [localhost] => (item=k8s-node-2) 
skipping: [localhost] => (item=k8s-node-3) 
skipping: [localhost]

TASK [Set global flag if provisioning is required] *****************************************************************************************************************************************************
ok: [localhost]

TASK [Report status] ***********************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": "Provisioning needed: False. Missing VMs: []"
}

PLAY [Play 3 - Prepare VM Assets in Parallel] **********************************************************************************************************************************************************
[WARNING]: Using run_once with the free strategy is not currently supported. This task will still be executed for every host in the inventory list.

TASK [Ensure VM directories exist] *********************************************************************************************************************************************************************
skipping: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevms) 
skipping: [dns-bgp-server] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg) 
skipping: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevms) 
skipping: [dns-bgp-server]
skipping: [k8s-node-1] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg) 
skipping: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevms) 
skipping: [k8s-node-2] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg) 
skipping: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevms) 
skipping: [k8s-node-3] => (item=/home/ois/data/k8s-cilium-lab/nodevm_cfg) 
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Check if VM disk image already exists] ***********************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Create VM disk image from base image] ************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Resize VM disk image] ****************************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Generate cloud-init files] ***********************************************************************************************************************************************************************
skipping: [k8s-node-1] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_user-data'}) 
skipping: [k8s-node-1] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_network-config'}) 
skipping: [k8s-node-2] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_user-data'}) 
skipping: [k8s-node-1] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-1_meta-data'}) 
skipping: [k8s-node-1]
skipping: [k8s-node-2] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_network-config'}) 
skipping: [dns-bgp-server] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_user-data'}) 
skipping: [k8s-node-2] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-2_meta-data'}) 
skipping: [k8s-node-2]
skipping: [dns-bgp-server] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_network-config'}) 
skipping: [dns-bgp-server] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/dns-bgp-server_meta-data'}) 
skipping: [k8s-node-3] => (item={'src': '../templates/user-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_user-data'}) 
skipping: [dns-bgp-server]
skipping: [k8s-node-3] => (item={'src': '../templates/network-config.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_network-config'}) 
skipping: [k8s-node-3] => (item={'src': '../templates/meta-data.j2', 'dest': '/home/ois/data/k8s-cilium-lab/nodevm_cfg/k8s-node-3_meta-data'}) 
skipping: [k8s-node-3]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [dns-bgp-server]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [k8s-node-1]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [k8s-node-2]

PLAY [Play 4 - Install VMs Sequentially to Avoid Race Condition] ***************************************************************************************************************************************

TASK [Create and start the VM with virt-install] *******************************************************************************************************************************************************
skipping: [k8s-node-3]

PLAY [Play 5 - Verify VM Connectivity in Parallel] *****************************************************************************************************************************************************

TASK [Wait for VMs to boot and SSH to become available] ************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [Add host keys to known_hosts file] ***************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server             : ok=1    changed=0    unreachable=0    failed=0    skipped=8    rescued=0    ignored=0   
k8s-node-1                 : ok=1    changed=0    unreachable=0    failed=0    skipped=8    rescued=0    ignored=0   
k8s-node-2                 : ok=1    changed=0    unreachable=0    failed=0    skipped=8    rescued=0    ignored=0   
k8s-node-3                 : ok=1    changed=0    unreachable=0    failed=0    skipped=8    rescued=0    ignored=0   
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

5. 生成Playbook以便安装Containerd和K8S工具

ois@ois:~/data/k8s-cilium-lab$ cd ..
ois@ois:~/data$ ./02-prepare-nodes.sh 
--- Node Preparation Playbook Generator (with cloud-init wait) ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Ensuring Role Directories Exist ---
✅ Role directories created.

--- Step 3: Generating Role Task Files ---
✅ Created tasks for 'common' role.
✅ Created tasks for 'k8s_node' role.
✅ Created tasks for 'infra_server' role.

--- Step 4: Generating Config Templates ---
✅ Created /etc/hosts template.
✅ Created containerd config template.
✅ Created FRR config template.

--- Step 5: Generating Main Playbook ---
✅ Created main playbook: playbooks/2_prepare_nodes.yml

--- Generation Complete! ---
✅ All necessary files for node preparation have been created.

Next Step:
1. Change into the project directory: cd k8s-cilium-lab
2. Run the playbook to prepare your nodes. You will be prompted for the sudo password:
   ansible-playbook playbooks/2_prepare_nodes.yml --ask-become-pass

6. 执行Playbook，安装Containerd和K8S工具

ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/2_prepare_nodes.yml --ask-become-pass
BECOME password: 

PLAY [Play 1 - Prepare All Nodes] **********************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-3]
ok: [k8s-node-2]
ok: [k8s-node-1]

TASK [common : Wait for cloud-init to complete] ********************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]
ok: [k8s-node-1]

TASK [common : Update apt cache and upgrade all packages] **********************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-3]
changed: [k8s-node-1]
changed: [k8s-node-2]

TASK [common : Configure /etc/hosts from template] *****************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [dns-bgp-server]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [common : Turn off all swap devices] **************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Comment out swap entries in /etc/fstab] *************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Load required kernel modules] ***********************************************************************************************************************************************************
changed: [dns-bgp-server] => (item=overlay)
changed: [k8s-node-1] => (item=overlay)
changed: [k8s-node-2] => (item=overlay)
changed: [k8s-node-3] => (item=overlay)
changed: [k8s-node-2] => (item=br_netfilter)
changed: [dns-bgp-server] => (item=br_netfilter)
changed: [k8s-node-1] => (item=br_netfilter)
changed: [k8s-node-3] => (item=br_netfilter)

TASK [common : Ensure kernel modules are loaded on boot] ***********************************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-1]
changed: [k8s-node-3]
changed: [k8s-node-2]

TASK [common : Configure sysctl parameters for Kubernetes networking] **********************************************************************************************************************************
changed: [dns-bgp-server]
changed: [k8s-node-1]
changed: [k8s-node-3]
changed: [k8s-node-2]

TASK [common : Apply sysctl settings without reboot] ***************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-3]
ok: [k8s-node-2]

PLAY [Play 2 - Prepare Kubernetes Nodes] ***************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-3]
ok: [k8s-node-1]
ok: [k8s-node-2]

TASK [k8s_node : Install prerequisite packages] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Ensure apt keyrings directory exists] *************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-3]
ok: [k8s-node-2]

TASK [k8s_node : Add Docker's official GPG key] ********************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [k8s_node : Add Docker's repository to Apt sources] ***********************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Install containerd] *******************************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Configure containerd from template] ***************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Install prerequisite packages for Kubernetes repo] ************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

TASK [k8s_node : Download the Kubernetes public signing key] *******************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Dearmor the Kubernetes GPG key] *******************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Add Kubernetes APT repository] ********************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Clean up temporary key file] **********************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Install kubelet, kubeadm, and kubectl] ************************************************************************************************************************************************
changed: [k8s-node-3]
changed: [k8s-node-1]
changed: [k8s-node-2]

TASK [k8s_node : Pin Kubernetes package versions] ******************************************************************************************************************************************************
changed: [k8s-node-2] => (item=kubelet)
changed: [k8s-node-3] => (item=kubelet)
changed: [k8s-node-1] => (item=kubelet)
changed: [k8s-node-2] => (item=kubeadm)
changed: [k8s-node-1] => (item=kubeadm)
changed: [k8s-node-3] => (item=kubeadm)
changed: [k8s-node-2] => (item=kubectl)
changed: [k8s-node-3] => (item=kubectl)
changed: [k8s-node-1] => (item=kubectl)

TASK [k8s_node : Enable and start kubelet service] *****************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]
changed: [k8s-node-1]

RUNNING HANDLER [Restart containerd] *******************************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

PLAY [Play 3 - Prepare Infrastructure Server] **********************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Install dnsmasq and FRR] **********************************************************************************************************************************************************
changed: [dns-bgp-server]

TASK [infra_server : Configure dnsmasq] ****************************************************************************************************************************************************************
changed: [dns-bgp-server]

TASK [infra_server : Configure FRR daemons] ************************************************************************************************************************************************************
ok: [dns-bgp-server] => (item=zebra)
changed: [dns-bgp-server] => (item=bgpd)

TASK [infra_server : Configure frr.conf] ***************************************************************************************************************************************************************
changed: [dns-bgp-server]

TASK [infra_server : Ensure FRR config has correct permissions] ****************************************************************************************************************************************
ok: [dns-bgp-server]

RUNNING HANDLER [Restart dnsmasq] **********************************************************************************************************************************************************************
changed: [dns-bgp-server]

RUNNING HANDLER [Restart frr] **************************************************************************************************************************************************************************
changed: [dns-bgp-server]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server             : ok=16   changed=11   unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
k8s-node-1                 : ok=24   changed=18   unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
k8s-node-2                 : ok=24   changed=18   unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
k8s-node-3                 : ok=24   changed=18   unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

多次执行Playbook，验证幂等性

ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/2_prepare_nodes.yml --ask-become-pass
BECOME password: 

PLAY [Play 1 - Prepare All Nodes] **********************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [dns-bgp-server]
ok: [k8s-node-3]
ok: [k8s-node-1]

TASK [common : Wait for cloud-init to complete] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [common : Update apt cache and upgrade all packages] **********************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]
ok: [k8s-node-1]

TASK [common : Configure /etc/hosts from template] *****************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-3]
ok: [dns-bgp-server]
ok: [k8s-node-2]

TASK [common : Turn off all swap devices] **************************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Comment out swap entries in /etc/fstab] *************************************************************************************************************************************************
skipping: [dns-bgp-server]
skipping: [k8s-node-1]
skipping: [k8s-node-2]
skipping: [k8s-node-3]

TASK [common : Load required kernel modules] ***********************************************************************************************************************************************************
ok: [k8s-node-2] => (item=overlay)
ok: [dns-bgp-server] => (item=overlay)
ok: [k8s-node-3] => (item=overlay)
ok: [k8s-node-1] => (item=overlay)
ok: [k8s-node-2] => (item=br_netfilter)
ok: [dns-bgp-server] => (item=br_netfilter)
ok: [k8s-node-3] => (item=br_netfilter)
ok: [k8s-node-1] => (item=br_netfilter)

TASK [common : Ensure kernel modules are loaded on boot] ***********************************************************************************************************************************************
ok: [k8s-node-1]
ok: [dns-bgp-server]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [common : Configure sysctl parameters for Kubernetes networking] **********************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [common : Apply sysctl settings without reboot] ***************************************************************************************************************************************************
ok: [dns-bgp-server]
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

PLAY [Play 2 - Prepare Kubernetes Nodes] ***************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Install prerequisite packages] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Ensure apt keyrings directory exists] *************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Add Docker's official GPG key] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Add Docker's repository to Apt sources] ***********************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Install containerd] *******************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Configure containerd from template] ***************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Install prerequisite packages for Kubernetes repo] ************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-1]
ok: [k8s-node-3]

TASK [k8s_node : Download the Kubernetes public signing key] *******************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-1]
changed: [k8s-node-3]

TASK [k8s_node : Dearmor the Kubernetes GPG key] *******************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Add Kubernetes APT repository] ********************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Clean up temporary key file] **********************************************************************************************************************************************************
changed: [k8s-node-1]
changed: [k8s-node-2]
changed: [k8s-node-3]

TASK [k8s_node : Install kubelet, kubeadm, and kubectl] ************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [k8s_node : Pin Kubernetes package versions] ******************************************************************************************************************************************************
ok: [k8s-node-1] => (item=kubelet)
ok: [k8s-node-2] => (item=kubelet)
ok: [k8s-node-3] => (item=kubelet)
ok: [k8s-node-1] => (item=kubeadm)
ok: [k8s-node-2] => (item=kubeadm)
ok: [k8s-node-3] => (item=kubeadm)
ok: [k8s-node-1] => (item=kubectl)
ok: [k8s-node-2] => (item=kubectl)
ok: [k8s-node-3] => (item=kubectl)

TASK [k8s_node : Enable and start kubelet service] *****************************************************************************************************************************************************
ok: [k8s-node-1]
ok: [k8s-node-2]
ok: [k8s-node-3]

PLAY [Play 3 - Prepare Infrastructure Server] **********************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Install dnsmasq and FRR] **********************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Configure dnsmasq] ****************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Configure FRR daemons] ************************************************************************************************************************************************************
ok: [dns-bgp-server] => (item=zebra)
ok: [dns-bgp-server] => (item=bgpd)

TASK [infra_server : Configure frr.conf] ***************************************************************************************************************************************************************
ok: [dns-bgp-server]

TASK [infra_server : Ensure FRR config has correct permissions] ****************************************************************************************************************************************
ok: [dns-bgp-server]

PLAY RECAP *********************************************************************************************************************************************************************************************
dns-bgp-server             : ok=14   changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
k8s-node-1                 : ok=23   changed=2    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
k8s-node-2                 : ok=23   changed=2    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
k8s-node-3                 : ok=23   changed=2    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

7. 执行脚本，构建设置K8S Cluster的Playbook

ois@ois:~/data/k8s-cilium-lab$ cd ..
ois@ois:~/data$ ./03-setup-cluster.sh 
--- Kubernetes Cluster Setup Playbook Generator ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Checking for 'community.kubernetes' Ansible Collection ---
✅ 'community.kubernetes' collection is already installed.

--- Step 3: Generating Cilium BGP Template ---
✅ Created Cilium BGP config template.

--- Step 4: Generating Main Playbook ---
✅ Created main playbook: playbooks/3_setup_cluster.yml

--- Generation Complete! ---
✅ All necessary files for cluster setup have been created.

Next Step:
1. IMPORTANT: Reset your cluster nodes to ensure a clean state for this new workflow.
   On each K8s VM, run: sudo kubeadm reset -f
2. Change into the project directory: cd k8s-cilium-lab
3. Run the playbook to build your Kubernetes cluster. You will be prompted for the sudo password:
   ansible-playbook playbooks/3_setup_cluster.yml --ask-become-pass

8. 执行Playbook 设置K8S Cluster和Clium

ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/3_setup_cluster.yml --ask-become-pass
BECOME password: 
[DEPRECATION WARNING]: community.kubernetes.helm_repository has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core 
instead. This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.helm has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead. 
This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead. This
 feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s_info has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead.
 This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.

PLAY [Play 1 - Initialize and Configure Control Plane] *************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Check if cluster is already initialized] *********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Initialize the cluster] **************************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Create .kube directory for ubuntu user] **********************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Copy admin.conf to user's kube config] ***********************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Set KUBECONFIG for root user permanently] ********************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Install prerequisites for Kubernetes modules] ****************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Install Helm] ************************************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Install Cilium CLI] ******************************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Add Helm repositories] ***************************************************************************************************************************************************************************
changed: [k8s-node-1] => (item={'name': 'cilium', 'url': 'https://helm.cilium.io/'})
changed: [k8s-node-1] => (item={'name': 'isovalent', 'url': 'https://helm.isovalent.com/'})

TASK [Deploy Cilium and Hubble with Helm] **************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Expose Hubble UI service via NodePort] ***********************************************************************************************************************************************************
[WARNING]: kubernetes<24.2.0 is not supported or tested. Some features may not work.
changed: [k8s-node-1]

TASK [Wait for Cilium CRDs to become available] ********************************************************************************************************************************************************
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (20 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (19 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (18 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (17 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (16 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (15 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (14 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (13 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (12 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (11 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (10 retries left).
FAILED - RETRYING: [k8s-node-1]: Wait for Cilium CRDs to become available (9 retries left).
ok: [k8s-node-1]

TASK [Apply Cilium BGP Configuration from template] ****************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Generate a token for workers to join] ************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Store the join command for other hosts to access] ************************************************************************************************************************************************
ok: [k8s-node-1]

PLAY [Play 2 - Join Worker Nodes to the Fully Configured Cluster] **************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Check if node has already joined] ****************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Join the cluster] ********************************************************************************************************************************************************************************
changed: [k8s-node-2]
changed: [k8s-node-3]

PLAY [Play 3 - Display Final Access Information] *******************************************************************************************************************************************************

TASK [Get Hubble UI service details] *******************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Display the final access URL] ********************************************************************************************************************************************************************
ok: [k8s-node-1] => {
    "msg": "========================================================\n🚀 Your Kubernetes Lab is Ready!\n\nAccess the Hubble UI at:\nhttp://10.75.59.81:31708\n========================================================\n"
}

PLAY RECAP *********************************************************************************************************************************************************************************************
k8s-node-1                 : ok=18   changed=12   unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
k8s-node-2                 : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
k8s-node-3                 : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

9. 多次执行Playbook，验证幂等性。

ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/3_setup_cluster.yml --ask-become-pass
BECOME password: 
[DEPRECATION WARNING]: community.kubernetes.helm_repository has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core 
instead. This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.helm has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead. 
This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead. This
 feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s_info has been deprecated. The community.kubernetes collection is being renamed to kubernetes.core. Please update your FQCNs to kubernetes.core instead.
 This feature will be removed from community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.

PLAY [Play 1 - Initialize and Configure Control Plane] *************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Check if cluster is already initialized] *********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Initialize the cluster] **************************************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Create .kube directory for ubuntu user] **********************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Copy admin.conf to user's kube config] ***********************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Set KUBECONFIG for root user permanently] ********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Install prerequisites for Kubernetes modules] ****************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Install Helm] ************************************************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Install Cilium CLI] ******************************************************************************************************************************************************************************
skipping: [k8s-node-1]

TASK [Add Helm repositories] ***************************************************************************************************************************************************************************
ok: [k8s-node-1] => (item={'name': 'cilium', 'url': 'https://helm.cilium.io/'})
ok: [k8s-node-1] => (item={'name': 'isovalent', 'url': 'https://helm.isovalent.com/'})

TASK [Deploy Cilium and Hubble with Helm] **************************************************************************************************************************************************************
[WARNING]: The default idempotency check can fail to report changes in certain cases. Install helm diff >= 3.4.1 for better results.
ok: [k8s-node-1]

TASK [Expose Hubble UI service via NodePort] ***********************************************************************************************************************************************************
[WARNING]: kubernetes<24.2.0 is not supported or tested. Some features may not work.
ok: [k8s-node-1]

TASK [Wait for Cilium CRDs to become available] ********************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Apply Cilium BGP Configuration from template] ****************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Generate a token for workers to join] ************************************************************************************************************************************************************
changed: [k8s-node-1]

TASK [Store the join command for other hosts to access] ************************************************************************************************************************************************
ok: [k8s-node-1]

PLAY [Play 2 - Join Worker Nodes to the Fully Configured Cluster] **************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Check if node has already joined] ****************************************************************************************************************************************************************
ok: [k8s-node-2]
ok: [k8s-node-3]

TASK [Join the cluster] ********************************************************************************************************************************************************************************
skipping: [k8s-node-2]
skipping: [k8s-node-3]

PLAY [Play 3 - Display Final Access Information] *******************************************************************************************************************************************************

TASK [Get Hubble UI service details] *******************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Display the final access URL] ********************************************************************************************************************************************************************
ok: [k8s-node-1] => {
    "msg": "========================================================\n🚀 Your Kubernetes Lab is Ready!\n\nAccess the Hubble UI at:\nhttp://10.75.59.81:31708\n========================================================\n"
}

PLAY RECAP *********************************************************************************************************************************************************************************************
k8s-node-1                 : ok=13   changed=1    unreachable=0    failed=0    skipped=5    rescued=0    ignored=0   
k8s-node-2                 : ok=2    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   
k8s-node-3                 : ok=2    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

10. 部署Demo App

ois@ois:~/data/k8s-cilium-lab$ cd ../
ois@ois:~/data$ ./04-deploy-star-wars.sh 
--- Star Wars Demo Script Deployer ---

--- Step 1: Verifying Project Context ---
✅ Working inside project directory: /home/ois/data/k8s-cilium-lab

--- Step 2: Generating the script template ---
✅ Created template: templates/deploy-star-wars.sh.j2

--- Step 3: Generating the deployment playbook ---
✅ Created playbook: playbooks/4_deploy_app.yml

--- Generation Complete! ---
✅ All necessary files for deploying the application script have been created.

Next Steps:
1. Change into the project directory: cd k8s-cilium-lab
2. Run the playbook to copy the script to your control plane node:
   ansible-playbook playbooks/4_deploy_app.yml --ask-become-pass
3. SSH into the control plane and run the script:
   ssh ubuntu@k8s-node-1
   sudo /root/deploy-star-wars.sh
ois@ois:~/data$ cd k8s-cilium-lab/
ois@ois:~/data/k8s-cilium-lab$ ansible-playbook playbooks/4_deploy_app.yml --ask-become-pass
BECOME password: 

PLAY [Deploy Star Wars Demo Script] ********************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************
ok: [k8s-node-1]

TASK [Copy the Star Wars deployment script to the control plane] ***************************************************************************************************************************************
changed: [k8s-node-1]

PLAY RECAP *********************************************************************************************************************************************************************************************
k8s-node-1                 : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

ois@ois:~/data/k8s-cilium-lab$ 
ois@ois:~/data/k8s-cilium-lab$ ssh ubuntu@10.75.59.81
Welcome to Ubuntu 24.04.2 LTS (GNU/Linux 6.8.0-63-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/pro

 System information as of Mon Aug  4 05:00:52 PM CST 2025

  System load:  0.26               Processes:               195
  Usage of /:   33.4% of 18.33GB   Users logged in:         0
  Memory usage: 16%                IPv4 address for enp1s0: 10.75.59.81
  Swap usage:   0%


Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status


*** System restart required ***
Last login: Mon Aug  4 16:55:53 2025 from 10.75.59.129
ubuntu@k8s-node-1:~$ sudo su
[sudo] password for ubuntu: 
root@k8s-node-1:/home/ubuntu# cd
root@k8s-node-1:~# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    OK
 \__/¯¯\__/    Hubble Relay:       OK
    \__/       ClusterMesh:        disabled

DaemonSet              cilium                   Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet              cilium-envoy             Desired: 3, Ready: 3/3, Available: 3/3
Deployment             cilium-operator          Desired: 2, Ready: 2/2, Available: 2/2
Deployment             hubble-relay             Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-ui                Desired: 1, Ready: 1/1, Available: 1/1
Containers:            cilium                   Running: 3
                       cilium-envoy             Running: 3
                       cilium-operator          Running: 2
                       clustermesh-apiserver    
                       hubble-relay             Running: 1
                       hubble-ui                Running: 1
Cluster Pods:          8/8 managed by Cilium
Helm chart version:    1.17.6
Image versions         cilium             quay.io/isovalent/cilium:v1.17.6-cee.1@sha256:2d01daf4f25f7d644889b49ca856e1a4269981fc963e50bd3962665b41b6adb3: 3
                       cilium-envoy       quay.io/isovalent/cilium-envoy:v1.17.6-cee.1@sha256:318eff387835ca2717baab42a84f35a83a5f9e7d519253df87269f80b9ff0171: 3
                       cilium-operator    quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265: 2
                       hubble-relay       quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e: 1
                       hubble-ui          quay.io/isovalent/hubble-ui-backend:v1.17.6-cee.1@sha256:a034b7e98e6ea796ed26df8f4e71f83fc16465a19d166eff67a03b822c0bfa15: 1
                       hubble-ui          quay.io/isovalent/hubble-ui:v1.17.6-cee.1@sha256:9e37c1296b802830834cc87342a9182ccbb71ffebb711971e849221bd9d59392: 1
root@k8s-node-1:~# 
root@k8s-node-1:~# ./deploy-star-wars.sh 
🚀 Starting Star Wars Demo Application Deployment...
=================================================

--- Step 1: Ensuring 'star-wars' namespace exists ---

▶️  Running command:
  kubectl create namespace star-wars
namespace/star-wars created
✅ Namespace 'star-wars' created.

--- Step 2: Applying application manifest from GitHub ---

▶️  Running command:
  kubectl apply -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/minikube/http-sw-app.yaml
service/deathstar created
deployment.apps/deathstar created
pod/tiefighter created
pod/xwing created

--- Step 3: Waiting for all application pods to be ready ---
  (This may take a minute as images are pulled...)

▶️  Running command:
  kubectl wait --for=condition=ready pod --all -n star-wars --timeout=120s
pod/deathstar-86f85ffb4d-8xbb4 condition met
pod/deathstar-86f85ffb4d-dwfx5 condition met
pod/tiefighter condition met
pod/xwing condition met
✅ All pods are running and ready.

--- Step 4: Displaying pod status ---

▶️  Running command:
  kubectl -n star-wars get pod -o wide --show-labels
NAME                         READY   STATUS    RESTARTS   AGE   IP             NODE         NOMINATED NODE   READINESS GATES   LABELS
deathstar-86f85ffb4d-8xbb4   1/1     Running   0          24s   172.16.2.92    k8s-node-3   <none>           <none>            app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
deathstar-86f85ffb4d-dwfx5   1/1     Running   0          24s   172.16.1.198   k8s-node-2   <none>           <none>            app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
tiefighter                   1/1     Running   0          24s   172.16.2.180   k8s-node-3   <none>           <none>            app.kubernetes.io/name=tiefighter,class=tiefighter,org=empire
xwing                        1/1     Running   0          24s   172.16.2.156   k8s-node-3   <none>           <none>            app.kubernetes.io/name=xwing,class=xwing,org=alliance

--- Step 5: Exposing 'deathstar' service via NodePort ---

▶️  Running command:
  kubectl -n star-wars patch service deathstar -p {"spec":{"type":"NodePort"}}
service/deathstar patched
✅ Service 'deathstar' patched to NodePort.

--- Step 6: Testing connectivity from client pods ---
  (A 'Ship landed' message indicates success)

▶️  Running command:
  kubectl -n star-wars exec tiefighter -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed

▶️  Running command:
  kubectl -n star-wars exec xwing -- curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed

--- Step 7: Displaying external access information ---

=================================================
🎉 Star Wars Demo Application Deployed Successfully!
   You can access the Deathstar service from outside the cluster at:
   curl -XPOST http://10.75.59.81:30719/v1/request-landing
=================================================
root@k8s-node-1:~# curl -XPOST http://10.75.59.81:30719/v1/request-landing
Ship landed
root@k8s-node-1:~# exit
exit
ubuntu@k8s-node-1:~$ curl -XPOST http://10.75.59.81:30719/v1/request-landing
Ship landed
root@k8s-node-1:~# kubectl get nodes -o wide
NAME         STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
k8s-node-1   Ready    control-plane   21m   v1.33.3   10.75.59.81   <none>        Ubuntu 24.04.2 LTS   6.8.0-63-generic   containerd://1.7.27
k8s-node-2   Ready    <none>          19m   v1.33.3   10.75.59.82   <none>        Ubuntu 24.04.2 LTS   6.8.0-63-generic   containerd://1.7.27
k8s-node-3   Ready    <none>          19m   v1.33.3   10.75.59.83   <none>        Ubuntu 24.04.2 LTS   6.8.0-63-generic   containerd://1.7.27
root@k8s-node-1:~# kubectl get pods -A -o wide
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE   IP             NODE         NOMINATED NODE   READINESS GATES
kube-system   cilium-45qr7                         1/1     Running   0          19m   10.75.59.83    k8s-node-3   <none>           <none>
kube-system   cilium-9q5s7                         1/1     Running   0          19m   10.75.59.82    k8s-node-2   <none>           <none>
kube-system   cilium-envoy-72jj7                   1/1     Running   0          19m   10.75.59.82    k8s-node-2   <none>           <none>
kube-system   cilium-envoy-d8hb4                   1/1     Running   0          20m   10.75.59.81    k8s-node-1   <none>           <none>
kube-system   cilium-envoy-vvsms                   1/1     Running   0          19m   10.75.59.83    k8s-node-3   <none>           <none>
kube-system   cilium-operator-d67c55dc8-lfpjb      1/1     Running   0          20m   10.75.59.81    k8s-node-1   <none>           <none>
kube-system   cilium-operator-d67c55dc8-rpfv8      1/1     Running   0          20m   10.75.59.82    k8s-node-2   <none>           <none>
kube-system   cilium-xbjqm                         1/1     Running   0          20m   10.75.59.81    k8s-node-1   <none>           <none>
kube-system   coredns-674b8bbfcf-n9wgt             1/1     Running   0          21m   172.16.0.105   k8s-node-1   <none>           <none>
kube-system   coredns-674b8bbfcf-ntssg             1/1     Running   0          21m   172.16.0.161   k8s-node-1   <none>           <none>
kube-system   etcd-k8s-node-1                      1/1     Running   0          21m   10.75.59.81    k8s-node-1   <none>           <none>
kube-system   hubble-relay-cfb755899-46pzc         1/1     Running   0          20m   172.16.1.115   k8s-node-2   <none>           <none>
kube-system   hubble-ui-68c64498c4-p2nq4           2/2     Running   0          20m   172.16.1.105   k8s-node-2   <none>           <none>
kube-system   kube-apiserver-k8s-node-1            1/1     Running   0          21m   10.75.59.81    k8s-node-1   <none>           <none>
kube-system   kube-controller-manager-k8s-node-1   1/1     Running   0          21m   10.75.59.81    k8s-node-1   <none>           <none>
kube-system   kube-scheduler-k8s-node-1            1/1     Running   0          21m   10.75.59.81    k8s-node-1   <none>           <none>
star-wars     deathstar-86f85ffb4d-8xbb4           1/1     Running   0          12m   172.16.2.92    k8s-node-3   <none>           <none>
star-wars     deathstar-86f85ffb4d-dwfx5           1/1     Running   0          12m   172.16.1.198   k8s-node-2   <none>           <none>
star-wars     tiefighter                           1/1     Running   0          12m   172.16.2.180   k8s-node-3   <none>           <none>
star-wars     xwing                                1/1     Running   0          12m   172.16.2.156   k8s-node-3   <none>           <none>
root@k8s-node-1:~# kubectl get deployment -A -o wide
NAMESPACE     NAME              READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS         IMAGES                                                                                                                                                                                                                                        SELECTOR
kube-system   cilium-operator   2/2     2            2           21m   cilium-operator    quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265                                                                                                                      io.cilium/app=operator,name=cilium-operator
kube-system   coredns           2/2     2            2           22m   coredns            registry.k8s.io/coredns/coredns:v1.12.0                                                                                                                                                                                                       k8s-app=kube-dns
kube-system   hubble-relay      1/1     1            1           21m   hubble-relay       quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e                                                                                                                          k8s-app=hubble-relay
kube-system   hubble-ui         1/1     1            1           21m   frontend,backend   quay.io/isovalent/hubble-ui:v1.17.6-cee.1@sha256:9e37c1296b802830834cc87342a9182ccbb71ffebb711971e849221bd9d59392,quay.io/isovalent/hubble-ui-backend:v1.17.6-cee.1@sha256:a034b7e98e6ea796ed26df8f4e71f83fc16465a19d166eff67a03b822c0bfa15   k8s-app=hubble-ui
star-wars     deathstar         2/2     2            2           12m   deathstar          quay.io/cilium/starwars@sha256:896dc536ec505778c03efedb73c3b7b83c8de11e74264c8c35291ff6d5fe8ada                                                                                                                                               class=deathstar,org=empire

11. 系统信息探索

11.1 K8S 和 Cilium相关信息

root@k8s-node-1:~# helm get values cilium -n kube-system
USER-SUPPLIED VALUES:
autoDirectNodeRoutes: true
bgpControlPlane:
  announce:
    podCIDR: true
  enabled: true
bpf:
  lb:
    externalClusterIP: true
    sock: true
  masquerade: true
enableIPv4Masquerade: true
hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: true
ipam:
  mode: kubernetes
ipv4NativeRoutingCIDR: 172.16.0.0/20
k8s:
  requireIPv4PodCIDR: true
k8sServiceHost: 10.75.59.81
k8sServicePort: 6443
kubeProxyReplacement: true
routingMode: native
root@k8s-node-1:~# 
root@k8s-node-1:~# kubectl -n kube-system get configmap cilium-config -o yaml
apiVersion: v1
data:
  agent-not-ready-taint-key: node.cilium.io/agent-not-ready
  arping-refresh-period: 30s
  auto-direct-node-routes: "true"
  bgp-secrets-namespace: kube-system
  bpf-distributed-lru: "false"
  bpf-events-drop-enabled: "true"
  bpf-events-policy-verdict-enabled: "true"
  bpf-events-trace-enabled: "true"
  bpf-lb-acceleration: disabled
  bpf-lb-algorithm-annotation: "false"
  bpf-lb-external-clusterip: "false"
  bpf-lb-map-max: "65536"
  bpf-lb-mode-annotation: "false"
  bpf-lb-sock: "false"
  bpf-lb-source-range-all-types: "false"
  bpf-map-dynamic-size-ratio: "0.0025"
  bpf-policy-map-max: "16384"
  bpf-root: /sys/fs/bpf
  cgroup-root: /run/cilium/cgroupv2
  cilium-endpoint-gc-interval: 5m0s
  cluster-id: "0"
  cluster-name: default
  clustermesh-enable-endpoint-sync: "false"
  clustermesh-enable-mcs-api: "false"
  cni-exclusive: "true"
  cni-log-file: /var/run/cilium/cilium-cni.log
  custom-cni-conf: "false"
  datapath-mode: veth
  debug: "false"
  debug-verbose: ""
  default-lb-service-ipam: lbipam
  direct-routing-skip-unreachable: "false"
  dnsproxy-enable-transparent-mode: "true"
  dnsproxy-socket-linger-timeout: "10"
  egress-gateway-ha-reconciliation-trigger-interval: 1s
  egress-gateway-reconciliation-trigger-interval: 1s
  enable-auto-protect-node-port-range: "true"
  enable-bfd: "false"
  enable-bgp-control-plane: "true"
  enable-bgp-control-plane-status-report: "true"
  enable-bpf-clock-probe: "false"
  enable-bpf-masquerade: "true"
  enable-cluster-aware-addressing: "false"
  enable-egress-gateway-ha-socket-termination: "false"
  enable-endpoint-health-checking: "true"
  enable-endpoint-lockdown-on-policy-overflow: "false"
  enable-experimental-lb: "false"
  enable-health-check-loadbalancer-ip: "false"
  enable-health-check-nodeport: "true"
  enable-health-checking: "true"
  enable-hubble: "true"
  enable-inter-cluster-snat: "false"
  enable-internal-traffic-policy: "true"
  enable-ipv4: "true"
  enable-ipv4-big-tcp: "false"
  enable-ipv4-masquerade: "true"
  enable-ipv6: "false"
  enable-ipv6-big-tcp: "false"
  enable-ipv6-masquerade: "true"
  enable-k8s-networkpolicy: "true"
  enable-k8s-terminating-endpoint: "true"
  enable-l2-neigh-discovery: "true"
  enable-l7-proxy: "true"
  enable-lb-ipam: "true"
  enable-local-redirect-policy: "false"
  enable-masquerade-to-route-source: "false"
  enable-metrics: "true"
  enable-node-selector-labels: "false"
  enable-non-default-deny-policies: "false"
  enable-phantom-services: "false"
  enable-policy: default
  enable-policy-secrets-sync: "true"
  enable-runtime-device-detection: "true"
  enable-sctp: "false"
  enable-source-ip-verification: "true"
  enable-srv6: "false"
  enable-svc-source-range-check: "true"
  enable-tcx: "true"
  enable-vtep: "false"
  enable-well-known-identities: "false"
  enable-xt-socket-fallback: "true"
  envoy-access-log-buffer-size: "4096"
  envoy-base-id: "0"
  envoy-keep-cap-netbindservice: "false"
  export-aggregation: ""
  export-aggregation-renew-ttl: "true"
  export-aggregation-state-filter: ""
  export-file-path: ""
  external-envoy-proxy: "true"
  feature-gates-approved: ""
  feature-gates-strict: "true"
  health-check-icmp-failure-threshold: "3"
  http-retry-count: "3"
  hubble-disable-tls: "false"
  hubble-export-file-max-backups: "5"
  hubble-export-file-max-size-mb: "10"
  hubble-listen-address: :4244
  hubble-socket-path: /var/run/cilium/hubble.sock
  hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt
  hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt
  hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key
  identity-allocation-mode: crd
  identity-gc-interval: 15m0s
  identity-heartbeat-timeout: 30m0s
  install-no-conntrack-iptables-rules: "false"
  ipam: kubernetes
  ipam-cilium-node-update-rate: 15s
  iptables-random-fully: "false"
  ipv4-native-routing-cidr: 172.16.0.0/20
  k8s-require-ipv4-pod-cidr: "true"
  k8s-require-ipv6-pod-cidr: "false"
  kube-proxy-replacement: "true"
  kube-proxy-replacement-healthz-bind-address: ""
  max-connected-clusters: "255"
  mesh-auth-enabled: "true"
  mesh-auth-gc-interval: 5m0s
  mesh-auth-queue-size: "1024"
  mesh-auth-rotated-identities-queue-size: "1024"
  monitor-aggregation: medium
  monitor-aggregation-flags: all
  monitor-aggregation-interval: 5s
  multicast-enabled: "false"
  nat-map-stats-entries: "32"
  nat-map-stats-interval: 30s
  node-port-bind-protection: "true"
  nodeport-addresses: ""
  nodes-gc-interval: 5m0s
  operator-api-serve-addr: 127.0.0.1:9234
  operator-prometheus-serve-addr: :9963
  policy-cidr-match-mode: ""
  policy-secrets-namespace: cilium-secrets
  policy-secrets-only-from-secrets-namespace: "true"
  preallocate-bpf-maps: "false"
  procfs: /host/proc
  proxy-connect-timeout: "2"
  proxy-idle-timeout-seconds: "60"
  proxy-initial-fetch-timeout: "30"
  proxy-max-concurrent-retries: "128"
  proxy-max-connection-duration-seconds: "0"
  proxy-max-requests-per-connection: "0"
  proxy-xff-num-trusted-hops-egress: "0"
  proxy-xff-num-trusted-hops-ingress: "0"
  remove-cilium-node-taints: "true"
  routing-mode: native
  service-no-backend-response: reject
  set-cilium-is-up-condition: "true"
  set-cilium-node-taints: "true"
  srv6-encap-mode: reduced
  srv6-locator-pool-enabled: "false"
  synchronize-k8s-nodes: "true"
  tofqdns-dns-reject-response-code: refused
  tofqdns-enable-dns-compression: "true"
  tofqdns-endpoint-max-ip-per-hostname: "1000"
  tofqdns-idle-connection-grace-period: 0s
  tofqdns-max-deferred-connection-deletes: "10000"
  tofqdns-proxy-response-max-delay: 100ms
  tunnel-protocol: vxlan
  tunnel-source-port-range: 0-0
  unmanaged-pod-watcher-interval: "15"
  vtep-cidr: ""
  vtep-endpoint: ""
  vtep-mac: ""
  vtep-mask: ""
  write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: cilium
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2025-08-04T08:52:46Z"
  labels:
    app.kubernetes.io/managed-by: Helm
  name: cilium-config
  namespace: kube-system
  resourceVersion: "436"
  uid: 614c6b36-9112-4fd0-bebf-e92741fa28da
root@k8s-node-1:~# kubectl exec -n kube-system -it cilium-45qr7 -- /bin/sh
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
/home/cilium # cilium status
KVStore:                 Disabled   
Kubernetes:              Ok         1.33 (v1.33.3) [linux/amd64]
Kubernetes APIs:         ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "isovalent/v1alpha1::IsovalentClusterwideNetworkPolicy", "isovalent/v1alpha1::IsovalentNetworkPolicy", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement:    True   [enp1s0   10.75.59.83 fe80::5054:ff:fead:b814 (Direct Routing)]
Host firewall:           Disabled
SRv6:                    Disabled
CNI Chaining:            none
CNI Config file:         successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium:                  Ok   1.17.6-cee.1 (v1.17.6-cee.1-a33b0b85)
NodeMonitor:             Listening for events on 4 CPUs with 64x4096 of shared memory
Cilium health daemon:    Ok   
IPAM:                    IPv4: 5/254 allocated from 172.16.2.0/24, 
IPv4 BIG TCP:            Disabled
IPv6 BIG TCP:            Disabled
BandwidthManager:        Disabled
Routing:                 Network: Native   Host: BPF
Attach Mode:             TCX
Device Mode:             veth
Masquerading:            BPF   [enp1s0]   172.16.0.0/20 [IPv4: Enabled, IPv6: Disabled]
Controller Status:       36/36 healthy
Proxy Status:            OK, ip 172.16.2.88, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range:   min 256, max 65535
Hubble:                  Ok              Current/Max Flows: 2522/4095 (61.59%), Flows/s: 1.51   Metrics: Disabled
Encryption:              Disabled        
Cluster health:          3/3 reachable   (2025-08-04T09:21:04Z)
Name                     IP              Node   Endpoints
Modules Health:          Stopped(0) Degraded(0) OK(68)
/home/cilium # cilium status --verbose
KVStore:                Disabled   
Kubernetes:             Ok         1.33 (v1.33.3) [linux/amd64]
Kubernetes APIs:        ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "isovalent/v1alpha1::IsovalentClusterwideNetworkPolicy", "isovalent/v1alpha1::IsovalentNetworkPolicy", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement:   True   [enp1s0   10.75.59.83 fe80::5054:ff:fead:b814 (Direct Routing)]
Host firewall:          Disabled
SRv6:                   Disabled
CNI Chaining:           none
CNI Config file:        successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium:                 Ok   1.17.6-cee.1 (v1.17.6-cee.1-a33b0b85)
NodeMonitor:            Listening for events on 4 CPUs with 64x4096 of shared memory
Cilium health daemon:   Ok   
IPAM:                   IPv4: 5/254 allocated from 172.16.2.0/24, 
Allocated addresses:
  172.16.2.156 (star-wars/xwing)
  172.16.2.180 (star-wars/tiefighter)
  172.16.2.35 (health)
  172.16.2.88 (router)
  172.16.2.92 (star-wars/deathstar-86f85ffb4d-8xbb4)
IPv4 BIG TCP:           Disabled
IPv6 BIG TCP:           Disabled
BandwidthManager:       Disabled
Routing:                Network: Native   Host: BPF
Attach Mode:            TCX
Device Mode:            veth
Masquerading:           BPF   [enp1s0]   172.16.0.0/20 [IPv4: Enabled, IPv6: Disabled]
Clock Source for BPF:   ktime
Controller Status:      36/36 healthy
  Name                                                  Last success   Last error   Count   Message
  cilium-health-ep                                      1m0s ago       never        0       no error   
  ct-map-pressure                                       1s ago         never        0       no error   
  daemon-validate-config                                35s ago        never        0       no error   
  dns-garbage-collector-job                             3s ago         never        0       no error   
  endpoint-1375-regeneration-recovery                   never          never        0       no error   
  endpoint-196-regeneration-recovery                    never          never        0       no error   
  endpoint-2338-regeneration-recovery                   never          never        0       no error   
  endpoint-523-regeneration-recovery                    never          never        0       no error   
  endpoint-640-regeneration-recovery                    never          never        0       no error   
  endpoint-gc                                           2m3s ago       never        0       no error   
  endpoint-periodic-regeneration                        1m3s ago       never        0       no error   
  ep-bpf-prog-watchdog                                  1s ago         never        0       no error   
  ipcache-inject-labels                                 1s ago         never        0       no error   
  k8s-heartbeat                                         3s ago         never        0       no error   
  link-cache                                            3s ago         never        0       no error   
  node-neighbor-link-updater                            1s ago         never        0       no error   
  proxy-ports-checkpoint                                27m1s ago      never        0       no error   
  resolve-identity-1375                                 2m0s ago       never        0       no error   
  resolve-identity-196                                  1m41s ago      never        0       no error   
  resolve-identity-2338                                 1m41s ago      never        0       no error   
  resolve-identity-523                                  2m1s ago       never        0       no error   
  resolve-identity-640                                  1m41s ago      never        0       no error   
  resolve-labels-star-wars/deathstar-86f85ffb4d-8xbb4   21m41s ago     never        0       no error   
  resolve-labels-star-wars/tiefighter                   21m41s ago     never        0       no error   
  resolve-labels-star-wars/xwing                        21m41s ago     never        0       no error   
  sync-lb-maps-with-k8s-services                        27m1s ago      never        0       no error   
  sync-policymap-1375                                   11m56s ago     never        0       no error   
  sync-policymap-196                                    6m41s ago      never        0       no error   
  sync-policymap-2338                                   6m41s ago      never        0       no error   
  sync-policymap-523                                    11m57s ago     never        0       no error   
  sync-policymap-640                                    6m41s ago      never        0       no error   
  sync-to-k8s-ciliumendpoint (196)                      1s ago         never        0       no error   
  sync-to-k8s-ciliumendpoint (2338)                     1s ago         never        0       no error   
  sync-to-k8s-ciliumendpoint (640)                      1s ago         never        0       no error   
  sync-utime                                            1s ago         never        0       no error   
  write-cni-file                                        27m3s ago      never        0       no error   
Proxy Status:            OK, ip 172.16.2.88, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range:   min 256, max 65535
Hubble:                  Ok   Current/Max Flows: 2570/4095 (62.76%), Flows/s: 1.51   Metrics: Disabled
KubeProxyReplacement Details:
  Status:                 True
  Socket LB:              Enabled
  Socket LB Tracing:      Enabled
  Socket LB Coverage:     Full
  Devices:                enp1s0   10.75.59.83 fe80::5054:ff:fead:b814 (Direct Routing)
  Mode:                   SNAT
  Backend Selection:      Random
  Session Affinity:       Enabled
  Graceful Termination:   Enabled
  NAT46/64 Support:       Disabled
  XDP Acceleration:       Disabled
  Services:
  - ClusterIP:      Enabled
  - NodePort:       Enabled (Range: 30000-32767) 
  - LoadBalancer:   Enabled 
  - externalIPs:    Enabled 
  - HostPort:       Enabled
  Annotations:
  - service.cilium.io/node
  - service.cilium.io/src-ranges-policy
  - service.cilium.io/type
BPF Maps:   dynamic sizing: on (ratio: 0.002500)
  Name                          Size
  Auth                          524288
  Non-TCP connection tracking   65536
  TCP connection tracking       131072
  Endpoint policy               65535
  IP cache                      512000
  IPv4 masquerading agent       16384
  IPv6 masquerading agent       16384
  IPv4 fragmentation            8192
  IPv4 service                  65536
  IPv6 service                  65536
  IPv4 service backend          65536
  IPv6 service backend          65536
  IPv4 service reverse NAT      65536
  IPv6 service reverse NAT      65536
  Metrics                       1024
  Ratelimit metrics             64
  NAT                           131072
  Neighbor table                131072
  Global policy                 16384
  Session affinity              65536
  Sock reverse NAT              65536
Encryption:       Disabled        
Cluster health:   3/3 reachable   (2025-08-04T09:21:04Z)
Name              IP              Node   Endpoints
  k8s-node-3 (localhost):
    Host connectivity to 10.75.59.83:
      ICMP to stack:   OK, RTT=451.197µs
      HTTP to agent:   OK, RTT=625.346µs
    Endpoint connectivity to 172.16.2.35:
      ICMP to stack:   OK, RTT=376.939µs
      HTTP to agent:   OK, RTT=882.754µs
  k8s-node-1:
    Host connectivity to 10.75.59.81:
      ICMP to stack:   OK, RTT=582.892µs
      HTTP to agent:   OK, RTT=1.042743ms
    Endpoint connectivity to 172.16.0.116:
      ICMP to stack:   OK, RTT=703.331µs
      HTTP to agent:   OK, RTT=1.533329ms
  k8s-node-2:
    Host connectivity to 10.75.59.82:
      ICMP to stack:   OK, RTT=632.658µs
      HTTP to agent:   OK, RTT=1.156736ms
    Endpoint connectivity to 172.16.1.173:
      ICMP to stack:   OK, RTT=636.518µs
      HTTP to agent:   OK, RTT=1.37198ms
Modules Health:
      enterprise-agent
      ├── agent
      │   ├── controlplane
      │   │   ├── auth
      │   │   │   ├── observer-job-auth-gc-identity-events            [OK] OK (1.812µs) [5] (21m, x1)
      │   │   │   ├── observer-job-auth-request-authentication        [OK] Primed (27m, x1)
      │   │   │   └── timer-job-auth-gc-cleanup                       [OK] OK (15.847µs) (2m3s, x1)
      │   │   ├── bgp-control-plane
      │   │   │   ├── job-bgp-controller                              [OK] Running (27m, x1)
      │   │   │   ├── job-bgp-crd-status-initialize                   [OK] Running (27m, x1)
      │   │   │   ├── job-bgp-crd-status-update-job                   [OK] Running (27m, x1)
      │   │   │   ├── job-bgp-policy-observer                         [OK] Running (27m, x1)
      │   │   │   ├── job-bgp-reconcile-error-statedb-tracker         [OK] Running (27m, x1)
      │   │   │   ├── job-bgp-state-observer                          [OK] Running (27m, x1)
      │   │   │   ├── job-bgpcp-resource-store-events                 [OK] Running (27m, x5)
      │   │   │   └── job-diffstore-events                            [OK] Running (27m, x2)
      │   │   ├── ciliumenvoyconfig
      │   │   │   └── experimental
      │   │   │       ├── job-reconcile                               [OK] OK, 0 object(s) (27m, x3)
      │   │   │       └── job-refresh                                 [OK] Next refresh in 30m0s (27m, x1)
      │   │   ├── daemon
      │   │   │   ├──                                                 [OK] daemon-validate-config (35s, x27)
      │   │   │   ├── ep-bpf-prog-watchdog
      │   │   │   │   └── ep-bpf-prog-watchdog                        [OK] ep-bpf-prog-watchdog (1s, x55)
      │   │   │   └── job-sync-hostips                                [OK] Synchronized (1s, x29)
      │   │   ├── dynamic-lifecycle-manager
      │   │   │   ├── job-reconcile                                   [OK] OK, 0 object(s) (27m, x3)
      │   │   │   └── job-refresh                                     [OK] Next refresh in 30m0s (27m, x1)
      │   │   ├── enabled-features
      │   │   │   └── job-update-config-metric                        [OK] Waiting for agent config (27m, x1)
      │   │   ├── endpoint-manager
      │   │   │   ├── cilium-endpoint-1375 (/)
      │   │   │   │   ├── datapath-regenerate                         [OK] Endpoint regeneration successful (63s, x15)
      │   │   │   │   └── policymap-sync                              [OK] sync-policymap-1375 (11m, x2)
      │   │   │   ├── cilium-endpoint-196 (star-wars/xwing)
      │   │   │   │   ├── cep-k8s-sync                                [OK] sync-to-k8s-ciliumendpoint (196) (1s, x132)
      │   │   │   │   ├── datapath-regenerate                         [OK] Endpoint regeneration successful (63s, x12)
      │   │   │   │   └── policymap-sync                              [OK] sync-policymap-196 (6m41s, x2)
      │   │   │   ├── cilium-endpoint-2338 (star-wars/tiefighter)
      │   │   │   │   ├── cep-k8s-sync                                [OK] sync-to-k8s-ciliumendpoint (2338) (1s, x132)
      │   │   │   │   ├── datapath-regenerate                         [OK] Endpoint regeneration successful (63s, x12)
      │   │   │   │   └── policymap-sync                              [OK] sync-policymap-2338 (6m41s, x2)
      │   │   │   ├── cilium-endpoint-523 (/)
      │   │   │   │   ├── datapath-regenerate                         [OK] Endpoint regeneration successful (63s, x16)
      │   │   │   │   └── policymap-sync                              [OK] sync-policymap-523 (11m, x2)
      │   │   │   ├── cilium-endpoint-640 (star-wars/deathstar-86f85ffb4d-8xbb4)
      │   │   │   │   ├── cep-k8s-sync                                [OK] sync-to-k8s-ciliumendpoint (640) (1s, x132)
      │   │   │   │   ├── datapath-regenerate                         [OK] Endpoint regeneration successful (63s, x12)
      │   │   │   │   └── policymap-sync                              [OK] sync-policymap-640 (6m41s, x2)
      │   │   │   └── endpoint-gc                                     [OK] endpoint-gc (2m3s, x6)
      │   │   ├── envoy-proxy
      │   │   │   ├── observer-job-k8s-secrets-resource-events-cilium-secrets    [OK] Primed (27m, x1)
      │   │   │   └── timer-job-version-check                         [OK] OK (13.805158ms) (2m1s, x1)
      │   │   ├── hubble
      │   │   │   └── job-hubble                                      [OK] Running (27m, x1)
      │   │   ├── identity
      │   │   │   └── timer-job-id-alloc-update-policy-maps           [OK] OK (103.031µs) (21m, x1)
      │   │   ├── l2-announcer
      │   │   │   └── job-l2-announcer-lease-gc                       [OK] Running (27m, x1)
      │   │   ├── nat-stats
      │   │   │   └── timer-job-nat-stats                             [OK] OK (2.520538ms) (1s, x1)
      │   │   ├── node-manager
      │   │   │   ├── background-sync                                 [OK] Node validation successful (66s, x19)
      │   │   │   ├── neighbor-link-updater
      │   │   │   │   ├── k8s-node-1                                  [OK] Node neighbor link update successful (61s, x20)
      │   │   │   │   └── k8s-node-2                                  [OK] Node neighbor link update successful (31s, x20)
      │   │   │   ├── node-checkpoint-writer                          [OK] node checkpoint written (25m, x3)
      │   │   │   └── nodes-add                                       [OK] Node adds successful (27m, x3)
      │   │   ├── policy
      │   │   │   └── observer-job-policy-importer                    [OK] Primed (27m, x1)
      │   │   ├── service-manager
      │   │   │   ├── job-health-check-event-watcher                  [OK] Waiting for health check events (27m, x1)
      │   │   │   └── job-service-reconciler                          [OK] 1 NodePort frontend addresses (27m, x1)
      │   │   ├── service-resolver
      │   │   │   └── job-service-reloader-initializer                [OK] Running (27m, x1)
      │   │   └── stale-endpoint-cleanup
      │   │       └── job-endpoint-cleanup                            [OK] Running (27m, x1)
      │   ├── datapath
      │   │   ├── agent-liveness-updater
      │   │   │   └── timer-job-agent-liveness-updater                [OK] OK (82.885µs) (0s, x1)
      │   │   ├── iptables
      │   │   │   ├── ipset
      │   │   │   │   ├── job-ipset-init-finalizer                    [OK] Running (27m, x1)
      │   │   │   │   ├── job-reconcile                               [OK] OK, 0 object(s) (27m, x2)
      │   │   │   │   └── job-refresh                                 [OK] Next refresh in 30m0s (27m, x1)
      │   │   │   └── job-iptables-reconciliation-loop                [OK] iptables rules full reconciliation completed (27m, x1)
      │   │   ├── l2-responder
      │   │   │   └── job-l2-responder-reconciler                     [OK] Running (27m, x1)
      │   │   ├── maps
      │   │   │   └── bwmap
      │   │   │       └── timer-job-pressure-metric-throttle          [OK] OK (18.336µs) (1s, x1)
      │   │   ├── mtu
      │   │   │   ├── job-endpoint-mtu-updater                        [OK] Endpoint MTU updated (27m, x1)
      │   │   │   └── job-mtu-updater                                 [OK] MTU updated (1500) (27m, x1)
      │   │   ├── node-address
      │   │   │   └── job-node-address-update                         [OK] 172.16.2.88 (primary), fe80::7019:6fff:febf:e8a7 (primary) (27m, x1)
      │   │   ├── orchestrator
      │   │   │   └── job-reinitialize                                [OK] OK (26m, x2)
      │   │   └── sysctl
      │   │       ├── job-reconcile                                   [OK] OK, 16 object(s) (6m56s, x35)
      │   │       └── job-refresh                                     [OK] Next refresh in 9m53.185634443s (6m56s, x1)
      │   └── infra
      │       ├── k8s-synced-crdsync
      │       │   └── job-sync-crds                                   [OK] Running (27m, x1)
      │       ├── metrics
      │       │   ├── job-collect                                     [OK] Sampled 24 metrics in 4.183045ms, next collection at 2025-08-04 09:26:00.386029804 +0000 UTC m=+1803.177514891 (2m1s, x1)
      │       │   └── timer-job-cleanup                               [OK] Primed (27m, x1)
      │       └── shell
      │           └── job-listener                                    [OK] Listening on /var/run/cilium/shell.sock (27m, x1)
      └── enterprise-controlplane
          └── cec-ingress-policy
              └── timer-job-enterprise-endpoint-policy-periodic-regeneration      [OK] OK (21.899µs) (1s, x1)
      
/home/cilium # cilium service list
ID   Frontend                Service Type   Backend                               
1    172.16.32.1:443/TCP     ClusterIP      1 => 10.75.59.81:6443/TCP (active)    
2    172.16.42.186:443/TCP   ClusterIP      1 => 10.75.59.83:4244/TCP (active)    
3    172.16.42.14:80/TCP     ClusterIP      1 => 172.16.1.115:4245/TCP (active)   
4    172.16.38.134:80/TCP    ClusterIP      1 => 172.16.1.105:8081/TCP (active)   
5    10.75.59.83:31708/TCP   NodePort       1 => 172.16.1.105:8081/TCP (active)   
6    0.0.0.0:31708/TCP       NodePort       1 => 172.16.1.105:8081/TCP (active)   
7    172.16.32.10:53/TCP     ClusterIP      1 => 172.16.0.161:53/TCP (active)     
                                            2 => 172.16.0.105:53/TCP (active)     
8    172.16.32.10:9153/TCP   ClusterIP      1 => 172.16.0.161:9153/TCP (active)   
                                            2 => 172.16.0.105:9153/TCP (active)   
9    172.16.32.10:53/UDP     ClusterIP      1 => 172.16.0.161:53/UDP (active)     
                                            2 => 172.16.0.105:53/UDP (active)     
10   172.16.34.108:80/TCP    ClusterIP      1 => 172.16.1.198:80/TCP (active)     
                                            2 => 172.16.2.92:80/TCP (active)      
11   10.75.59.83:30719/TCP   NodePort       1 => 172.16.1.198:80/TCP (active)     
                                            2 => 172.16.2.92:80/TCP (active)      
12   0.0.0.0:30719/TCP       NodePort       1 => 172.16.1.198:80/TCP (active)     
                                            2 => 172.16.2.92:80/TCP (active)      
/home/cilium # cilium bpf nat list
TCP IN 10.75.59.82:4240 -> 10.75.59.83:35986 XLATE_DST 10.75.59.83:35986 Created=1076sec ago NeedsCT=1
ICMP IN 10.75.59.81:0 -> 10.75.59.83:63865 XLATE_DST 10.75.59.83:63865 Created=156sec ago NeedsCT=1
TCP IN 10.75.59.81:4240 -> 10.75.59.83:58402 XLATE_DST 10.75.59.83:58402 Created=56sec ago NeedsCT=1
ICMP OUT 10.75.59.83:47633 -> 10.75.59.81:0 XLATE_SRC 10.75.59.83:47633 Created=56sec ago NeedsCT=1
ICMP OUT 10.75.59.83:56308 -> 172.16.0.116:0 XLATE_SRC 10.75.59.83:56308 Created=206sec ago NeedsCT=1
ICMP OUT 10.75.59.83:40570 -> 10.75.59.82:0 XLATE_SRC 10.75.59.83:40570 Created=226sec ago NeedsCT=1
TCP IN 172.16.0.116:4240 -> 10.75.59.83:33274 XLATE_DST 10.75.59.83:33274 Created=706sec ago NeedsCT=1
ICMP IN 172.16.0.116:0 -> 10.75.59.83:56308 XLATE_DST 10.75.59.83:56308 Created=206sec ago NeedsCT=1
TCP OUT 10.75.59.83:37066 -> 10.75.59.81:6443 XLATE_SRC 10.75.59.83:37066 Created=1655sec ago NeedsCT=1
TCP OUT 10.75.59.83:44064 -> 172.16.1.173:4240 XLATE_SRC 10.75.59.83:44064 Created=1066sec ago NeedsCT=1
TCP OUT 10.75.59.83:46184 -> 10.75.59.81:6443 XLATE_SRC 10.75.59.83:46184 Created=1651sec ago NeedsCT=1
TCP OUT 10.75.59.83:43981 -> 10.75.59.86:179 XLATE_SRC 10.75.59.83:43981 Created=1648sec ago NeedsCT=1
TCP OUT 10.75.59.83:57802 -> 10.75.59.81:4240 XLATE_SRC 10.75.59.83:57802 Created=716sec ago NeedsCT=1
ICMP IN 10.75.59.82:0 -> 10.75.59.83:43531 XLATE_DST 10.75.59.83:43531 Created=36sec ago NeedsCT=1
TCP IN 10.75.59.86:179 -> 10.75.59.83:43981 XLATE_DST 10.75.59.83:43981 Created=1648sec ago NeedsCT=1
TCP OUT 10.75.59.83:58402 -> 10.75.59.81:4240 XLATE_SRC 10.75.59.83:58402 Created=56sec ago NeedsCT=1
ICMP OUT 10.75.59.83:43619 -> 10.75.59.82:0 XLATE_SRC 10.75.59.83:43619 Created=176sec ago NeedsCT=1
ICMP OUT 10.75.59.83:40760 -> 10.75.59.81:0 XLATE_SRC 10.75.59.83:40760 Created=216sec ago NeedsCT=1
TCP IN 142.250.199.100:443 -> 10.75.59.83:40732 XLATE_DST 172.16.2.180:40732 Created=205sec ago NeedsCT=0
TCP OUT 10.75.59.83:34202 -> 172.16.0.116:4240 XLATE_SRC 10.75.59.83:34202 Created=46sec ago NeedsCT=1
ICMP IN 10.75.59.81:0 -> 10.75.59.83:47633 XLATE_DST 10.75.59.83:47633 Created=56sec ago NeedsCT=1
TCP OUT 10.75.59.83:33274 -> 172.16.0.116:4240 XLATE_SRC 10.75.59.83:33274 Created=706sec ago NeedsCT=1
ICMP OUT 10.75.59.83:43531 -> 10.75.59.82:0 XLATE_SRC 10.75.59.83:43531 Created=36sec ago NeedsCT=1
ICMP IN 10.75.59.82:0 -> 10.75.59.83:43619 XLATE_DST 10.75.59.83:43619 Created=176sec ago NeedsCT=1
TCP IN 10.75.59.81:6443 -> 10.75.59.83:37066 XLATE_DST 10.75.59.83:37066 Created=1655sec ago NeedsCT=1
ICMP OUT 10.75.59.83:36675 -> 172.16.1.173:0 XLATE_SRC 10.75.59.83:36675 Created=106sec ago NeedsCT=1
TCP OUT 172.16.2.180:40732 -> 142.250.199.100:443 XLATE_SRC 10.75.59.83:40732 Created=205sec ago NeedsCT=0
ICMP IN 172.16.0.116:0 -> 10.75.59.83:38433 XLATE_DST 10.75.59.83:38433 Created=276sec ago NeedsCT=1
TCP OUT 10.75.59.83:35986 -> 10.75.59.82:4240 XLATE_SRC 10.75.59.83:35986 Created=1076sec ago NeedsCT=1
ICMP IN 172.16.1.173:0 -> 10.75.59.83:36675 XLATE_DST 10.75.59.83:36675 Created=106sec ago NeedsCT=1
TCP IN 10.75.59.81:6443 -> 10.75.59.83:46184 XLATE_DST 10.75.59.83:46184 Created=1651sec ago NeedsCT=1
TCP IN 172.16.0.116:4240 -> 10.75.59.83:34202 XLATE_DST 10.75.59.83:34202 Created=46sec ago NeedsCT=1
ICMP OUT 10.75.59.83:63865 -> 10.75.59.81:0 XLATE_SRC 10.75.59.83:63865 Created=156sec ago NeedsCT=1
ICMP OUT 10.75.59.83:38433 -> 172.16.0.116:0 XLATE_SRC 10.75.59.83:38433 Created=276sec ago NeedsCT=1
ICMP IN 10.75.59.82:0 -> 10.75.59.83:40570 XLATE_DST 10.75.59.83:40570 Created=226sec ago NeedsCT=1
ICMP OUT 10.75.59.83:36899 -> 172.16.1.173:0 XLATE_SRC 10.75.59.83:36899 Created=236sec ago NeedsCT=1
ICMP IN 172.16.1.173:0 -> 10.75.59.83:36899 XLATE_DST 10.75.59.83:36899 Created=236sec ago NeedsCT=1
TCP IN 10.75.59.81:4240 -> 10.75.59.83:57802 XLATE_DST 10.75.59.83:57802 Created=716sec ago NeedsCT=1
ICMP IN 10.75.59.81:0 -> 10.75.59.83:40760 XLATE_DST 10.75.59.83:40760 Created=216sec ago NeedsCT=1
TCP IN 172.16.1.173:4240 -> 10.75.59.83:44064 XLATE_DST 10.75.59.83:44064 Created=1066sec ago NeedsCT=1

/home/cilium # cilium config
##### Read-write configurations #####
ConntrackAccounting               : Disabled
ConntrackLocal                    : Disabled
Debug                             : Disabled
DebugLB                           : Disabled
DropNotification                  : Enabled
MonitorAggregationLevel           : Medium
PolicyAccounting                  : Enabled
PolicyAuditMode                   : Disabled
PolicyTracing                     : Disabled
PolicyVerdictNotification         : Enabled
SourceIPVerification              : Enabled
TraceNotification                 : Enabled
MonitorNumPages                   : 64
PolicyEnforcement                 : default
/home/cilium # exit
root@k8s-node-1:~#

11.2 BGP相关信息

root@k8s-node-1:~# cilium bgp peers
Node         Local AS   Peer AS   Peer Address   Session State   Uptime   Family         Received   Advertised
k8s-node-1   65000      65000     10.75.59.86    established     41m41s   ipv4/unicast   1          8    
k8s-node-2   65000      65000     10.75.59.86    established     39m53s   ipv4/unicast   1          8    
k8s-node-3   65000      65000     10.75.59.86    established     39m52s   ipv4/unicast   1          8    
root@k8s-node-1:~# cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node         VRouter   Prefix             NextHop   Age      Attrs
k8s-node-1   65000     172.16.0.0/24      0.0.0.0   41m48s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.32.1/32     0.0.0.0   41m48s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.32.10/32    0.0.0.0   41m48s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.34.108/32   0.0.0.0   34m39s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.38.134/32   0.0.0.0   41m48s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.42.14/32    0.0.0.0   41m48s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.42.186/32   0.0.0.0   41m47s   [{Origin: i} {Nexthop: 0.0.0.0}]   
k8s-node-2   65000     172.16.1.0/24      0.0.0.0   39m59s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.32.1/32     0.0.0.0   39m59s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.32.10/32    0.0.0.0   39m59s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.34.108/32   0.0.0.0   34m39s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.38.134/32   0.0.0.0   39m59s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.42.14/32    0.0.0.0   39m59s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.42.186/32   0.0.0.0   39m56s   [{Origin: i} {Nexthop: 0.0.0.0}]   
k8s-node-3   65000     172.16.2.0/24      0.0.0.0   39m58s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.32.1/32     0.0.0.0   39m58s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.32.10/32    0.0.0.0   39m58s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.34.108/32   0.0.0.0   34m39s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.38.134/32   0.0.0.0   39m58s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.42.14/32    0.0.0.0   39m58s   [{Origin: i} {Nexthop: 0.0.0.0}]   
             65000     172.16.42.186/32   0.0.0.0   39m55s   [{Origin: i} {Nexthop: 0.0.0.0}]   

root@dns-bgp-server:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static 
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.86 
172.16.0.0/24 nhid 8 via 10.75.59.81 dev enp1s0 proto bgp metric 20 
172.16.1.0/24 nhid 12 via 10.75.59.82 dev enp1s0 proto bgp metric 20 
172.16.2.0/24 nhid 19 via 10.75.59.83 dev enp1s0 proto bgp metric 20 
172.16.32.1 nhid 18 proto bgp metric 20 
        nexthop via 10.75.59.81 dev enp1s0 weight 1 
        nexthop via 10.75.59.82 dev enp1s0 weight 1 
        nexthop via 10.75.59.83 dev enp1s0 weight 1 
172.16.32.10 nhid 18 proto bgp metric 20 
        nexthop via 10.75.59.81 dev enp1s0 weight 1 
        nexthop via 10.75.59.82 dev enp1s0 weight 1 
        nexthop via 10.75.59.83 dev enp1s0 weight 1 
172.16.34.108 nhid 18 proto bgp metric 20 
        nexthop via 10.75.59.81 dev enp1s0 weight 1 
        nexthop via 10.75.59.82 dev enp1s0 weight 1 
        nexthop via 10.75.59.83 dev enp1s0 weight 1 
172.16.38.134 nhid 18 proto bgp metric 20 
        nexthop via 10.75.59.81 dev enp1s0 weight 1 
        nexthop via 10.75.59.82 dev enp1s0 weight 1 
        nexthop via 10.75.59.83 dev enp1s0 weight 1 
172.16.42.14 nhid 18 proto bgp metric 20 
        nexthop via 10.75.59.81 dev enp1s0 weight 1 
        nexthop via 10.75.59.82 dev enp1s0 weight 1 
        nexthop via 10.75.59.83 dev enp1s0 weight 1 
172.16.42.186 nhid 18 proto bgp metric 20 
        nexthop via 10.75.59.81 dev enp1s0 weight 1 
        nexthop via 10.75.59.82 dev enp1s0 weight 1 
        nexthop via 10.75.59.83 dev enp1s0 weight 1 
root@dns-bgp-server:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.75.59.1      0.0.0.0         UG    0      0        0 enp1s0
10.75.59.0      0.0.0.0         255.255.255.0   U     0      0        0 enp1s0
172.16.0.0      10.75.59.81     255.255.255.0   UG    20     0        0 enp1s0
172.16.1.0      10.75.59.82     255.255.255.0   UG    20     0        0 enp1s0
172.16.2.0      10.75.59.83     255.255.255.0   UG    20     0        0 enp1s0
172.16.32.1     10.75.59.81     255.255.255.255 UGH   20     0        0 enp1s0
172.16.32.10    10.75.59.81     255.255.255.255 UGH   20     0        0 enp1s0
172.16.34.108   10.75.59.81     255.255.255.255 UGH   20     0        0 enp1s0
172.16.38.134   10.75.59.81     255.255.255.255 UGH   20     0        0 enp1s0
172.16.42.14    10.75.59.81     255.255.255.255 UGH   20     0        0 enp1s0
172.16.42.186   10.75.59.81     255.255.255.255 UGH   20     0        0 enp1s0
root@dns-bgp-server:~# vtysh -c "show ip bgp"
BGP table version is 15, local router ID is 10.75.59.86, vrf id 0
Default local pref 100, local AS 65000
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*> 10.75.59.0/24    0.0.0.0                  0         32768 ?
*>i172.16.0.0/24    10.75.59.81                   100      0 i
*>i172.16.1.0/24    10.75.59.82                   100      0 i
*>i172.16.2.0/24    10.75.59.83                   100      0 i
*=i172.16.32.1/32   10.75.59.83                   100      0 i
*=i                 10.75.59.82                   100      0 i
*>i                 10.75.59.81                   100      0 i
*=i172.16.32.10/32  10.75.59.83                   100      0 i
*=i                 10.75.59.82                   100      0 i
*>i                 10.75.59.81                   100      0 i
*>i172.16.34.108/32 10.75.59.81                   100      0 i
*=i                 10.75.59.82                   100      0 i
*=i                 10.75.59.83                   100      0 i
*=i172.16.38.134/32 10.75.59.83                   100      0 i
*=i                 10.75.59.82                   100      0 i
*>i                 10.75.59.81                   100      0 i
*=i172.16.42.14/32  10.75.59.83                   100      0 i
*=i                 10.75.59.82                   100      0 i
*>i                 10.75.59.81                   100      0 i
*=i172.16.42.186/32 10.75.59.83                   100      0 i
*=i                 10.75.59.82                   100      0 i
*>i                 10.75.59.81                   100      0 i

Displayed  10 routes and 22 total paths
root@dns-bgp-server:~# vtysh -c "show ip bgp summary"

IPv4 Unicast Summary (VRF default):
BGP router identifier 10.75.59.86, local AS number 65000 vrf-id 0
BGP table version 15
RIB entries 19, using 3648 bytes of memory
Peers 3, using 2172 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.75.59.81     4      65000       247       244        0    0    0 00:40:09            7        1 N/A
10.75.59.82     4      65000       240       234        0    0    0 00:38:20            7        1 N/A
10.75.59.83     4      65000       239       233        0    0    0 00:38:19            7        1 N/A

Total number of neighbors 3

12. 项目文档结构

ois@ois:~/data/k8s-cilium-lab$ tree
.
├── ansible.cfg
├── group_vars
│   └── all.yml
├── host_vars
│   ├── dns-bgp-server.yml
│   ├── k8s-node-1.yml
│   ├── k8s-node-2.yml
│   └── k8s-node-3.yml
├── inventory.ini
├── nodevm_cfg
│   ├── dns-bgp-server_meta-data
│   ├── dns-bgp-server_network-config
│   ├── dns-bgp-server_user-data
│   ├── k8s-node-1_meta-data
│   ├── k8s-node-1_network-config
│   ├── k8s-node-1_user-data
│   ├── k8s-node-2_meta-data
│   ├── k8s-node-2_network-config
│   ├── k8s-node-2_user-data
│   ├── k8s-node-3_meta-data
│   ├── k8s-node-3_network-config
│   └── k8s-node-3_user-data
├── nodevms
│   ├── dns-bgp-server.qcow2
│   ├── k8s-node-1.qcow2
│   ├── k8s-node-2.qcow2
│   └── k8s-node-3.qcow2
├── playbooks
│   ├── 1_create_vms.yml
│   ├── 2_prepare_nodes.yml
│   ├── 3_setup_cluster.yml
│   └── 4_deploy_app.yml
├── roles
│   ├── common
│   │   └── tasks
│   │       └── main.yml
│   ├── infra_server
│   │   └── tasks
│   │       └── main.yml
│   └── k8s_node
│       └── tasks
│           └── main.yml
└── templates
    ├── cilium-bgp.yaml.j2
    ├── containerd.toml.j2
    ├── deploy-star-wars.sh.j2
    ├── frr.conf.j2
    ├── hosts.j2
    ├── meta-data.j2
    ├── network-config.j2
    └── user-data.j2

14 directories, 38 files

Kubernetes with CNI Cilium 安装学习

发表于 2025-08-01 本文字数： 20k 阅读时长 ≈ 1:12

Kubernetes with CNI Cilium 安装学习

1. 使用自动化脚本安装Ubuntu 24.04虚拟机

1.1 准备工作

密码加密字符串
ois@ois:~$ mkpasswd --method=SHA-512 --rounds=4096
Password: 
$6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1

生成ssh-key，以便实现无密码登录。
ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa
Your public key has been saved in /root/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:1+BaD0K3fe6saxPFf41r0SyZEpqhq29AVeRwz+WEXiU 
The key's randomart image is:
+---[RSA 4096]----+
|        .o+  .E..|
|        .+ o.+.. |
|       .. +.oo.  |
|      .. o.=o o  |
|     .  S.*+oo.B.|
|      . .=oooo* *|
|       ...  .o.+.|
|        o   ooo  |
|      .+.  .o=o  |
+----[SHA256]-----+

1.2 自动化脚本创建虚拟机节点— 由Gemini生成

create-vms.sh

#!/bin/bash

# --- Configuration ---
BASE_IMAGE_PATH="/home/ois/data/vmimages/noble-server-cloudimg-amd64.img"
VM_IMAGE_DIR="/home/ois/data/k8s/nodevms"
VM_CONFIG_DIR="/home/ois/data/k8s/nodevm_cfg"
RAM_MB=8192
VCPUS=4
DISK_SIZE_GB=20 # <--- ADDED: Increased disk size to 20 GB
BRIDGE_INTERFACE="br0"
BASE_IP="10.75.59"
NETWORK_PREFIX="/24"
GATEWAY="10.75.59.1"
NAMESERVER1="64.104.76.247"
NAMESERVER2="64.104.14.184"
SEARCH_DOMAIN="cisco.com"
VNC_PORT_START=5905 # VNC ports will be 5905, 5906, 5907
PASSWORD_HASH='$6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1'
SSH_PUB_KEY=$(cat ~/.ssh/id_rsa.pub)

# --- Loop to create 3 VMs ---
for i in {1..3}; do
    VM_NAME="kube-node-$i"
    VM_IP="${BASE_IP}.$((70 + i))" # IPs will be 10.75.59.71, 10.75.59.72, 10.75.59.73
    VM_IMAGE_PATH="${VM_IMAGE_DIR}/${VM_NAME}.qcow2"
    VM_VNC_PORT=$((VNC_PORT_START + i - 1)) # VNC ports will be 5905, 5906, 5907

    echo "--- Preparing for $VM_NAME (IP: $VM_IP) ---"

    # Create directories if they don't exist
    mkdir -p "$VM_IMAGE_DIR"
    mkdir -p "$VM_CONFIG_DIR"

    # Create a fresh image for each VM
    if [ -f "$VM_IMAGE_PATH" ]; then
        echo "Removing existing image for $VM_NAME..."
        rm "$VM_IMAGE_PATH"
    fi
    echo "Copying base image to $VM_IMAGE_PATH..."
    cp "$BASE_IMAGE_PATH" "$VM_IMAGE_PATH"

    # --- NEW: Resize the copied image before virt-install ---
    echo "Resizing VM image to ${DISK_SIZE_GB}GB..."
    qemu-img resize "$VM_IMAGE_PATH" "${DISK_SIZE_GB}G"
    # --- END NEW ---

    # Generate user-data for the current VM
    USER_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_user-data"
    cat <<EOF > "$USER_DATA_FILE"
#cloud-config

locale: en_US
keyboard:
  layout: us
timezone: Asia/Shanghai
hostname: ${VM_NAME}
create_hostname_file: true

ssh_pwauth: yes

groups:
  - ubuntu

users:
  - name: ubuntu
    gecos: ubuntu
    primary_group: ubuntu
    groups: sudo, cdrom
    sudo: ALL=(ALL:ALL) ALL
    shell: /bin/bash
    lock_passwd: false
    passwd: ${PASSWORD_HASH}
    ssh_authorized_keys:
      - "${SSH_PUB_KEY}"

apt:
  primary:
    - arches: [default]
      uri: http://us.archive.ubuntu.com/ubuntu/

packages:
  - openssh-server
  - net-tools
  - iftop
  - htop
  - iperf3
  - vim
  - curl
  - wget
  - cloud-guest-utils # Ensure growpart is available

ntp:
  servers: ['ntp.esl.cisco.com']

runcmd:
  - echo "Attempting to resize root partition and filesystem..."
  - growpart /dev/vda 1 # Expand the first partition on /dev/vda
  - resize2fs /dev/vda1 # Expand the ext4 filesystem on /dev/vda1
  - echo "Disk resize commands executed. Verify with 'df -h' after boot."
EOF

    # Generate network-config for the current VM
    NETWORK_CONFIG_FILE="${VM_CONFIG_DIR}/${VM_NAME}_network-config"
    cat <<EOF > "$NETWORK_CONFIG_FILE"
network:
  version: 2
  ethernets:
    enp1s0:
      addresses:
      - "${VM_IP}${NETWORK_PREFIX}"
      nameservers:
        addresses:
        - ${NAMESERVER1}
        - ${NAMESERVER2}
        search:
        - ${SEARCH_DOMAIN}
      routes:
      - to: "default"
        via: "${GATEWAY}"
EOF

    # Generate meta-data (can be static for now)
    META_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_meta-data"
    cat <<EOF > "$META_DATA_FILE"
instance-id: ${VM_NAME}
local-hostname: ${VM_NAME}
EOF

    echo "--- Installing $VM_NAME ---"
    virt-install --name "${VM_NAME}" --ram "${RAM_MB}" --vcpus "${VCPUS}" --noreboot \
        --os-variant ubuntu24.04 \
        --network bridge="${BRIDGE_INTERFACE}" \
        --graphics vnc,listen=0.0.0.0,port="${VM_VNC_PORT}" \
        --disk path="${VM_IMAGE_PATH}",format=qcow2 \
        --console pty,target_type=serial \
        --cloud-init user-data="${USER_DATA_FILE}",meta-data="${META_DATA_FILE}",network-config="${NETWORK_CONFIG_FILE}" \
        --import \
        --wait 0

    echo "Successfully initiated creation of $VM_NAME."
    echo "You can connect to VNC on port ${VM_VNC_PORT} to monitor installation (optional)."
    echo "Wait a few minutes for the VM to boot and cloud-init to run."
    echo "--------------------------------------------------------"
done

echo "All 3 VMs have been initiated. Please wait for them to fully provision."
echo "You can SSH into them using 'ssh ubuntu@<IP_ADDRESS>' where IP addresses are 10.75.59.71, 10.75.59.72, 10.75.59.73."

上述脚本可自动生成三个虚拟机，并可以使用 ssh ubuntu@10.75.59.71 登录

chmod +x create-vms.sh

ois@ois:~/data/k8s$ ./create-vms.sh 
--- Preparing for kube-node-1 (IP: 10.75.59.71) ---
Removing existing image for kube-node-1...
Copying base image to /home/ois/data/k8s/nodevms/kube-node-1.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing kube-node-1 ---
WARNING  Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-3jev55ba-cloudinit.iso'                                                                                                                                                                                             |    0 B  00:00:00 ... 
Transferring 'virtinst-3jev55ba-cloudinit.iso'                                                                                                                                                                                           |    0 B  00:00:00 ... 
Creating domain...                                                                                                                                                                                                                       |    0 B  00:00:00     
Domain creation completed.
Successfully initiated creation of kube-node-1.
You can connect to VNC on port 5905 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
--- Preparing for kube-node-2 (IP: 10.75.59.72) ---
Removing existing image for kube-node-2...
Copying base image to /home/ois/data/k8s/nodevms/kube-node-2.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing kube-node-2 ---
WARNING  Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-c4ruhql3-cloudinit.iso'                                                                                                                                                                                             |    0 B  00:00:00 ... 
Transferring 'virtinst-c4ruhql3-cloudinit.iso'                                                                                                                                                                                           |    0 B  00:00:00 ... 
Creating domain...                                                                                                                                                                                                                       |    0 B  00:00:00     
Domain creation completed.
Successfully initiated creation of kube-node-2.
You can connect to VNC on port 5906 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
--- Preparing for kube-node-3 (IP: 10.75.59.73) ---
Removing existing image for kube-node-3...
Copying base image to /home/ois/data/k8s/nodevms/kube-node-3.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing kube-node-3 ---
WARNING  Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-u5e8k9a9-cloudinit.iso'                                                                                                                                                                                             |    0 B  00:00:00 ... 
Transferring 'virtinst-u5e8k9a9-cloudinit.iso'                                                                                                                                                                                           |    0 B  00:00:00 ... 
Creating domain...                                                                                                                                                                                                                       |    0 B  00:00:00     
Domain creation completed.
Successfully initiated creation of kube-node-3.
You can connect to VNC on port 5907 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
All 3 VMs have been initiated. Please wait for them to fully provision.
You can SSH into them using 'ssh ubuntu@<IP_ADDRESS>' where IP addresses are 10.75.59.71, 10.75.59.72, 10.75.59.73.

ois@ois:~/data/k8s$ virsh list
 Id   Name                  State
-------------------------------------
 83   kube-node-1           running
 84   kube-node-2           running
 85   kube-node-3           running

user-data 示例

#cloud-config

locale: en_US
keyboard:
  layout: us
timezone: Asia/Shanghai
hostname: kube-node-1
create_hostname_file: true

ssh_pwauth: yes

groups:
  - ubuntu

users:
  - name: ubuntu
    gecos: ubuntu
    primary_group: ubuntu
    groups: sudo, cdrom
    sudo: ALL=(ALL:ALL) ALL
    shell: /bin/bash
    lock_passwd: false
    passwd: $6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1
    ssh_authorized_keys:
      - "ssh-rsa AAAAB3NzaC1yXXXXXXXXXX=="

apt:
  primary:
    - arches: [default]
      uri: http://us.archive.ubuntu.com/ubuntu/

packages:
  - openssh-server
  - net-tools
  - iftop
  - htop
  - iperf3
  - vim
  - curl
  - wget
  - cloud-guest-utils # Ensure growpart is available

ntp:
  servers: ['ntp.esl.cisco.com']

runcmd:
  - echo "Attempting to resize root partition and filesystem..."
  - growpart /dev/vda 1 # Expand the first partition on /dev/vda
  - resize2fs /dev/vda1 # Expand the ext4 filesystem on /dev/vda1
  - echo "Disk resize commands executed. Verify with 'df -h' after boot."

network-config

network:
  version: 2
  ethernets:
    enp1s0:
      addresses:
      - "10.75.59.71/24"
      nameservers:
        addresses:
        - 64.104.76.247
        - 64.104.14.184
        search:
        - cisco.com
      routes:
      - to: "default"
        via: "10.75.59.1"

meta-data

1 2	instance-id: kube-node-1 local-hostname: kube-node-1

2. 使用Ansible安装Kubernetes

2.1 脚本内容

以下脚本可用于自动化安装Ansible，并生成Playbook，可直接执行安装任务。

#!/bin/bash

# This script automates the setup of an Ansible environment for Kubernetes cluster deployment.
# It installs Ansible, creates the project directory, inventory, configuration,
# and defines common Kubernetes setup tasks.
# This version stops after installing Kubernetes components, allowing manual kubeadm init/join.
# Includes a robust fix for Containerd's SystemdCgroup configuration and CRI plugin enabling,
# defines the necessary handler for restarting Containerd, dynamically adds host entries to /etc/hosts,
# and updates the pause image version in the manual instructions.
# This update also addresses the runc runtime root configuration in containerd and fixes
# YAML escape character issues in the hosts file regex, and updates the sandbox image in containerd config.

# --- Configuration ---
PROJECT_DIR="k8s_cluster_setup"
MASTER_NODE_IP="10.75.59.71" # Based on your previous script's IP assignment for kube-node-1
WORKER_NODE_IP_1="10.75.59.72" # Based on your previous script's IP assignment for kube-node-2
WORKER_NODE_IP_2="10.75.59.73" # Based on your previous script's IP address for kube-node-3
ANSIBLE_USER="ubuntu" # The user created by cloud-init on your VMs
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Path to your SSH private key on the Ansible control machine

# --- Functions ---

# Function to install Ansible
install_ansible() {
    echo "--- Installing Ansible ---"
    if ! command -v ansible &> /dev/null; then
        sudo apt update -y
        sudo apt install -y ansible
        echo "Ansible installed successfully."
    else
        echo "Ansible is already installed."
    fi
}

# Function to create project directory and navigate into it
create_project_dir() {
    echo "--- Creating project directory: ${PROJECT_DIR} ---"
    mkdir -p "${PROJECT_DIR}"
    cd "${PROJECT_DIR}" || { echo "Failed to change directory to ${PROJECT_DIR}. Exiting."; exit 1; }
    echo "Changed to directory: $(pwd)"
}

# Function to create ansible.cfg
create_ansible_cfg() {
    echo "--- Creating ansible.cfg ---"
    cat <<EOF > ansible.cfg
[defaults]
inventory = inventory.ini
roles_path = ./roles
host_key_checking = False # WARNING: Disable host key checking for convenience during initial setup. Re-enable for production!
EOF
    echo "ansible.cfg created."
}

# Function to create inventory.ini (UPDATED with IP variables)
create_inventory() {
    echo "--- Creating inventory.ini ---"
    cat <<EOF > inventory.ini
[master]
kube-node-1 ansible_host=${MASTER_NODE_IP}

[workers]
kube-node-2 ansible_host=${WORKER_NODE_IP_1}
kube-node-3 ansible_host=${WORKER_NODE_IP_2}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH}
ansible_python_interpreter=/usr/bin/python3
# These variables are now primarily for documentation/script clarity,
# as the hosts file task will dynamically read from inventory groups.
master_node_ip=${MASTER_NODE_IP}
worker_node_ip_1=${WORKER_NODE_IP_1}
worker_node_ip_2=${WORKER_NODE_IP_2}
EOF
    echo "inventory.ini created."
}

# Function to create main playbook.yml (Modified to only include common setup)
create_playbook() {
    echo "--- Creating playbook.yml ---"
    cat <<EOF > playbook.yml
---
- name: Common Kubernetes Setup for all nodes
  hosts: all
  become: yes
  roles:
    - common_k8s_setup
EOF
    echo "playbook.yml created (only common setup included)."
}

# Function to create roles and their tasks
create_roles() {
    echo "--- Creating Ansible roles and tasks ---"

    # common_k8s_setup role
    mkdir -p roles/common_k8s_setup/tasks
    # UPDATED: main.yml to include the new hosts entry task first
    cat <<EOF > roles/common_k8s_setup/tasks/main.yml
---
- name: Include add hosts entries task
  ansible.builtin.include_tasks: 00_add_hosts_entries.yml

- name: Include disable swap task
  ansible.builtin.include_tasks: 01_disable_swap.yml

- name: Include containerd setup task
  ansible.builtin.include_tasks: 02_containerd_setup.yml

- name: Include kernel modules and sysctl task
  ansible.builtin.include_tasks: 03_kernel_modules_sysctl.yml

- name: Include kube repo, install, and hold task
  ansible.builtin.include_tasks: 04_kube_repo_install_hold.yml

- name: Include initial apt upgrade task
  ansible.builtin.include_tasks: 05_initial_upgrade.yml

- name: Include configure weekly updates task
  ansible.builtin.include_tasks: 06_configure_weekly_updates.yml
EOF

    # NEW FILE: 00_add_hosts_entries.yml (Dynamically adds hosts from inventory, FIXED: Escaped backslashes in regex)
    cat <<EOF > roles/common_k8s_setup/tasks/00_add_hosts_entries.yml
---
- name: Add all inventory hosts to /etc/hosts on each node
  ansible.builtin.lineinfile:
    path: /etc/hosts
    regexp: "^{{ hostvars[item]['ansible_host'] }}\\\\s+{{ item }}" # Fixed: \\\\s for escaped backslash in regex
    line: "{{ hostvars[item]['ansible_host'] }} {{ item }}"
    state: present
    create: yes
    mode: '0644'
    owner: root
    group: root
  loop: "{{ groups['all'] }}" # Loop over all hosts defined in the inventory
EOF

    # Create handlers directory and file
    mkdir -p roles/common_k8s_setup/handlers
    cat <<EOF > roles/common_k8s_setup/handlers/main.yml
---
- name: Restart containerd service
  ansible.builtin.systemd:
    name: containerd
    state: restarted
    daemon_reload: yes
EOF

    # 01_disable_swap.yml
    cat <<EOF > roles/common_k8s_setup/tasks/01_disable_swap.yml
---
- name: Check if swap is active
  ansible.builtin.command: swapon --show
  register: swap_check_result
  changed_when: false # This command itself doesn't change state
  failed_when: false  # Don't fail if swapon --show returns non-zero (e.g., no swap enabled)

- name: Disable swap
  ansible.builtin.command: swapoff -a
  when: swap_check_result.rc == 0 # Only run if swapon --show indicated swap is active

- name: Persistently disable swap (comment out swapfile in fstab)
  ansible.builtin.replace:
    path: /etc/fstab
    regexp: '^(/swapfile.*)$'
    replace: '#\1'
  when: swap_check_result.rc == 0 # Only run if swap was found to be active
EOF

    # 02_containerd_setup.yml (UPDATED for sandbox_image)
    cat <<EOF > roles/common_k8s_setup/tasks/02_containerd_setup.yml
---
- name: Install required packages for Containerd
  ansible.builtin.apt:
    name:
      - ca-certificates
      - curl
      - gnupg
      - lsb-release
      - apt-transport-https
      - software-properties-common
    state: present
    update_cache: yes

- name: Add Docker GPG key
  ansible.builtin.apt_key:
    url: https://download.docker.com/linux/ubuntu/gpg
    state: present
    keyring: /etc/apt/keyrings/docker.gpg # Use keyring for modern apt

- name: Add Docker APT repository
  ansible.builtin.apt_repository:
    repo: "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu {{ ansible_distribution_release }} stable"
    state: present
    filename: docker

- name: Install Containerd
  ansible.builtin.apt:
    name: containerd.io
    state: present
    update_cache: yes

- name: Create containerd configuration directory
  ansible.builtin.file:
    path: /etc/containerd
    state: directory
    mode: '0755'

- name: Generate default containerd configuration directly to final path
  ansible.builtin.shell: containerd config default > /etc/containerd/config.toml
  changed_when: true # Always report change as we're ensuring a default state

- name: Ensure CRI plugin is enabled (remove any disabled_plugins line containing "cri")
  ansible.builtin.lineinfile:
    path: /etc/containerd/config.toml
    regexp: '^\s*disabled_plugins = \[.*"cri".*\]' # More general regexp
    state: absent
    backup: yes
  notify: Restart containerd service

- name: Remove top-level systemd_cgroup from CRI plugin section
  ansible.builtin.lineinfile:
    path: /etc/containerd/config.toml
    regexp: '^\s*systemd_cgroup = (true|false)' # Matches the 'systemd_cgroup' directly under [plugins."io.containerd.grpc.v1.cri"]
    state: absent # Remove this line
    backup: yes
  notify: Restart containerd service

- name: Remove old runtime_root from runc runtime section
  ansible.builtin.lineinfile:
    path: /etc/containerd/config.toml
    regexp: '^\s*runtime_root = ".*"' # Matches runtime_root line
    state: absent
    backup: yes
  notify: Restart containerd service

- name: Configure runc runtime to use SystemdCgroup = true
  ansible.builtin.lineinfile:
    path: /etc/containerd/config.toml
    regexp: '^\s*#?\s*SystemdCgroup = (true|false)' # Matches the 'SystemdCgroup' under runc.options
    line: '            SystemdCgroup = true' # Ensure correct indentation
    insertafter: '^\s*\[plugins\."io\.containerd\.grpc\.v1\.cri"\.containerd\.runtimes\.runc\.options\]'
    backup: yes
  notify: Restart containerd service

- name: Add Root path to runc options
  ansible.builtin.lineinfile:
    path: /etc/containerd/config.toml
    regexp: '^\s*Root = ".*"' # Matches existing Root line if any
    line: '            Root = "/run/containerd/runc"' # New Root path
    insertafter: '^\s*\[plugins\."io\.containerd\.grpc\.v1\.cri"\.containerd\.runtimes\.runc\.options\]'
    backup: yes
  notify: Restart containerd service

- name: Update sandbox_image to pause:3.10
  ansible.builtin.lineinfile:
    path: /etc/containerd/config.toml
    regexp: '^\s*sandbox_image = "registry.k8s.io/pause:.*"'
    line: '    sandbox_image = "registry.k8s.io/pause:3.10"'
    insertafter: '^\s*\[plugins\."io\.containerd\.grpc\.v1\.cri"\]' # Insert after the CRI plugin section start
    backup: yes
  notify: Restart containerd service
EOF

    # 03_kernel_modules_sysctl.yml
    cat <<EOF > roles/common_k8s_setup/tasks/03_kernel_modules_sysctl.yml
---
- name: Load overlay module
  ansible.builtin.command: modprobe overlay
  args:
    creates: /sys/module/overlay # Check if module is loaded
  changed_when: false

- name: Load br_netfilter module
  ansible.builtin.command: modprobe br_netfilter
  args:
    creates: /sys/module/br_netfilter # Check if module is loaded
  changed_when: false

- name: Add modules to /etc/modules-load.d/k8s.conf
  ansible.builtin.copy:
    dest: /etc/modules-load.d/k8s.conf
    content: |
      overlay
      br_netfilter

- name: Configure sysctl parameters for Kubernetes networking
  ansible.builtin.copy:
    dest: /etc/sysctl.d/k8s.conf
    content: |
      net.bridge.bridge-nf-call-iptables = 1
      net.bridge.bridge-nf-call-ip6tables = 1
      net.ipv4.ip_forward = 1

- name: Apply sysctl parameters
  ansible.builtin.command: sysctl --system
  changed_when: false
EOF

    # 04_kube_repo_install_hold.yml
    cat <<EOF > roles/common_k8s_setup/tasks/04_kube_repo_install_hold.yml
---
- name: Create Kubernetes apt keyring directory
  ansible.builtin.file:
    path: /etc/apt/keyrings
    state: directory
    mode: '0755'

- name: Download Kubernetes GPG key and dearmor
  ansible.builtin.shell: |
    curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.33/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
  args:
    creates: /etc/apt/keyrings/kubernetes-apt-keyring.gpg
  changed_when: false # This command is idempotent enough for our purposes

- name: Add Kubernetes APT repository source list
  ansible.builtin.copy:
    dest: /etc/apt/sources.list.d/kubernetes.list
    content: "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.33/deb/ /\n"
    mode: '0644'
    backup: yes

- name: Update apt cache after adding Kubernetes repo
  ansible.builtin.apt:
    update_cache: yes

- name: Install kubelet, kubeadm, kubectl
  ansible.builtin.apt:
    name:
      - kubelet
      - kubeadm
      - kubectl
    state: present
    update_cache: yes # Ensure apt cache is updated after adding repo.

- name: Hold kubelet, kubeadm, kubectl packages
  ansible.builtin.dpkg_selections:
    name: "{{ item }}"
    selection: hold
  loop:
    - kubelet
    - kubeadm
    - kubectl

- name: Enable and start kubelet service
  ansible.builtin.systemd:
    name: kubelet
    state: started
    enabled: yes
EOF

    # NEW FILE: 05_initial_upgrade.yml
    cat <<EOF > roles/common_k8s_setup/tasks/05_initial_upgrade.yml
---
- name: Perform initial apt update and upgrade
  ansible.builtin.apt:
    update_cache: yes
    upgrade: yes
    autoremove: yes
    purge: yes
EOF

    # NEW FILE: 06_configure_weekly_updates.yml
    cat <<EOF > roles/common_k8s_setup/tasks/06_configure_weekly_updates.yml
---
- name: Configure weekly apt update and upgrade cron job
  ansible.builtin.cron:
    name: "weekly apt update and upgrade"
    weekday: "0" # Sunday
    hour: "3"    # 3 AM
    minute: "0"
    job: "/usr/bin/apt update && /usr/bin/apt upgrade -y && /usr/bin/apt autoremove -y && /usr/bin/apt clean"
    user: root
    state: present
EOF

    # Master and Worker roles are still created for structure, but not called by playbook.yml
    mkdir -p roles/k8s-master/tasks
    cat <<EOF > roles/k8s-master/tasks/main.yml
---
- name: This role is intentionally skipped by the main playbook for manual setup.
  ansible.builtin.debug:
    msg: "This master role is not executed by default. Run 'kubeadm init' manually on the master node."
EOF

    mkdir -p roles/k8s-worker/tasks
    cat <<EOF > roles/k8s-worker/tasks/main.yml
---
- name: This role is intentionally skipped by the main playbook for manual setup.
  ansible.builtin.debug:
    msg: "This worker role is not executed by default. Run 'kubeadm join' manually on worker nodes."
EOF

    echo "Ansible roles and tasks created."
}

# --- Main execution ---
install_ansible
create_project_dir
create_ansible_cfg
create_inventory
create_playbook
create_roles

echo ""
echo "--- Ansible setup for Kubernetes installation is complete! ---"
echo "Navigate to the project directory:"
echo "cd ${PROJECT_DIR}"
echo ""
echo "Then, run the Ansible playbook to install Kubernetes components on all nodes:"
echo "ansible-playbook playbook.yml -K"
echo ""
echo "After the playbook finishes, you will need to manually initialize the Kubernetes cluster:"
echo "1. SSH into the master node (kube-node-1):"
echo "   ssh ubuntu@${MASTER_NODE_IP}"
echo ""
echo "2. Initialize the Kubernetes control plane on the master node:"
echo "   sudo kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy --pod-infra-container-image=registry.k8s.io/pause:3.10"
echo ""
echo "3. After 'kubeadm init' completes, it will print instructions to set up kubectl and the 'kubeadm join' command."
echo "   Follow the instructions to set up kubectl for the 'ubuntu' user:"
echo "   mkdir -p \$HOME/.kube"
echo "   sudo cp -i /etc/kubernetes/admin.conf \$HOME/.kube/config"
echo "   sudo chown \$(id -u):\$(id -g) \$HOME/.kube/config"
echo ""
echo "4. Copy the 'kubeadm join' command (including the token and discovery-token-ca-cert-hash) printed by 'kubeadm init'."
echo "   It will look something like: 'kubeadm join <master-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>'"
echo ""
echo "5. SSH into each worker node (kube-node-2, kube-node-3) and run the join command:"
echo "   ssh ubuntu@${WORKER_NODE_IP_1} (for kube-node-2)"
echo "   sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>"
echo ""
echo "   ssh ubuntu@${WORKER_NODE_IP_2} (for kube-node-3)"
echo "   sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>"
echo ""
echo "6. Verify your cluster status from the master node:"
echo "   ssh ubuntu@${MASTER_NODE_IP}"
echo "   kubectl get nodes"
echo "   kubectl get pods --all-namespaces"

上述脚本执行完后，创建以下文档

root@ois:/home/ois/data/k8s/k8s_cluster_setup# ls -R1
.:
ansible.cfg
inventory.ini
playbook.yml
roles

./roles:
common_k8s_setup
k8s-master
k8s-worker

./roles/common_k8s_setup:
handlers
tasks

./roles/common_k8s_setup/handlers:
main.yml

./roles/common_k8s_setup/tasks:
00_add_hosts_entries.yml
01_disable_swap.yml
02_containerd_setup.yml
03_kernel_modules_sysctl.yml
04_kube_repo_install_hold.yml
05_initial_upgrade.yml
06_configure_weekly_updates.yml
main.yml

./roles/k8s-master:
tasks

./roles/k8s-master/tasks:
main.yml

./roles/k8s-worker:
tasks

./roles/k8s-worker/tasks:
main.yml

2.2 执行过程

在执行前，需要让.ssh/known_hosts 文件保存这些主机的SSH Key。

#!/bin/bash

# This script prepares the Ansible control node by adding the SSH host keys
# of the target nodes to the ~/.ssh/known_hosts file.
# It first removes any outdated keys for the specified hosts before scanning
# for the new ones, preventing "REMOTE HOST IDENTIFICATION HAS CHANGED" errors.

# --- Configuration ---
# List of hosts (IPs or FQDNs) to scan.
# These should be the same hosts you have in your Ansible inventory.
HOSTS=(
    "10.75.59.71"
    "10.75.59.72"
    "10.75.59.73"
)

# The location of your known_hosts file.
KNOWN_HOSTS_FILE=~/.ssh/known_hosts

# --- Main Logic ---
echo "Starting SSH host key scan to update ${KNOWN_HOSTS_FILE}..."
echo ""

# Ensure the .ssh directory exists with the correct permissions.
mkdir -p ~/.ssh
chmod 700 ~/.ssh

# Loop through each host defined in the HOSTS array.
for host in "${HOSTS[@]}"; do
    echo "--- Processing host: ${host} ---"

    # 1. Remove the old host key (if it exists).
    # This is the key step to ensure we replace outdated entries.
    # The command is silent if no key is found.
    echo "Step 1: Removing any old key for ${host}..."
    ssh-keygen -R "${host}"

    # 2. Scan for the new host key and append it.
    # The -H flag hashes the hostname, which is a security best practice.
    echo "Step 2: Scanning for new key and adding it to known_hosts..."
    ssh-keyscan -H "${host}" >> "${KNOWN_HOSTS_FILE}"

    echo "Successfully updated key for ${host}."
    echo ""
done

# Set the correct permissions for the known_hosts file, as SSH is strict about this.
chmod 600 "${KNOWN_HOSTS_FILE}"

echo "✅ All hosts have been scanned and keys have been updated."
echo "You can now run your Ansible playbook without host key verification prompts."

chmod +x ansible-k8s-v2.sh 
ois@ois:~/data/k8s$ ./ansible-k8s-v2.sh 
--- Installing Ansible ---
Ansible is already installed.
--- Creating project directory: k8s_cluster_setup ---
Changed to directory: /home/ois/data/k8s/k8s_cluster_setup
--- Creating ansible.cfg ---
ansible.cfg created.
--- Creating inventory.ini ---
inventory.ini created.
--- Creating playbook.yml ---
playbook.yml created (only common setup included).
--- Creating Ansible roles and tasks ---
Ansible roles and tasks created.

--- Ansible setup for Kubernetes installation is complete! ---
Navigate to the project directory:
cd k8s_cluster_setup

Then, run the Ansible playbook to install Kubernetes components on all nodes:
ansible-playbook playbook.yml -K

After the playbook finishes, you will need to manually initialize the Kubernetes cluster:
1. SSH into the master node (kube-node-1):
   ssh ubuntu@10.75.59.71

2. Initialize the Kubernetes control plane on the master node:
   sudo kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy --pod-infra-container-image=registry.k8s.io/pause:3.10

3. After 'kubeadm init' completes, it will print instructions to set up kubectl and the 'kubeadm join' command.
   Follow the instructions to set up kubectl for the 'ubuntu' user:
   mkdir -p $HOME/.kube
   sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
   sudo chown $(id -u):$(id -g) $HOME/.kube/config

4. Copy the 'kubeadm join' command (including the token and discovery-token-ca-cert-hash) printed by 'kubeadm init'.
   It will look something like: 'kubeadm join <master-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>'

5. SSH into each worker node (kube-node-2, kube-node-3) and run the join command:
   ssh ubuntu@10.75.59.72 (for kube-node-2)
   sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>

   ssh ubuntu@10.75.59.73 (for kube-node-3)
   sudo <PASTE_YOUR_KUBEADM_JOIN_COMMAND_HERE>

6. Verify your cluster status from the master node:
   ssh ubuntu@10.75.59.71
   kubectl get nodes
   kubectl get pods --all-namespaces
ois@ois:~/data/k8s$ cd k8s_cluster_setup/

ois@ois:~/data/k8s/k8s_cluster_setup$ ansible-playbook playbook.yml -K
BECOME password: 

PLAY [Common Kubernetes Setup for all nodes] *******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Include add hosts entries task] *******************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/00_add_hosts_entries.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Add all inventory hosts to /etc/hosts on each node] ***********************************************************************************************************************************************************************************
changed: [kube-node-2] => (item=kube-node-1)
changed: [kube-node-1] => (item=kube-node-1)
changed: [kube-node-3] => (item=kube-node-1)
changed: [kube-node-1] => (item=kube-node-2)
changed: [kube-node-2] => (item=kube-node-2)
changed: [kube-node-3] => (item=kube-node-2)
changed: [kube-node-2] => (item=kube-node-3)
changed: [kube-node-1] => (item=kube-node-3)
changed: [kube-node-3] => (item=kube-node-3)

TASK [common_k8s_setup : Include disable swap task] ************************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/01_disable_swap.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Check if swap is active] **************************************************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-1]
ok: [kube-node-3]

TASK [common_k8s_setup : Disable swap] *************************************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Persistently disable swap (comment out swapfile in fstab)] ****************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-3]
ok: [kube-node-1]

TASK [common_k8s_setup : Include containerd setup task] ********************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/02_containerd_setup.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Install required packages for Containerd] *********************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

TASK [common_k8s_setup : Add Docker GPG key] *******************************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Add Docker APT repository] ************************************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

TASK [common_k8s_setup : Install Containerd] *******************************************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-1]
changed: [kube-node-2]

TASK [common_k8s_setup : Create containerd configuration directory] ********************************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-3]
ok: [kube-node-1]

TASK [common_k8s_setup : Generate default containerd configuration directly to final path] *********************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

TASK [common_k8s_setup : Ensure CRI plugin is enabled (remove any disabled_plugins line containing "cri")] *****************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Remove top-level systemd_cgroup from CRI plugin section] ******************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Remove old runtime_root from runc runtime section] ************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

TASK [common_k8s_setup : Configure runc runtime to use SystemdCgroup = true] ***********************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Add Root path to runc options] ********************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

TASK [common_k8s_setup : Update sandbox_image to pause:3.10] ***************************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Include kernel modules and sysctl task] ***********************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/03_kernel_modules_sysctl.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Load overlay module] ******************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Load br_netfilter module] *************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Add modules to /etc/modules-load.d/k8s.conf] ******************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Configure sysctl parameters for Kubernetes networking] ********************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [common_k8s_setup : Apply sysctl parameters] **************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Include kube repo, install, and hold task] ********************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/04_kube_repo_install_hold.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Create Kubernetes apt keyring directory] **********************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [common_k8s_setup : Download Kubernetes GPG key and dearmor] **********************************************************************************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-1]
ok: [kube-node-3]

TASK [common_k8s_setup : Add Kubernetes APT repository source list] ********************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-3]
changed: [kube-node-1]

TASK [common_k8s_setup : Update apt cache after adding Kubernetes repo] ****************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

TASK [common_k8s_setup : Install kubelet, kubeadm, kubectl] ****************************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-2]
changed: [kube-node-1]

TASK [common_k8s_setup : Hold kubelet, kubeadm, kubectl packages] **********************************************************************************************************************************************************************************************
changed: [kube-node-1] => (item=kubelet)
changed: [kube-node-3] => (item=kubelet)
changed: [kube-node-2] => (item=kubelet)
changed: [kube-node-3] => (item=kubeadm)
changed: [kube-node-1] => (item=kubeadm)
changed: [kube-node-2] => (item=kubeadm)
changed: [kube-node-3] => (item=kubectl)
changed: [kube-node-1] => (item=kubectl)
changed: [kube-node-2] => (item=kubectl)

TASK [common_k8s_setup : Enable and start kubelet service] *****************************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-2]
changed: [kube-node-1]

TASK [common_k8s_setup : Include initial apt upgrade task] *****************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/05_initial_upgrade.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Perform initial apt update and upgrade] ***********************************************************************************************************************************************************************************************
changed: [kube-node-3]
changed: [kube-node-2]
changed: [kube-node-1]

TASK [common_k8s_setup : Include configure weekly updates task] ************************************************************************************************************************************************************************************************
included: /home/ois/data/k8s/k8s_cluster_setup/roles/common_k8s_setup/tasks/06_configure_weekly_updates.yml for kube-node-1, kube-node-2, kube-node-3

TASK [common_k8s_setup : Configure weekly apt update and upgrade cron job] *************************************************************************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

RUNNING HANDLER [common_k8s_setup : Restart containerd service] ************************************************************************************************************************************************************************************************
changed: [kube-node-2]
changed: [kube-node-1]
changed: [kube-node-3]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
kube-node-1                : ok=39   changed=22   unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
kube-node-2                : ok=39   changed=22   unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
kube-node-3                : ok=39   changed=22   unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

3. 设置本地DNS和BGP

设置一个本地DNS和BGP服务器用于测试。

3.1 create-dns.sh

#!/bin/bash

# --- Configuration ---
BASE_IMAGE_PATH="/home/ois/data/vmimages/noble-server-cloudimg-amd64.img"
VM_IMAGE_DIR="/home/ois/data/k8s/nodevms"
VM_CONFIG_DIR="/home/ois/data/k8s/nodevm_cfg"
RAM_MB=8192
VCPUS=4
DISK_SIZE_GB=20
BRIDGE_INTERFACE="br0"
# Specific IP for the single VM
VM_IP="10.75.59.76"
NETWORK_PREFIX="/24"
GATEWAY="10.75.59.1"
# DNS servers for the VM's initial resolution (for internet access)
VM_NAMESERVER1="64.104.76.247"
VM_NAMESERVER2="64.104.14.184"
SEARCH_DOMAIN="cisco.com"
VNC_PORT=5909 # Fixed VNC port for the single VM
PASSWORD_HASH='$6$rounds=4096$LDu9pXXXXXXXXXXXXXXOh/Iunw372/TVfst1'
SSH_PUB_KEY=$(cat ~/.ssh/id_rsa.pub)

# --- VM Details ---
VM_NAME="dns-server-vm"
VM_IMAGE_PATH="${VM_IMAGE_DIR}/${VM_NAME}.qcow2"

echo "--- Preparing for $VM_NAME (IP: $VM_IP) ---"

# Create directories if they don't exist
mkdir -p "$VM_IMAGE_DIR"
mkdir -p "$VM_CONFIG_DIR"

# Create a fresh image for the VM
if [ -f "$VM_IMAGE_PATH" ]; then
    echo "Removing existing image for $VM_NAME..."
    rm "$VM_IMAGE_PATH"
fi
echo "Copying base image to $VM_IMAGE_PATH..."
cp "$BASE_IMAGE_PATH" "$VM_IMAGE_PATH"

# Resize the copied image before virt-install
echo "Resizing VM image to ${DISK_SIZE_GB}GB..."
qemu-img resize "$VM_IMAGE_PATH" "${DISK_SIZE_GB}G"

# Generate user-data for the VM (installing dnsmasq, no UFW, no dnsmasq config here)
USER_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_user-data"
cat <<EOF > "$USER_DATA_FILE"
#cloud-config

locale: en_US
keyboard:
  layout: us
timezone: Asia/Tokyo
hostname: ${VM_NAME}
create_hostname_file: true

ssh_pwauth: yes

groups:
  - ubuntu

users:
  - name: ubuntu
    gecos: ubuntu
    primary_group: ubuntu
    groups: sudo, cdrom
    sudo: ALL=(ALL:ALL) ALL
    shell: /bin/bash
    lock_passwd: false
    passwd: ${PASSWORD_HASH}
    ssh_authorized_keys:
      - "${SSH_PUB_KEY}"

apt:
  primary:
    - arches: [default]
      uri: http://us.archive.ubuntu.com/ubuntu/

packages:
  - openssh-server
  - net-tools
  - iftop
  - htop
  - iperf3
  - vim
  - curl
  - wget
  - cloud-guest-utils # Ensure growpart is available
  - dnsmasq # Install dnsmasq for DNS server functionality

ntp:
  servers: ['ntp.esl.cisco.com']

runcmd:
  - echo "Attempting to resize root partition and filesystem..."
  - growpart /dev/vda 1 # Expand the first partition on /dev/vda
  - resize2fs /dev/vda1 # Expand the ext4 filesystem on /dev/vda1
  - echo "Disk resize commands executed. Verify with 'df -h' after boot."
EOF

# Generate network-config for the VM (pointing to external DNS for initial connectivity)
NETWORK_CONFIG_FILE="${VM_CONFIG_DIR}/${VM_NAME}_network-config"
cat <<EOF > "$NETWORK_CONFIG_FILE"
network:
  version: 2
  ethernets:
    enp1s0:
      addresses:
      - "${VM_IP}${NETWORK_PREFIX}"
      nameservers:
        addresses:
        - ${VM_NAMESERVER1} # Point VM to external DNS for initial internet access
        - ${VM_NAMESERVER2}
        search:
        - ${SEARCH_DOMAIN}
      routes:
      - to: "default"
        via: "${GATEWAY}"
EOF

# Generate meta-data
META_DATA_FILE="${VM_CONFIG_DIR}/${VM_NAME}_meta-data"
cat <<EOF > "$META_DATA_FILE"
instance-id: ${VM_NAME}
local-hostname: ${VM_NAME}
EOF

echo "--- Installing $VM_NAME ---"
virt-install --name "${VM_NAME}" --ram "${RAM_MB}" --vcpus "${VCPUS}" --noreboot \
    --os-variant ubuntu24.04 \
    --network bridge="${BRIDGE_INTERFACE}" \
    --graphics vnc,listen=0.0.0.0,port="${VNC_PORT}" \
    --disk path="${VM_IMAGE_PATH}",format=qcow2 \
    --console pty,target_type=serial \
    --cloud-init user-data="${USER_DATA_FILE}",meta-data="${META_DATA_FILE}",network-config="${NETWORK_CONFIG_FILE}" \
    --import \
    --wait 0

echo "Successfully initiated creation of $VM_NAME."
echo "You can connect to VNC on port ${VNC_PORT} to monitor installation (optional)."
echo "Wait a few minutes for the VM to boot and cloud-init to run."
echo "--------------------------------------------------------"

echo "The DNS server VM has been initiated. Please wait for it to fully provision."
echo "You can SSH into it using 'ssh ubuntu@${VM_IP}'."
echo "Once provisioned, proceed to use the 'setup-dnsmasq-ansible.sh' script to configure DNS using Ansible."

ois@ois:~/data/k8s$ ./create-dns.sh 
--- Preparing for dns-server-vm (IP: 10.75.59.76) ---
Copying base image to /home/ois/data/k8s/nodevms/dns-server-vm.qcow2...
Resizing VM image to 20GB...
Image resized.
--- Installing dns-server-vm ---
WARNING  Treating --wait 0 as --noautoconsole

Starting install...
Allocating 'virtinst-y1pxxrj5-cloudinit.iso'                                                                                                                                                                                             |    0 B  00:00:00 ... 
Transferring 'virtinst-y1pxxrj5-cloudinit.iso'                                                                                                                                                                                           |    0 B  00:00:00 ... 
Creating domain...                                                                                                                                                                                                                       |    0 B  00:00:00     
Domain creation completed.
Successfully initiated creation of dns-server-vm.
You can connect to VNC on port 5909 to monitor installation (optional).
Wait a few minutes for the VM to boot and cloud-init to run.
--------------------------------------------------------
The DNS server VM has been initiated. Please wait for it to fully provision.
You can SSH into it using 'ssh ubuntu@10.75.59.76'.
Once provisioned, you can test the DNS forwarding by configuring another machine to use 10.75.59.76 as its DNS server and performing a DNS query.

3.2 使用Ansible设置dnsmasq

#!/bin/bash

# --- Configuration for Ansible and DNSmasq ---
VM_IP="10.75.59.76"
SSH_USER="ubuntu"
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Ensure this path is correct for your setup
FORWARD_NAMESERVER1="64.104.76.247"
FORWARD_NAMESERVER2="64.104.14.184"
SEARCH_DOMAIN="cisco.com"
ANSIBLE_DIR="ansible_dnsmasq_setup"

echo "--- Setting up Ansible environment and configuring DNSmasq on ${VM_IP} ---"

# Create a directory for Ansible files
mkdir -p "$ANSIBLE_DIR"
cd "$ANSIBLE_DIR" || exit 1

# --- Install Ansible if not already installed ---
if ! command -v ansible &> /dev/null
then
    echo "Ansible not found. Installing Ansible..."
    # Check OS and install accordingly
    if [ -f /etc/os-release ]; then
        . /etc/os-release
        if [[ "$ID" == "ubuntu" || "$ID" == "debian" ]]; then
            sudo apt update
            sudo apt install -y software-properties-common
            sudo apt-add-repository --yes --update ppa:ansible/ansible
            sudo apt install -y ansible
        elif [[ "$ID" == "centos" || "$ID" == "rhel" || "$ID" == "fedora" ]]; then
            sudo yum install -y epel-release
            sudo yum install -y ansible
        else
            echo "Unsupported OS for automatic Ansible installation. Please install Ansible manually."
            exit 1
        fi
    else
        echo "Could not determine OS for automatic Ansible installation. Please install Ansible manually."
        exit 1
    fi
else
    echo "Ansible is already installed."
fi

# --- Create Ansible Inventory File ---
echo "Creating Ansible inventory file: inventory.ini"
cat <<EOF > inventory.ini
[dns_server]
${VM_IP} ansible_user=${SSH_USER} ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH} ansible_python_interpreter=/usr/bin/python3
EOF

# --- Create Ansible Playbook (setup-dnsmasq.yml) ---
echo "Creating Ansible playbook: setup-dnsmasq.yml"
cat <<EOF > setup-dnsmasq.yml
---
- name: Configure DNSmasq Server on Ubuntu VM
  hosts: dns_server
  become: yes # Run tasks with sudo privileges
  vars:
    dns_forwarder_1: "${FORWARD_NAMESERVER1}"
    dns_forwarder_2: "${FORWARD_NAMESERVER2}"
    vm_ip: "${VM_IP}"
    search_domain: "${SEARCH_DOMAIN}"

  tasks:
    - name: Ensure apt cache is updated
      ansible.builtin.apt:
        update_cache: yes
        cache_valid_time: 3600 # Cache for 1 hour

    - name: Install dnsmasq package
      ansible.builtin.apt:
        name: dnsmasq
        state: present

    - name: Stop dnsmasq service before configuration
      ansible.builtin.systemd:
        name: dnsmasq
        state: stopped
      ignore_errors: yes # Ignore if it's not running initially

    - name: Backup original dnsmasq.conf
      ansible.builtin.command: mv /etc/dnsmasq.conf /etc/dnsmasq.conf.bak
      args:
        removes: /etc/dnsmasq.conf # Only run if dnsmasq.conf exists
      ignore_errors: yes

    - name: Configure dnsmasq for forwarding
      ansible.builtin.template:
        src: dnsmasq.conf.j2
        dest: /etc/dnsmasq.conf
        owner: root
        group: root
        mode: '0644'
      notify: Restart dnsmasq

    - name: Set VM's /etc/resolv.conf to point to itself (local DNS)
      ansible.builtin.template:
        src: resolv.conf.j2
        dest: /etc/resolv.conf
        owner: root
        group: root
        mode: '0644'
      vars:
        local_dns_ip: "127.0.0.1" # dnsmasq listens on 127.0.0.1
        # Removed: search_domain: "{{ search_domain }}" - it's already available from play vars
      notify: Restart systemd-resolved # Or NetworkManager, depending on Ubuntu version

  handlers:
    - name: Restart dnsmasq
      ansible.builtin.systemd:
        name: dnsmasq
        state: restarted
        enabled: yes # Ensure it's enabled to start on boot

    - name: Restart systemd-resolved
      ansible.builtin.systemd:
        name: systemd-resolved
        state: restarted
      ignore_errors: yes # systemd-resolved might not be used on server installs
EOF

# --- Create dnsmasq.conf.j2 template ---
echo "Creating dnsmasq.conf.j2 template"
cat <<EOF > dnsmasq.conf.j2
# This file is managed by Ansible. Do not edit manually.

# Do not read /etc/resolv.conf, use the servers below
no-resolv

# Specify upstream DNS servers for forwarding
server={{ dns_forwarder_1 }}
server={{ dns_forwarder_2 }}

# Listen on localhost and the VM's primary IP
listen-address=127.0.0.1,{{ vm_ip }}

# Allow queries from any interface
interface={{ ansible_default_ipv4.interface }} # Listen on the primary network interface

# Bind to the interfaces to prevent dnsmasq from listening on all interfaces
bind-interfaces

# Cache DNS results
cache-size=150
EOF

# --- Create resolv.conf.j2 template ---
echo "Creating resolv.conf.j2 template"
cat <<EOF > resolv.conf.j2
# This file is managed by Ansible. Do not edit manually.
nameserver {{ local_dns_ip }}
search {{ search_domain }}
EOF

# --- Run the Ansible Playbook ---
echo "Running Ansible playbook to configure DNSmasq..."
ansible-playbook -i inventory.ini setup-dnsmasq.yml -K

# --- Final Instructions ---
echo "--------------------------------------------------------"
echo "DNSmasq configuration complete on ${VM_IP}."
echo "You can now test the DNS server from the VM or from another machine."
echo "From the VM, run: dig @127.0.0.1 www.cisco.com"
echo "From another machine on the same network, configure its DNS to ${VM_IP} and run: dig www.cisco.com"
echo "Remember to exit the '${ANSIBLE_DIR}' directory when done (cd ..)."

执行效果

ois@ois:~/data/k8s$ ./ansible-dnsmasq.sh 
--- Setting up Ansible environment and configuring DNSmasq on 10.75.59.76 ---
Ansible is already installed.
Creating Ansible inventory file: inventory.ini
Creating Ansible playbook: setup-dnsmasq.yml
Creating dnsmasq.conf.j2 template
Creating resolv.conf.j2 template
Running Ansible playbook to configure DNSmasq...
BECOME password: 

PLAY [Configure DNSmasq Server on Ubuntu VM] *******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Ensure apt cache is updated] *****************************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Install dnsmasq package] *********************************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Stop dnsmasq service before configuration] ***************************************************************************************************************************************************************************************************************
ok: [10.75.59.76]

TASK [Backup original dnsmasq.conf] ****************************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

TASK [Configure dnsmasq for forwarding] ************************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

TASK [Set VM's /etc/resolv.conf to point to itself (local DNS)] ************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

RUNNING HANDLER [Restart dnsmasq] ******************************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

RUNNING HANDLER [Restart systemd-resolved] *********************************************************************************************************************************************************************************************************************
changed: [10.75.59.76]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
10.75.59.76                : ok=9    changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

--------------------------------------------------------
DNSmasq configuration complete on 10.75.59.76.
You can now test the DNS server from the VM or from another machine.
From the VM, run: dig @127.0.0.1 www.cisco.com
From another machine on the same network, configure its DNS to 10.75.59.76 and run: dig www.cisco.com
Remember to exit the 'ansible_dnsmasq_setup' directory when done (cd ..).

root@dns-server-vm:~# dig @127.0.0.1 www.cisco.com

; <<>> DiG 9.18.30-0ubuntu0.24.04.2-Ubuntu <<>> @127.0.0.1 www.cisco.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10545
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: ba16202024b9dc8301000000688b40f90daaa0a500a50d2d (good)
;; QUESTION SECTION:
;www.cisco.com.                 IN      A

;; ANSWER SECTION:
www.cisco.com.          3600    IN      CNAME   origin-www.cisco.com.
origin-www.cisco.com.   1800    IN      CNAME   origin-www.xgslb-v3.cisco.com.
origin-www.xgslb-v3.CISCO.com. 10 IN    A       72.163.4.161

;; Query time: 71 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Thu Jul 31 19:10:01 JST 2025
;; MSG SIZE  rcvd: 183

3.3 使用Ansible更新Node 的DNS配置

#!/bin/bash

# --- Configuration ---
ANSIBLE_DIR="ansible_dns_update"
INVENTORY_FILE="${ANSIBLE_DIR}/hosts.ini"
PLAYBOOK_FILE="${ANSIBLE_DIR}/update_dns.yml"

# Kubernetes Node IPs (ensure these match your actual VM IPs)
KUBE_NODE_1_IP="10.75.59.71"
KUBE_NODE_2_IP="10.75.59.72"
KUBE_NODE_3_IP="10.75.59.73"

# Common Ansible user and Python interpreter
ANSIBLE_USER="ubuntu"
ANSIBLE_PYTHON_INTERPRETER="/usr/bin/python3"

# --- Functions ---

# Function to check and install Ansible
install_ansible() {
    if ! command -v ansible &> /dev/null
    then
        echo "Ansible not found. Attempting to install Ansible..."
        if [ -f /etc/debian_version ]; then
            # Debian/Ubuntu
            sudo apt update
            sudo apt install -y software-properties-common
            sudo add-apt-repository --yes --update ppa:ansible/ansible
            sudo apt install -y ansible
        elif [ -f /etc/redhat-release ]; then
            # CentOS/RHEL/Fedora
            sudo yum install -y epel-release
            sudo yum install -y ansible
        else
            echo "Unsupported OS for automatic Ansible installation. Please install Ansible manually."
            exit 1
        fi
        if ! command -v ansible &> /dev/null; then
            echo "Ansible installation failed. Please install it manually and re-run this script."
            exit 1
        fi
        echo "Ansible installed successfully."
    else
        echo "Ansible is already installed."
    fi
}

# Function to create Ansible inventory file
create_inventory() {
    echo "Creating Ansible inventory file: ${INVENTORY_FILE}"
    mkdir -p "$ANSIBLE_DIR"
    cat <<EOF > "$INVENTORY_FILE"
[kubernetes_nodes]
kube-node-1 ansible_host=${KUBE_NODE_1_IP}
kube-node-2 ansible_host=${KUBE_NODE_2_IP}
kube-node-3 ansible_host=${KUBE_NODE_3_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_python_interpreter=${ANSIBLE_PYTHON_INTERPRETER}
EOF
    echo "Inventory file created."
}

# Function to create Ansible playbook file
create_playbook() {
    echo "Creating Ansible playbook file: ${PLAYBOOK_FILE}"
    mkdir -p "$ANSIBLE_DIR"
    cat <<'EOF' > "$PLAYBOOK_FILE"
---
- name: Update DNS server on Kubernetes nodes to use local DNS only
  hosts: kubernetes_nodes
  become: yes # This allows Ansible to run commands with sudo privileges

  tasks:
    - name: Ensure netplan configuration directory exists
      ansible.builtin.file:
        path: /etc/netplan
        state: directory
        mode: '0755'

    - name: Get current network configuration file (e.g., 00-installer-config.yaml)
      ansible.builtin.find:
        paths: /etc/netplan
        patterns: '*.yaml'
        # We assume there's only one primary netplan config file for simplicity.
        # If there are multiple, you might need to specify which one.
      register: netplan_files

    - name: Set network config file variable
      ansible.builtin.set_fact:
        netplan_config_file: "{{ netplan_files.files[0].path }}"
      when: netplan_files.files | length > 0

    - name: Fail if no netplan config file found
      ansible.builtin.fail:
        msg: "No Netplan configuration file found in /etc/netplan. Cannot proceed."
      when: netplan_files.files | length == 0

    - name: Read current netplan configuration
      ansible.builtin.slurp:
        src: "{{ netplan_config_file }}"
      register: current_netplan_config

    - name: Parse current netplan configuration
      ansible.builtin.set_fact:
        parsed_netplan: "{{ current_netplan_config['content'] | b64decode | from_yaml }}"

    - name: Update nameservers in netplan configuration to local DNS only
      ansible.builtin.set_fact:
        updated_netplan: "{{ parsed_netplan | combine(
            {
              'network': {
                'ethernets': {
                  'enp1s0': {
                    'nameservers': {
                      'addresses': ['10.75.59.76'],
                      'search': ['cisco.com']
                    }
                  }
                }
              }
            }, recursive=True) }}"

    - name: Write updated netplan configuration
      ansible.builtin.copy:
        content: "{{ updated_netplan | to_yaml }}"
        dest: "{{ netplan_config_file }}"
        mode: '0600'
      notify: Apply Netplan Configuration

  handlers:
    - name: Apply Netplan Configuration
      ansible.builtin.command: netplan apply
      listen: "Apply Netplan Configuration"
EOF
    echo "Playbook file created."
}

# --- Main Script Execution ---

echo "Starting Ansible DNS update process..."

# 1. Install Ansible if not present
install_ansible

# 2. Create Ansible inventory file
create_inventory

# 3. Create Ansible playbook file
create_playbook

# 4. Run the Ansible playbook
echo "Running Ansible playbook to update DNS on Kubernetes nodes..."
echo "You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs."
ansible-playbook -i "$INVENTORY_FILE" "$PLAYBOOK_FILE" --ask-become-pass

if [ $? -eq 0 ]; then
    echo "Ansible playbook executed successfully."
    echo "Your Kubernetes nodes should now be configured to use 10.75.59.76 as their only DNS server."
else
    echo "Ansible playbook failed. Please check the output for errors."
fi

echo "Process complete."

ois@ois:~/data/k8s$ ./updatedns.sh 
Starting Ansible DNS update process...
Ansible is already installed.
Creating Ansible inventory file: ansible_dns_update/hosts.ini
Inventory file created.
Creating Ansible playbook file: ansible_dns_update/update_dns.yml
Playbook file created.
Running Ansible playbook to update DNS on Kubernetes nodes...
You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs.
BECOME password: 

PLAY [Update DNS server on Kubernetes nodes to use local DNS only] *********************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Ensure netplan configuration directory exists] ***********************************************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-2]
ok: [kube-node-1]

TASK [Get current network configuration file (e.g., 00-installer-config.yaml)] *********************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-1]
ok: [kube-node-2]

TASK [Set network config file variable] ************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Fail if no netplan config file found] ********************************************************************************************************************************************************************************************************************
skipping: [kube-node-1]
skipping: [kube-node-2]
skipping: [kube-node-3]

TASK [Read current netplan configuration] **********************************************************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-1]
ok: [kube-node-2]

TASK [Parse current netplan configuration] *********************************************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Update nameservers in netplan configuration to local DNS only] *******************************************************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-2]
ok: [kube-node-3]

TASK [Write updated netplan configuration] *********************************************************************************************************************************************************************************************************************
ok: [kube-node-3]
ok: [kube-node-1]
ok: [kube-node-2]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
kube-node-1                : ok=8    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   
kube-node-2                : ok=8    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   
kube-node-3                : ok=8    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   

Ansible playbook executed successfully.
Your Kubernetes nodes should now be configured to use 10.75.59.76 as their only DNS server.
Process complete.

root@kube-node-1:~# cat /etc/resolv.conf 
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search cisco.com
root@kube-node-1:~# cat /etc/netplan/50-cloud-init.yaml 
network:
  ethernets:
    enp1s0:
      addresses: [10.75.59.71/24]
      nameservers:
        addresses: [10.75.59.76]
        search: [cisco.com]
      routes:
      - {to: default, via: 10.75.59.1}
  version: 2

3.4 使用Ansible配置FRR BGP

#!/bin/bash

# This script automates the setup of an Ansible environment for installing and configuring FRRouting (FRR).
# It creates the project directory, inventory, configuration, and the playbook
# with an idempotent role to install and configure FRR.

# --- Configuration ---
PROJECT_DIR="ansible-frr-setup" # Changed project directory name
FRR_NODE_IP="10.75.59.76" # IP address of your FRR VM (frr-server-vm)
ANSIBLE_USER="ubuntu" # The user created by cloud-init on your VMs
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Path to your SSH private key on the Ansible control machine

# FRR specific configuration
FRR_AS=65000 # The Autonomous System number for this FRR node (example AS, choose your own)
K8S_MASTER_IP="10.75.59.71" # From your create-vms.sh script
K8S_WORKER_1_IP="10.75.59.72" # From your create-vms.sh script
K8S_WORKER_2_IP="10.75.59.73" # From your create-vms.sh script
CILIUM_BGP_AS=65000 # AS for Cilium as per your CiliumBGPClusterConfig

# --- Functions ---

# Function to install Ansible (if not already installed)
install_ansible() {
    echo "--- Installing Ansible ---"
    if ! command -v ansible &> /dev/null; then
        sudo apt update -y
        sudo apt install -y ansible
        echo "Ansible installed successfully."
    else
        echo "Ansible is already installed."
    fi
}

# Function to create project directory and navigate into it
create_project_dir() {
    echo "--- Creating project directory: ${PROJECT_DIR} ---"
    # Check if directory exists, if so, just navigate, otherwise create and navigate
    if [ ! -d "${PROJECT_DIR}" ]; then
        mkdir -p "${PROJECT_DIR}"
        echo "Created new directory: ${PROJECT_DIR}"
    else
        echo "Directory ${PROJECT_DIR} already exists."
    fi
    cd "${PROJECT_DIR}" || { echo "Failed to change directory to ${PROJECT_DIR}. Exiting."; exit 1; }
    echo "Changed to directory: $(pwd)"
}

# Function to create ansible.cfg
create_ansible_cfg() {
    echo "--- Creating ansible.cfg ---"
    cat <<EOF > ansible.cfg
[defaults]
inventory = inventory.ini
roles_path = ./roles
host_key_checking = False # WARNING: Disable host key checking for convenience. Re-enable for production!
EOF
    echo "ansible.cfg created."
}

# Function to create inventory.ini
create_inventory() {
    echo "--- Creating inventory.ini ---"
    cat <<EOF > inventory.ini
[frr_nodes]
frr-node-1 ansible_host=${FRR_NODE_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH}
ansible_python_interpreter=/usr/bin/python3
FRR_AS=${FRR_AS}
K8S_MASTER_IP=${K8S_MASTER_IP}
K8S_WORKER_1_IP=${K8S_WORKER_1_IP}
K8S_WORKER_2_IP=${K8S_WORKER_2_IP}
CILIUM_BGP_AS=${CILIUM_BGP_AS}
EOF
    echo "inventory.ini created."
}

# Function to create the main playbook.yml
create_playbook() {
    echo "--- Creating playbook.yml ---"
    cat <<EOF > playbook.yml
---
- name: Install and Configure FRRouting (FRR)
  hosts: frr_nodes
  become: yes
  roles:
    - frr_setup # Changed role name to frr_setup
EOF
    echo "playbook.yml created."
}

# Function to create the FRR installation and configuration role
create_frr_role() { # Changed function name from create_gobgp_role
    echo "--- Creating Ansible role for FRR setup ---"
    mkdir -p roles/frr_setup/tasks
    cat <<EOF > roles/frr_setup/tasks/main.yml
---
- name: Install FRRouting (FRR)
  ansible.builtin.apt:
    name: frr
    state: present
    update_cache: yes

- name: Configure FRR daemons (enable zebra and bgpd)
  ansible.builtin.lineinfile:
    path: /etc/frr/daemons
    regexp: '^(zebra|bgpd)='
    line: '\1=yes'
    state: present
    backrefs: yes # Required to make regexp work for replacement
  notify: Restart FRR service

- name: Configure frr.conf
  ansible.builtin.copy:
    dest: /etc/frr/frr.conf
    content: |
      !
      hostname {{ ansible_hostname }}
      password zebra
      enable password zebra
      !
      log syslog informational
      !
      router bgp {{ FRR_AS }}
       bgp router-id {{ ansible_host }}
       !
       neighbor {{ K8S_MASTER_IP }} remote-as {{ CILIUM_BGP_AS }}
       neighbor {{ K8S_WORKER_1_IP }} remote-as {{ CILIUM_BGP_AS }}
       neighbor {{ K8S_WORKER_2_IP }} remote-as {{ CILIUM_BGP_AS }}
       !
       address-family ipv4 unicast
        # Crucial: Redistribute BGP learned routes into the kernel
        redistribute connected
        redistribute static
        redistribute kernel
       exit-address-family
      !
      line vty
      !
    mode: '0644'
  notify: Restart FRR service # Handler only runs if file content changes

- name: Set permissions for frr.conf
  ansible.builtin.file:
    path: /etc/frr/frr.conf
    owner: frr
    group: frr
    mode: '0640'

- name: Enable and start FRR service
  ansible.builtin.systemd:
    name: frr
    state: started
    enabled: yes
    daemon_reload: yes # Ensure systemd reloads unit files if service file changed

EOF

    mkdir -p roles/frr_setup/handlers
    cat <<EOF > roles/frr_setup/handlers/main.yml
---
- name: Restart FRR service
  ansible.builtin.systemd:
    name: frr
    state: restarted
EOF
    echo "FRR Ansible role created."
}

# --- Main execution ---
install_ansible
create_project_dir
create_ansible_cfg
create_inventory
create_playbook
create_frr_role # Changed function call

echo ""
echo "--- Ansible setup for FRR installation is complete! ---"
echo "Navigate to the new project directory:"
echo "cd ${PROJECT_DIR}"
echo ""
echo "Then, run the Ansible playbook to install and configure FRR on your VM:"
echo "ansible-playbook playbook.yml -K"
echo ""
echo "After the playbook finishes, FRR should be running and configured on ${FRR_NODE_IP}."
echo "You can SSH into the VM and verify with 'sudo vtysh -c \"show ip bgp summary\"' and 'sudo ip route show'."

ois@ois:~/data/k8s$ ./ansible-frr.sh 
--- Installing Ansible ---
Ansible is already installed.
--- Creating project directory: ansible-frr-setup ---
Created new directory: ansible-frr-setup
Changed to directory: /home/ois/data/k8s/ansible-frr-setup
--- Creating ansible.cfg ---
ansible.cfg created.
--- Creating inventory.ini ---
inventory.ini created.
--- Creating playbook.yml ---
playbook.yml created.
--- Creating Ansible role for FRR setup ---
FRR Ansible role created.

--- Ansible setup for FRR installation is complete! ---
Navigate to the new project directory:
cd ansible-frr-setup

Then, run the Ansible playbook to install and configure FRR on your VM:
ansible-playbook playbook.yml -K

After the playbook finishes, FRR should be running and configured on 10.75.59.76.
You can SSH into the VM and verify with 'sudo vtysh -c "show ip bgp summary"' and 'sudo ip route show'.
ois@ois:~/data/k8s$ cd ansible-frr-setup/
ois@ois:~/data/k8s/ansible-frr-setup$ ansible-playbook playbook.yml -K
BECOME password: 

PLAY [Install and Configure FRRouting (FRR)] *******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [frr-node-1]

TASK [frr_setup : Install FRRouting (FRR)] *********************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Configure FRR daemons (enable zebra and bgpd)] ***********************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Configure frr.conf] **************************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Set permissions for frr.conf] ****************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

TASK [frr_setup : Enable and start FRR service] ****************************************************************************************************************************************************************************************************************
ok: [frr-node-1]

RUNNING HANDLER [frr_setup : Restart FRR service] **************************************************************************************************************************************************************************************************************
changed: [frr-node-1]

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
frr-node-1                 : ok=7    changed=5    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

root@dns-server-vm:~# cat /etc/frr/frr.conf 
!
hostname dns-server-vm
password zebra
enable password zebra
!
log syslog informational
!
router bgp 65000
 bgp router-id 10.75.59.76
 !
 neighbor 10.75.59.71 remote-as 65000
 neighbor 10.75.59.72 remote-as 65000
 neighbor 10.75.59.73 remote-as 65000
 !
 address-family ipv4 unicast
  # Crucial: Redistribute BGP learned routes into the kernel
  redistribute connected
  redistribute static
  redistribute kernel
 exit-address-family
!
line vty
!
root@dns-server-vm:~# systemctl status frr
* frr.service - FRRouting
     Loaded: loaded (/usr/lib/systemd/system/frr.service; enabled; preset: enabled)
     Active: active (running) since Wed 2025-07-23 12:16:58 JST; 1 week 1 day ago
       Docs: https://frrouting.readthedocs.io/en/latest/setup.html
    Process: 15611 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
   Main PID: 15623 (watchfrr)
     Status: "FRR Operational"
      Tasks: 13 (limit: 9486)
     Memory: 21.1M (peak: 28.3M)
        CPU: 5min 23.845s
     CGroup: /system.slice/frr.service
             |-15623 /usr/lib/frr/watchfrr -d -F traditional zebra bgpd staticd
             |-15636 /usr/lib/frr/zebra -d -F traditional -A 127.0.0.1 -s 90000000
             |-15641 /usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1
             `-15648 /usr/lib/frr/staticd -d -F traditional -A 127.0.0.1

Jul 31 16:26:25 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
Jul 31 16:27:24 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.73 in vrf default
Jul 31 16:27:24 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.72 in vrf default
Jul 31 16:27:24 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
Jul 31 16:46:48 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.72 in vrf default
Jul 31 16:46:48 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.73 in vrf default
Jul 31 16:46:48 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
Jul 31 16:47:54 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.73 in vrf default
Jul 31 16:47:54 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.72 in vrf default
Jul 31 16:47:54 dns-server-vm bgpd[15641]: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.75.59.71 in vrf default
root@dns-server-vm:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static 
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.76 
172.16.0.0/24 nhid 95 via 10.75.59.71 dev enp1s0 proto bgp metric 20 
172.16.1.0/24 nhid 90 via 10.75.59.72 dev enp1s0 proto bgp metric 20 
172.16.2.0/24 nhid 100 via 10.75.59.73 dev enp1s0 proto bgp metric 20 
172.16.16.1 nhid 202 proto bgp metric 20 
        nexthop via 10.75.59.72 dev enp1s0 weight 1 
        nexthop via 10.75.59.71 dev enp1s0 weight 1 
        nexthop via 10.75.59.73 dev enp1s0 weight 1 
172.16.16.10 nhid 202 proto bgp metric 20 
        nexthop via 10.75.59.72 dev enp1s0 weight 1 
        nexthop via 10.75.59.71 dev enp1s0 weight 1 
        nexthop via 10.75.59.73 dev enp1s0 weight 1 
172.16.20.119 nhid 202 proto bgp metric 20 
        nexthop via 10.75.59.72 dev enp1s0 weight 1 
        nexthop via 10.75.59.71 dev enp1s0 weight 1 
        nexthop via 10.75.59.73 dev enp1s0 weight 1 
172.16.22.26 nhid 202 proto bgp metric 20 
        nexthop via 10.75.59.72 dev enp1s0 weight 1 
        nexthop via 10.75.59.71 dev enp1s0 weight 1 
        nexthop via 10.75.59.73 dev enp1s0 weight 1 
172.16.23.18 nhid 202 proto bgp metric 20 
        nexthop via 10.75.59.72 dev enp1s0 weight 1 
        nexthop via 10.75.59.71 dev enp1s0 weight 1 
        nexthop via 10.75.59.73 dev enp1s0 weight 1 
172.16.30.170 nhid 202 proto bgp metric 20 
        nexthop via 10.75.59.72 dev enp1s0 weight 1 
        nexthop via 10.75.59.71 dev enp1s0 weight 1 
        nexthop via 10.75.59.73 dev enp1s0 weight 1 
root@dns-server-vm:~# vtysh -c 'show ip bgp summary'

IPv4 Unicast Summary (VRF default):
BGP router identifier 10.75.59.76, local AS number 65000 vrf-id 0
BGP table version 222
RIB entries 19, using 3648 bytes of memory
Peers 3, using 2172 KiB of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.75.59.71     4      65000     71265     71182        0    0    0 03:21:08            7        2 N/A
10.75.59.72     4      65000     71344     71264        0    0    0 03:21:09            7        2 N/A
10.75.59.73     4      65000     71240     71162        0    0    0 03:21:09            7        2 N/A

Total number of neighbors 3

4. 设置Kubernetes集群并安装CNI Cilium

4.1 使用kubeadm设置Kubernetes

使用以下命令进行集群初始化

kubeadm config images pull

kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy

--skip-phases=addon/kube-proxy 表示不安装kube-proxy，会用Cilium进行替代

ubuntu@kube-node-1:~$ sudo kubeadm init --control-plane-endpoint=kube-node-1 --pod-network-cidr=172.16.0.0/20 --service-cidr=172.16.32.0/20 --skip-phases=addon/kube-proxy[sudo] password for ubuntu: 
[init] Using Kubernetes version: v1.33.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-node-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.248.0.1 10.75.59.71]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.002649961s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://10.75.59.71:6443/livez
[control-plane-check] Checking kube-controller-manager at https://127.0.0.1:10257/healthz
[control-plane-check] Checking kube-scheduler at https://127.0.0.1:10259/livez
[control-plane-check] kube-controller-manager is healthy after 1.813351787s
[control-plane-check] kube-scheduler is healthy after 3.309147352s
[control-plane-check] kube-apiserver is healthy after 5.505049123s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 1r5ugd.o2pjzipcq69z71l8
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join kube-node-1:6443 --token 1r5ugd.o2pjzipcq69z71l8 \
        --discovery-token-ca-cert-hash sha256:e29fb62581a4d21268585c3b345f9e060827c52a8325b1d28b8437c792ba7923 \
        --control-plane 

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join kube-node-1:6443 --token 1r5ugd.o2pjzipcq69z71l8 \
        --discovery-token-ca-cert-hash sha256:e29fb62581a4d21268585c3b345f9e060827c52a8325b1d28b8437c792ba7923 
ubuntu@kube-node-1:~$ 
ubuntu@kube-node-1:~$   mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
ubuntu@kube-node-1:~$ sudo su
root@kube-node-1:/home/ubuntu# cd

在.bashrc 中增加以下内容
root@kube-node-1:~# cat .bashrc | grep export
export KUBECONFIG=/etc/kubernetes/admin.conf

这样root才能执行 kubectl 命令与 api-server通信.

root@kube-node-1:~# kubectl cluster-info
Kubernetes control plane is running at https://kube-node-1:6443
CoreDNS is running at https://kube-node-1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
root@kube-node-1:~# kubectl get node -o wide
NAME          STATUS     ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
kube-node-1   NotReady   control-plane   68s   v1.33.3   10.75.59.71   <none>        Ubuntu 24.04.2 LTS   6.8.0-63-generic   containerd://1.7.27

还没有安装CNI，Node NotReady

4.2 使用Ansible安装Helm

设置Helm的目的是为了安装Cilium，并非一定要使用Ansible来进行设置，供参考。

#!/bin/bash

# This script automates the setup of an Ansible environment for installing Helm.
# It creates the project directory, inventory, configuration, and the playbook
# with an idempotent role to install Helm.

# --- Configuration ---
PROJECT_DIR="ansible-helm"
MASTER_NODE_IP="10.75.59.71" # IP address of your Kubernetes master node (kube-node-1)
ANSIBLE_USER="ubuntu" # The user created by cloud-init on your VMs
SSH_PRIVATE_KEY_PATH="~/.ssh/id_rsa" # Path to your SSH private key on the Ansible control machine

# Helm version to install
HELM_VERSION="v3.18.4" # You can change this to a desired stable version

# --- Functions ---

# Function to create project directory and navigate into it
create_project_dir() {
    echo "--- Creating project directory: ${PROJECT_DIR} ---"
    # Check if directory exists, if so, just navigate, otherwise create and navigate
    if [ ! -d "${PROJECT_DIR}" ]; then
        mkdir -p "${PROJECT_DIR}"
        echo "Created new directory: ${PROJECT_DIR}"
    else
        echo "Directory ${PROJECT_DIR} already exists."
    fi
    cd "${PROJECT_DIR}" || { echo "Failed to change directory to ${PROJECT_DIR}. Exiting."; exit 1; }
    echo "Changed to directory: $(pwd)"
}

# Function to create ansible.cfg
create_ansible_cfg() {
    echo "--- Creating ansible.cfg ---"
    cat <<EOF > ansible.cfg
[defaults]
inventory = inventory.ini
roles_path = ./roles
host_key_checking = False # WARNING: Disable host key checking for convenience. Re-enable for production!
EOF
    echo "ansible.cfg created."
}

# Function to create inventory.ini
create_inventory() {
    echo "--- Creating inventory.ini ---"
    cat <<EOF > inventory.ini
[kubernetes_master]
kube-node-1 ansible_host=${MASTER_NODE_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_ssh_private_key_file=${SSH_PRIVATE_KEY_PATH}
ansible_python_interpreter=/usr/bin/python3
HELM_VERSION=${HELM_VERSION}
EOF
    echo "inventory.ini created."
}

# Function to create the main playbook.yml
create_playbook() {
    echo "--- Creating playbook.yml ---"
    cat <<EOF > playbook.yml
---
- name: Install Helm on Kubernetes Master Node
  hosts: kubernetes_master
  become: yes
  environment: # Ensure KUBECONFIG is set for helm commands run with become
    KUBECONFIG: /etc/kubernetes/admin.conf # Use the admin kubeconfig on the master
  roles:
    - helm_install
EOF
    echo "playbook.yml created."
}

# Function to create the Helm installation role (with idempotent check)
create_helm_role() {
    echo "--- Creating Ansible role for Helm installation ---"
    mkdir -p roles/helm_install/tasks
    cat <<EOF > roles/helm_install/tasks/main.yml
---
- name: Check if Helm is installed and get version
  ansible.builtin.command: helm version --short
  register: helm_version_raw
  ignore_errors: yes
  changed_when: false

- name: Set installed Helm version fact
  ansible.builtin.set_fact:
    installed_helm_version: "{{ (helm_version_raw.stdout | default('') | regex_findall('^(v[0-9]+\\\\.[0-9]+\\\\.[0-9]+)') | first | default('') | trim) }}"
  changed_when: false

- name: Debug installed Helm version
  ansible.builtin.debug:
    msg: "Current installed Helm version: {{ installed_helm_version | default('Not installed') }}"

- name: Debug raw Helm version output
  ansible.builtin.debug:
    msg: "Raw Helm version output: {{ helm_version_raw.stdout | default('No output') }}"
  when: helm_version_raw.stdout is defined and helm_version_raw.stdout | length > 0

- name: Check if Helm binary exists
  ansible.builtin.stat:
    path: /usr/local/bin/helm
  register: helm_binary_stat
  when: installed_helm_version == HELM_VERSION

- name: Download Helm tarball
  ansible.builtin.get_url:
    url: "https://get.helm.sh/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
    dest: "/tmp/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
    mode: '0644'
    checksum: "sha256:{{ lookup('url', 'https://get.helm.sh/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz.sha256sum', wantlist=True)[0].split(' ')[0] }}"
  register: download_helm_result
  until: download_helm_result is success
  retries: 5
  delay: 5
  when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Create Helm installation directory
  ansible.builtin.file:
    path: /usr/local/bin
    state: directory
    mode: '0755'
  when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Extract Helm binary
  ansible.builtin.unarchive:
    src: "/tmp/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
    dest: "/tmp"
    remote_src: yes
    creates: "/tmp/linux-amd64/helm"
  when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Move Helm binary to /usr/local/bin
  ansible.builtin.copy:
    src: "/tmp/linux-amd64/helm"
    dest: "/usr/local/bin/helm"
    mode: '0755'
    remote_src: yes
    owner: root
    group: root
  when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Clean up Helm tarball and extracted directory
  ansible.builtin.file:
    path: "{{ item }}"
    state: absent
  loop:
    - "/tmp/helm-{{ HELM_VERSION }}-linux-amd64.tar.gz"
    - "/tmp/linux-amd64"
  when: installed_helm_version != HELM_VERSION or not helm_binary_stat.stat.exists

- name: Verify Helm installation
  ansible.builtin.command: helm version --client
  register: helm_version_output
  changed_when: false

- name: Display Helm version
  ansible.builtin.debug:
    msg: "{{ helm_version_output.stdout }}"
EOF
    echo "Helm installation role created."
}

# --- Main execution ---
create_project_dir
create_ansible_cfg
create_inventory
create_playbook
create_helm_role

echo ""
echo "--- Ansible setup for Helm installation is complete! ---"
echo "Navigate to the new project directory:"
echo "cd ${PROJECT_DIR}"
echo ""
echo "Then, run the Ansible playbook to install only Helm on your master node:"
echo "ansible-playbook playbook.yml -K"
echo ""
echo "After Helm is installed, you can SSH into your master node (kube-node-1) and manage Cilium Enterprise installation directly using Helm."
echo "Remember to use the correct Cilium chart version and your custom values file."
echo "Example steps for manual Cilium installation via Helm:"
echo "ssh ubuntu@${MASTER_NODE_IP}"
echo "sudo helm repo add cilium https://helm.cilium.io/"
echo "sudo helm repo add isovalent https://helm.isovalent.com"
echo "sudo helm repo update"
echo "sudo helm install cilium isovalent/cilium --version 1.17.6 --namespace kube-system -f <path_to_your_cilium_values_file.yaml> --wait"
echo "Example content for /tmp/cilium-enterprise-values.yaml:"
echo "hubble:"
echo "  enabled: true"
echo "  relay:"
echo "    enabled: true"
echo "  ui:"
echo "    enabled: false"
echo "kubeProxyReplacement: strict"
echo "ipam:"
echo "  mode: kubernetes"
echo "ipv4NativeRoutingCIDR: 10.244.0.0/16"
echo "k8s:"
echo "  requireIPv4PodCIDR: true"
echo "routingMode: native"
echo "autoDirectNodeRoutes: false"
echo "bgpControlPlane:"
echo "  enabled: true"

执行效果如下：

ois@ois:~/data/k8s$ ./ansible-helm.sh 
--- Creating project directory: ansible-helm ---
Created new directory: ansible-helm
Changed to directory: /home/ois/data/k8s/ansible-helm
--- Creating ansible.cfg ---
ansible.cfg created.
--- Creating inventory.ini ---
inventory.ini created.
--- Creating playbook.yml ---
playbook.yml created.
--- Creating Ansible role for Helm installation ---
Helm installation role created.

--- Ansible setup for Helm installation is complete! ---
Navigate to the new project directory:
cd ansible-helm

Then, run the Ansible playbook to install only Helm on your master node:
ansible-playbook playbook.yml -K

After Helm is installed, you can SSH into your master node (kube-node-1) and manage Cilium Enterprise installation directly using Helm.
Remember to use the correct Cilium chart version and your custom values file.
Example steps for manual Cilium installation via Helm:
ssh ubuntu@10.75.59.71
sudo helm repo add cilium https://helm.cilium.io/
sudo helm repo add isovalent https://helm.isovalent.com
sudo helm repo update
sudo helm install cilium isovalent/cilium --version 1.17.6 --namespace kube-system -f <path_to_your_cilium_values_file.yaml> --wait
Example content for /tmp/cilium-enterprise-values.yaml:
hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: false
kubeProxyReplacement: strict
ipam:
  mode: kubernetes
ipv4NativeRoutingCIDR: 172.16.0.0/20
k8s:
  requireIPv4PodCIDR: true
routingMode: native
autoDirectNodeRoutes: false
bgpControlPlane:
  enabled: true
ois@ois:~/data/k8s$ cd ansible-helm
ois@ois:~/data/k8s/ansible-helm$ ansible-playbook playbook.yml -K
BECOME password: 

PLAY [Install Helm on Kubernetes Master Node] ******************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Check if Helm is installed and get version] ***********************************************************************************************************************************************************************************************
fatal: [kube-node-1]: FAILED! => {"changed": false, "cmd": "helm version --short", "msg": "[Errno 2] No such file or directory: b'helm'", "rc": 2, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring

TASK [helm_install : Set installed Helm version fact] **********************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Debug installed Helm version] *************************************************************************************************************************************************************************************************************
ok: [kube-node-1] => {
    "msg": "Current installed Helm version: "
}

TASK [helm_install : Debug raw Helm version output] ************************************************************************************************************************************************************************************************************
skipping: [kube-node-1]

TASK [helm_install : Check if Helm binary exists] **************************************************************************************************************************************************************************************************************
skipping: [kube-node-1]

TASK [helm_install : Download Helm tarball] ********************************************************************************************************************************************************************************************************************
changed: [kube-node-1]

TASK [helm_install : Create Helm installation directory] *******************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Extract Helm binary] **********************************************************************************************************************************************************************************************************************
changed: [kube-node-1]

TASK [helm_install : Move Helm binary to /usr/local/bin] *******************************************************************************************************************************************************************************************************
changed: [kube-node-1]

TASK [helm_install : Clean up Helm tarball and extracted directory] ********************************************************************************************************************************************************************************************
changed: [kube-node-1] => (item=/tmp/helm-v3.18.4-linux-amd64.tar.gz)
changed: [kube-node-1] => (item=/tmp/linux-amd64)

TASK [helm_install : Verify Helm installation] *****************************************************************************************************************************************************************************************************************
ok: [kube-node-1]

TASK [helm_install : Display Helm version] *********************************************************************************************************************************************************************************************************************
ok: [kube-node-1] => {
    "msg": "version.BuildInfo{Version:\"v3.18.4\", GitCommit:\"d80839cf37d860c8aa9a0503fe463278f26cd5e2\", GitTreeState:\"clean\", GoVersion:\"go1.24.4\"}"
}

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************
kube-node-1                : ok=11   changed=4    unreachable=0    failed=0    skipped=2    rescued=0    ignored=1

4.3 使用Helm安装Cilium


root@kube-node-1:~# helm repo add cilium https://helm.cilium.io/
"cilium" has been added to your repositories
root@kube-node-1:~# helm repo add isovalent https://helm.isovalent.com
"isovalent" has been added to your repositories
root@kube-node-1:~# 

准备配置文件如下：

root@kube-node-1:~# cat > cilium-enterprise-values.yaml <<EOF
hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: false

# Enable Gateway API
#gatewayAPI:
#  enabled: true

# Explicitly disable Egress Gateway
#egressGateway:
#  enabled: false

# BGP native-routing configuration
ipam:
  mode: kubernetes
ipv4NativeRoutingCIDR: 172.16.0.0/20 # Advertises all pod CIDRs; ensure BGP router supports this
k8s:
  requireIPv4PodCIDR: true
routingMode: native
autoDirectNodeRoutes: true
bgpControlPlane:
  enabled: true
  # Configure BGP peers (replace with your BGP router details)
  announce:
    podCIDR: true # Advertise pod CIDRs to BGP peers
enableIPv4Masquerade: true

# Enable kube-proxy replacement
kubeProxyReplacement: true

bpf:
  masquerade: true
  lb:
    externalClusterIP: true
    sock: true
EOF

root@kube-node-1:~# helm install cilium isovalent/cilium --version 1.17.6 \
    --namespace kube-system \
    --set kubeProxyReplacement=true \
    --set k8sServiceHost=10.75.59.71 \
    --set k8sServicePort=6443 \
    -f cilium-enterprise-values.yaml
NAME: cilium
LAST DEPLOYED: Fri Aug  1 09:54:27 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble Relay.

Your release version is 1.17.6.

For any further help, visit https://docs.isovalent.com/v1.17

上述命令中的参数k8sServiceHost=10.75.59.71 和 k8sServicePort=6443 是不能少的。

在其他Node上执行kubeadm join

ubuntu@kube-node-2:~$ sudo su
[sudo] password for ubuntu: 
root@kube-node-2:/home/ubuntu# cd
root@kube-node-2:~# kubeadm join kube-node-1:6443 --token wnc2sl.st6g6c4o0cd42bi4 \
        --discovery-token-ca-cert-hash sha256:381868d3e0faab6dbd3e240d8f40e0e81ab46cb54b2f15ffbfe0f587fac5d982 
[preflight] Running pre-flight checks
[preflight] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[preflight] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0801 09:56:27.830756   47851 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:wnc2sl" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.503396779s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

等待一段时间，Cilium、Hubble Relay即可Ready

root@kube-node-1:~# kubectl get pods -n kube-system -o wide
NAME                                  READY   STATUS    RESTARTS   AGE     IP            NODE          NOMINATED NODE   READINESS GATES
cilium-2vrgj                          1/1     Running   0          3m58s   10.75.59.73   kube-node-3   <none>           <none>
cilium-65kvc                          1/1     Running   0          4m14s   10.75.59.72   kube-node-2   <none>           <none>
cilium-envoy-24sd7                    1/1     Running   0          4m14s   10.75.59.72   kube-node-2   <none>           <none>
cilium-envoy-7pr4g                    1/1     Running   0          6m12s   10.75.59.71   kube-node-1   <none>           <none>
cilium-envoy-k86tp                    1/1     Running   0          3m58s   10.75.59.73   kube-node-3   <none>           <none>
cilium-operator-867fb7f659-2vnld      1/1     Running   0          6m12s   10.75.59.72   kube-node-2   <none>           <none>
cilium-operator-867fb7f659-5998x      1/1     Running   0          6m12s   10.75.59.71   kube-node-1   <none>           <none>
cilium-x4pr7                          1/1     Running   0          6m12s   10.75.59.71   kube-node-1   <none>           <none>
coredns-674b8bbfcf-6t8np              1/1     Running   0          13m     172.16.1.40   kube-node-2   <none>           <none>
coredns-674b8bbfcf-bx8xd              1/1     Running   0          13m     172.16.1.64   kube-node-2   <none>           <none>
etcd-kube-node-1                      1/1     Running   1          13m     10.75.59.71   kube-node-1   <none>           <none>
hubble-relay-cfb755899-gch8w          1/1     Running   0          6m12s   172.16.1.81   kube-node-2   <none>           <none>
kube-apiserver-kube-node-1            1/1     Running   1          13m     10.75.59.71   kube-node-1   <none>           <none>
kube-controller-manager-kube-node-1   1/1     Running   1          13m     10.75.59.71   kube-node-1   <none>           <none>
kube-scheduler-kube-node-1            1/1     Running   1          13m     10.75.59.71   kube-node-1   <none>           <none>

4.4 安装企业版Cilium-cli

curl -L --remote-name-all https://github.com/isovalent/cilium-cli-releases/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}

sha256sum --check cilium-linux-amd64.tar.gz.sha256sum

tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin

root@kube-node-1:~# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    OK
 \__/¯¯\__/    Hubble Relay:       OK
    \__/       ClusterMesh:        disabled

DaemonSet              cilium                   Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet              cilium-envoy             Desired: 3, Ready: 3/3, Available: 3/3
Deployment             cilium-operator          Desired: 2, Ready: 2/2, Available: 2/2
Deployment             hubble-relay             Desired: 1, Ready: 1/1, Available: 1/1
Containers:            cilium                   Running: 3
                       cilium-envoy             Running: 3
                       cilium-operator          Running: 2
                       clustermesh-apiserver    
                       hubble-relay             Running: 1
Cluster Pods:          3/3 managed by Cilium
Helm chart version:    1.17.6
Image versions         cilium             quay.io/isovalent/cilium:v1.17.6-cee.1@sha256:2d01daf4f25f7d644889b49ca856e1a4269981fc963e50bd3962665b41b6adb3: 3
                       cilium-envoy       quay.io/isovalent/cilium-envoy:v1.17.6-cee.1@sha256:318eff387835ca2717baab42a84f35a83a5f9e7d519253df87269f80b9ff0171: 3
                       cilium-operator    quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265: 2
                       hubble-relay       quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e: 1
root@kube-node-1:~#

4.5 设置Cilium BGP

root@kube-node-1:~#cat > cilium-bgp.yaml << EOF
--- #bgp的对外宣告策略，宣告POD和serviceip
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
  name: bgp-advertisements
  labels:
    advertise: bgp
spec:
  advertisements:
    - advertisementType: "PodCIDR"              # Only for Kubernetes or ClusterPool IPAM cluster-pool
    - advertisementType: "Service"
      service:
        addresses:
          - ClusterIP
          - ExternalIP
          #- LoadBalancerIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']}                  # 等同于宣告所有

--- #bgp邻居组的配置，类似template peer ,里面调用相关的advertisement策略
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeerConfig
metadata:
  name: cilium-peer
spec:
  timers:
    holdTimeSeconds: 30             #default 90s
    keepAliveTimeSeconds: 10  #default 30s
    connectRetryTimeSeconds: 40  #default 120s
  gracefulRestart:
    enabled: true
    restartTimeSeconds: 120        #default 120s
  #transport:
  #  peerPort: 179
  families:
    - afi: ipv4
      safi: unicast
      advertisements:
        matchLabels:
          advertise: "bgp"

--- #bgp的邻居配置
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
  name: cilium-bgp-default
spec:
  bgpInstances:
  - name: "instance-65000"
    localASN: 65000
    peers:
    - name: "GoBGP"
      peerASN: 65000
      peerAddress: 10.75.59.76
      peerConfigRef:
        name: "cilium-peer"
EOF
root@kube-node-1:~# 
root@kube-node-1:~# 
root@kube-node-1:~# kubectl apply -f cilium-bgp.yaml 
ciliumbgpadvertisement.cilium.io/bgp-advertisements created
ciliumbgppeerconfig.cilium.io/cilium-peer created
ciliumbgpclusterconfig.cilium.io/cilium-bgp-default created
root@kube-node-1:~# cilium bgp peers
Node          Local AS   Peer AS   Peer Address   Session State   Uptime   Family         Received   Advertised
kube-node-1   65000      65000     10.75.59.76    established     7s       ipv4/unicast   2          6    
kube-node-2   65000      65000     10.75.59.76    established     6s       ipv4/unicast   2          6    
kube-node-3   65000      65000     10.75.59.76    established     6s       ipv4/unicast   2          6    
root@kube-node-1:~# cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node          VRouter   Prefix             NextHop   Age   Attrs
kube-node-1   65000     172.16.0.0/24      0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.1/32     0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.10/32    0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.37.239/32   0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.43.10/32    0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
kube-node-2   65000     172.16.1.0/24      0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.1/32     0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.10/32    0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.37.239/32   0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.43.10/32    0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
kube-node-3   65000     172.16.3.0/24      0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.1/32     0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.10/32    0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.37.239/32   0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.43.10/32    0.0.0.0   12s   [{Origin: i} {Nexthop: 0.0.0.0}]   
root@kube-node-1:~#

4.6 安装Hubble UI

root@kube-node-1:~# helm search repo isovalent/hubble-ui -l
NAME                    CHART VERSION   APP VERSION     DESCRIPTION         
isovalent/hubble-ui     1.3.6           1.3.6           Hubble UI Enterprise
isovalent/hubble-ui     1.3.5           1.3.5           Hubble UI Enterprise

root@kube-node-1:~# cat > hubble-ui-values.yaml << EOF
relay:
  address: "hubble-relay.kube-system.svc.cluster.local"
EOF
root@kube-node-1:~#  
root@kube-node-1:~# helm install hubble-ui isovalent/hubble-ui --version 1.3.6 --namespace kube-system --values hubble-ui-values.yaml --wait
NAME: hubble-ui
LAST DEPLOYED: Fri Aug  1 10:47:58 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Hubble-Ui.
Your release version is 1.3.6.

For any further help, visit https://docs.isovalent.com
root@kube-node-1:~# kubectl patch service hubble-ui -n kube-system -p '{"spec": {"type": "NodePort"}}'
service/hubble-ui patched
root@kube-node-1:~# kubectl get svc -n kube-system -o wide                                                                  
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE   SELECTOR
cilium-envoy   ClusterIP   None            <none>        9964/TCP                 54m   k8s-app=cilium-envoy
hubble-peer    ClusterIP   172.16.43.10    <none>        443/TCP                  54m   k8s-app=cilium
hubble-relay   NodePort    172.16.37.239   <none>        80:31234/TCP             54m   k8s-app=hubble-relay
hubble-ui      NodePort    172.16.35.177   <none>        80:31225/TCP             64s   k8s-app=hubble-ui
kube-dns       ClusterIP   172.16.32.10    <none>        53/UDP,53/TCP,9153/TCP   61m   k8s-app=kube-dns
root@kube-node-1:~# kubectl -n kube-system exec ds/cilium -- cilium service list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ID   Frontend                Service Type   Backend                               
1    172.16.32.1:443/TCP     ClusterIP      1 => 10.75.59.71:6443/TCP (active)    
2    172.16.43.10:443/TCP    ClusterIP      1 => 10.75.59.71:4244/TCP (active)    
3    172.16.37.239:80/TCP    ClusterIP      1 => 172.16.1.81:4245/TCP (active)    
4    10.75.59.71:31234/TCP   NodePort       1 => 172.16.1.81:4245/TCP (active)    
5    0.0.0.0:31234/TCP       NodePort       1 => 172.16.1.81:4245/TCP (active)    
6    172.16.32.10:53/TCP     ClusterIP      1 => 172.16.1.64:53/TCP (active)      
                                            2 => 172.16.1.40:53/TCP (active)      
7    172.16.32.10:9153/TCP   ClusterIP      1 => 172.16.1.64:9153/TCP (active)    
                                            2 => 172.16.1.40:9153/TCP (active)    
8    172.16.32.10:53/UDP     ClusterIP      1 => 172.16.1.64:53/UDP (active)      
                                            2 => 172.16.1.40:53/UDP (active)      
9    172.16.35.177:80/TCP    ClusterIP      1 => 172.16.3.127:8081/TCP (active)   
10   10.75.59.71:31225/TCP   NodePort       1 => 172.16.3.127:8081/TCP (active)   
11   0.0.0.0:31225/TCP       NodePort       1 => 172.16.3.127:8081/TCP (active)  

浏览器可以通过http://10.75.59.71:31225/ 访问Hubble-ui

root@dns-server-vm:~# curl http://10.75.59.71:31225/
<!doctype html><html><head><meta charset="utf-8"/><title>Hubble UI Enterprise</title><meta http-equiv="X-UA-Compatible" content="IE=edge"/><meta name="viewport" content="width=device-width,user-scalable=0,initial-scale=1,minimum-scale=1,maximum-scale=1"/><link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png"/><link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png"/><link rel="shortcut icon" href="/favicon.ico"/><link rel="stylesheet" href="/fonts/inter/stylesheet.css"/><link rel="stylesheet" href="/fonts/roboto-mono/stylesheet.css"/><script defer="defer" src="/bundle.app.77bec96f333a96efe6ea.js"></script><link href="/bundle.app.f1e6c0c33f1535bc8508.css" rel="stylesheet"><script type="text/template" id="hubble-ui/feature-flags">[10, 0, 18, 0, 26, 0, 34, 0, 42, 0, 50, 0]</script><script type="text/template" id="hubble-ui/authorization">[8, 1, 26, 4, 24, 1, 32, 1]</script></head><body><div id="test-process-tree-char" style="font-family: 'Roboto Mono', monospace;
        font-size: 16px;
        position: absolute;
        visibility: hidden;
        height: auto;
        width: auto;
        white-space: nowrap;">a</div><div id="app"></div></body></html>root@dns-server-vm:~# 
root@dns-server-vm:~#

5. 部署Star Wars App

5.1 部署APP

Isovalent 提供了一个Demo APP，以下是部署脚本。

root@kube-node-1:~# kubectl create namespace star-wars
namespace/star-wars created
root@kube-node-1:~# kubectl apply -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/minikube/http-sw-app.yaml
service/deathstar created
deployment.apps/deathstar created
pod/tiefighter created
pod/xwing created
root@kube-node-1:~# kubectl -n star-wars get pod -o wide --show-labels
NAME                         READY   STATUS    RESTARTS   AGE   IP             NODE          NOMINATED NODE   READINESS GATES   LABELS
deathstar-86f85ffb4d-4ldsj   1/1     Running   0          39s   172.16.1.231   kube-node-2   <none>           <none>            app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
deathstar-86f85ffb4d-dbzft   1/1     Running   0          39s   172.16.3.161   kube-node-3   <none>           <none>            app.kubernetes.io/name=deathstar,class=deathstar,org=empire,pod-template-hash=86f85ffb4d
tiefighter                   1/1     Running   0          39s   172.16.3.247   kube-node-3   <none>           <none>            app.kubernetes.io/name=tiefighter,class=tiefighter,org=empire
xwing                        1/1     Running   0          39s   172.16.3.155   kube-node-3   <none>           <none>            app.kubernetes.io/name=xwing,class=xwing,org=alliance
root@kube-node-1:~# kubectl -n star-wars get service -o wide
NAME        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE   SELECTOR
deathstar   ClusterIP   172.16.39.138   <none>        80/TCP    82s   class=deathstar,org=empire
root@kube-node-1:~# kubectl -n star-wars patch service deathstar -p '{"spec":{"type":"NodePort"}}'
service/deathstar patched
root@kube-node-1:~# kubectl -n star-wars get service -o wide
NAME        TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE    SELECTOR
deathstar   NodePort   172.16.39.138   <none>        80:32271/TCP   112s   class=deathstar,org=empire
root@kube-node-1:~# kubectl -n kube-system exec ds/cilium -- cilium service list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
ID   Frontend                Service Type   Backend                               
1    172.16.32.1:443/TCP     ClusterIP      1 => 10.75.59.71:6443/TCP (active)    
2    172.16.43.10:443/TCP    ClusterIP      1 => 10.75.59.71:4244/TCP (active)    
3    172.16.37.239:80/TCP    ClusterIP      1 => 172.16.1.81:4245/TCP (active)    
4    10.75.59.71:31234/TCP   NodePort       1 => 172.16.1.81:4245/TCP (active)    
5    0.0.0.0:31234/TCP       NodePort       1 => 172.16.1.81:4245/TCP (active)    
6    172.16.32.10:53/TCP     ClusterIP      1 => 172.16.1.64:53/TCP (active)      
                                            2 => 172.16.1.40:53/TCP (active)      
7    172.16.32.10:9153/TCP   ClusterIP      1 => 172.16.1.64:9153/TCP (active)    
                                            2 => 172.16.1.40:9153/TCP (active)    
8    172.16.32.10:53/UDP     ClusterIP      1 => 172.16.1.64:53/UDP (active)      
                                            2 => 172.16.1.40:53/UDP (active)      
9    172.16.35.177:80/TCP    ClusterIP      1 => 172.16.3.127:8081/TCP (active)   
10   10.75.59.71:31225/TCP   NodePort       1 => 172.16.3.127:8081/TCP (active)   
11   0.0.0.0:31225/TCP       NodePort       1 => 172.16.3.127:8081/TCP (active)   
12   172.16.39.138:80/TCP    ClusterIP      1 => 172.16.3.161:80/TCP (active)     
                                            2 => 172.16.1.231:80/TCP (active)     
13   10.75.59.71:32271/TCP   NodePort       1 => 172.16.3.161:80/TCP (active)     
                                            2 => 172.16.1.231:80/TCP (active)     
14   0.0.0.0:32271/TCP       NodePort       1 => 172.16.3.161:80/TCP (active)     
                                            2 => 172.16.1.231:80/TCP (active)     
到这里就算部署成功了。

root@kube-node-1:~# kubectl -n star-wars exec tiefighter --   curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed
root@kube-node-1:~# kubectl -n star-wars exec xwing --   curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
Ship landed
root@kube-node-1:~# 

在外部的设备也可以通过节点IP访问。

root@dns-server-vm:~# curl -s -XPOST http://10.75.59.72:32271/v1/request-landing
Ship landed
root@dns-server-vm:~# curl -s -XPOST http://10.75.59.71:32271/v1/request-landing
Ship landed
root@dns-server-vm:~# curl -s -XPOST http://10.75.59.73:32271/v1/request-landing
Ship landed

5.2 在Node外部抓包 ( Pod to Pod )

查找虚拟机连接在网桥上的网卡

root@ois:/home/ois/data/k8s# virsh list
 Id    Name                  State
--------------------------------------
 4     win1                  running
 17    r1                    running
 69    ubuntu-2404-desktop   running
 96    dns-server-vm         running
 97    u1                    running
 98    ubuntu24042           running
 99    kube-node-1           running
 100   kube-node-2           running
 101   kube-node-3           running

root@ois:/home/ois/data/k8s# virsh domiflist 99
 Interface   Type     Source   Model    MAC
-----------------------------------------------------------
 vnet90      bridge   br0      virtio   52:54:00:90:8c:cf

root@ois:/home/ois/data/k8s# virsh domiflist 100
 Interface   Type     Source   Model    MAC
-----------------------------------------------------------
 vnet91      bridge   br0      virtio   52:54:00:ba:d4:1f

root@ois:/home/ois/data/k8s# virsh domiflist 101
 Interface   Type     Source   Model    MAC
-----------------------------------------------------------
 vnet92      bridge   br0      virtio   52:54:00:37:e0:96

在Pod中发起访问

root@kube-node-1:~# kubectl -n star-wars get pods -o wide
NAME                         READY   STATUS    RESTARTS   AGE    IP             NODE          NOMINATED NODE   READINESS GATES
deathstar-86f85ffb4d-4ldsj   1/1     Running   0          4m1s   172.16.1.231   kube-node-2   <none>           <none>
deathstar-86f85ffb4d-dbzft   1/1     Running   0          4m1s   172.16.3.161   kube-node-3   <none>           <none>
tiefighter                   1/1     Running   0          4m1s   172.16.3.247   kube-node-3   <none>           <none>
xwing                        1/1     Running   0          4m1s   172.16.3.155   kube-node-3   <none>           <none>
root@kube-node-1:~# kubectl -n star-wars exec xwing -- ip a                                                                            
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
11: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether f2:2a:b8:da:e7:d2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.16.3.155/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::f02a:b8ff:feda:e7d2/64 scope link 
       valid_lft forever preferred_lft forever
root@kube-node-1:~# kubectl -n star-wars exec xwing -- ping 172.16.1.231
error: Internal error occurred: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "1b1120795a5b60f35bac5a4056a1714a5c0df32762e2a79f272cf2c82089e970": OCI runtime exec failed: exec failed: unable to start container process: exec: "ping": executable file not found in $PATH: unknown
root@kube-node-1:~# kubectl -n star-wars exec xwing -- curl -s http://172.16.1.231/v1
{
        "name": "Death Star",
        "hostname": "deathstar-86f85ffb4d-4ldsj",
        "model": "DS-1 Orbital Battle Station",
        "manufacturer": "Imperial Department of Military Research, Sienar Fleet Systems",
        "cost_in_credits": "1000000000000",
        "length": "120000",
        "crew": "342953",
        "passengers": "843342",
        "cargo_capacity": "1000000000000",
        "hyperdrive_rating": "4.0",
        "starship_class": "Deep Space Mobile Battlestation",
        "api": [
                "GET   /v1",
                "GET   /v1/healthz",
                "POST  /v1/request-landing",
                "PUT   /v1/cargobay",
                "GET   /v1/hyper-matter-reactor/status",
                "PUT   /v1/exhaust-port"
        ]
}

在宿主机上进行抓包，可以看到两者之间是直接路由的。

root@ois:/home/ois/data/k8s# tcpdump -i vnet92 -vn 'tcp port 80'
tcpdump: listening on vnet92, link-type EN10MB (Ethernet), snapshot length 262144 bytes
11:11:11.478887 IP (tos 0x0, ttl 63, id 25688, offset 0, flags [DF], proto TCP (6), length 60)
    172.16.3.155.37162 > 172.16.1.231.80: Flags [S], cksum 0x5dd1 (incorrect -> 0xdb07), seq 542023762, win 64240, options [mss 1460,sackOK,TS val 1624400247 ecr 0,nop,wscale 7], length 0
11:11:11.479178 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    172.16.1.231.80 > 172.16.3.155.37162: Flags [S.], cksum 0x5dd1 (incorrect -> 0x7b14), seq 3892089835, ack 542023763, win 65160, options [mss 1460,sackOK,TS val 727758081 ecr 1624400247,nop,wscale 7], length 0
11:11:11.479479 IP (tos 0x0, ttl 63, id 25689, offset 0, flags [DF], proto TCP (6), length 52)
    172.16.3.155.37162 > 172.16.1.231.80: Flags [.], cksum 0x5dc9 (incorrect -> 0xa673), ack 1, win 502, options [nop,nop,TS val 1624400247 ecr 727758081], length 0
11:11:11.479552 IP (tos 0x0, ttl 63, id 25690, offset 0, flags [DF], proto TCP (6), length 130)
    172.16.3.155.37162 > 172.16.1.231.80: Flags [P.], cksum 0x5e17 (incorrect -> 0x366d), seq 1:79, ack 1, win 502, options [nop,nop,TS val 1624400247 ecr 727758081], length 78: HTTP, length: 78
        GET /v1 HTTP/1.1
        Host: 172.16.1.231
        User-Agent: curl/7.88.1
        Accept: */*

11:11:11.479684 IP (tos 0x0, ttl 63, id 13836, offset 0, flags [DF], proto TCP (6), length 52)
    172.16.1.231.80 > 172.16.3.155.37162: Flags [.], cksum 0x5dc9 (incorrect -> 0xa61e), ack 79, win 509, options [nop,nop,TS val 727758081 ecr 1624400247], length 0
11:11:11.480420 IP (tos 0x0, ttl 63, id 13837, offset 0, flags [DF], proto TCP (6), length 746)
    172.16.1.231.80 > 172.16.3.155.37162: Flags [P.], cksum 0x607f (incorrect -> 0xc657), seq 1:695, ack 79, win 509, options [nop,nop,TS val 727758082 ecr 1624400247], length 694: HTTP, length: 694
        HTTP/1.1 200 OK
        Content-Type: text/plain
        Date: Fri, 01 Aug 2025 03:11:11 GMT
        Content-Length: 591

        {
                "name": "Death Star",
                "hostname": "deathstar-86f85ffb4d-4ldsj",
                "model": "DS-1 Orbital Battle Station",
                "manufacturer": "Imperial Department of Military Research, Sienar Fleet Systems",
                "cost_in_credits": "1000000000000",
                "length": "120000",
                "crew": "342953",
                "passengers": "843342",
                "cargo_capacity": "1000000000000",
                "hyperdrive_rating": "4.0",
                "starship_class": "Deep Space Mobile Battlestation",
                "api": [
                        "GET   /v1",
                        "GET   /v1/healthz",
                        "POST  /v1/request-landing",
                        "PUT   /v1/cargobay",
                        "GET   /v1/hyper-matter-reactor/status",
                        "PUT   /v1/exhaust-port"
                ]
        }

Cilium将各个Node对应的Pod路由自动安装了

root@kube-node-3:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static 
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.73 
172.16.0.0/24 via 10.75.59.71 dev enp1s0 proto kernel 
172.16.1.0/24 via 10.75.59.72 dev enp1s0 proto kernel 
172.16.3.0/24 via 172.16.3.22 dev cilium_host proto kernel src 172.16.3.22 
172.16.3.22 dev cilium_host proto kernel scope link 
root@kube-node-3:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.75.59.1      0.0.0.0         UG    0      0        0 enp1s0
10.75.59.0      0.0.0.0         255.255.255.0   U     0      0        0 enp1s0
172.16.0.0      10.75.59.71     255.255.255.0   UG    0      0        0 enp1s0
172.16.1.0      10.75.59.72     255.255.255.0   UG    0      0        0 enp1s0
172.16.3.0      172.16.3.22     255.255.255.0   UG    0      0        0 cilium_host
172.16.3.22     0.0.0.0         255.255.255.255 UH    0      0        0 cilium_host

root@kube-node-2:~# ip route show
default via 10.75.59.1 dev enp1s0 proto static 
10.75.59.0/24 dev enp1s0 proto kernel scope link src 10.75.59.72 
172.16.0.0/24 via 10.75.59.71 dev enp1s0 proto kernel 
172.16.1.0/24 via 172.16.1.128 dev cilium_host proto kernel src 172.16.1.128 
172.16.1.128 dev cilium_host proto kernel scope link 
172.16.3.0/24 via 10.75.59.73 dev enp1s0 proto kernel 
root@kube-node-2:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.75.59.1      0.0.0.0         UG    0      0        0 enp1s0
10.75.59.0      0.0.0.0         255.255.255.0   U     0      0        0 enp1s0
172.16.0.0      10.75.59.71     255.255.255.0   UG    0      0        0 enp1s0
172.16.1.0      172.16.1.128    255.255.255.0   UG    0      0        0 cilium_host
172.16.1.128    0.0.0.0         255.255.255.255 UH    0      0        0 cilium_host
172.16.3.0      10.75.59.73     255.255.255.0   UG    0      0        0 enp1s0

在Hubble UI中，可以看到详细的Flow

{
  "uuid": "2af76725-f6f9-49b4-b9f1-8d01b3f14930",
  "verdict": 1,
  "drop_reason": 0,
  "auth_type": 0,
  "Type": 1,
  "node_name": "kube-node-2",
  "node_labels": [
    "beta.kubernetes.io/arch=amd64",
    "beta.kubernetes.io/os=linux",
    "kubernetes.io/arch=amd64",
    "kubernetes.io/hostname=kube-node-2",
    "kubernetes.io/os=linux"
  ],
  "source_names": [],
  "destination_names": [],
  "reply": false,
  "traffic_direction": 2,
  "policy_match_type": 0,
  "trace_observation_point": 101,
  "trace_reason": 1,
  "drop_reason_desc": 0,
  "debug_capture_point": 0,
  "proxy_port": 0,
  "sock_xlate_point": 0,
  "socket_cookie": 0,
  "cgroup_id": 0,
  "Summary": "TCP Flags: SYN",
  "egress_allowed_by": [],
  "ingress_allowed_by": [],
  "egress_denied_by": [],
  "ingress_denied_by": [],
  "time": {
    "seconds": 1754022736,
    "nanos": 962926009
  },
  "ethernet": {
    "source": "f2:64:9f:b5:8e:81",
    "destination": "82:31:36:d1:5a:00"
  },
  "IP": {
    "source": "172.16.3.155",
    "source_xlated": "",
    "destination": "172.16.1.231",
    "ipVersion": 1,
    "encrypted": false
  },
  "l4": {
    "protocol": {
      "oneofKind": "TCP",
      "TCP": {
        "source_port": 38680,
        "destination_port": 80,
        "flags": {
          "FIN": false,
          "SYN": true,
          "RST": false,
          "PSH": false,
          "ACK": false,
          "URG": false,
          "ECE": false,
          "CWR": false,
          "NS": false
        }
      }
    }
  },
  "source": {
    "ID": 0,
    "identity": 36770,
    "cluster_name": "default",
    "namespace": "star-wars",
    "labels": [
      "k8s:app.kubernetes.io/name=xwing",
      "k8s:class=xwing",
      "k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=star-wars",
      "k8s:io.cilium.k8s.policy.cluster=default",
      "k8s:io.cilium.k8s.policy.serviceaccount=default",
      "k8s:io.kubernetes.pod.namespace=star-wars",
      "k8s:org=alliance"
    ],
    "pod_name": "xwing",
    "workloads": []
  },
  "destination": {
    "ID": 284,
    "identity": 15153,
    "cluster_name": "default",
    "namespace": "star-wars",
    "labels": [
      "k8s:app.kubernetes.io/name=deathstar",
      "k8s:class=deathstar",
      "k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=star-wars",
      "k8s:io.cilium.k8s.policy.cluster=default",
      "k8s:io.cilium.k8s.policy.serviceaccount=default",
      "k8s:io.kubernetes.pod.namespace=star-wars",
      "k8s:org=empire"
    ],
    "pod_name": "deathstar-86f85ffb4d-4ldsj",
    "workloads": [
      {
        "name": "deathstar",
        "kind": "Deployment"
      }
    ]
  },
  "event_type": {
    "type": 4,
    "sub_type": 0
  },
  "is_reply": {
    "value": false
  },
  "interface": {
    "index": 14,
    "name": "lxcf734e435e18a"
  }
}

5.3 在Node外部抓包 ( Pod to External )

接下来，在Pod内部访问Internet进行测试。

root@kube-node-1:~# kubectl get configmap cilium-config -n kube-system -o yaml | grep -E 'enable-ipv4-masquerade|enable-bpf-masquerade'
  enable-bpf-masquerade: "true"
  enable-ipv4-masquerade: "true"
  
Pod xwing 位于Kube-node3，在Node3查看

root@kube-node-1:~# kubectl get pods -n kube-system -l k8s-app=cilium -o wide | grep kube-node-3
cilium-2vrgj   1/1     Running   0          106m   10.75.59.73   kube-node-3   <none>           <none>

root@kube-node-1:~# kubectl -n star-wars exec xwing -- curl -s https://echo.free.beeceptor.com
{
  "method": "GET",
  "protocol": "https",
  "host": "echo.free.beeceptor.com",
  "path": "/",
  "ip": "64.104.44.105:35834",
  "headers": {
    "Host": "echo.free.beeceptor.com",
    "User-Agent": "curl/7.88.1",
    "Accept": "*/*",
    "Via": "2.0 Caddy",
    "Accept-Encoding": "gzip"
  },
  "parsedQueryParams": {}
}root@kube-node-1:~# 
root@kube-node-1:~# 

使用 cilium bpf nat list 查看NAT表

root@kube-node-1:~# kubectl exec -n kube-system cilium-2vrgj -- cilium bpf nat list | grep 172.16.3.155
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
TCP OUT 172.16.3.155:35834 -> 147.182.252.2:443 XLATE_SRC 10.75.59.73:35834 Created=4sec ago NeedsCT=0
TCP IN 147.182.252.2:443 -> 10.75.59.73:35834 XLATE_DST 172.16.3.155:35834 Created=4sec ago NeedsCT=0
root@kube-node-1:~# 
root@kube-node-1:~# 

在外部的抓包：
root@ois:/home/ois/data/k8s# tcpdump -i vnet92 -vn 'tcp port 443'
tcpdump: listening on vnet92, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:28:22.307306 IP (tos 0x0, ttl 63, id 45759, offset 0, flags [DF], proto TCP (6), length 60)
    10.75.59.73.49208 > 147.182.252.2.443: Flags [S], cksum 0xd57b (incorrect -> 0x363a), seq 3275650602, win 64240, options [mss 1460,sackOK,TS val 493037768 ecr 0,nop,wscale 7], length 0
12:28:22.477754 IP (tos 0x0, ttl 42, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    147.182.252.2.443 > 10.75.59.73.49208: Flags [S.], cksum 0xed16 (correct), seq 450982431, ack 3275650603, win 65160, options [mss 1254,sackOK,TS val 1573215106 ecr 493037768,nop,wscale 7], length 0
12:28:22.478053 IP (tos 0x0, ttl 63, id 45760, offset 0, flags [DF], proto TCP (6), length 52)
    10.75.59.73.49208 > 147.182.252.2.443: Flags [.], cksum 0xd573 (incorrect -> 0x16fd), ack 1, win 502, options [nop,nop,TS val 493037939 ecr 1573215106], length 0
12:28:22.490922 IP (tos 0x0, ttl 63, id 45761, offset 0, flags [DF], proto TCP (6), length 569)
    10.75.59.73.49208 > 147.182.252.2.443: Flags [P.], cksum 0xd778 (incorrect -> 0x1d27), seq 1:518, ack 1, win 502, options [nop,nop,TS val 493037952 ecr 1573215106], length 517

6. 安装K8S和Cilium脚本自动化

接下来打算把K8S和Cilium的安装脚本自动化。

首先需要解决在kube-node-1 能无密码登录到kube-node-2 和 kube-node-3。

6.1 无密码登录设置

#!/bin/bash

# --- Configuration ---
ANSIBLE_DIR="ansible_ssh_setup"
INVENTORY_FILE="${ANSIBLE_DIR}/hosts.ini"
PLAYBOOK_FILE="${ANSIBLE_DIR}/setup_ssh.yml"

# Kubernetes Node IPs
KUBE_NODE_1_IP="10.75.59.71"
KUBE_NODE_2_IP="10.75.59.72"
KUBE_NODE_3_IP="10.75.59.73"

# Common Ansible user and Python interpreter
ANSIBLE_USER="ubuntu"
ANSIBLE_PYTHON_INTERPRETER="/usr/bin/python3"

# --- Functions ---

# Function to check and install Ansible
install_ansible() {
    if ! command -v ansible &> /dev/null
    then
        echo "Ansible not found. Attempting to install Ansible..."
        if [ -f /etc/debian_version ]; then
            # Debian/Ubuntu
            sudo apt update
            sudo apt install -y software-properties-common
            sudo add-apt-repository --yes --update ppa:ansible/ansible
            sudo apt install -y ansible
        elif [ -f /etc/redhat-release ]; then
            # CentOS/RHEL/Fedora
            sudo yum install -y epel-release
            sudo yum install -y ansible
        else
            echo "Unsupported OS for automatic Ansible installation. Please install Ansible manually."
            exit 1
        fi
        if ! command -v ansible &> /dev/null; then
            echo "Ansible installation failed. Please install it manually and re-run this script."
            exit 1
        fi
        echo "Ansible installed successfully."
    else
        echo "Ansible is already installed."
    fi
}

# Function to create Ansible inventory file
create_inventory() {
    echo "Creating Ansible inventory file: ${INVENTORY_FILE}"
    mkdir -p "$ANSIBLE_DIR"
    cat <<EOF > "$INVENTORY_FILE"
[kubernetes_nodes]
kube-node-1 ansible_host=${KUBE_NODE_1_IP}
kube-node-2 ansible_host=${KUBE_NODE_2_IP}
kube-node-3 ansible_host=${KUBE_NODE_3_IP}

[all:vars]
ansible_user=${ANSIBLE_USER}
ansible_python_interpreter=${ANSIBLE_PYTHON_INTERPRETER}
EOF
    echo "Inventory file created."
}

# Function to create Ansible playbook file
create_playbook() {
    echo "Creating Ansible playbook file: ${PLAYBOOK_FILE}"
    mkdir -p "$ANSIBLE_DIR"
    cat <<'EOF' > "$PLAYBOOK_FILE"
---
- name: Generate SSH key on kube-node-1 and distribute to other nodes
  hosts: kubernetes_nodes
  become: yes

  tasks:
    - name: Generate SSH key on kube-node-1
      ansible.builtin.command:
        cmd: ssh-keygen -t rsa -b 4096 -N "" -f /root/.ssh/id_rsa
        creates: /root/.ssh/id_rsa
      when: inventory_hostname == 'kube-node-1'

    - name: Ensure .ssh directory exists on all nodes
      ansible.builtin.file:
        path: /root/.ssh
        state: directory
        mode: '0700'

    - name: Ensure authorized_keys file exists
      ansible.builtin.file:
        path: /root/.ssh/authorized_keys
        state: touch
        mode: '0600'

    - name: Fetch public key from kube-node-1
      ansible.builtin.slurp:
        src: /root/.ssh/id_rsa.pub
      register: ssh_public_key
      when: inventory_hostname == 'kube-node-1'

    - name: Distribute public key to kube-node-2 and kube-node-3
      ansible.builtin.lineinfile:
        path: /root/.ssh/authorized_keys
        line: "{{ hostvars['kube-node-1']['ssh_public_key']['content'] | b64decode }}"
        state: present
      when: inventory_hostname in ['kube-node-2', 'kube-node-3']
EOF
    echo "Playbook file created."
}

# --- Main Script Execution ---

echo "Starting Ansible SSH key setup process..."

# 1. Install Ansible if not present
install_ansible

# 2. Create Ansible inventory file
create_inventory

# 3. Create Ansible playbook file
create_playbook

echo "Setup complete. You can now run the Ansible playbook manually using:"
echo "ansible-playbook -i \"$INVENTORY_FILE\" \"$PLAYBOOK_FILE\" --ask-become-pass"
echo "You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs."
echo "Process complete."

chmod +x ansible_ssh.sh
./ansible_ssh.sh

ois@ois:~/data/k8s$ ./ansible_ssh.sh
Starting Ansible SSH key setup process...
Ansible is already installed.
Creating Ansible inventory file: ansible_ssh_setup/hosts.ini
Inventory file created.
Creating Ansible playbook file: ansible_ssh_setup/setup_ssh.yml
Playbook file created.
Setup complete. You can now run the Ansible playbook manually using:
ansible-playbook -i "ansible_ssh_setup/hosts.ini" "ansible_ssh_setup/setup_ssh.yml" --ask-become-pass
You will be prompted for the 'sudo' password for the 'ubuntu' user on your VMs.
Process complete.
ois@ois:~/data/k8s$ cd ansible_ssh_setup/
ois@ois:~/data/k8s/ansible_ssh_setup$ ansible-playbook setup_ssh.yml -i hosts.ini -K
BECOME password: 

PLAY [Generate SSH key on kube-node-1 and distribute to other nodes] ********************************************************************************************************

TASK [Gathering Facts] ******************************************************************************************************************************************************
ok: [kube-node-1]
ok: [kube-node-3]
ok: [kube-node-2]

TASK [Generate SSH key on kube-node-1] **************************************************************************************************************************************
skipping: [kube-node-2]
skipping: [kube-node-3]
changed: [kube-node-1]

TASK [Ensure .ssh directory exists on all nodes] ****************************************************************************************************************************
ok: [kube-node-2]
ok: [kube-node-1]
ok: [kube-node-3]

TASK [Ensure authorized_keys file exists] ***********************************************************************************************************************************
changed: [kube-node-1]
changed: [kube-node-2]
changed: [kube-node-3]

TASK [Fetch public key from kube-node-1] ************************************************************************************************************************************
skipping: [kube-node-2]
skipping: [kube-node-3]
ok: [kube-node-1]

TASK [Distribute public key to kube-node-2 and kube-node-3] *****************************************************************************************************************
skipping: [kube-node-1]
changed: [kube-node-3]
changed: [kube-node-2]

PLAY RECAP ******************************************************************************************************************************************************************
kube-node-1                : ok=5    changed=2    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   
kube-node-2                : ok=4    changed=2    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0   
kube-node-3                : ok=4    changed=2    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

效果：

root@kube-node-1:~# ssh root@10.75.59.72
Welcome to Ubuntu 24.04.2 LTS (GNU/Linux 6.8.0-63-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/pro

 System information as of Fri Aug  1 02:28:23 PM CST 2025

  System load:  0.13               Processes:               188
  Usage of /:   30.2% of 18.33GB   Users logged in:         1
  Memory usage: 10%                IPv4 address for enp1s0: 10.75.59.72
  Swap usage:   0%

 * Strictly confined Kubernetes makes edge and IoT secure. Learn how MicroK8s
   just raised the bar for easy, resilient and secure K8s cluster deployment.

   https://ubuntu.com/engage/secure-kubernetes-at-the-edge

Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status

*** System restart required ***

6.2 自动化安装设置脚本

#!/bin/bash

# ==============================================================================
# Idempotent Kubernetes and Cilium Setup Script
#
# This script can be run multiple times. It checks the current state
# at each step and only performs actions if necessary.
# It must be run as root on the primary control-plane node.
# ==============================================================================

# --- Configuration ---
CONTROL_PLANE_ENDPOINT="kube-node-1"
CONTROL_PLANE_IP="10.75.59.71"
WORKER_NODES=("10.75.59.72" "10.75.59.73")
POD_CIDR="172.16.0.0/20"
SERVICE_CIDR="172.16.32.0/20"
BGP_PEER_IP="10.75.59.76"
LOCAL_ASN=65000
PEER_ASN=65000
CILIUM_VERSION="1.17.6"
HUBBLE_UI_VERSION="1.3.6"

# ==============================================================================
# Helper Function
# ==============================================================================
print_header() { echo -e "\n### $1 ###"; }

# ==============================================================================
# STEP 1: Initialize Kubernetes Control-Plane
# ==============================================================================
print_header "STEP 1: Initializing Kubernetes Control-Plane"

if kubectl get nodes &> /dev/null; then
  echo "✅ Kubernetes cluster is already running. Skipping kubeadm init."
else
  echo "--> Kubernetes cluster not found. Initializing..."
  kubeadm config images pull
  kubeadm init \
    --control-plane-endpoint=${CONTROL_PLANE_ENDPOINT} \
    --pod-network-cidr=${POD_CIDR} \
    --service-cidr=${SERVICE_CIDR} \
    --skip-phases=addon/kube-proxy
  mkdir -p /root/.kube
  cp -i /etc/kubernetes/admin.conf /root/.kube/config
  echo "✅ Control-Plane initialization complete."
fi

# ==============================================================================
# STEP 2: Install or Upgrade Cilium CNI
# ==============================================================================
print_header "STEP 2: Installing or Upgrading Cilium CNI"

helm repo add cilium https://helm.cilium.io/ &> /dev/null
helm repo add isovalent https://helm.isovalent.com/ &> /dev/null
helm repo update > /dev/null

cat > cilium-values.yaml <<EOF
hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: false
ipam:
  mode: kubernetes
ipv4NativeRoutingCIDR: ${POD_CIDR}
k8s:
  requireIPv4PodCIDR: true
routingMode: native
autoDirectNodeRoutes: true
enableIPv4Masquerade: true
bgpControlPlane:
  enabled: true
  announce:
    podCIDR: true
kubeProxyReplacement: true
bpf:
  masquerade: true
  lb:
    externalClusterIP: true
    sock: true
EOF

if helm status cilium -n kube-system &> /dev/null; then
  echo "--> Cilium is already installed. Upgrading to apply latest configuration..."
  helm upgrade cilium isovalent/cilium --version ${CILIUM_VERSION} --namespace kube-system --set k8sServiceHost=${CONTROL_PLANE_IP},k8sServicePort=6443 -f cilium-values.yaml
else
  echo "--> Cilium not found. Installing..."
  helm install cilium isovalent/cilium --version ${CILIUM_VERSION} --namespace kube-system --set k8sServiceHost=${CONTROL_PLANE_IP},k8sServicePort=6443 -f cilium-values.yaml
fi
echo "--> Waiting for Cilium pods to become ready..."
kubectl -n kube-system wait --for=condition=Ready pod -l k8s-app=cilium --timeout=5m
echo "✅ Cilium is configured."

# ==============================================================================
# STEP 3: Join Worker Nodes to the Cluster
# ==============================================================================
print_header "STEP 3: Joining Worker Nodes"

for NODE_IP in "${WORKER_NODES[@]}"; do
  if kubectl get nodes -o wide | grep -q "$NODE_IP"; then
    echo "✅ Node ${NODE_IP} is already in the cluster. Skipping join."
  else
    echo "--> Node ${NODE_IP} not found in cluster. Attempting to join..."
    JOIN_COMMAND=$(kubeadm token create --print-join-command)
    ssh -o StrictHostKeyChecking=no root@${NODE_IP} "${JOIN_COMMAND}"
    if [ $? -ne 0 ]; then
      echo "❌ Failed to join node ${NODE_IP}. Please check SSH connectivity and logs." >&2
      exit 1
    fi
    echo "✅ Node ${NODE_IP} joined successfully."
  fi
done

# ==============================================================================
# STEP 4: Install Cilium CLI
# ==============================================================================
print_header "STEP 4: Installing Cilium CLI"

if command -v cilium &> /dev/null; then
  echo "✅ Cilium CLI is already installed. Skipping."
else
  echo "--> Installing Cilium CLI..."
  curl -L --silent --remote-name-all https://github.com/isovalent/cilium-cli-releases/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}
  sha256sum --check cilium-linux-amd64.tar.gz.sha256sum > /dev/null
  tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin > /dev/null
  rm cilium-linux-amd64.tar.gz cilium-linux-amd64.tar.gz.sha256sum
  echo "✅ Cilium CLI installed."
fi

# ==============================================================================
# STEP 5: Configure Cilium BGP Peering
# ==============================================================================
print_header "STEP 5: Configuring Cilium BGP Peering"

echo "--> Applying BGP configuration. 'unchanged' means it's already correct."
# CORRECTION: Using the correct schema with matchLabels.
cat > cilium-bgp.yaml << EOF
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
  name: bgp-advertisements
  labels:
    advertise: bgp
spec:
  advertisements:
    - advertisementType: "PodCIDR"              # Only for Kubernetes or ClusterPool IPAM cluster-pool
    - advertisementType: "Service"
      service:
        addresses:
          - ClusterIP
          - ExternalIP
          #- LoadBalancerIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']} 

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeerConfig
metadata:
  name: cilium-peer
spec:
  timers:
    holdTimeSeconds: 30             #default 90s
    keepAliveTimeSeconds: 10  #default 30s
    connectRetryTimeSeconds: 40  #default 120s
  gracefulRestart:
    enabled: true
    restartTimeSeconds: 120        #default 120s
  #transport:
  #  peerPort: 179
  families:
    - afi: ipv4
      safi: unicast
      advertisements:
        matchLabels:
          advertise: "bgp"

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
  name: cilium-bgp-default
spec:
  bgpInstances:
  - name: "instance-65000"
    localASN: ${LOCAL_ASN}
    peers:
    - name: "FRR_BGP"
      peerASN: ${PEER_ASN}
      peerAddress: ${BGP_PEER_IP}
      peerConfigRef:
        name: "cilium-peer"
EOF

# Apply the configuration and check for errors
kubectl apply -f cilium-bgp.yaml
if [ $? -ne 0 ]; then
  echo "❌ Failed to apply BGP configuration. Please check the errors above." >&2
  exit 1
fi
echo "✅ BGP configuration applied."

# ==============================================================================
# STEP 6: Install or Upgrade Hubble UI
# ==============================================================================
print_header "STEP 6: Installing or Upgrading Hubble UI"

cat > hubble-ui-values.yaml << EOF
relay:
  address: "hubble-relay.kube-system.svc.cluster.local"
EOF

if helm status hubble-ui -n kube-system &> /dev/null; then
  echo "--> Hubble UI is already installed. Upgrading..."
  helm upgrade hubble-ui isovalent/hubble-ui --version ${HUBBLE_UI_VERSION} --namespace kube-system --values hubble-ui-values.yaml --wait
else
  echo "--> Hubble UI not found. Installing..."
  helm install hubble-ui isovalent/hubble-ui --version ${HUBBLE_UI_VERSION} --namespace kube-system --values hubble-ui-values.yaml --wait
fi

SERVICE_TYPE=$(kubectl get service hubble-ui -n kube-system -o jsonpath='{.spec.type}')
if [ "$SERVICE_TYPE" != "NodePort" ]; then
  echo "--> Patching Hubble UI service to NodePort..."
  kubectl patch service hubble-ui -n kube-system -p '{"spec": {"type": "NodePort"}}'
else
  echo "--> Hubble UI service is already of type NodePort."
fi
echo "✅ Hubble UI is configured."

# ==============================================================================
# STEP 7: Final Verification
# ==============================================================================
print_header "STEP 7: Final Verification"
echo "--> Waiting for all nodes to be ready..."
kubectl wait --for=condition=Ready node --all --timeout=5m
echo "--> Checking Cilium status..."
cilium status --wait

HUBBLE_UI_PORT=$(kubectl get service hubble-ui -n kube-system -o jsonpath='{.spec.ports[0].nodePort}')
echo -e "\n----------------------------------------------------------------"
echo "🚀 Cluster setup is complete and verified!"
echo "Access Hubble UI at: http://${CONTROL_PLANE_IP}:${HUBBLE_UI_PORT}"
echo "----------------------------------------------------------------"

root@kube-node-1:~# ./k8s-cilium-setup.sh 
### STEP 1: Initializing Kubernetes Control-Plane on kube-node-1... ###
--> Pulling Kubernetes container images...
[config/images] Pulled registry.k8s.io/kube-apiserver:v1.33.3
[config/images] Pulled registry.k8s.io/kube-controller-manager:v1.33.3
[config/images] Pulled registry.k8s.io/kube-scheduler:v1.33.3
[config/images] Pulled registry.k8s.io/kube-proxy:v1.33.3
[config/images] Pulled registry.k8s.io/coredns/coredns:v1.12.0
[config/images] Pulled registry.k8s.io/pause:3.10
[config/images] Pulled registry.k8s.io/etcd:3.5.21-0
--> Running kubeadm init...
[init] Using Kubernetes version: v1.33.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-node-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [172.16.32.1 10.75.59.71]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-node-1 localhost] and IPs [10.75.59.71 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.001873284s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://10.75.59.71:6443/livez
[control-plane-check] Checking kube-controller-manager at https://127.0.0.1:10257/healthz
[control-plane-check] Checking kube-scheduler at https://127.0.0.1:10259/livez
[control-plane-check] kube-controller-manager is healthy after 2.272937689s
[control-plane-check] kube-scheduler is healthy after 3.04977489s
[control-plane-check] kube-apiserver is healthy after 5.003048769s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node kube-node-1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 0q5m6l.5pc7hz15orcc0b6b
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join kube-node-1:6443 --token 0q5m6l.5pc7hz15orcc0b6b \
        --discovery-token-ca-cert-hash sha256:4795595e8237c54f1bf20c7fb56feea9a1960af5802c08c410733d54b5e317a6 \
        --control-plane 

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join kube-node-1:6443 --token 0q5m6l.5pc7hz15orcc0b6b \
        --discovery-token-ca-cert-hash sha256:4795595e8237c54f1bf20c7fb56feea9a1960af5802c08c410733d54b5e317a6 
--> Configuring kubectl...
✅ Control-Plane initialization complete.
### STEP 2: Installing Cilium... ###
--> Adding Helm repositories...
"cilium" has been added to your repositories
"isovalent" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "cilium" chart repository
...Successfully got an update from the "isovalent" chart repository
Update Complete. ⎈Happy Helming!⎈
--> Creating cilium-enterprise-values.yaml...
--> Installing Cilium with Helm...
NAME: cilium
LAST DEPLOYED: Fri Aug  1 14:16:16 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble Relay.

Your release version is 1.17.6.

For any further help, visit https://docs.isovalent.com/v1.17
--> Waiting for Cilium pods to become ready...
pod/cilium-5ngcm condition met
✅ Cilium installation complete.
### STEP 3: Joining Worker Nodes... ###
--> Generated join command: kubeadm join kube-node-1:6443 --token whltm8.4zs5af6hiht167da --discovery-token-ca-cert-hash sha256:4795595e8237c54f1bf20c7fb56feea9a1960af5802c08c410733d54b5e317a6 
--> Joining node 10.75.59.72 to the cluster...
[preflight] Running pre-flight checks
[preflight] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[preflight] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0801 14:18:01.674767   20870 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:whltm8" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.501212696s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

✅ Node 10.75.59.72 joined successfully.
--> Joining node 10.75.59.73 to the cluster...
[preflight] Running pre-flight checks
[preflight] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[preflight] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0801 14:18:07.347008   20684 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:whltm8" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.002187345s
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

✅ Node 10.75.59.73 joined successfully.
### STEP 4: Installing the Cilium CLI... ###
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
100 59.2M  100 59.2M    0     0  13.1M      0  0:00:04  0:00:04 --:--:-- 23.8M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
100    92  100    92    0     0     70      0  0:00:01  0:00:01 --:--:--    70
cilium-linux-amd64.tar.gz: OK
cilium
✅ Cilium CLI installed.
### STEP 5: Configuring Cilium BGP Peering ###
--> Applying BGP configuration. 'unchanged' means it's already correct.
ciliumbgpadvertisement.cilium.io/bgp-advertisements unchanged
ciliumbgppeerconfig.cilium.io/cilium-peer unchanged
ciliumbgpclusterconfig.cilium.io/cilium-bgp-default configured
✅ BGP configuration applied.
### STEP 6: Installing Hubble UI... ###
--> Creating hubble-ui-values.yaml...
--> Installing Hubble UI with Helm...
NAME: hubble-ui
LAST DEPLOYED: Fri Aug  1 14:18:18 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Hubble-Ui.
Your release version is 1.3.6.

For any further help, visit https://docs.isovalent.com
--> Exposing Hubble UI service via NodePort...
service/hubble-ui patched
✅ Hubble UI installed.
### STEP 7: Verifying the Setup... ###
--> Waiting for all nodes to be ready...
node/kube-node-1 condition met
node/kube-node-2 condition met
node/kube-node-3 condition met
--> Checking Cilium status...
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    OK
 \__/¯¯\__/    Hubble Relay:       OK
    \__/       ClusterMesh:        disabled

DaemonSet              cilium                   Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet              cilium-envoy             Desired: 3, Ready: 3/3, Available: 3/3
Deployment             cilium-operator          Desired: 2, Ready: 2/2, Available: 2/2
Deployment             hubble-relay             Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-ui                Desired: 1, Ready: 1/1, Available: 1/1
Containers:            cilium                   Running: 3
                       cilium-envoy             Running: 3
                       cilium-operator          Running: 2
                       clustermesh-apiserver    
                       hubble-relay             Running: 1
                       hubble-ui                Running: 1
Cluster Pods:          8/8 managed by Cilium
Helm chart version:    1.17.6
Image versions         cilium             quay.io/isovalent/cilium:v1.17.6-cee.1@sha256:2d01daf4f25f7d644889b49ca856e1a4269981fc963e50bd3962665b41b6adb3: 3
                       cilium-envoy       quay.io/isovalent/cilium-envoy:v1.17.6-cee.1@sha256:318eff387835ca2717baab42a84f35a83a5f9e7d519253df87269f80b9ff0171: 3
                       cilium-operator    quay.io/isovalent/operator-generic:v1.17.6-cee.1@sha256:2e602710a7c4f101831df679e5d8251bae8bf0f9fe26c20bbef87f1966ea8265: 2
                       hubble-relay       quay.io/isovalent/hubble-relay:v1.17.6-cee.1@sha256:d378e3607f7492374e65e2bd854cc0ec87480c63ba49a96dadcd75a6946b586e: 1
                       hubble-ui          quay.io/isovalent/hubble-ui-enterprise-backend:v1.3.6: 1
                       hubble-ui          quay.io/isovalent/hubble-ui-enterprise:v1.3.6: 1

----------------------------------------------------------------
🚀 Cluster setup is complete and verified!
Access Hubble UI at: http://10.75.59.71:30583
----------------------------------------------------------------
root@kube-node-1:~# cilium bgp peers
Node          Local AS   Peer AS   Peer Address   Session State   Uptime   Family         Received   Advertised
kube-node-1   65000      65000     10.75.59.76    established     19m1s    ipv4/unicast   2          8    
kube-node-2   65000      65000     10.75.59.76    established     19m2s    ipv4/unicast   2          8    
kube-node-3   65000      65000     10.75.59.76    established     19m2s    ipv4/unicast   2          8    
root@kube-node-1:~# cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node          VRouter   Prefix             NextHop   Age     Attrs
kube-node-1   65000     172.16.0.0/24      0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.1/32     0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.10/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.36.130/32   0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.40.165/32   0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.43.51/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.47.30/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
kube-node-2   65000     172.16.1.0/24      0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.1/32     0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.10/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.36.130/32   0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.40.165/32   0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.43.51/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.47.30/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
kube-node-3   65000     172.16.2.0/24      0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.1/32     0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.32.10/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.36.130/32   0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.40.165/32   0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.43.51/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
              65000     172.16.47.30/32    0.0.0.0   19m8s   [{Origin: i} {Nexthop: 0.0.0.0}]   
root@kube-node-1:~#

7. Kubectl 常用命令

kubectl describe pod hubble-ui-5fdd8b4495-dv7nr -n kube-system

kubectl get nodes -o wide
kubectl get pods -n kube-system -o wide
kubectl get pods --all-namespaces
kubectl get service -n kube-system -o wide
kubectl get daemonset -n kube-system cilium
kubectl get deployment -o wide

kubectl get endpoints -n kube-system kube-dns
kubectl get cm cilium-config -n kube-system -o yaml
kubectl get endpoints -n kube-system
kubectl get pods -n kube-system -l k8s-app=cilium -o wide
# kubectl describe pod hubble-relay-cfb755899-r42l8



kubectl -n kube-system get pods -l k8s-app=cilium
kubectl -n kube-system exec ds/cilium -- cilium-dbg bpf ipmasq list
kubectl -n kube-system exec ds/cilium -- cilium-dbg status --verbose
kubectl -n kube-system exec ds/cilium -- cilium status
kubectl -n kube-system exec ds/cilium -- cilium service list
kubectl -n kube-system exec ds/cilium -- cilium bpf nat list
kubectl exec -n kube-system cilium-2vrgj -- cilium bpf nat list

kubectl -n kube-system get configmap cilium-config -o yaml

kubectl create namespace star-wars
kubectl apply -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/minikube/http-sw-app.yaml
kubectl -n star-wars get pod -o wide --show-labels
kubectl -n star-wars patch service deathstar -p '{"spec":{"type":"NodePort"}}'
kubectl -n star-wars exec tiefighter --   curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing
kubectl -n star-wars exec xwing --   curl -s -XPOST http://deathstar.star-wars.svc.cluster.local/v1/request-landing

# kubectl delete -n star-wars -f https://raw.githubusercontent.com/cilium/cilium/1.18.0/examples/minikube/http-sw-app.yaml

# Commands for reference only.


kubectl delete pod,svc,daemonset -n kube-system -l k8s-app=cilium
kubectl delete daemonset -n kube-system kube-proxy


kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=ClusterIP

kubectl run -it --rm busybox --image=busybox --restart=Never -- sh

kubectl run -it --rm curl --image=curlimages/curl --restart=Never -- sh

for i in {1..10}; do   kubectl exec -n star-wars tiefighter --     curl -s http://deathstar.default.svc.cluster.local/v1 |     jq -r '.hostname'; done

AppDynamics AMI cloud-init code update

发表于 2023-03-15 本文字数： 309 阅读时长 ≈ 1 分钟

#cloud-config
#Update the packages onboot.
package_update: true

#ncurses-compat-libs is for Amazon Linux 2.
packages:
  - libaio
  - numactl
  - tzdata
  - ncurses-compat-libs

write_files:
  - path: /etc/profile.d/lang.sh
    content: |
      export LANG=en_US.UTF-8
      export LC_ALL=en_US.UTF-8

  - path: /etc/security/limits.conf
    content: |
      root hard nofile 65535
      root soft nofile 65535
      root hard nproc 8192
      root soft nproc 8192

  - path: /opt/appdynamics/response.varfile.bak
    content: |
      serverHostName=HOST_NAME
      sys.languageId=en
      disableEULA=true
      platformAdmin.port=9191
      platformAdmin.databasePort=3377
      platformAdmin.dataDir=/opt/appdynamics/platform/mysql/data
      platformAdmin.databasePassword=ENTER_PASSWORD
      platformAdmin.databaseRootPassword=ENTER_PASSWORD
      platformAdmin.adminPassword=ENTER_PASSWORD
      platformAdmin.useHttps$Boolean=false
      sys.installationDir=/opt/appdynamics/platform

  - path: /etc/systemd/system/appd.console.service
    permissions: '0644'
    content: |
      [Unit]
      Description=AppDynamics Enterprise Console
      After=network.target

      [Service]
      Type=forking
      ExecStart=/opt/appdynamics/platform/platform-admin/bin/platform-admin.sh start-platform-admin
      ExecStop=/opt/appdynamics/platform/platform-admin/bin/platform-admin.sh stop-platform-admin
      User=root
      Restart=always

      [Install]
      WantedBy=multi-user.target

  - path: /etc/systemd/system/appd.console.install.service
    permissions: '0644'
    content: |
      [Unit]
      Description=AppDynamics Enterprise Console Installation
      After=network.target

      [Service]
      Type=oneshot
      RemainAfterExit=no
      ExecStart=/bin/sh -c 'sleep 5 && cp /opt/appdynamics/response.varfile.bak /opt/appdynamics/response.varfile && sed -i \"s/ENTER_PASSWORD/`curl http://169.254.169.254/latest/meta-data/instance-id`/g\" /opt/appdynamics/response.varfile && sed -i \"s/HOST_NAME/`curl http://169.254.169.254/latest/meta-data/hostname`/g\" /opt/appdynamics/response.varfile && /opt/appdynamics/platform-setup-x64-linux-23.1.1.18.sh -q -varfile /opt/appdynamics/response.varfile && systemctl daemon-reload && systemctl enable appd.console.service && systemctl start appd.console.service'

      [Install]
      WantedBy=multi-user.target

runcmd:
  # Create directory and copy Cisco AppDynamics Enterprise Console setup file
  - aws s3 cp s3://ciscoappdnx/platform-setup-x64-linux-23.1.1.18.sh /opt/appdynamics/ --region cn-northwest-1
  - chmod +x /opt/appdynamics/platform-setup-x64-linux-23.1.1.18.sh
  - systemctl daemon-reload
  - systemctl enable appd.console.install.service
  - sed -i 's/#PermitRootLogin yes/PermitRootLogin no/g' /etc/ssh/sshd_config
  - rm -rf /root/.ssh/authorized_keys
  - rm -rf /home/ec2-user/.ssh/authorized_keys
  - shred -u /etc/ssh/*_key /etc/ssh/*_key.pub

与网红ChatGPT对话，AI助手加速云计算开发流程

发表于 2023-03-03 更新于 2023-03-04 本文字数： 2.1k 阅读时长 ≈ 8 分钟

手工作坊与“云”格格不入

Cisco AppDynamics 是一款功能强大、易于使用的应用程序性能管理（APM）解决方案，能够端到端监控亚马逊云的应用程序，包括微服务和 Docker，通过 CloudWatch 集成为 EC2、DynamoDB、Lambda 等提供支持。AppDynamics可比较和验证云迁移前后的从客户到业务的优化，从而加速客户上云，因而深受用户喜爱。

为了提升用户在亚马逊云科技云端安装部署AppDynamics软件的效率，我们需要制作一个打包好的安装镜像 —— Amazon Machine Images (AMI)。用户使用AMI镜像启动虚拟机即可进入AppDynamics的设置界面，这能帮助用户节省大量软件下载、安装调试的时间，极大改善用户的安装体验。
我们采用什么方式来制作AMI镜像呢？

使用纯手工方式当然可以完成制作，但是这个AMI镜像封存了整个虚拟机的磁盘，包括操作系统和软件包。如果AppDynamics软件发布新版本，或者操作系统发现安全漏洞，就需要进行软件升级或系统漏洞修复的工作。在这种情况下，手工作坊难以招架，换句话说，在云的世界，已经没有手工作坊的一席之地，只有自动化一种选项。

那么，接下来的问题是：自动化需要工具和代码的支持，代码要怎么写呢？

笔者虽然能写点简单的Python代码、Shell脚本，可是要编写一个综合性的代码，恐怕没有两周时间，再加上掉几把头发是写不出来的。

阅读全文 »

使用ChatGPT生成cloud-init制作AppDynamics在AWS上的安装镜像

发表于 2023-03-02 更新于 2023-03-03 本文字数： 3k 阅读时长 ≈ 11 分钟

作者: 饶维波

本文记录通过ChatGPT生成cloud-init制作AppDynamics在AWS上 AMI安装镜像的过程，形成操作文档作为参考指南。

任务简介和总体思路

任务简介

Cisco AppDynamics 提供功能强大、易于使用的应用程序性能管理（APM）解决方案，端到端监控亚马逊云的应用程序，包括微服务和 Docker，通过 CloudWatch 集成为 EC2、DynamoDB、Lambda 等提供支持。AppDynamics可比较和验证云迁移前后的从客户到业务的优化，从而加速客户上云，因而深受用户喜爱。

为了提升用户在亚马逊云科技云端安装部署AppDynamics软件的效率，我们需要制作一个打包好的安装镜像，叫做Amazon Machine Images (AMI)。用户使用AMI镜像启动虚拟机即可进入AppDynamics的设置界面，这能帮助用户节省大量软件下载、安装调试的时间，极大改善用户的安装体验。

本次任务的目标是需要完成AppDynamics AMI的制作，且为了便于后续维护，维护的工作主要包括操作系统层面的安全漏洞修复、AppDynamics软件版本的升级，尽量使用自动化，节省人的时间精力的同时，避免人为错误。

任务关键点是需要编写一个自动化脚本，考虑为Shell脚本或者cloud-init脚本。

阅读全文 »