Deployment 滚动更新流程

本文档描述 Kubernetes Deployment 如何通过 ReplicaSet Controller 管理滚动更新的完整流程。

概述

Deployment 滚动更新涉及以下核心组件:

  • Deployment Controller:管理 Deployment 生命周期
  • ReplicaSet Controller:管理 Pod 副本数
  • API Server:存储和分发资源状态
flowchart TD
    subgraph User["用户操作"]
        A["kubectl apply -f deployment.yaml"]
    end

    subgraph APIServer["API Server"]
        B["Deployment 资源"]
        C["ReplicaSet 资源"]
        D["Pod 资源"]
    end

    subgraph DeploymentController["Deployment Controller"]
        E["Watch Deployment"]
        F["创建/更新 ReplicaSet"]
        G["计算期望副本数"]
    end

    subgraph ReplicaSetController["ReplicaSet Controller"]
        H["Watch ReplicaSet"]
        I["创建/删除 Pod"]
        J["维护期望副本数"]
    end

    A --> B
    B -->|触发| E
    E --> F
    F --> C
    C -->|触发| H
    H --> I
    I --> D

    G -->|控制| F
    J -->|控制| I

    style A fill:#e1f5fe
    style E fill:#c8e6c9
    style H fill:#fff9c4

流程详解

1. Deployment Controller 架构

代码路径: pkg/controller/deployment/deployment_controller.go

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
type DeploymentController struct {
// rsControl 用于采用/释放 ReplicaSet
rsControl controller.RSControlInterface
client clientset.Interface

// 事件广播
eventBroadcaster record.EventBroadcaster
eventRecorder record.EventRecorder

// 同步处理函数(可注入用于测试)
syncHandler func(ctx context.Context, dKey string) error

// Lister 用于从缓存获取资源
dLister appslisters.DeploymentLister
rsLister appslisters.ReplicaSetLister
podLister corelisters.PodLister

// 工作队列
queue workqueue.RateLimitingInterface
}

2. 滚动更新触发流程

sequenceDiagram
    participant User as 用户
    participant API as API Server
    participant DC as Deployment Controller
    participant RC as ReplicaSet Controller
    participant Pod as Pod

    User->>API: 更新 Deployment (修改镜像版本)
    API->>DC: Watch 事件触发
    DC->>DC: syncDeployment()

    Note over DC: 获取所有关联的 ReplicaSet
    DC->>API: List ReplicaSets by label selector

    Note over DC: 计算新旧 ReplicaSet 副本数
    DC->>DC: getAllReplicaSetsAndSyncRevision()

    alt 滚动更新策略
        DC->>API: 创建新 ReplicaSet (replicas=0)
        DC->>API: 逐步调整新旧 RS 副本数
    else 重建策略
        DC->>API: 删除旧 ReplicaSet
        DC->>API: 创建新 ReplicaSet
    end

    API->>RC: ReplicaSet 变更事件
    RC->>Pod: 创建/删除 Pod
    Pod->>API: 更新 Pod 状态
    API->>DC: 状态变更事件
    DC->>API: 更新 Deployment Status

3. syncDeployment 核心逻辑

代码路径: pkg/controller/deployment/deployment_controller.go:585

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
func (dc *DeploymentController) syncDeployment(ctx context.Context, key string) error {
namespace, name, err := cache.SplitMetaNamespaceKey(key)

// 获取 Deployment
deployment, err := dc.dLister.Deployments(namespace).Get(name)
if errors.IsNotFound(err) {
return nil // Deployment 已删除
}

d := deployment.DeepCopy()

// 获取所有关联的 ReplicaSet
rsList, err := dc.getReplicaSetsForDeployment(ctx, d)

// 获取所有关联的 Pod
podMap, err := dc.getPodMapForDeployment(d, rsList)

// 同步 Deployment 状态
if err := dc.syncDeploymentStatus(ctx, allRSs, newRS, d); err != nil {
return err
}

// 根据 Deployment 策略执行同步
switch d.Spec.Strategy.Type {
case apps.RecreateDeploymentStrategyType:
return dc.rolloutRecreate(ctx, d, rsList, podMap)
case apps.RollingUpdateDeploymentStrategyType:
return dc.rolloutRolling(ctx, d, rsList)
}
}

4. 滚动更新策略

4.1 RollingUpdate 策略

1
2
3
4
5
6
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25% # 最多超出期望副本数的比例
maxUnavailable: 25% # 最多不可用副本数的比例

计算逻辑:

1
2
3
4
5
6
7
期望副本数: desiredReplicas = deployment.spec.replicas

最大可用数: maxAvailable = ceil(desiredReplicas * (1 - maxUnavailable))
最大总数: maxTotal = desiredReplicas + maxSurge

新 RS 副本数: newRSReplicas = min(maxTotal - oldRSReplicas, desiredReplicas)
旧 RS 副本数: oldRSReplicas = max(0, desiredReplicas - newRSReplicas)

代码路径: pkg/controller/deployment/rolling.go

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
func (dc *DeploymentController) rolloutRolling(ctx context.Context, deployment *apps.Deployment, rsList []*apps.ReplicaSet) error {
// 获取新 ReplicaSet
newRS, err := dc.getNewReplicaSet(ctx, deployment, rsList, revision)

// 获取旧 ReplicaSet 列表
oldRSs, err := dc.getOldReplicaSets(deployment, rsList)

// 计算新 RS 的期望副本数
newReplicasCount, err := deploymentutil.NewRSNewReplicas(deployment, allRSs, newRS)

// 更新新 RS 副本数
if *(newRS.Spec.Replicas) != newReplicasCount {
*(newRS.Spec.Replicas) = newReplicasCount
_, err = dc.client.AppsV1().ReplicaSets(deployment.Namespace).Update(ctx, newRS, metav1.UpdateOptions{})
}

// 缩减旧 RS 副本数
for _, oldRS := range oldRSs {
if *(oldRS.Spec.Replicas) == 0 {
continue
}
// 计算缩减后的副本数
newReplicas := ...
*(oldRS.Spec.Replicas) = newReplicas
dc.client.AppsV1().ReplicaSets(deployment.Namespace).Update(ctx, oldRS, metav1.UpdateOptions{})
}
}

4.2 Recreate 策略

1
2
3
spec:
strategy:
type: Recreate

Recreate 策略会先删除所有旧 Pod,再创建新 Pod:

flowchart LR
    A["旧 RS (3副本)"] -->|删除所有 Pod| B["旧 RS (0副本)"]
    B -->|创建新 RS| C["新 RS (3副本)"]

5. ReplicaSet Controller 角色

代码路径: pkg/controller/replicaset/replica_set.go

1
2
3
4
5
6
7
8
9
10
11
12
type ReplicaSetController struct {
schema.GroupVersionKind

kubeClient clientset.Interface
podControl controller.PodControlInterface

// 期望缓存
expectations *controller.UIDTrackingControllerExpectations

// 工作队列
queue workqueue.RateLimitingInterface
}

核心同步逻辑:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
func (rsc *ReplicaSetController) syncReplicaSet(ctx context.Context, key string) error {
// 获取 ReplicaSet
rs, err := rsc.rsLister.ReplicaSets(namespace).Get(name)

// 获取所有关联的 Pod
filteredPods, err := rsc.getPodsForReplicaSet(ctx, rs)

// 计算需要创建/删除的 Pod 数量
diff := len(filteredPods) - int(*(rs.Spec.Replicas))

if diff < 0 {
// 需要创建 Pod
wait.AbsurdDuration
rsc.podControl.CreatePods(ctx, ...)
} else if diff > 0 {
// 需要删除 Pod
rsc.podControl.DeletePods(ctx, ...)
}

// 更新 ReplicaSet 状态
rsc.updateReplicaSetStatus(ctx, rs, filteredPods)
}

6. 滚动更新状态追踪

代码路径: pkg/controller/deployment/sync.go:476

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
func (dc *DeploymentController) syncDeploymentStatus(ctx context.Context, allRSs []*apps.ReplicaSet, newRS *apps.ReplicaSet, d *apps.Deployment) error {
newStatus := calculateStatus(allRSs, newRS, d)

if reflect.DeepEqual(d.Status, newStatus) {
return nil
}

newDeployment := d
newDeployment.Status = newStatus
_, err := dc.client.AppsV1().Deployments(newDeployment.Namespace).UpdateStatus(ctx, newDeployment, metav1.UpdateOptions{})
return err
}

func calculateStatus(allRSs []*apps.ReplicaSet, newRS *apps.ReplicaSet, deployment *apps.Deployment) apps.DeploymentStatus {
availableReplicas := deploymentutil.GetAvailableReplicaCountForReplicaSets(allRSs)
totalReplicas := deploymentutil.GetReplicaCountForReplicaSets(allRSs)

status := apps.DeploymentStatus{
ObservedGeneration: deployment.Generation,
Replicas: deploymentutil.GetActualReplicaCountForReplicaSets(allRSs),
UpdatedReplicas: deploymentutil.GetActualReplicaCountForReplicaSets([]*apps.ReplicaSet{newRS}),
ReadyReplicas: deploymentutil.GetReadyReplicaCountForReplicaSets(allRSs),
AvailableReplicas: availableReplicas,
UnavailableReplicas: totalReplicas - availableReplicas,
}
return status
}

关键代码锚点

功能 文件路径
Deployment Controller 结构 pkg/controller/deployment/deployment_controller.go:66
syncDeployment 入口 pkg/controller/deployment/deployment_controller.go:585
RollingUpdate 逻辑 pkg/controller/deployment/rolling.go
Recreate 逻辑 pkg/controller/deployment/recreate.go
状态计算 pkg/controller/deployment/sync.go:476
ReplicaSet Controller pkg/controller/replicaset/replica_set.go:81

滚动更新示例

假设 Deployment 期望 10 副本,maxSurge=25%maxUnavailable=25%

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
初始状态:
旧 RS (v1): 10 副本

步骤 1:
新 RS (v2): 3 副本 (maxSurge=3)
旧 RS (v1): 8 副本 (保持至少 7 可用)

步骤 2:
新 RS (v2): 6 副本
旧 RS (v1): 5 副本

步骤 3:
新 RS (v2): 9 副本
旧 RS (v1): 2 副本

最终状态:
新 RS (v2): 10 副本
旧 RS (v1): 0 副本

回滚机制

1
2
3
4
5
6
7
8
# 查看历史版本
kubectl rollout history deployment/my-deployment

# 回滚到上一版本
kubectl rollout undo deployment/my-deployment

# 回滚到指定版本
kubectl rollout undo deployment/my-deployment --to-revision=2

回滚通过保留旧 ReplicaSet 的 revision annotation 实现:

1
2
3
metadata:
annotations:
deployment.kubernetes.io/revision: "2"

高频面试题

Q1: Deployment 滚动更新的流程是怎样的?

参考答案:

  1. 用户更新 Deployment(如修改镜像版本)
  2. Deployment Controller 监听到变化,执行 syncDeployment()
  3. 获取所有关联的 ReplicaSet,计算新旧 RS
  4. 创建新 ReplicaSet(如果不存在)
  5. 按照 RollingUpdate 策略逐步调整副本数:
    • 扩容新 RS(受 maxSurge 限制)
    • 缩容旧 RS(受 maxUnavailable 限制)
  6. ReplicaSet Controller 监听 RS 变化,创建/删除 Pod
  7. 更新 Deployment Status,直到所有副本更新完成

Q2: maxSurge 和 maxUnavailable 是如何工作的?

参考答案:

1
2
3
4
5
spec:
strategy:
rollingUpdate:
maxSurge: 25% # 最多超出期望副本数
maxUnavailable: 25% # 最多不可用副本数

假设 replicas=10

  • maxSurge=25% → 最多 10+3=13 个 Pod 同时运行
  • maxUnavailable=25% → 最少保持 10-3=7 个可用 Pod

计算逻辑:

1
2
3
4
maxAvailable = ceil(desiredReplicas * (1 - maxUnavailable))
maxTotal = desiredReplicas + maxSurge

新RS副本数 = min(maxTotal - 旧RS副本数, desiredReplicas)

示例(10副本,25%/25%):

1
2
3
4
5
初始: 旧RS=10
步骤1: 新RS=3, 旧RS=8 (总数11,可用8)
步骤2: 新RS=6, 旧RS=5 (总数11,可用11)
步骤3: 新RS=9, 旧RS=2 (总数11,可用11)
最终: 新RS=10, 旧RS=0

Q3: RollingUpdate 和 Recreate 策略有什么区别?

参考答案:

特性 RollingUpdate Recreate
服务中断 无(渐进式替换) 有(先删后建)
资源占用 可能超出期望副本数 不会超出
更新速度 较慢 较快
回滚能力 保留历史 RS 保留历史 RS
适用场景 生产环境、需要高可用 开发环境、单副本应用

Recreate 流程:

  1. 缩容所有旧 RS 到 0
  2. 等待所有旧 Pod 终止
  3. 创建新 RS 并扩容到目标副本数

Q4: Deployment 如何实现回滚?

参考答案:

1
2
3
4
5
6
7
8
# 查看历史版本
kubectl rollout history deployment/my-deployment

# 回滚到上一版本
kubectl rollout undo deployment/my-deployment

# 回滚到指定版本
kubectl rollout undo deployment/my-deployment --to-revision=2

原理:

  1. 每个 ReplicaSet 都有 deployment.kubernetes.io/revision annotation
  2. 回滚时,Deployment Controller 找到目标 revision 的 RS
  3. 将该 RS 的 PodTemplate 复制到 Deployment
  4. 触发正常的滚动更新流程

保留历史版本数量:

1
2
spec:
revisionHistoryLimit: 10 # 默认保留 10 个旧 RS

Q5: ReplicaSet Controller 是如何工作的?

参考答案:

  1. Watch ReplicaSet 和 Pod 变化
  2. 计算期望差值: diff = 当前Pod数 - 期望副本数
  3. 创建 Pod(diff < 0):
    • 使用 slowStartBatch 批量创建,避免一次创建过多
    • burstReplicas 限制(默认 500)
  4. 删除 Pod(diff > 0):
    • 优先删除不健康的 Pod(Pending、Unknown、NotReady)
    • 按优先级排序后删除
  5. Expectations 机制:
    • 记录期望创建/删除的 Pod
    • 只有 SatisfiedExpectations() 返回 true 才执行同步
    • 避免在 Pod 创建/删除过程中重复操作

Q6: 如何排查 Deployment 更新卡住的问题?

参考答案:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 1. 查看 Deployment 状态
kubectl describe deployment <name>
kubectl get deployment <name> -o yaml

# 2. 检查 ReplicaSet 状态
kubectl get rs -l app=<app-name>

# 3. 查看 Pod 状态
kubectl get pods -l app=<app-name>
kubectl describe pod <pod-name>

# 4. 查看事件
kubectl get events --field-selector involvedObject.name=<deployment-name>

# 5. 常见原因:
# - 镜像拉取失败(imagePullPolicy、镜像不存在)
# - 资源不足(CPU/内存限制)
# - Pod 健康检查失败(readinessProbe/livenessProbe)
# - maxUnavailable=0 且新 Pod 无法启动
# - PDB 限制(PodDisruptionBudget)

延伸阅读