Kube-controller-manager(GarbageController)
Kube-controller-manager(GarbageController)
基于1.25
GarbageController负责资源对象的级联删除(GC)
- K8s中,主要通过OwnerReference来记录资源的级联关系
OwnerReference结构定义:
UID:资源对象父级资源的UID
BlockOwnerDeletion:true表示父级资源在以Foregroud删除策略执行删除(在删除这个资源之前,不可删除父资源)
GC Controller支持三种删除策略
- Orphan(孤儿删除):仅删除当前资源对象,不级联删除当前资源对象的子级资源对象
- Foreground(前台删除):先删除当前资源对象子对象,再删除这个对象
- Backgroud(后台删除):先删除当前资源对象,再级联删除这个资源的子级对象
- 使用Orphan和Foreground,资源对象会被加上Finalizer,并且在删除中由GC Controller负责移除
-
/ OwnerReference contains enough information to let you identify an owning
// object. An owning object must be in the same namespace as the dependent, or
// be cluster-scoped, so there is no namespace field.
// +structType=atomic
type OwnerReference struct {
// API version of the referent.
APIVersion string `json:"apiVersion" protobuf:"bytes,5,opt,name=apiVersion"`
// Kind of the referent.
// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
Kind string `json:"kind" protobuf:"bytes,1,opt,name=kind"`
// Name of the referent.
// More info: http://kubernetes.io/docs/user-guide/identifiers#names
Name string `json:"name" protobuf:"bytes,3,opt,name=name"`
// UID of the referent.
// More info: http://kubernetes.io/docs/user-guide/identifiers#uids
UID types.UID `json:"uid" protobuf:"bytes,4,opt,name=uid,casttype=k8s.io/apimachinery/pkg/types.UID"`
// If true, this reference points to the managing controller.
// +optional
Controller *bool `json:"controller,omitempty" protobuf:"varint,6,opt,name=controller"`
// If true, AND if the owner has the "foregroundDeletion" finalizer, then
// the owner cannot be deleted from the key-value store until this
// reference is removed.
// See https://kubernetes.io/docs/concepts/architecture/garbage-collection/#foreground-deletion
// for how the garbage collector interacts with this field and enforces the foreground deletion.
// Defaults to false.
// To set this field, a user needs "delete" permission of the owner,
// otherwise 422 (Unprocessable Entity) will be returned.
// +optional
BlockOwnerDeletion *bool `json:"blockOwnerDeletion,omitempty" protobuf:"varint,7,opt,name=blockOwnerDeletion"`
}
控制器初始化
仅仅跳过Event资源对象
主要执行逻辑
gc.dependencyGraphBuilder.Run
该协程负责从graphChanges工作队列中取出资源对象和事件类型,更新GC Controller记录的资源对象依赖关系图。
如果有资源对象关系图变化,导致了一些资源对象需要被删除,会被放入到attemptOrphan或者acttemptToDelete工作队列
gc.runAttemptToOrphan.Worker
负责取出数据,进行孤儿策略删除
gc.runAttemptToDelete.Worker
负责取出数据,按照前台或者后台删除
更新资源对象依赖关系图
concurrentUIDToNode利用一个Map来记录资源对象关系图中的资源对象映射
-
type concurrentUIDToNode struct {
uidToNodeLock sync.RWMutex
uidToNode map[types.UID]*node
}
node结构:
-
// The single-threaded GraphBuilder.processGraphChanges() is the sole writer of the
// nodes. The multi-threaded GarbageCollector.attemptToDeleteItem() reads the nodes.
// WARNING: node has different locks on different fields. setters and getters
// use the respective locks, so the return values of the getters can be
// inconsistent.
type node struct {
// 资源对象本身
identity objectReference
// dependents will be read by the orphan() routine, we need to protect it with a lock.
dependentsLock sync.RWMutex
// dependents are the nodes that have node.identity as a
// metadata.ownerReference.
// 当前资源对象的子级资源对象
dependents map[*node]struct{}
// this is set by processGraphChanges() if the object has non-nil DeletionTimestamp
// and has the FinalizerDeleteDependents.
deletingDependents bool
deletingDependentsLock sync.RWMutex
// this records if the object's deletionTimestamp is non-nil.
beingDeleted bool
beingDeletedLock sync.RWMutex
// this records if the object was constructed virtually and never observed via informer event
virtual bool
virtualLock sync.RWMutex
// when processing an Update event, we need to compare the updated
// ownerReferences with the owners recorded in the graph.
owners []metav1.OwnerReference
}
资源对象依赖图的更新流程:
事件为Add或Uptdae且资源对象不在资源对象依赖关系图中
gb.insertNode
GC Controller生成Node,并且将其插入到资源对象关系图中。在插入过程中,不仅将node插入图中,还需要更新node的owners
gb.processTransitions
如果资源对象拥有Orphan Finalizer且DeletionTimestamp不为空,则说明当前资源对象正在被孤儿策略删除,此时node放入attemptToOrphan工作队列
事件为AddUptdae且资源对象在资源对象依赖关系图中
ReferenceDiffs
检查node中记录的原有的父级资源对象和当前资源对象的OwnerReference,找到新增的父级资源对象、减少的父级资源对象、更新的父级资源对象。对于变更,执行下下面的流程:
gc.addUnblockedOwnersToDeleteQueue
对于减少的父级的资源对象,GC Controller 检查当前的资源对象是否曾经阻塞这些父级资源对象的前台删除
gc.addDependentToOwners
如果资源对象新增了某些父级资源对象,则需要在资源对象依赖关系图中更新这些父级对应的node,将资源对象添加到这些父级资源对象的node的dependents
markBeingDeleted
如果资源对象的DeletionTimestamp不为空,则node的beingDeleted更新为true。该属性影响后续删除步骤的执行
gc.processTransitions
根据资源对象的DeletionTimestamp和Finalizer,将其加入attemptToDelete或attemptToOrphan
事件为Delete
gb.removeNode
GC Controoler将当前node从资源对象依赖关系中删除。与insertNode方法相反,GC Controller将当前node从资源对象依赖关系图中删除,移除node的dependents
markBeingDeleted
根据资源对象的DeletionTimestamp和Finalizer,将其加入attemptToDelete或attemptToOrphan工作队列
existingNode.dependents
将当前node的子级资源对象加入attemptToDelete工作队列
如果一个资源对象是后台策略删除,则它的各个子级资源对象此刻被加入attemptToDelete工作队列
existingNode.owners
检查当前node的各个父级资源对象,如果父级资源对象正在执行前台删除,则父级资源对象也加入attemptToDelete工作队列
孤儿删除
孤儿删除策略在runAfftemptToOrphanWorker中完成
流程如下:
gc.orphanDependents
解除资源对象与其子级资源对象之间的关联关系。
gc.removeFinalizer
移除Orphan Finalizer。
让GC Controller有机会在删除对象之前清理资源对象的关联关系。
完成关联关系清理之后,GC Controller将Orphan Finalizer移除,之后资源对象会真正被删除
级联删除
级联删除执行在runAttemptToDeleteWorker中完成
主要执行流程如下:
Item.isBeingDeleted() & !item.isDeletingDependents
检查资源对象是否处于删除中状态且是否为前台删除
item.isDeletingDependents
检查资源是否处于前台删除中,如果是执行前台删除
-
// process item that's waiting for its dependents to be deleted
func (gc *GarbageCollector) processDeletingDependentsItem(item *node) error {
blockingDependents := item.blockingDependents()
if len(blockingDependents) == 0 {
klog.V(2).Infof("remove DeleteDependents finalizer for item %s", item.identity)
return gc.removeFinalizer(item, metav1.FinalizerDeleteDependents)
}
for _, dep := range blockingDependents {
if !dep.isDeletingDependents() {
klog.V(2).Infof("adding %s to attemptToDelete, because its owner %s is deletingDependents", dep.identity, item.identity)
gc.attemptToDelete.Add(dep)
}
}
return nil
}
-
gc.classifyReferences
根据当前资源对象的OwnerReference分类处理级联删除
gc.classifyReferences主要分为三类:
- Solid:没有在等待其他资源对象完成前台删除
- WaitingForDependentsDeletion:在等待其他资源对象完成前台删除
- Danging:已经不存在的资源对象