K8s发现和负载均衡-Service

基于1.25

什么是Service

K8s中,Pod随时消亡,控制器Pod集合不断变化,每个服务调用者不知道发送到哪个IP,Service就是为了解决这个问题,提供了以下的能力:

  • Pod有自己的IP
  • Service被赋予唯一的DNS Name
  • Service通过标签选择器选择一组Pod
  • Service实现负载均衡,可以将请求均衡放到一组Pod

ServiceSpec

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
// ServiceSpec describes the attributes that a user creates on a service.
type ServiceSpec struct {
// The list of ports that are exposed by this service.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +patchMergeKey=port
// +patchStrategy=merge
// +listType=map
// +listMapKey=port
// +listMapKey=protocol
// Services对外暴露的端口列表
Ports []ServicePort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"port" protobuf:"bytes,1,rep,name=ports"`

// Route service traffic to pods with label keys and values matching this
// selector. If empty or not present, the service is assumed to have an
// external process managing its endpoints, which Kubernetes will not
// modify. Only applies to types ClusterIP, NodePort, and LoadBalancer.
// Ignored if type is ExternalName.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/
// +optional
// +mapType=atomic
// 标签选择器
// 不适用于ExternalName
Selector map[string]string `json:"selector,omitempty" protobuf:"bytes,2,rep,name=selector"`

// clusterIP is the IP address of the service and is usually assigned
// randomly. If an address is specified manually, is in-range (as per
// system configuration), and is not in use, it will be allocated to the
// service; otherwise creation of the service will fail. This field may not
// be changed through updates unless the type field is also being changed
// to ExternalName (which requires this field to be blank) or the type
// field is being changed from ExternalName (in which case this field may
// optionally be specified, as describe above). Valid values are "None",
// empty string (""), or a valid IP address. Setting this to "None" makes a
// "headless service" (no virtual IP), which is useful when direct endpoint
// connections are preferred and proxying is not required. Only applies to
// types ClusterIP, NodePort, and LoadBalancer. If this field is specified
// when creating a Service of type ExternalName, creation will fail. This
// field will be wiped when updating a Service to type ExternalName.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +optional
// Service的IP,随机分配,如果手动指定IP,没有使用可以分配IP,否则创建Service失败
// 这个字段不能通过更新修改,除非字段类型设置为ExternalName(要求字段为空)
// 有效值为None,空字符(""),有效的IP地址
// 设置为None 会产生无头服务,在直接连接端点的时候好不需要代理很有用
ClusterIP string `json:"clusterIP,omitempty" protobuf:"bytes,3,opt,name=clusterIP"`

// ClusterIPs is a list of IP addresses assigned to this service, and are
// usually assigned randomly. If an address is specified manually, is
// in-range (as per system configuration), and is not in use, it will be
// allocated to the service; otherwise creation of the service will fail.
// This field may not be changed through updates unless the type field is
// also being changed to ExternalName (which requires this field to be
// empty) or the type field is being changed from ExternalName (in which
// case this field may optionally be specified, as describe above). Valid
// values are "None", empty string (""), or a valid IP address. Setting
// this to "None" makes a "headless service" (no virtual IP), which is
// useful when direct endpoint connections are preferred and proxying is
// not required. Only applies to types ClusterIP, NodePort, and
// LoadBalancer. If this field is specified when creating a Service of type
// ExternalName, creation will fail. This field will be wiped when updating
// a Service to type ExternalName. If this field is not specified, it will
// be initialized from the clusterIP field. If this field is specified,
// clients must ensure that clusterIPs[0] and clusterIP have the same
// value.
//
// This field may hold a maximum of two entries (dual-stack IPs, in either order).
// These IPs must correspond to the values of the ipFamilies field. Both
// clusterIPs and ipFamilies are governed by the ipFamilyPolicy field.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +listType=atomic
// +optional
// Service的IP地址地址,最多俩个条目(双栈IP)
// ClusterIPs和IP Families都由IPFamilyPoliy字段管理
ClusterIPs []string `json:"clusterIPs,omitempty" protobuf:"bytes,18,opt,name=clusterIPs"`

// type determines how the Service is exposed. Defaults to ClusterIP. Valid
// options are ExternalName, ClusterIP, NodePort, and LoadBalancer.
// "ClusterIP" allocates a cluster-internal IP address for load-balancing
// to endpoints. Endpoints are determined by the selector or if that is not
// specified, by manual construction of an Endpoints object or
// EndpointSlice objects. If clusterIP is "None", no virtual IP is
// allocated and the endpoints are published as a set of endpoints rather
// than a virtual IP.
// "NodePort" builds on ClusterIP and allocates a port on every node which
// routes to the same endpoints as the clusterIP.
// "LoadBalancer" builds on NodePort and creates an external load-balancer
// (if supported in the current cloud) which routes to the same endpoints
// as the clusterIP.
// "ExternalName" aliases this service to the specified externalName.
// Several other fields do not apply to ExternalName services.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
// +optional
// Servic的暴露方式,默认为ClusterIP
// 可选值ExternalName ClusterIP NodePort LoadBalancer
// ClusterIP 表示分配集群内IP 通过标签选择器选择Pod 也可以手动创建Endpoints或者EndpoineSlice
// ClusterIP 为None 不分配IP地址
// NodePort 建立在ClusterIP之上,创建一个外部负载均衡 实现路由到ClusterIP
// ExternalName 把别名设置为指定的ExternalName
Type ServiceType `json:"type,omitempty" protobuf:"bytes,4,opt,name=type,casttype=ServiceType"`

// externalIPs is a list of IP addresses for which nodes in the cluster
// will also accept traffic for this service. These IPs are not managed by
// Kubernetes. The user is responsible for ensuring that traffic arrives
// at a node with this IP. A common example is external load-balancers
// that are not part of the Kubernetes system.
// +optional
// 一个IP地址列表
// IP不受K8s管理,可以接收到此Service流量
ExternalIPs []string `json:"externalIPs,omitempty" protobuf:"bytes,5,rep,name=externalIPs"`

// Supports "ClientIP" and "None". Used to maintain session affinity.
// Enable client IP based session affinity.
// Must be ClientIP or None.
// Defaults to None.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +optional
// 维护会话关联,支持ClinetIP和None
// 默认None
SessionAffinity ServiceAffinity `json:"sessionAffinity,omitempty" protobuf:"bytes,7,opt,name=sessionAffinity,casttype=ServiceAffinity"`

// Only applies to Service Type: LoadBalancer.
// This feature depends on whether the underlying cloud-provider supports specifying
// the loadBalancerIP when a load balancer is created.
// This field will be ignored if the cloud-provider does not support the feature.
// Deprecated: This field was under-specified and its meaning varies across implementations,
// and it cannot support dual-stack.
// As of Kubernetes v1.24, users are encouraged to use implementation-specific annotations when available.
// This field may be removed in a future API version.
// +optional
// 仅适用于LoadBalancer类型
// 可能会删除
LoadBalancerIP string `json:"loadBalancerIP,omitempty" protobuf:"bytes,8,opt,name=loadBalancerIP"`

// If specified and supported by the platform, this will restrict traffic through the cloud-provider
// load-balancer will be restricted to the specified client IPs. This field will be ignored if the
// cloud-provider does not support the feature."
// More info: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/
// +optional
// 负载均衡流量使用特定的ClientIP
LoadBalancerSourceRanges []string `json:"loadBalancerSourceRanges,omitempty" protobuf:"bytes,9,opt,name=loadBalancerSourceRanges"`

// externalName is the external reference that discovery mechanisms will
// return as an alias for this service (e.g. a DNS CNAME record). No
// proxying will be involved. Must be a lowercase RFC-1123 hostname
// (https://tools.ietf.org/html/rfc1123) and requires `type` to be "ExternalName".
// +optional
// 发现机制将返回的外部引用,作为Service别名
ExternalName string `json:"externalName,omitempty" protobuf:"bytes,10,opt,name=externalName"`

// externalTrafficPolicy describes how nodes distribute service traffic they
// receive on one of the Service's "externally-facing" addresses (NodePorts,
// ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
// the service in a way that assumes that external load balancers will take care
// of balancing the service traffic between nodes, and so each node will deliver
// traffic only to the node-local endpoints of the service, without masquerading
// the client source IP. (Traffic mistakenly sent to a node with no endpoints will
// be dropped.) The default value, "Cluster", uses the standard behavior of
// routing to all endpoints evenly (possibly modified by topology and other
// features). Note that traffic sent to an External IP or LoadBalancer IP from
// within the cluster will always get "Cluster" semantics, but clients sending to
// a NodePort from within the cluster may need to take traffic policy into account
// when picking a node.
// +optional
// 描述了节点在Service 外部的地址
// 设置为Local 待定把假定的负载均衡器负责平衡节点之间的服务流量配置Services
// 默认是有Cluster 均衡的发布到每一个端点
ExternalTrafficPolicy ServiceExternalTrafficPolicyType `json:"externalTrafficPolicy,omitempty" protobuf:"bytes,11,opt,name=externalTrafficPolicy"`

// healthCheckNodePort specifies the healthcheck nodePort for the service.
// This only applies when type is set to LoadBalancer and
// externalTrafficPolicy is set to Local. If a value is specified, is
// in-range, and is not in use, it will be used. If not specified, a value
// will be automatically allocated. External systems (e.g. load-balancers)
// can use this port to determine if a given node holds endpoints for this
// service or not. If this field is specified when creating a Service
// which does not need it, creation will fail. This field will be wiped
// when updating a Service to no longer need it (e.g. changing type).
// This field cannot be updated once set.
// +optional
// 指定Service的健康检查端口
// 仅在Type 为LoadBalancer 且ExternalTrafficPolicy=Local
HealthCheckNodePort int32 `json:"healthCheckNodePort,omitempty" protobuf:"bytes,12,opt,name=healthCheckNodePort"`

// publishNotReadyAddresses indicates that any agent which deals with endpoints for this
// Service should disregard any indications of ready/not-ready.
// The primary use case for setting this field is for a StatefulSet's Headless Service to
// propagate SRV DNS records for its Pods for the purpose of peer discovery.
// The Kubernetes controllers that generate Endpoints and EndpointSlice resources for
// Services interpret this to mean that all endpoints are considered "ready" even if the
// Pods themselves are not. Agents which consume only Kubernetes generated endpoints
// through the Endpoints or EndpointSlice resources can safely assume this behavior.
// +optional
// 指定是否忽略为就绪的Pod,默认false
// 默认,需要就绪才加到Endpoint
// true,未就绪就加到Endpoint
PublishNotReadyAddresses bool `json:"publishNotReadyAddresses,omitempty" protobuf:"varint,13,opt,name=publishNotReadyAddresses"`

// sessionAffinityConfig contains the configurations of session affinity.
// +optional
// 会话亲和性配置
SessionAffinityConfig *SessionAffinityConfig `json:"sessionAffinityConfig,omitempty" protobuf:"bytes,14,opt,name=sessionAffinityConfig"`

// TopologyKeys is tombstoned to show why 16 is reserved protobuf tag.
//TopologyKeys []string `json:"topologyKeys,omitempty" protobuf:"bytes,16,opt,name=topologyKeys"`

// IPFamily is tombstoned to show why 15 is a reserved protobuf tag.
// IPFamily *IPFamily `json:"ipFamily,omitempty" protobuf:"bytes,15,opt,name=ipFamily,Configcasttype=IPFamily"`

// IPFamilies is a list of IP families (e.g. IPv4, IPv6) assigned to this
// service. This field is usually assigned automatically based on cluster
// configuration and the ipFamilyPolicy field. If this field is specified
// manually, the requested family is available in the cluster,
// and ipFamilyPolicy allows it, it will be used; otherwise creation of
// the service will fail. This field is conditionally mutable: it allows
// for adding or removing a secondary IP family, but it does not allow
// changing the primary IP family of the Service. Valid values are "IPv4"
// and "IPv6". This field only applies to Services of types ClusterIP,
// NodePort, and LoadBalancer, and does apply to "headless" services.
// This field will be wiped when updating a Service to type ExternalName.
//
// This field may hold a maximum of two entries (dual-stack families, in
// either order). These families must correspond to the values of the
// clusterIPs field, if specified. Both clusterIPs and ipFamilies are
// governed by the ipFamilyPolicy field.
// +listType=atomic
// +optional
// IP系列列表
IPFamilies []IPFamily `json:"ipFamilies,omitempty" protobuf:"bytes,19,opt,name=ipFamilies,casttype=IPFamily"`

// IPFamilyPolicy represents the dual-stack-ness requested or required by
// this Service. If there is no value provided, then this field will be set
// to SingleStack. Services can be "SingleStack" (a single IP family),
// "PreferDualStack" (two IP families on dual-stack configured clusters or
// a single IP family on single-stack clusters), or "RequireDualStack"
// (two IP families on dual-stack configured clusters, otherwise fail). The
// ipFamilies and clusterIPs fields depend on the value of this field. This
// field will be wiped when updating a service to type ExternalName.
// +optional
// Service请求和要求的双栈
// 默认SingleStack
// SingeStack 单IP
// PreferDualStack 双栈 单IP
// RequireStack 双栈IP更新
IPFamilyPolicy *IPFamilyPolicy `json:"ipFamilyPolicy,omitempty" protobuf:"bytes,17,opt,name=ipFamilyPolicy,casttype=IPFamilyPolicy"`

// allocateLoadBalancerNodePorts defines if NodePorts will be automatically
// allocated for services with type LoadBalancer. Default is "true". It
// may be set to "false" if the cluster load-balancer does not rely on
// NodePorts. If the caller requests specific NodePorts (by specifying a
// value), those requests will be respected, regardless of this field.
// This field may only be set for services with type LoadBalancer and will
// be cleared if the type is changed to any other type.
// +optional
// 是否自动为LoadBlancer 分配IP
AllocateLoadBalancerNodePorts *bool `json:"allocateLoadBalancerNodePorts,omitempty" protobuf:"bytes,20,opt,name=allocateLoadBalancerNodePorts"`

// loadBalancerClass is the class of the load balancer implementation this Service belongs to.
// If specified, the value of this field must be a label-style identifier, with an optional prefix,
// e.g. "internal-vip" or "example.com/internal-vip". Unprefixed names are reserved for end-users.
// This field can only be set when the Service type is 'LoadBalancer'. If not set, the default load
// balancer implementation is used, today this is typically done through the cloud provider integration,
// but should apply for any default implementation. If set, it is assumed that a load balancer
// implementation is watching for Services with a matching class. Any default load balancer
// implementation (e.g. cloud providers) should ignore Services that set this field.
// This field can only be set when creating or updating a Service to type 'LoadBalancer'.
// Once set, it can not be changed. This field will be wiped when a service is updated to a non 'LoadBalancer' type.
// +featureGate=LoadBalancerClass
// +optional
// Service属于单负载均衡器实现的类
LoadBalancerClass *string `json:"loadBalancerClass,omitempty" protobuf:"bytes,21,opt,name=loadBalancerClass"`

// InternalTrafficPolicy describes how nodes distribute service traffic they
// receive on the ClusterIP. If set to "Local", the proxy will assume that pods
// only want to talk to endpoints of the service on the same node as the pod,
// dropping the traffic if there are no local endpoints. The default value,
// "Cluster", uses the standard behavior of routing to all endpoints evenly
// (possibly modified by topology and other features).
// +featureGate=ServiceInternalTrafficPolicy
// +optional
// 描述了节点如何分配在ClusterIP收到的流量
InternalTrafficPolicy *ServiceInternalTrafficPolicyType `json:"internalTrafficPolicy,omitempty" protobuf:"bytes,22,opt,name=internalTrafficPolicy"`
}

服务类型

Service Type 目前支持ClusterIP、NodePort、LoadBalancer、ExternalName四种类型

ClusterIP

默认的服务类型

  • 是NodePort、LoadbBalancer的基础
  • 单独的IP网段
  • 先通过VIP再通过Kube-Proxy到各个Pod
  • 如果使用ipvs,可以通过ipvsadm命令查看负载均衡的转发规则
  • ClusterIP Service会创建域名对应所属的SRV记录
  • ClusterIP Service优点是VIP位于Pod前面,可以有效避免直接DNS解析
  • 缺点是请求量大,kube-proxy处理性能发生瓶颈,可能出现gPRC等长链接不生效的问题

NodePort

NodePort通过每个节点的IP地址和静态端口(NodePort)暴露Service

  • K8s控制面通过--service-node-port-range分配端口号(30000-32767)
  • Service,可以通过任何一个集群节点IP+NodePort访问
  • NodePort Service 请求路径是由K8s的节点IP直接到Pod,不会经过CLuster IP,转发逻辑依旧是Kube-Porxy实现
  • NodePort Service域名的解析结果是一个CLUSTER- IP,在集群内部请求负载均衡的逻辑跟实现ClusterIP Service是一致的
  • 缺点:
    • NodePort 本身端口有限制,请求量大,kube-proxy成为瓶颈
    • 在访问时与节点IP地址强绑定不利于节点IP地址频繁变动的场景

LoadBalancer

LoadBalancer使用云提供商的负载均衡对外暴露Service

外部负载均衡器可以将流量路由到自动创建的NodePort Service

开源LoadBancer:

Headless Service

可以把spec.clusterIP设置为None创建Headless Service

  • 此时该Service域名解析的结果就是Service关联的所有的PodIP地址
  • 使用该域名访问的时候,请求会直接到达Pod,相当于仅仅用DNS用了负载均衡,而不是用K8s大kube-proxy作负载均衡

各种Port的区别

  • containerPort:作用于Pod内部容器

    • 告知K8s的容器的Service端口
  • nodePort:

    • 只存在于LoadBalancer和NodePort类型的Service中,用于访问指定Service
  • port:只作用于CLUSTER-IP和EXTERNAL-IP

    • 对于LoadBalancer、NodePort和ClusterIP类型都有作用
    • 集群外可以通过EXTERNAL-IP:port访问
    • 集群内可以通过ClusterIP:port访问
  • targetPort:Pod的外部访问端口,转发路径

      1. NodeIP:nodePort->PodIP:targetPort
      2. CLUSTER-IP:port->PodIP:targetPort
      3. EXTERNAL-IP:port->PodIP:targetPort

实际过程中,最好保证targetPort、containerdPort和Pod中实际监听端口一致

Service和Pod的DNS

K8s为Service的Pod创建DNS记录,可以使用一致的DNS名称而非IP地址访问Service

  • kubelete配置Pod的DNS以便运行中的容器通过名称来寻找而不是IP地址查找Service
  • 集群中的Service赋予了DNS名称,默认情况下客户端的Pod的DNS搜索列表会查找Pod自身命名空间和集群默认域

ExternalTrafficPolicy

ExternalTrafficPolicy表示此Service是否希望外部流量路由到节点本地或者集群访问,有俩个选项

  • Cluster:隐藏客户端源IP,可能导致第二跳跳转到另外一个节点上,有比较好的负载均衡
    • 不管其他节点是否可以承接流量,先转移到其他节点,再找对应的Pod
  • Local:保留客户端源IP,避免LB和NodePort类型的Service第二跳,但是存在不负载均衡的风险
    • 只有Pod所在节点可以承接流量,其他节点无法承接流量