K8s发现和负载均衡-Service

基于1.25

什么是Service

K8s中,Pod随时消亡,控制器Pod集合不断变化,每个服务调用者不知道发送到哪个IP,Service就是为了解决这个问题,提供了以下的能力:

  • Pod有自己的IP
  • Service被赋予唯一的DNS Name
  • Service通过标签选择器选择一组Pod
  • Service实现负载均衡,可以将请求均衡放到一组Pod

ServiceSpec

// ServiceSpec describes the attributes that a user creates on a service.
type ServiceSpec struct {
// The list of ports that are exposed by this service.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +patchMergeKey=port
// +patchStrategy=merge
// +listType=map
// +listMapKey=port
// +listMapKey=protocol
// Services对外暴露的端口列表
Ports []ServicePort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"port" protobuf:"bytes,1,rep,name=ports"`

// Route service traffic to pods with label keys and values matching this
// selector. If empty or not present, the service is assumed to have an
// external process managing its endpoints, which Kubernetes will not
// modify. Only applies to types ClusterIP, NodePort, and LoadBalancer.
// Ignored if type is ExternalName.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/
// +optional
// +mapType=atomic
// 标签选择器
// 不适用于ExternalName
Selector map[string]string `json:"selector,omitempty" protobuf:"bytes,2,rep,name=selector"`

// clusterIP is the IP address of the service and is usually assigned
// randomly. If an address is specified manually, is in-range (as per
// system configuration), and is not in use, it will be allocated to the
// service; otherwise creation of the service will fail. This field may not
// be changed through updates unless the type field is also being changed
// to ExternalName (which requires this field to be blank) or the type
// field is being changed from ExternalName (in which case this field may
// optionally be specified, as describe above). Valid values are "None",
// empty string (""), or a valid IP address. Setting this to "None" makes a
// "headless service" (no virtual IP), which is useful when direct endpoint
// connections are preferred and proxying is not required. Only applies to
// types ClusterIP, NodePort, and LoadBalancer. If this field is specified
// when creating a Service of type ExternalName, creation will fail. This
// field will be wiped when updating a Service to type ExternalName.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +optional
// Service的IP,随机分配,如果手动指定IP,没有使用可以分配IP,否则创建Service失败
// 这个字段不能通过更新修改,除非字段类型设置为ExternalName(要求字段为空)
// 有效值为None,空字符(""),有效的IP地址
// 设置为None 会产生无头服务,在直接连接端点的时候好不需要代理很有用
ClusterIP string `json:"clusterIP,omitempty" protobuf:"bytes,3,opt,name=clusterIP"`

// ClusterIPs is a list of IP addresses assigned to this service, and are
// usually assigned randomly. If an address is specified manually, is
// in-range (as per system configuration), and is not in use, it will be
// allocated to the service; otherwise creation of the service will fail.
// This field may not be changed through updates unless the type field is
// also being changed to ExternalName (which requires this field to be
// empty) or the type field is being changed from ExternalName (in which
// case this field may optionally be specified, as describe above). Valid
// values are "None", empty string (""), or a valid IP address. Setting
// this to "None" makes a "headless service" (no virtual IP), which is
// useful when direct endpoint connections are preferred and proxying is
// not required. Only applies to types ClusterIP, NodePort, and
// LoadBalancer. If this field is specified when creating a Service of type
// ExternalName, creation will fail. This field will be wiped when updating
// a Service to type ExternalName. If this field is not specified, it will
// be initialized from the clusterIP field. If this field is specified,
// clients must ensure that clusterIPs[0] and clusterIP have the same
// value.
//
// This field may hold a maximum of two entries (dual-stack IPs, in either order).
// These IPs must correspond to the values of the ipFamilies field. Both
// clusterIPs and ipFamilies are governed by the ipFamilyPolicy field.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +listType=atomic
// +optional
// Service的IP地址地址,最多俩个条目(双栈IP)
// ClusterIPs和IP Families都由IPFamilyPoliy字段管理
ClusterIPs []string `json:"clusterIPs,omitempty" protobuf:"bytes,18,opt,name=clusterIPs"`

// type determines how the Service is exposed. Defaults to ClusterIP. Valid
// options are ExternalName, ClusterIP, NodePort, and LoadBalancer.
// "ClusterIP" allocates a cluster-internal IP address for load-balancing
// to endpoints. Endpoints are determined by the selector or if that is not
// specified, by manual construction of an Endpoints object or
// EndpointSlice objects. If clusterIP is "None", no virtual IP is
// allocated and the endpoints are published as a set of endpoints rather
// than a virtual IP.
// "NodePort" builds on ClusterIP and allocates a port on every node which
// routes to the same endpoints as the clusterIP.
// "LoadBalancer" builds on NodePort and creates an external load-balancer
// (if supported in the current cloud) which routes to the same endpoints
// as the clusterIP.
// "ExternalName" aliases this service to the specified externalName.
// Several other fields do not apply to ExternalName services.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
// +optional
// Servic的暴露方式,默认为ClusterIP
// 可选值ExternalName ClusterIP NodePort LoadBalancer
// ClusterIP 表示分配集群内IP 通过标签选择器选择Pod 也可以手动创建Endpoints或者EndpoineSlice
// ClusterIP 为None 不分配IP地址
// NodePort 建立在ClusterIP之上,创建一个外部负载均衡 实现路由到ClusterIP
// ExternalName 把别名设置为指定的ExternalName
Type ServiceType `json:"type,omitempty" protobuf:"bytes,4,opt,name=type,casttype=ServiceType"`

// externalIPs is a list of IP addresses for which nodes in the cluster
// will also accept traffic for this service. These IPs are not managed by
// Kubernetes. The user is responsible for ensuring that traffic arrives
// at a node with this IP. A common example is external load-balancers
// that are not part of the Kubernetes system.
// +optional
// 一个IP地址列表
// IP不受K8s管理,可以接收到此Service流量
ExternalIPs []string `json:"externalIPs,omitempty" protobuf:"bytes,5,rep,name=externalIPs"`

// Supports "ClientIP" and "None". Used to maintain session affinity.
// Enable client IP based session affinity.
// Must be ClientIP or None.
// Defaults to None.
// More info: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies
// +optional
// 维护会话关联,支持ClinetIP和None
// 默认None
SessionAffinity ServiceAffinity `json:"sessionAffinity,omitempty" protobuf:"bytes,7,opt,name=sessionAffinity,casttype=ServiceAffinity"`

// Only applies to Service Type: LoadBalancer.
// This feature depends on whether the underlying cloud-provider supports specifying
// the loadBalancerIP when a load balancer is created.
// This field will be ignored if the cloud-provider does not support the feature.
// Deprecated: This field was under-specified and its meaning varies across implementations,
// and it cannot support dual-stack.
// As of Kubernetes v1.24, users are encouraged to use implementation-specific annotations when available.
// This field may be removed in a future API version.
// +optional
// 仅适用于LoadBalancer类型
// 可能会删除
LoadBalancerIP string `json:"loadBalancerIP,omitempty" protobuf:"bytes,8,opt,name=loadBalancerIP"`

// If specified and supported by the platform, this will restrict traffic through the cloud-provider
// load-balancer will be restricted to the specified client IPs. This field will be ignored if the
// cloud-provider does not support the feature."
// More info: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/
// +optional
// 负载均衡流量使用特定的ClientIP
LoadBalancerSourceRanges []string `json:"loadBalancerSourceRanges,omitempty" protobuf:"bytes,9,opt,name=loadBalancerSourceRanges"`

// externalName is the external reference that discovery mechanisms will
// return as an alias for this service (e.g. a DNS CNAME record). No
// proxying will be involved. Must be a lowercase RFC-1123 hostname
// (https://tools.ietf.org/html/rfc1123) and requires `type` to be "ExternalName".
// +optional
// 发现机制将返回的外部引用,作为Service别名
ExternalName string `json:"externalName,omitempty" protobuf:"bytes,10,opt,name=externalName"`

// externalTrafficPolicy describes how nodes distribute service traffic they
// receive on one of the Service's "externally-facing" addresses (NodePorts,
// ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
// the service in a way that assumes that external load balancers will take care
// of balancing the service traffic between nodes, and so each node will deliver
// traffic only to the node-local endpoints of the service, without masquerading
// the client source IP. (Traffic mistakenly sent to a node with no endpoints will
// be dropped.) The default value, "Cluster", uses the standard behavior of
// routing to all endpoints evenly (possibly modified by topology and other
// features). Note that traffic sent to an External IP or LoadBalancer IP from
// within the cluster will always get "Cluster" semantics, but clients sending to
// a NodePort from within the cluster may need to take traffic policy into account
// when picking a node.
// +optional
// 描述了节点在Service 外部的地址
// 设置为Local 待定把假定的负载均衡器负责平衡节点之间的服务流量配置Services
// 默认是有Cluster 均衡的发布到每一个端点
ExternalTrafficPolicy ServiceExternalTrafficPolicyType `json:"externalTrafficPolicy,omitempty" protobuf:"bytes,11,opt,name=externalTrafficPolicy"`

// healthCheckNodePort specifies the healthcheck nodePort for the service.
// This only applies when type is set to LoadBalancer and
// externalTrafficPolicy is set to Local. If a value is specified, is
// in-range, and is not in use, it will be used. If not specified, a value
// will be automatically allocated. External systems (e.g. load-balancers)
// can use this port to determine if a given node holds endpoints for this
// service or not. If this field is specified when creating a Service
// which does not need it, creation will fail. This field will be wiped
// when updating a Service to no longer need it (e.g. changing type).
// This field cannot be updated once set.
// +optional
// 指定Service的健康检查端口
// 仅在Type 为LoadBalancer 且ExternalTrafficPolicy=Local
HealthCheckNodePort int32 `json:"healthCheckNodePort,omitempty" protobuf:"bytes,12,opt,name=healthCheckNodePort"`

// publishNotReadyAddresses indicates that any agent which deals with endpoints for this
// Service should disregard any indications of ready/not-ready.
// The primary use case for setting this field is for a StatefulSet's Headless Service to
// propagate SRV DNS records for its Pods for the purpose of peer discovery.
// The Kubernetes controllers that generate Endpoints and EndpointSlice resources for
// Services interpret this to mean that all endpoints are considered "ready" even if the
// Pods themselves are not. Agents which consume only Kubernetes generated endpoints
// through the Endpoints or EndpointSlice resources can safely assume this behavior.
// +optional
// 指定是否忽略为就绪的Pod,默认false
// 默认,需要就绪才加到Endpoint
// true,未就绪就加到Endpoint
PublishNotReadyAddresses bool `json:"publishNotReadyAddresses,omitempty" protobuf:"varint,13,opt,name=publishNotReadyAddresses"`

// sessionAffinityConfig contains the configurations of session affinity.
// +optional
// 会话亲和性配置
SessionAffinityConfig *SessionAffinityConfig `json:"sessionAffinityConfig,omitempty" protobuf:"bytes,14,opt,name=sessionAffinityConfig"`

// TopologyKeys is tombstoned to show why 16 is reserved protobuf tag.
//TopologyKeys []string `json:"topologyKeys,omitempty" protobuf:"bytes,16,opt,name=topologyKeys"`

// IPFamily is tombstoned to show why 15 is a reserved protobuf tag.
// IPFamily *IPFamily `json:"ipFamily,omitempty" protobuf:"bytes,15,opt,name=ipFamily,Configcasttype=IPFamily"`

// IPFamilies is a list of IP families (e.g. IPv4, IPv6) assigned to this
// service. This field is usually assigned automatically based on cluster
// configuration and the ipFamilyPolicy field. If this field is specified
// manually, the requested family is available in the cluster,
// and ipFamilyPolicy allows it, it will be used; otherwise creation of
// the service will fail. This field is conditionally mutable: it allows
// for adding or removing a secondary IP family, but it does not allow
// changing the primary IP family of the Service. Valid values are "IPv4"
// and "IPv6". This field only applies to Services of types ClusterIP,
// NodePort, and LoadBalancer, and does apply to "headless" services.
// This field will be wiped when updating a Service to type ExternalName.
//
// This field may hold a maximum of two entries (dual-stack families, in
// either order). These families must correspond to the values of the
// clusterIPs field, if specified. Both clusterIPs and ipFamilies are
// governed by the ipFamilyPolicy field.
// +listType=atomic
// +optional
// IP系列列表
IPFamilies []IPFamily `json:"ipFamilies,omitempty" protobuf:"bytes,19,opt,name=ipFamilies,casttype=IPFamily"`

// IPFamilyPolicy represents the dual-stack-ness requested or required by
// this Service. If there is no value provided, then this field will be set
// to SingleStack. Services can be "SingleStack" (a single IP family),
// "PreferDualStack" (two IP families on dual-stack configured clusters or
// a single IP family on single-stack clusters), or "RequireDualStack"
// (two IP families on dual-stack configured clusters, otherwise fail). The
// ipFamilies and clusterIPs fields depend on the value of this field. This
// field will be wiped when updating a service to type ExternalName.
// +optional
// Service请求和要求的双栈
// 默认SingleStack
// SingeStack 单IP
// PreferDualStack 双栈 单IP
// RequireStack 双栈IP更新
IPFamilyPolicy *IPFamilyPolicy `json:"ipFamilyPolicy,omitempty" protobuf:"bytes,17,opt,name=ipFamilyPolicy,casttype=IPFamilyPolicy"`

// allocateLoadBalancerNodePorts defines if NodePorts will be automatically
// allocated for services with type LoadBalancer. Default is "true". It
// may be set to "false" if the cluster load-balancer does not rely on
// NodePorts. If the caller requests specific NodePorts (by specifying a
// value), those requests will be respected, regardless of this field.
// This field may only be set for services with type LoadBalancer and will
// be cleared if the type is changed to any other type.
// +optional
// 是否自动为LoadBlancer 分配IP
AllocateLoadBalancerNodePorts *bool `json:"allocateLoadBalancerNodePorts,omitempty" protobuf:"bytes,20,opt,name=allocateLoadBalancerNodePorts"`

// loadBalancerClass is the class of the load balancer implementation this Service belongs to.
// If specified, the value of this field must be a label-style identifier, with an optional prefix,
// e.g. "internal-vip" or "example.com/internal-vip". Unprefixed names are reserved for end-users.
// This field can only be set when the Service type is 'LoadBalancer'. If not set, the default load
// balancer implementation is used, today this is typically done through the cloud provider integration,
// but should apply for any default implementation. If set, it is assumed that a load balancer
// implementation is watching for Services with a matching class. Any default load balancer
// implementation (e.g. cloud providers) should ignore Services that set this field.
// This field can only be set when creating or updating a Service to type 'LoadBalancer'.
// Once set, it can not be changed. This field will be wiped when a service is updated to a non 'LoadBalancer' type.
// +featureGate=LoadBalancerClass
// +optional
// Service属于单负载均衡器实现的类
LoadBalancerClass *string `json:"loadBalancerClass,omitempty" protobuf:"bytes,21,opt,name=loadBalancerClass"`

// InternalTrafficPolicy describes how nodes distribute service traffic they
// receive on the ClusterIP. If set to "Local", the proxy will assume that pods
// only want to talk to endpoints of the service on the same node as the pod,
// dropping the traffic if there are no local endpoints. The default value,
// "Cluster", uses the standard behavior of routing to all endpoints evenly
// (possibly modified by topology and other features).
// +featureGate=ServiceInternalTrafficPolicy
// +optional
// 描述了节点如何分配在ClusterIP收到的流量
InternalTrafficPolicy *ServiceInternalTrafficPolicyType `json:"internalTrafficPolicy,omitempty" protobuf:"bytes,22,opt,name=internalTrafficPolicy"`
}

服务类型

Service Type 目前支持ClusterIP、NodePort、LoadBalancer、ExternalName四种类型

ClusterIP

默认的服务类型

  • 是NodePort、LoadbBalancer的基础
  • 单独的IP网段
  • 先通过VIP再通过Kube-Proxy到各个Pod
  • 如果使用ipvs,可以通过ipvsadm命令查看负载均衡的转发规则
  • ClusterIP Service会创建域名对应所属的SRV记录
  • ClusterIP Service优点是VIP位于Pod前面,可以有效避免直接DNS解析
  • 缺点是请求量大,kube-proxy处理性能发生瓶颈,可能出现gPRC等长链接不生效的问题

NodePort

NodePort通过每个节点的IP地址和静态端口(NodePort)暴露Service

  • K8s控制面通过--service-node-port-range分配端口号(30000-32767)
  • Service,可以通过任何一个集群节点IP+NodePort访问
  • NodePort Service 请求路径是由K8s的节点IP直接到Pod,不会经过CLuster IP,转发逻辑依旧是Kube-Porxy实现
  • NodePort Service域名的解析结果是一个CLUSTER- IP,在集群内部请求负载均衡的逻辑跟实现ClusterIP Service是一致的
  • 缺点:
    • NodePort 本身端口有限制,请求量大,kube-proxy成为瓶颈
    • 在访问时与节点IP地址强绑定不利于节点IP地址频繁变动的场景

LoadBalancer

LoadBalancer使用云提供商的负载均衡对外暴露Service

外部负载均衡器可以将流量路由到自动创建的NodePort Service

开源LoadBancer:

Headless Service

可以把spec.clusterIP设置为None创建Headless Service

  • 此时该Service域名解析的结果就是Service关联的所有的PodIP地址
  • 使用该域名访问的时候,请求会直接到达Pod,相当于仅仅用DNS用了负载均衡,而不是用K8s大kube-proxy作负载均衡

各种Port的区别

  • containerPort:作用于Pod内部容器

    • 告知K8s的容器的Service端口
  • nodePort:

    • 只存在于LoadBalancer和NodePort类型的Service中,用于访问指定Service
  • port:只作用于CLUSTER-IP和EXTERNAL-IP

    • 对于LoadBalancer、NodePort和ClusterIP类型都有作用
    • 集群外可以通过EXTERNAL-IP:port访问
    • 集群内可以通过ClusterIP:port访问
  • targetPort:Pod的外部访问端口,转发路径

      1. NodeIP:nodePort->PodIP:targetPort
      2. CLUSTER-IP:port->PodIP:targetPort
      3. EXTERNAL-IP:port->PodIP:targetPort

实际过程中,最好保证targetPort、containerdPort和Pod中实际监听端口一致

Service和Pod的DNS

K8s为Service的Pod创建DNS记录,可以使用一致的DNS名称而非IP地址访问Service

  • kubelete配置Pod的DNS以便运行中的容器通过名称来寻找而不是IP地址查找Service
  • 集群中的Service赋予了DNS名称,默认情况下客户端的Pod的DNS搜索列表会查找Pod自身命名空间和集群默认域

ExternalTrafficPolicy

ExternalTrafficPolicy表示此Service是否希望外部流量路由到节点本地或者集群访问,有俩个选项

  • Cluster:隐藏客户端源IP,可能导致第二跳跳转到另外一个节点上,有比较好的负载均衡
    • 不管其他节点是否可以承接流量,先转移到其他节点,再找对应的Pod
  • Local:保留客户端源IP,避免LB和NodePort类型的Service第二跳,但是存在不负载均衡的风险
    • 只有Pod所在节点可以承接流量,其他节点无法承接流量