k8s自定义暴露指标实现hpa自动扩缩容

参考博客:https://blog.51cto.com/14143894/2458468

1、修改代码

参考链接:https://prometheus.io/docs/instrumenting/clientlibs/

Python版本连接: https://github.com/prometheus/client_python

import prometheus_client

from prometheus_client import Counter

from prometheus_client.core import CollectorRegistry

request_total = Counter(“http_requests_total”,“Total request cout of the service”)

@app.route(’/metrics’)

def requests_count():

# request_total.inc()

return Response(prometheus_client.generate_latest(request_total),mimetype='text/plain')

@app.before_request

def request_stat():

request_total.inc()

2、有本地环境的话可以测试下:

C:\Users\luogu>curl http://192.168.11.5:8081/metrics

HELP http_requests_total Total request cout of the service

TYPE http_requests_total counter

http_requests_total 18225.0

HELP http_requests_created Total request cout of the service

TYPE http_requests_created gauge

http_requests_created 1.6097487772094784e+09

指标会自动加上_total 后缀

3、部署应用:

我的应用deployment文件部分内容,加上annotation即可

apiVersion: apps/v1
kind: Deployment
metadata:
name: flask-admin
namespace: ms-test
labels:
app: flask-admin
spec:
minReadySeconds: 5
strategy:
type: RollingUpdate
rollingUpdate:
#更新时允许最大激增的容器数
maxSurge: 1
#更新时允许最大unavailable的容器数
maxUnavailable: 0
replicas: 1
selector:
matchLabels:
app: flask-admin
template:
metadata:
labels:
app: flask-admin
annotations:
prometheus.io/scrape: “true”
prometheus.io/port: “8081”
prometheus.io/path: “/metrics”

添加上注解:
annotations:
prometheus.io/scrape: “true”
prometheus.io/port: “8081”
prometheus.io/path: “/metrics”
让prometheus能自动发现pod的指标

4、 再用clusterIp测试指标获取正不正常:

[root@node1 ~]# kubectl -n ms-test get svc|grep flask-admin

flask-admin NodePort 10.68.192.130 8081:34622/TCP 17d

[root@node1 ~]#

[root@node1 ~]#

[root@node1 ~]#

[root@node1 ~]# curl http://10.68.192.130:8081/metrics

HELP http_requests_total Total request cout of the service

TYPE http_requests_total counter

http_requests_total 3.0

HELP http_requests_created Total request cout of the service

TYPE http_requests_created gauge

http_requests_created 1.6097489733610158e+09

5、在prometheus查看指标

我用operator安装的prometheus默认查找不到,因为没有配置Pod的ServiceMonitor,这里拿istio的yaml过来改下:

cat serviceMonitor.yaml


apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

#labels:

monitoring: kube-pods

name: kubernetes-pods-custom-metrics

namespace: monitoring

spec:

endpoints:

  • interval: 15s

relabelings:

  • action: keep

regex: “true”

sourceLabels:

  • __meta_kubernetes_pod_annotation_prometheus_io_scrape

  • action: replace

regex: (.+)

sourceLabels:

  • __meta_kubernetes_pod_annotation_prometheus_io_path

targetLabel: metrics_path

  • action: replace

regex: ([^:]+)(?::\d+)?\d+)

replacement: $1:$2

sourceLabels:

  • address

  • __meta_kubernetes_pod_annotation_prometheus_io_port

targetLabel: address

  • action: labelmap

regex: _meta_kubernetes_pod_label(.+)

  • action: replace

sourceLabels:

  • __meta_kubernetes_namespace

targetLabel: namespace

  • action: replace

sourceLabels:

  • __meta_kubernetes_pod_name

targetLabel: pod_name

jobLabel: kubernetes-pods-custom-metrics

namespaceSelector:

any: true

selector:

matchExpressions:

  • key: prometheus-ignore

operator: DoesNotExist

#matchLabels:

#app: flask-admin

kubectl apply -f serviceMonitor.yaml

这时再去prometheus上查看http_requests_total 指标就能找到了

6、 部署 Custom Metrics Adapter

prometheus采集到的metrics并不能直接给k8s用,因为两者数据格式不兼容,还需要另外一个组件(k8s-prometheus-adpater),将prometheus的metrics 数据格式转换成k8s API接口能识别的格式,转换以后,因为是自定义API,所以还需要用Kubernetes aggregator在主APIServer中注册,以便直接通过/apis/来访问。

这个主要作用就是将自己注册到api-server中,第二就是转换成api可以识别的数据,

https://github.com/DirectXMan12/k8s-prometheus-adapter

该 PrometheusAdapter 有一个稳定的Helm Charts,我们直接使用。

先准备下helm环境:

[root@k8s-master1 helm]# wget https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz
[root@k8s-master1 helm]# tar xf helm-v3.0.0-linux-amd64.tar.gz
[root@k8s-master1 helm]# mv linux-amd64/helm /usr/bin
现在就可以使用helm 了,安装好helm,还能配置一个helm的仓库

也就是它将adapter存放到这个仓库里面了

添加的话建议使用微软云的adapter的

[root@k8s-master1 helm]# helm repo add stable http://mirror.azure.cn/kubernetes/charts
“stable” has been added to your repositories
[root@k8s-master1 helm]# helm repo ls
NAME URL
stable http://mirror.azure.cn/kubernetes/charts
这样的话,我们就能使用helm install,安装adapter了

因为adapter这个chart需要把prometheus连接改为自己环境的,如下:

[root@k8s-master1 helm]# helm install prometheus-adapter stable/prometheus-adapter --namespace kube-system --set prometheus.url=http://prometheus-k8s.monitoring,prometheus.port=9090
NAME: prometheus-adapter
LAST DEPLOYED: Fri Dec 13 15:22:42 2019
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
prometheus-adapter has been deployed.
In a few minutes you should be able to list metrics using the following command(s):

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1

实际操作,会提示仓库已经废弃,将就用着先

[root@k8s-master1 helm]# helm list -n kube-system
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
prometheus-adapter kube-system 1 2019-12-13 15:22:42.043441232 +0800 CST deployed prometheus-adapter-1.4.0 v0.5.0

查看pod已经部署成功

[root@k8s-master1 helm]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE

prometheus-adapter-77b7b4dd8b-9rv26 1/1 Running 0 2m36s
检查判断pod是否工作正常,这里已经是注册到聚合层了

[root@k8s-master1 helm]# kubectl get apiservice
v1beta1.custom.metrics.k8s.io kube-system/prometheus-adapter True 13m
这样就能通过一个原始的url去测试这个接口能不能用

[root@k8s-master1 helm]# kubectl get --raw “/apis/custom.metrics.k8s.io/v1beta1” |jq

创建hpa策略

[root@node1 prometheus-adapter]# cat flask-admin-hpa.yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: flask-admin-hpa
namespace: ms-test
spec:
minReplicas: 1
maxReplicas: 3
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: flask-admin
metrics:

  • type: Pods
    pods:
    metric:
    name: http_requests_per_second
    target:
    type: AverageValue
    averageValue: 40000m # /1000= 40/s

注意单位是m,换成个/s需要除以1000,上面意思也就是请求大于平均40个/s,就会扩容

查看: 目前拿不到值
[root@node1 prometheus-adapter]# kubectl -n ms-test get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
flask-admin-hpa Deployment/flask-admin /20 1 3 1 49s

因为适配器还不知道你要什么指标(http_requests_per_second),HPA也就获取不到Pod提供指标。
在名称空间中编辑prometheus-adapter ConfigMap,在该rules部分的顶部添加一个seriesQuery,来收集我们想实现的QPS的值,如下:

[root@k8s-master1 hpa]# kubectl edit cm prometheus-adapter -n kube-system

rules:
- seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
  resources:
    overrides:
      namespace: {resource: "namespace"}
      pod: {resource: "pod"}
  name:
    matches: "^(.*)_total"
    as: "${1}_per_second"
  metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'

这个值是求的它的一个平均值,也就是2分钟之内的http请求平均值

rate(http_requests_total{namespace!="",pod!=""}[2m])

因为我们是多个pod,所以我们需要相加对外提供一个指标,然后我们再给一个by,给个标签,这样的话进行标签去查询

sum(rate(http_requests_total{namespace!="",pod!=""}[2m]))

使用by,定义标签的名称方便去查询

sum(rate(http_requests_total{namespace!="",pod!=""}[2m])) by (pod)

测试api
kubectl get --raw “/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second” |grep “http_requests_per_second”

内容很多,我是输出到文件,然后再查找http_requests_per_second有没有这样的字眼
目前已经收到我们的值了

[root@node1 ~]# kubectl -n ms-test get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
flask-admin-hpa Deployment/flask-admin 66m/40 1 3 1 4h41m

压测 ab -c 3 -n 10000 -H ‘token: eyJhbGciOiJIUzI1NiIsImlhdCI6MTYwOTU4NzE2MiwiZXhwIjoxNjEyMTc5MTYyfQ.eyJpZCI6M30.spvwRMBdf5Cz5AxOa-d31ar2x5hfKkRBL-2AxH5XI3I’ http://test-gateway.kkkk.com/worksheet/version_list (换成你自己压测的地址,我这个地址有token需要加上了)

查看扩容状态

等待一会大概5分钟就会进行副本的缩容

总结流程: 代码暴露指标=>prometheus通过自动发现获取指标(operator安装的需要加serviceMonitor,普通安装的需要改配置,网上有)=>安装prometheus-adapter,将接口注册到apiserver供hpa扩缩容时查询,对应于metric-server=>编写hpa=>压测检验结果

你可能感兴趣的