K8s集群性能指標(biāo)監(jiān)控方案kube-prometheus-stack(helm)+Metrics Server安裝Demo

寫在前面
學(xué)習(xí)K8s涉及,整理筆記記憶
博文偏實戰(zhàn),內(nèi)容涉及:
集群核心指標(biāo)(Core Metrics)監(jiān)控工具Metrics Server的簡介
Metrics Server的安裝Demo
集群自定義指標(biāo)(Custom Metrics)監(jiān)控平臺簡介:
Prometheus
Grafana
NodeExporter
通過helm(kube-prometheus-stack)安裝平臺Demo
「 帶著凡世的夢想,將污穢的靈魂依偎在純潔的天邊。它們是所有流浪、追尋、渴望與鄉(xiāng)愁的永恒象征。     ——赫爾曼·黑塞《彼得.卡門青》」

Kubernetes集群性能監(jiān)控管理
「Kubernetes平臺搭建好后,了解Kubernetes平臺及在此平臺上部署的應(yīng)用的運行狀況,以及處理系統(tǒng)主要告誓及性能瓶頸,這些都依賴監(jiān)控管理系統(tǒng)?!?br>
?
「Kubernetes的早期版本依靠Heapster來實現(xiàn)完整的性能數(shù)據(jù)采集和監(jiān)控功能, Kubernetes從1.8版本開始,性能數(shù)據(jù)開始以Metrics APl的方式提供標(biāo)準(zhǔn)化接口,并且從1.10版本開始將Heapster替換為MetricsServer?!?br>
?
「在Kubernetes新的監(jiān)控體系中:Metrics Server用于提供核心指標(biāo)(Core Metrics) ,包括Node, Pod的CPU和內(nèi)存使用指標(biāo)。對其他自定義指標(biāo)(Custom Metrics)的監(jiān)控則由Prometheus等組件來完成?!?br>
「監(jiān)控節(jié)點狀態(tài),我們使用docker的話可以通過docker stats.」

┌──[root@vms81.liruilongs.github.io]-[~]
└─$docker stats
CONTAINER ID   NAME                                                                                                                             CPU %     MEM USAGE / LIMIT     MEM %     NET I/O   BLOCK I/O     PIDS
781c898eea19   k8s_kube-scheduler_kube-scheduler-vms81.liruilongs.github.io_kube-system_5bd71ffab3a1f1d18cb589aa74fe082b_18                     0.15%     23.22MiB / 3.843GiB   0.59%     0B / 0B   0B / 0B       7
acac8b21bb57   k8s_kube-controller-manager_kube-controller-manager-vms81.liruilongs.github.io_kube-system_93d9ae7b5a4ccec4429381d493b5d475_18   1.18%     59.16MiB / 3.843GiB   1.50%     0B / 0B   0B / 0B       6
fe97754d3dab   k8s_calico-node_calico-node-skzjp_kube-system_a211c8be-3ee1-44a0-a4ce-3573922b65b2_14                                            4.89%     94.25MiB / 3.843GiB   2.39%     0B / 0B   0B / 4.1kB    40
「那使用k8s的話,我們可以通過Metrics Server監(jiān)控Pod和Node的CPU和內(nèi)存資源使用數(shù)據(jù)」

Metrics Server:集群性能監(jiān)控平臺
「Metrics Server在部署完成后,將通過Kubernetes核心API Server 的/apis/metrics.k8s.io/v1beta1路徑提供Pod和Node的監(jiān)控數(shù)據(jù)?!?br>
安裝Metrics Server
「Metrics Server源代碼和部署配置可以在GitHub代碼庫」

curl -Ls https://api.github.com/repos/kubernetes-sigs/metrics-server/tarball/v0.3.6 -o metrics-server-v0.3.6.tar.gz
「相關(guān)鏡像」

docker pull mirrorgooglecontainers/metrics-server-amd64:v0.3.6
鏡像小伙伴可以下載一下,這里我已經(jīng)下載好了,直接上傳導(dǎo)入鏡像

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible all -m copy -a "src=./metrics-img.tar dest=/root/metrics-img.tar"
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible all -m shell -a "systemctl restart docker "
192.168.26.82 | CHANGED | rc=0 >>

192.168.26.83 | CHANGED | rc=0 >>

192.168.26.81 | CHANGED | rc=0 >>
「通過docker命令導(dǎo)入鏡像」

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible all -m shell -a "docker load -i /root/metrics-img.tar"
192.168.26.83 | CHANGED | rc=0 >>
Loaded image: k8s.gcr.io/metrics-server-amd64:v0.3.6
192.168.26.81 | CHANGED | rc=0 >>
Loaded image: k8s.gcr.io/metrics-server-amd64:v0.3.6
192.168.26.82 | CHANGED | rc=0 >>
Loaded image: k8s.gcr.io/metrics-server-amd64:v0.3.6
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$
「修改metrics-server-deployment.yaml」

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$mv kubernetes-sigs-metrics-server-d1f4f6f/ metrics
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$cd metrics/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics]
└─$ls
cmd                 deploy      hack      OWNERS          README.md          version
code-of-conduct.md  Gopkg.lock  LICENSE   OWNERS_ALIASES  SECURITY_CONTACTS
CONTRIBUTING.md     Gopkg.toml  Makefile  pkg             vendor
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics]
└─$cd deploy/1.8+/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$ls
aggregated-metrics-reader.yaml  metrics-apiservice.yaml         resource-reader.yaml
auth-delegator.yaml             metrics-server-deployment.yaml
auth-reader.yaml                metrics-server-service.yaml
「這里修改一些鏡像獲取策略,因為Githup上的鏡像拉去不下來,或者拉去比較麻煩,所以我們提前上傳好」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$vim metrics-server-deployment.yaml
 31       - name: metrics-server
 32         image: k8s.gcr.io/metrics-server-amd64:v0.3.6
 33         #imagePullPolicy: Always
 34         imagePullPolicy: IfNotPresent
 35         command:
 36         - /metrics-server
 37         - --metric-resolution=30s
 38         - --kubelet-insecure-tls
 39         - --kubelet-preferred-address-types=InternalIP
 40         volumeMounts:
「運行資源文件,創(chuàng)建相關(guān)資源對象」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$kubectl apply -f .
「查看pod列表,metrics-server創(chuàng)建成功」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$kubectl  get pods -n kube-system
NAME                                                 READY   STATUS    RESTARTS   AGE
calico-kube-controllers-78d6f96c7b-79xx4             1/1     Running   2          3h15m
calico-node-ntm7v                                    1/1     Running   1          12h
calico-node-skzjp                                    1/1     Running   4          12h
calico-node-v7pj5                                    1/1     Running   1          12h
coredns-545d6fc579-9h2z4                             1/1     Running   2          3h15m
coredns-545d6fc579-xgn8x                             1/1     Running   2          3h16m
etcd-vms81.liruilongs.github.io                      1/1     Running   1          13h
kube-apiserver-vms81.liruilongs.github.io            1/1     Running   2          13h
kube-controller-manager-vms81.liruilongs.github.io   1/1     Running   4          13h
kube-proxy-rbhgf                                     1/1     Running   1          13h
kube-proxy-vm2sf                                     1/1     Running   1          13h
kube-proxy-zzbh9                                     1/1     Running   1          13h
kube-scheduler-vms81.liruilongs.github.io            1/1     Running   5          13h
metrics-server-bcfb98c76-gttkh                       1/1     Running   0          70m
「通過kubectl top nodes命令測試,」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$kubectl top nodes
W1007 14:23:06.102605  102831 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME                         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
vms81.liruilongs.github.io   555m         27%    2025Mi          52%
vms82.liruilongs.github.io   204m         10%    595Mi           15%
vms83.liruilongs.github.io   214m         10%    553Mi           14%
┌──[root@vms81.liruilongs.github.io]-[~/ansible/metrics/deploy/1.8+]
└─$
Prometheus+Grafana+NodeExporter:集群監(jiān)控平臺
「在各個計算節(jié)點上部署NodeExporter采集CPU、內(nèi)存、磁盤及IO信息,并將這些信息傳輸給監(jiān)控節(jié)點上的Prometheus服務(wù)器進行存儲分析,通過Grafana進行可視化監(jiān)控,」

Prometheus
?
「Prometheus是一款開源的監(jiān)控解決方案,由SoundCloud公司開發(fā)的開源監(jiān)控系統(tǒng),是繼Kubernetes之后CNCF第2個孵化成功的項目,在容器和微服務(wù)領(lǐng)域得到了廣泛應(yīng)用,能在監(jiān)控Kubernetes平臺的同時監(jiān)控部署在此平臺中的應(yīng)用,它提供了一系列工具集及多維度監(jiān)控指標(biāo)。Prometheus依賴Grafana實現(xiàn)數(shù)據(jù)可視化?!?br>
?
「Prometheus的主要特點如下」:

使用指標(biāo)名稱及鍵值對標(biāo)識的多維度數(shù)據(jù)模型。
采用靈活的查詢語言PromQL。
不依賴分布式存儲,為自治的單節(jié)點服務(wù)。
使用HTTP完成對監(jiān)控數(shù)據(jù)的拉取。
支持通過網(wǎng)關(guān)推送時序數(shù)據(jù)。
支持多種圖形和Dashboard的展示,例如Grafana。
「Prometheus生態(tài)系統(tǒng)由各種組件組成,用于功能的擴充:」

組件    描述
Prometheus Server    負(fù)責(zé)監(jiān)控數(shù)據(jù)采集和時序數(shù)據(jù)存儲,并提供數(shù)據(jù)查詢功能。
客戶端SDK    對接Prometheus的開發(fā)工具包。
Push Gateway    推送數(shù)據(jù)的網(wǎng)關(guān)組件。
第三方Exporter    各種外部指標(biāo)收集系統(tǒng),其數(shù)據(jù)可以被Prometheus采集
AlertManager    告警管理器。
其他輔助支持工具    --
「Prometheus的核心組件Prometheus Server的主要功能包括:」

?
從Kubernetes Master獲取需要監(jiān)控的資源或服務(wù)信息;從各種Exporter抓取(Pull)指標(biāo)數(shù)據(jù),然后將指標(biāo)數(shù)據(jù)保存在時序數(shù)據(jù)庫(TSDB)中;向其他系統(tǒng)提供HTTP API進行查詢;提供基于PromQL語言的數(shù)據(jù)查詢;可以將告警數(shù)據(jù)推送(Push)給AlertManager,等等。

?


「Prometheus的系統(tǒng)架構(gòu):」

NodeExporter
「NodeExporter主要用來采集服務(wù)器CPU、內(nèi)存、磁盤、IO等信息,是機器數(shù)據(jù)的通用采集方案。只要在宿主機上安裝NodeExporter和cAdisor容器,通過Prometheus進行抓取即可。它同Zabbix的功能相似.」

Grafana
「Grafana是一個Dashboard工具,用Go和JS開發(fā),它是一個時間序列數(shù)據(jù)庫的界面展示層,通過SQL命令查詢出Metrics并將結(jié)果展示出來。它能自定義多種儀表盤,可以輕松實現(xiàn)覆蓋多個Docker的宿主機監(jiān)控信息的展現(xiàn)?!?br>
搭建Prometheus+Grafana+NodeExporter平臺
?
這里我們通過helm的方式搭建,簡單方便快捷,運行之后,相關(guān)的鏡像都會創(chuàng)建成功.下面是創(chuàng)建成功的鏡像列表。

?
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME                                                    READY   STATUS    RESTARTS      AGE
alertmanager-liruilong-kube-prometheus-alertmanager-0   2/2     Running   0             61m
liruilong-grafana-5955564c75-zpbjq                      3/3     Running   0             62m
liruilong-kube-prometheus-operator-5cb699b469-fbkw5     1/1     Running   0             62m
liruilong-kube-state-metrics-5dcf758c47-bbwt4           1/1     Running   7 (32m ago)   62m
liruilong-prometheus-node-exporter-rfsc5                1/1     Running   0             62m
liruilong-prometheus-node-exporter-vm7s9                1/1     Running   0             62m
liruilong-prometheus-node-exporter-z9j8b                1/1     Running   0             62m
prometheus-liruilong-kube-prometheus-prometheus-0       2/2     Running   0             61m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$
環(huán)境版本
「我的K8s集群版本」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$kubectl  get nodes
NAME                         STATUS   ROLES                  AGE   VERSION
vms81.liruilongs.github.io   Ready    control-plane,master   34d   v1.22.2
vms82.liruilongs.github.io   Ready    <none>                 34d   v1.22.2
vms83.liruilongs.github.io   Ready    <none>                 34d   v1.22.2
「hrlm版本」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm version
version.BuildInfo{Version:"v3.2.1", GitCommit:"fe51cd1e31e6a202cba7dead9552a6d418ded79a", GitTreeState:"clean", GoVersion:"go1.13.10"}
prometheus-operator(舊名字)安裝出現(xiàn)的問題
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm  search repo prometheus-operator
NAME                            CHART VERSION   APP VERSION     DESCRIPTION
ali/prometheus-operator         8.7.0           0.35.0          Provides easy monitoring definitions for Kubern...
azure/prometheus-operator       9.3.2           0.38.1          DEPRECATED Provides easy monitoring definitions...
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm install liruilong  ali/prometheus-operator
Error: failed to install CRD crds/crd-alertmanager.yaml: unable to recognize "": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm pull  ali/prometheus-operator
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$
解決辦法:新版本安裝
「直接下載kube-prometheus-stack(新)的chart包,通過命令安裝:」

https://github.com/prometheus-community/helm-charts/releases/download/kube-prometheus-stack-30.0.1/kube-prometheus-stack-30.0.1.tgz

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$ls
index.yaml  kube-prometheus-stack-30.0.1.tgz  liruilonghelm  liruilonghelm-0.1.0.tgz  mysql  mysql-1.6.4.tgz
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$helm list
NAME    NAMESPACE       REVISION        UPDATED STATUS  CHART   APP VERSION
「解壓chart包kube-prometheus-stack-30.0.1.tgz」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$tar -zxf kube-prometheus-stack-30.0.1.tgz
「創(chuàng)建新的命名空間」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$cd kube-prometheus-stack/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl create ns monitoring
namespace/monitoring created
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl config  set-context $(kubectl config current-context) --namespace=monitoring
Context "kubernetes-admin@kubernetes" modified.
「進入文件夾,直接通過helm install liruilong .安裝」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$ls
Chart.lock  charts  Chart.yaml  CONTRIBUTING.md  crds  README.md  templates  values.yaml
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$helm install liruilong .
「kube-prometheus-admission-create對應(yīng)Pod的相關(guān)鏡像下載不下來問題」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$kubectl get pods
NAME                                                  READY   STATUS             RESTARTS   AGE
liruilong-kube-prometheus-admission-create--1-bn7x2   0/1     ImagePullBackOff   0          33s
「查看pod詳細信息,發(fā)現(xiàn)是谷歌的一個鏡像國內(nèi)無法下載」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$kubectl  describe pod  liruilong-kube-prometheus-admission-create--1-bn7x2
Name:         liruilong-kube-prometheus-admission-create--1-bn7x2
Namespace:    monitoring
Priority:     0
Node:         vms83.liruilongs.github.io/192.168.26.83
Start Time:   Sun, 16 Jan 2022 02:43:07 +0800
Labels:       app=kube-prometheus-stack-admission-create
              app.kubernetes.io/instance=liruilong
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/part-of=kube-prometheus-stack
              app.kubernetes.io/version=30.0.1
              chart=kube-prometheus-stack-30.0.1
              controller-uid=2ce48cd2-a118-4e23-a27f-0228ef6c45e7
              heritage=Helm
              job-name=liruilong-kube-prometheus-admission-create
              release=liruilong
Annotations:  cni.projectcalico.org/podIP: 10.244.70.8/32
              cni.projectcalico.org/podIPs: 10.244.70.8/32
Status:       Pending
IP:           10.244.70.8
IPs:
  IP:           10.244.70.8
Controlled By:  Job/liruilong-kube-prometheus-admission-create
Containers:
  create:
    Container ID:
    Image:         k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068
    Image ID:
    Port:          <none>
    Host Port:
    。。。。。。。。。。。。。。。。。。。。。。。。。。。
「在dokcer倉庫里找了一個類似的,通過 kubectl edit 修改」

image: k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0  替換為 : docker.io/liangjw/kube-webhook-certgen:v1.1.1
「或者也可以修改配置文件從新install(記得要把sha注釋掉)」






┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$ls
index.yaml  kube-prometheus-stack  kube-prometheus-stack-30.0.1.tgz  liruilonghelm  liruilonghelm-0.1.0.tgz  mysql  mysql-1.6.4.tgz
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create]
└─$cd kube-prometheus-stack/
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$ls
Chart.lock  charts  Chart.yaml  CONTRIBUTING.md  crds  README.md  templates  values.yaml
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$cat values.yaml | grep -A 3 -B 2 kube-webhook-certgen
      enabled: true
      image:
        repository: docker.io/liangjw/kube-webhook-certgen
        tag: v1.1.1
        #sha: "f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068"
        pullPolicy: IfNotPresent
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$helm del liruilong;helm install liruilong .
「之后其他的相關(guān)pod正常創(chuàng)建中」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME                                                  READY   STATUS              RESTARTS   AGE
liruilong-grafana-5955564c75-zpbjq                    0/3     ContainerCreating   0          27s
liruilong-kube-prometheus-operator-5cb699b469-fbkw5   0/1     ContainerCreating   0          27s
liruilong-kube-state-metrics-5dcf758c47-bbwt4         0/1     ContainerCreating   0          27s
liruilong-prometheus-node-exporter-rfsc5              0/1     ContainerCreating   0          28s
liruilong-prometheus-node-exporter-vm7s9              0/1     ContainerCreating   0          28s
liruilong-prometheus-node-exporter-z9j8b              0/1     ContainerCreating   0          28s
「kube-state-metrics這個pod的鏡像也沒有拉取下來。應(yīng)該也是相同的原因」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME                                                    READY   STATUS             RESTARTS   AGE
alertmanager-liruilong-kube-prometheus-alertmanager-0   2/2     Running            0          3m35s
liruilong-grafana-5955564c75-zpbjq                      3/3     Running            0          4m46s
liruilong-kube-prometheus-operator-5cb699b469-fbkw5     1/1     Running            0          4m46s
liruilong-kube-state-metrics-5dcf758c47-bbwt4           0/1     ImagePullBackOff   0          4m46s
liruilong-prometheus-node-exporter-rfsc5                1/1     Running            0          4m47s
liruilong-prometheus-node-exporter-vm7s9                1/1     Running            0          4m47s
liruilong-prometheus-node-exporter-z9j8b                1/1     Running            0          4m47s
prometheus-liruilong-kube-prometheus-prometheus-0       2/2     Running            0          3m34s
「同樣 k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.3.0 這個鏡像沒辦法拉取」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl  describe  pod liruilong-kube-state-metrics-5dcf758c47-bbwt4
Name:         liruilong-kube-state-metrics-5dcf758c47-bbwt4
Namespace:    monitoring
Priority:     0
Node:         vms82.liruilongs.github.io/192.168.26.82
Start Time:   Sun, 16 Jan 2022 02:59:53 +0800
Labels:       app.kubernetes.io/component=metrics
              app.kubernetes.io/instance=liruilong
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=kube-state-metrics
              app.kubernetes.io/part-of=kube-state-metrics
              app.kubernetes.io/version=2.3.0
              helm.sh/chart=kube-state-metrics-4.3.0
              pod-template-hash=5dcf758c47
              release=liruilong
Annotations:  cni.projectcalico.org/podIP: 10.244.171.153/32
              cni.projectcalico.org/podIPs: 10.244.171.153/32
Status:       Pending
IP:           10.244.171.153
IPs:
  IP:           10.244.171.153
Controlled By:  ReplicaSet/liruilong-kube-state-metrics-5dcf758c47
Containers:
  kube-state-metrics:
    Container ID:
    Image:         k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.3.0
    Image ID:
    Port:          8080/TCP
    。。。。。。。。。。。。。。。。。。。。。。
「同樣的,我們通過docker倉庫找一下相同的,然后通過kubectl edit pod修改一下」

k8s.gcr.io/kube-state-metrics/kube-state-metrics 替換為: docker.io/dyrnq/kube-state-metrics:v2.3.0
「可以先在節(jié)點機上拉取一下」

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible node -m shell -a "docker pull dyrnq/kube-state-metrics:v2.3.0"
192.168.26.82 | CHANGED | rc=0 >>
v2.3.0: Pulling from dyrnq/kube-state-metrics
e8614d09b7be: Pulling fs layer
53ccb90bafd7: Pulling fs layer
e8614d09b7be: Verifying Checksum
e8614d09b7be: Download complete
e8614d09b7be: Pull complete
53ccb90bafd7: Verifying Checksum
53ccb90bafd7: Download complete
53ccb90bafd7: Pull complete
Digest: sha256:c9137505edaef138cc23479c73e46e9a3ef7ec6225b64789a03609c973b99030
Status: Downloaded newer image for dyrnq/kube-state-metrics:v2.3.0
docker.io/dyrnq/kube-state-metrics:v2.3.0
192.168.26.83 | CHANGED | rc=0 >>
v2.3.0: Pulling from dyrnq/kube-state-metrics
e8614d09b7be: Pulling fs layer
53ccb90bafd7: Pulling fs layer
e8614d09b7be: Verifying Checksum
e8614d09b7be: Download complete
e8614d09b7be: Pull complete
53ccb90bafd7: Verifying Checksum
53ccb90bafd7: Download complete
53ccb90bafd7: Pull complete
Digest: sha256:c9137505edaef138cc23479c73e46e9a3ef7ec6225b64789a03609c973b99030
Status: Downloaded newer image for dyrnq/kube-state-metrics:v2.3.0
docker.io/dyrnq/kube-state-metrics:v2.3.0
「修改完之后,會發(fā)現(xiàn)所有的pod都創(chuàng)建成功」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl get pods
NAME                                                    READY   STATUS    RESTARTS      AGE
alertmanager-liruilong-kube-prometheus-alertmanager-0   2/2     Running   0             61m
liruilong-grafana-5955564c75-zpbjq                      3/3     Running   0             62m
liruilong-kube-prometheus-operator-5cb699b469-fbkw5     1/1     Running   0             62m
liruilong-kube-state-metrics-5dcf758c47-bbwt4           1/1     Running   7 (32m ago)   62m
liruilong-prometheus-node-exporter-rfsc5                1/1     Running   0             62m
liruilong-prometheus-node-exporter-vm7s9                1/1     Running   0             62m
liruilong-prometheus-node-exporter-z9j8b                1/1     Running   0             62m
prometheus-liruilong-kube-prometheus-prometheus-0       2/2     Running   0             61m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$
「然后我們需要修改liruilong-grafana SVC的類型為NodePort,這樣,物理機就可以訪問了」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl  get svc
NAME                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                    ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   33m
liruilong-grafana                        ClusterIP   10.99.220.121    <none>        80/TCP                       34m
liruilong-kube-prometheus-alertmanager   ClusterIP   10.97.193.228    <none>        9093/TCP                     34m
liruilong-kube-prometheus-operator       ClusterIP   10.101.106.93    <none>        443/TCP                      34m
liruilong-kube-prometheus-prometheus     ClusterIP   10.105.176.19    <none>        9090/TCP                     34m
liruilong-kube-state-metrics             ClusterIP   10.98.94.55      <none>        8080/TCP                     34m
liruilong-prometheus-node-exporter       ClusterIP   10.110.216.215   <none>        9100/TCP                     34m
prometheus-operated                      ClusterIP   None             <none>        9090/TCP                     33m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack]
└─$kubectl edit svc liruilong-grafana
service/liruilong-grafana edited
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl  get svc
NAME                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                    ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   35m
liruilong-grafana                        NodePort    10.99.220.121    <none>        80:30443/TCP                 36m
liruilong-kube-prometheus-alertmanager   ClusterIP   10.97.193.228    <none>        9093/TCP                     36m
liruilong-kube-prometheus-operator       ClusterIP   10.101.106.93    <none>        443/TCP                      36m
liruilong-kube-prometheus-prometheus     ClusterIP   10.105.176.19    <none>        9090/TCP                     36m
liruilong-kube-state-metrics             ClusterIP   10.98.94.55      <none>        8080/TCP                     36m
liruilong-prometheus-node-exporter       ClusterIP   10.110.216.215   <none>        9100/TCP                     36m
prometheus-operated                      ClusterIP   None             <none>        9090/TCP                     35m
物理機訪問



「通過secrets解密獲取用戶名密碼」

┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets | grep grafana
liruilong-grafana                                                  Opaque                                3      38m
liruilong-grafana-test-token-q8z8j                                 kubernetes.io/service-account-token   3      38m
liruilong-grafana-token-j94p8                                      kubernetes.io/service-account-token   3      38m
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets liruilong-grafana -o yaml
apiVersion: v1
data:
  admin-password: cHJvbS1vcGVyYXRvcg==
  admin-user: YWRtaW4=
  ldap-toml: ""
kind: Secret
metadata:
  annotations:
    meta.helm.sh/release-name: liruilong
    meta.helm.sh/release-namespace: monitoring
  creationTimestamp: "2022-01-15T18:59:40Z"
  labels:
    app.kubernetes.io/instance: liruilong
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: grafana
    app.kubernetes.io/version: 8.3.3
    helm.sh/chart: grafana-6.20.5
  name: liruilong-grafana
  namespace: monitoring
  resourceVersion: "1105663"
  uid: c03ff5f3-deb5-458c-8583-787f41034469
type: Opaque
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets liruilong-grafana -o jsonpath='{.data.admin-user}}'| base64 -d
adminbase64: 輸入無效
┌──[root@vms81.liruilongs.github.io]-[~/ansible/k8s-helm-create/kube-prometheus-stack/templates]
└─$kubectl get secrets liruilong-grafana -o jsonpath='{.data.admin-password}}'| base64 -d
prom-operatorbase64: 輸入無效
得到用戶名密碼:admin/prom-operator

正常登錄,查看監(jiān)控信息



作者:山河已無恙


歡迎關(guān)注微信公眾號 :山河已無恙