User Tools

Site Tools


linux:ceph:howtos:ceph_grafana_prometheus

[HOWTO] Ceph+grafana+prometheus

Documentation
Name: [HOWTO] Ceph+grafana+prometheus
Description: How to setup ceph with prometheus and grafana for advanced statistics
Modification date : 24/12/2021
Owner:dodger
Notify changes to:Owner
Tags:ceph, object storage
Scalate to:The_fucking_bofh

Documentation

Pre-Requisites

Prometheus node exporter

From the salt-master:

export THEHOSTNAME='avmlp-os*'
salt "${THEHOSTNAME}" test.ping
salt "${THEHOSTNAME}" pkg.install golang-github-prometheus-node-exporter
salt "${THEHOSTNAME}" service.start node_exporter
salt "${THEHOSTNAME}" service.enable node_exporter
salt "${THEHOSTNAME}" service.status node_exporter

Check:

salt "${THEHOSTNAME}" cmd.run "netstat -nap | egrep 9100 | egrep LISTEN"

Obtain the list of nodes for configuring prometheus to scrape the node_exporter:

salt "${THEHOSTNAME}" service.status node_exporter | grep "^${THEHOSTNAME}" |  awk -F\: '{print "\047"$1":9100\047,"}'

Example:

root@avmlm-salt-001 /home/bofher/scripts/nutanix_buster $ salt "${THEHOSTNAME}" service.status node_exporter | grep "^${THEHOSTNAME}" |  awk -F\: '{print "\047"$1":9100\047,"}'
'bvmlm-osd-001.ciberterminal.net:9100',
'bvmlm-osd-019.ciberterminal.net:9100',
'bvmlm-osd-013.ciberterminal.net:9100',
'bvmlm-osm-003.ciberterminal.net:9100',
'bvmlm-osd-005.ciberterminal.net:9100',
'bvmlm-oslb-001.ciberterminal.net:9100',
'bvmlm-osd-010.ciberterminal.net:9100',
'bvmlm-osd-003.ciberterminal.net:9100',
'bvmlm-osd-020.ciberterminal.net:9100',
'bvmlm-osfs-003.ciberterminal.net:9100',
'bvmlm-osd-002.ciberterminal.net:9100',
'bvmlm-osm-001.ciberterminal.net:9100',
'bvmlm-osm-004.ciberterminal.net:9100',
'bvmlm-osd-015.ciberterminal.net:9100',
'bvmlm-osd-018.ciberterminal.net:9100',
'bvmlm-osgw-001.ciberterminal.net:9100',
'bvmlm-osd-017.ciberterminal.net:9100',
'bvmlm-osd-011.ciberterminal.net:9100',
'bvmlm-osd-007.ciberterminal.net:9100',
'bvmlm-osgw-004.ciberterminal.net:9100',
'bvmlm-osgw-003.ciberterminal.net:9100',
'bvmlm-osd-006.ciberterminal.net:9100',
'bvmlm-osfs-004.ciberterminal.net:9100',
'bvmlm-osm-002.ciberterminal.net:9100',
'bvmlm-osd-008.ciberterminal.net:9100',
'bvmlm-osfs-002.ciberterminal.net:9100',
'bvmlm-osfs-001.ciberterminal.net:9100',
'bvmlm-osd-004.ciberterminal.net:9100',
'bvmlm-oslb-002.ciberterminal.net:9100',
'bvmlm-osd-012.ciberterminal.net:9100',
'bvmlm-osd-009.ciberterminal.net:9100',
'bvmlm-osgw-002.ciberterminal.net:9100',
'bvmlm-osd-014.ciberterminal.net:9100',
'bvmlm-osm-005.ciberterminal.net:9100',
'bvmlm-osnx-002.ciberterminal.net:9100',
'bvmlm-osd-016.ciberterminal.net:9100',

Prometheus

Bare minimal install instructions:

cat >/etc/yum.repos.d/prometheus.repo<<EOF
[prometheus]
name=prometheus
baseurl=https://packagecloud.io/prometheus-rpm/release/el/$releasever/$basearch
repo_gpgcheck=1
enabled=1
gpgkey=https://packagecloud.io/prometheus-rpm/release/gpgkey
       https://raw.githubusercontent.com/lest/prometheus-rpm/master/RPM-GPG-KEY-prometheus-rpm
gpgcheck=1
metadata_expire=300
EOF
 
 
yum install prometheus2.x86_64 \
            apache_exporter.x86_64 \
            collectd_exporter.x86_64 
            consul_exporter.x86_64 \
            elasticsearch_exporter.x86_64 \
            graphite_exporter.x86_64 \
            haproxy_exporter.x86_64 \
            kafka_exporter.x86_64 \
            memcached_exporter.x86_64 \
            mysqld_exporter.x86_64 \
            nginx_exporter.x86_64 \
            node_exporter.x86_64 \
            postgres_exporter.x86_64 \
            process_exporter.x86_64 \
            pushgateway.x86_64 \
            rabbitmq_exporter.x86_64 \
            redis_exporter.x86_64 \
            sachet.x86_64 \
            smokeping_prober.x86_64 \
            snmp_exporter.x86_64 \
            statsd_exporter.x86_64 \
            thanos.x86_64
systemctl start prometheus
systemctl enable prometheus
systemctl status prometheus

Prometheus setup, add scrape config for ceph, for example, in dev with larry:

prometheus.yml
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).
 
# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093
 
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"
 
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['0.0.0.0:9090']
  - job_name: 'ceph-larry'
    static_configs:
    - targets: ['larry.ciberterminal.net:9283']
  - job_name: 'node-exporter'
    static_configs:
    - targets: [
        'bvmlm-osd-001.ciberterminal.net:9100',
        'bvmlm-osd-019.ciberterminal.net:9100',
        'bvmlm-osd-013.ciberterminal.net:9100',
        'bvmlm-osm-003.ciberterminal.net:9100',
        'bvmlm-osd-005.ciberterminal.net:9100',
        'bvmlm-oslb-001.ciberterminal.net:9100',
        'bvmlm-osd-010.ciberterminal.net:9100',
        'bvmlm-osd-003.ciberterminal.net:9100',
        'bvmlm-osd-020.ciberterminal.net:9100',
        'bvmlm-osfs-003.ciberterminal.net:9100',
        'bvmlm-osd-002.ciberterminal.net:9100',
        'bvmlm-osm-001.ciberterminal.net:9100',
        'bvmlm-osm-004.ciberterminal.net:9100',
        'bvmlm-osd-015.ciberterminal.net:9100',
        'bvmlm-osd-018.ciberterminal.net:9100',
        'bvmlm-osgw-001.ciberterminal.net:9100',
        'bvmlm-osd-017.ciberterminal.net:9100',
        'bvmlm-osd-011.ciberterminal.net:9100',
        'bvmlm-osd-007.ciberterminal.net:9100',
        'bvmlm-osgw-004.ciberterminal.net:9100',
        'bvmlm-osgw-003.ciberterminal.net:9100',
        'bvmlm-osd-006.ciberterminal.net:9100',
        'bvmlm-osfs-004.ciberterminal.net:9100',
        'bvmlm-osm-002.ciberterminal.net:9100',
        'bvmlm-osd-008.ciberterminal.net:9100',
        'bvmlm-osfs-002.ciberterminal.net:9100',
        'bvmlm-osfs-001.ciberterminal.net:9100',
        'bvmlm-osd-004.ciberterminal.net:9100',
        'bvmlm-oslb-002.ciberterminal.net:9100',
        'bvmlm-osd-012.ciberterminal.net:9100',
        'bvmlm-osd-009.ciberterminal.net:9100',
        'bvmlm-osgw-002.ciberterminal.net:9100',
        'bvmlm-osd-014.ciberterminal.net:9100',
        'bvmlm-osm-005.ciberterminal.net:9100',
        'bvmlm-osnx-002.ciberterminal.net:9100',
        'bvmlm-osd-016.ciberterminal.net:9100'
        ]

We will restart and check after setting up the rest of elements :-)

grafana

  • Grafana working

I haven't setup it, so I can't give instructions here xD

Additional setup for grafana to work with ceph:

--- grafana.ini 2021-12-24 10:38:20.669668776 +0100
+++ grafana.ini.orig    2021-12-24 12:36:44.083311253 +0100
@@ -185,7 +185,6 @@
 
 # set to true if you want to allow browsers to render Grafana in a <frame>, <iframe>, <embed> or <object>. default is false.
 ;allow_embedding = false
-allow_embedding = true
 
 # Set to true if you want to enable http strict transport security (HSTS) response header.
 # This is only sent when HTTPS is enabled in this configuration.
@@ -308,16 +307,12 @@
 [auth.anonymous]
 # enable anonymous access
 ;enabled = false
-enabled = true
 
 # specify organization name that should be used for unauthenticated users
 ;org_name = Main Org.
-;org_name = ciberterminal.net
-org_name = ciberterminal DEMO
 
 # specify role for unauthenticated users
 ;org_role = Viewer
-org_role = Viewer
 
 #################################### Github Auth ##########################
 [auth.github]


But you'll need the following plugins for grafana:

grafana-cli plugins install vonage-status-panel
grafana-cli plugins install grafana-piechart-panel


Import all of the officia dashboards :-)
Here you have some nice oneliners to simplify the process:

wget "https://github.com/ceph/ceph/tree/master/monitoring/grafana/dashboards"
for i in $(cat dashboards| egrep json |egrep "dashboard" | awk -F\" '{print $6}' | egrep "\.json") ; do wget "https://raw.githubusercontent.com/ceph/ceph/master/monitoring/grafana/dashboards/${i}" ; done
for i in *json ; do cat ${i} | jq . >/dev/null && echo "### OK ${i}" || echo "@@@ KO ${i}" ; done

And import them with the web-ui (I couldn't import them through API).
Also you'll have to setup prometheus as data-source for grafana and setup the prometheus server:

Instructions

Following official documentation, on any of the ceph admin nodes:

ceph mgr module enable prometheus
ceph config set mgr mgr/prometheus/server_port 9283
ceph config set mgr mgr/prometheus/server_addr 0.0.0.0
ceph config set mgr mgr/prometheus/scrape_interval 15
ceph dashboard set-grafana-api-url http://avvmld-graf-001.ciberterminal.net:3000/
ceph dashboard set-grafana-api-ssl-verify False

You must change grafana url according your setup.
check:

bvmlm-osm-001 /home/bofher # ceph config dump | egrep -v "KEY"
WHO   MASK LEVEL    OPTION                           VALUE                                    RO 
  mgr      advanced mgr/dashboard/GRAFANA_API_URL    https://grafana-bavel.ciberterminal.net/    *  
  mgr      advanced mgr/prometheus/scrape_interval   15                                       *  
  mgr      advanced mgr/prometheus/server_addr       0.0.0.0                                  *  
  mgr      advanced mgr/prometheus/server_port       9283                                     *  
 
bvmlm-osm-001 /home/bofher # ceph mgr services
{
    "dashboard": "https://bvmlm-osm-002.ciberterminal.net:8443/",
    "prometheus": "http://bvmlm-osm-002.ciberterminal.net:9283/"
}


haproxy configuration so it magically balance to the working monitor server running dashboard & prometheus module:

# Fronted for prometheus scrapper
frontend http_web *:9283
    mode http
    default_backend ceph_prometheus
 
backend ceph_prometheus
    mode http
    option httpchk GET /
    http-check expect status 200
    server monscraper1 bvmlm-osm-001.ciberterminal.net:9283 check verify none
    server monscraper2 bvmlm-osm-002.ciberterminal.net:9283 check verify none
    server monscraper3 bvmlm-osm-003.ciberterminal.net:9283 check verify none
    server monscraper4 bvmlm-osm-004.ciberterminal.net:9283 check verify none
    server monscraper5 bvmlm-osm-005.ciberterminal.net:9283 check verify none


Go and restart prometheus to begin scrapping ceph:

systemctl restart prometheus
systemctl status prometheus

Check targets on prometheus: http://avmlm-prom-001:9090/targets (change the prometheus server…)

Need more instructions? RTFM!

For NX nodes (nginx)

Add firewall rules:

firewall-cmd --permanent --zone=public --add-rich-rule='rule family=ipv4 source address=10.40.3.64/32 port port=9100 protocol=tcp accept'
firewall-cmd  --zone=public --add-rich-rule='rule family=ipv4 source address=10.40.3.64/32 port port=9100 protocol=tcp accept'

Final thoughts

linux/ceph/howtos/ceph_grafana_prometheus.txt · Last modified: 2022/02/11 11:36 by 127.0.0.1