Spark in me - Internet, data science, math, deep learning, philo

snakers4 @ telegram, 1797 members, 1726 posts since 2016

All this - lost like tears in rain.

Data science, ML, a bit of philosophy and math. No bs.

Our website
- spark-in.me
Our chat
- t.me/joinchat/Bv9tjkH9JHYvOr92hi5LxQ
DS courses review
- goo.gl/5VGU5A
- goo.gl/YzVUKf

March 04, 08:46

Tracking your hardware ... for data science

For a long time I though that if you really want to track all your servers' metrics you need Zabbix (which is very complicated).

A friend recommended me an amazing tool

- prometheus.io/docs/guides/node-exporter/

It installs and runs literally in minutes.

If you want to auto-start it properly, there are even a bit older Ubuntu packages and systemd examples

- github.com/prometheus/node_exporter/tree/master/examples/systemd

Dockerized metric exporters for GPUs by Nvidia

- github.com/NVIDIA/gpu-monitoring-tools/tree/master/exporters/prometheus-dcgm

It also features extensive alerting features, but they are very difficult to easily start, there being no minimal example

- prometheus.io/docs/alerting/overview/

- github.com/prometheus/docs/issues/581

#linux

Monitoring Linux host metrics with the Node Exporter | Prometheus

An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.