Références utiles
- Principe de Heartbeat (Deadman's switch) https://jpweber.io/blog/taking-advantage-of-deadmans-switch-in-prometheus/ pour les petites infrastructures
- Liste d'alertes communes : https://awesome-prometheus-alerts.grep.to/rules.html
- Stockage long terme à surveiller : https://thanos.io/
- Outils connexes : https://openapm.io/
- Exemples de requêtes :
- https://github.com/infinityworks/prometheus-example-queries
- https://coralogix.com/blog/promql-tutorial-5-tricks-to-become-a-prometheus-god/
- https://timber.io/blog/promql-for-humans/
- https://archive.fosdem.org/2017/schedule/event/alerting_with_time_series/attachments/slides/1736/export/events/attachments/alerting_with_time_series/slides/1736/FOSDEM__Alerting_with_Time_Series.pdf
- https://towardsdatascience.com/practical-monitoring-with-prometheus-grafana-part-iii-81f019ecee19
- https://about.gitlab.com/blog/2019/07/23/anomaly-detection-using-prometheus/
- https://rancher.com/docs/rancher/v2.x/en/monitoring-alerting/v2.0.x-v2.4.x/cluster-monitoring/expression/