codema.in

Running a grafana instance with loki and prometheus for monitoring

MS Manav Sethi Public Seen by 18

Recently I have seen some people complaining about downtime on our matrix group and as a part of the community maintained service , IMO it doesn't look good for the community . So I propose that we setup grafana with loki and prometheus for monitoring . That way we can regularly monitor things and instantly know if something goes down .

I am willing to be one of the two people who can volunteer to maintain it . We also need another maintainer for the same.

MS

Poll Created Wed 3 Mar 2021 4:38PM

Setup for Grafana with Loki and Promotheus Closed Wed 10 Mar 2021 5:00PM

Results

Results Option % of points Voters
Agree 100.0% 2  
Abstain 0.0% 0  
Disagree 0.0% 0  
Block 0.0% 0  
Undecided 0% 37  

2 of 39 people have participated (5%)

MS

Poll Created Wed 3 Mar 2021 4:39PM

Are you willing to help maintain the same ? Closed Wed 10 Mar 2021 5:00PM

Results

Results Option Voters
Yes 1 MS
No 0  
Undecided 38 PB PP S MK SK NV BC AKS RD S MKT NAJ PS AA JN JI HM PK PV SB

1 of 39 people have participated (2%)

D

Dhanesh Sun 7 Mar 2021 7:15AM

I have some questions as to how this will be implemented:

  1. Where do we plan to host the Grafana instance?

  2. Who should have access to it and how will the access be managed?

  3. What is the contingency plan when the Grafana instance itself goes down?

  4. Our logs are stored on encrypted partitions which have to be specifically mounted if a server restart happens. How do we handle this situation with the Grafana instance unable to access those logs?

  5. How much resource will this setup use?

MS

Manav Sethi Sun 7 Mar 2021 7:21AM

  1. Grafana itself isn't very resource intensive so it can run on this machine itself

  2. access levels can be decided based on the community , we can even open it up for everyone to see and not edit for example https://dashboards.gitlab.com/d/general-public-splashscreen/general-gitlab-dashboards?orgId=1&from=1614774492340&to=1614786244041

  3. we would have to regularly monitor it so that it doesn't go down , not sure about any other things

  4. in case of a server restart we would have to manually mount the partitions and restart then fluent-bit or promtail ie the service which will send logs to loki can be restarted

  5. This setup will require very minimal resources

PP

Pirate Praveen Sun 7 Mar 2021 12:09PM

I don't think running the monitoring on the same machine is very useful. How will it alert when the server itself goes down?

MS

Manav Sethi Fri 12 Mar 2021 3:01AM

@Pirate Praveen any other ideas on where we should run grafana ?