Apply “Sensitivity” to determine the global Up status

We encountered a case where the global status showed Down because one tagged location was failing. The failure was due to a probe-side issue, with the message: “the test server queue is currently full, please try again using a different test server.
Even though our Sensitivity (“How many locations should be down before an alert is sent”) was set to 2, the global status did not reflect recovery while other locations were healthy.

We propose applying the sensitivity concept or a similar mechanism, to the global Up determination as well. In our case, with Sensitivity = 2 and 3 locations, if at least 2 locations report Up, the global status should be Up; only if fewer than 2 are Up should the check be considered Down.
We are not trying to precisely determine our environment’s state from a single Uptime location check; rather, we want to avoid false Down states caused by individual probe issues. This approach would also prevent incorrect alerts triggered by probe-side anomalies (like probe queue saturation).

Please authenticate to join the conversation.

Upvoters
Status

Product Review

Board

💡 Feature Request

Tags

Checks

Date

6 months ago

Subscribe to post

Get notified by email when there are changes.