Commit Graph

4 Commits

Author SHA1 Message Date
Moby von Briesen
c4a9a5d48b satellite/downtime: update detection and estimation downtime chores for
more trustworthy downtime tracking

Detection chore: Do not update downtime at all from the detection chore.
We only want to include downtime between two explicitly failed ping attempts
(the duration between last contact success and the first failed ping is no longer
included in downtime calculation)

Estimation chore: If the satellite started after the last failed ping for a node,
do not include offline time since the last failed ping time - only
estimate based on two failed pings with no satellite downtime in
between.
This protects us from including satellite downtime in our storagenode downtime calculations.

Change-Id: I1fddc9f7255a7023e02474255d70c64faae75b8a
2020-02-10 22:37:01 +00:00
Egon Elbre
c1c878efcf all: fix import groupings
check-imports was broken and didn't complain about things.

Change-Id: I38adafd16b4aba86f0eb4f53427b4393f9a6c710
2020-01-20 17:47:44 +00:00
Egon Elbre
f3b4bf2b7c satellite/satellitedb/satellitedbtest: pass ctx as an argument
ctx is created in most tests, instead pass in as argument
to reduce code duplication.

Change-Id: I466c51c008392001129c8b007c9d6b3619935ac4
2020-01-20 16:35:42 +02:00
Natalie Ventura Villasana
6b1829f3c3
satellite/downtime: new chore estimates downtime
Adds EstimationChore to the downtime package, which is an
independent chore that finds offline nodes given a configurable
limit, then uptime checks those nodes, and sets a last contact
success or failure given a response. For failed nodes, the chore
updates the amount of downtime the node has been offline in the
DowntimeTracking table.

Design doc section: https://github.com/storj/storj/blob/master/docs/blueprints/storage-node-downtime-tracking.md#estimating-offline-time
Jira: https://storjlabs.atlassian.net/browse/V3-2545

Change-Id: I60af95803930bf9b33232b248bb20cca6f0e0b5f
2020-01-09 15:05:13 -05:00