Commit Graph

8 Commits

Author SHA1 Message Date
Márton Elek
20a3045e1a satellite/durability: use fixed number of pieces in integration test
Test was flaky because we asserted if we have 15 classes:

6 email (for each used (!!!) nodes)
6 last_net (for each used (!!!) nodes)
1 wallet
1 country ("HU")
1 empty value

But there was a very low chance to use only 5 nodes, out of the 6 (RS.Success=5, RS.Total=6).

In that specific case, we had only 12 classes, as we didn't see all the used emails, as we iterated over the used nodes only (and one node was not used).

https://github.com/storj/storj/issues/6549

Change-Id: I66882d5fa9b0d5f5b2397ea856494037972d4b81
2023-11-29 11:04:20 +00:00
Márton Elek
0f4f1ddde8 satellite/durability: use single classifier per observer instance
the new bus_factor calculation doesn't make sense with different classes, as we have overlaps.

For example: it can detect a risk if we loose one country and one different subnet (with possible overlap).

It's better to calculate the stat and bus_factor per class (net, country, ...).

It also makes it easier to measure execution time per class.

Change-Id: I7d4d5f7cb811cd50c5831077b43e001908aab96b
2023-11-21 17:08:34 +00:00
Márton Elek
0fdacfed8f satellite/durability: observer must reset between executions
Change-Id: I8f5b951beba513b219c4bb5680658f5e8b54538d
2023-11-20 16:56:43 +00:00
Márton Elek
0ef3247d44 satellite/durability: make benchmark even quicker
To make sure that Benchmark tests are good, we run them with -short flag, eg:

```
go test -short -run=BenchmarkDurabilityProcess
```

Durability benchmark already supports this, but we can make it slightly more faster with
using less sgements and pieces during the `-short` run.

Change-Id: I9547ca1e3cd0178eb395a7a388f2e7936a9862d7
2023-11-08 19:00:30 +00:00
Márton Elek
23c592adeb satellite/durability: use process level classID cache (instead fork level)
Classifier of durability is sg. like "net:1.3.4.1" or "country:HU".

To make the calculation faster we use arrays instead of maps, which means that we assign a uinique index to all of these strings (classes).

As Egon suggested earlier, we can do this mapping only once (per process), not for each fork.

Not a big deal performance-wise, as we have limited number of forks, which are initialized once per 5-10 hours, but the code is more readable and clean.

Change-Id: Id081846b5d97dae8009aeeecbcc63cb713bed294
2023-11-08 15:22:48 +00:00
Márton Elek
015cb94909
satellite/durability: add exemplar and report time to the reported results
Exemplars are representative elements for the stat. For example if a stat min is `30`, we can save one example with that value.

More details about the concept is here: https://grafana.com/docs/grafana/latest/fundamentals/exemplars/

In our context, which save the segment + position in case of min is updated, to make it easier to look after the segment in danger.

Change-Id: I19be482f1ddc7f1711e722c7b17480366d2c8312
2023-11-06 13:22:28 +01:00
Márton Elek
5c49ba1d85 satellite/durability: ignore information from new nodes
To get better performance, we pre-load all nodealias/node information at the beginning of the segment loop.

It's possible that we receive a new node alias from the segment table what we are not fully aware of (yet).

The easiest solution is just ignoring. New risks/threats can be detected by a new execution cycle.

Change-Id: Ib54f7edc46eedbab6d13b4d651aaac1425994940
2023-10-19 16:38:39 +00:00
Márton Elek
db3578d9ba satellite: durability rangeloop observer for monitoring risks
Change-Id: I92805fcc6e7c1bbe0f42bbf849d22f9908fedadb
2023-10-12 16:32:30 +00:00