storj

Author	SHA1	Message	Date
Márton Elek	1ba314428f	go.mod: make bump-dependencies (uplink, common, ...) It requires more work at this time, as https://review.dev.storj.io/c/storj/private/+/11066 modified the way how we configure the debug port. Change-Id: I4a7cc999e13fd3514064a515b21885bb4d39ff16	2023-11-29 16:55:41 +00:00
Márton Elek	0f4f1ddde8	satellite/durability: use single classifier per observer instance the new bus_factor calculation doesn't make sense with different classes, as we have overlaps. For example: it can detect a risk if we loose one country and one different subnet (with possible overlap). It's better to calculate the stat and bus_factor per class (net, country, ...). It also makes it easier to measure execution time per class. Change-Id: I7d4d5f7cb811cd50c5831077b43e001908aab96b	2023-11-21 17:08:34 +00:00
Márton Elek	0ef3247d44	satellite/durability: make benchmark even quicker To make sure that Benchmark tests are good, we run them with -short flag, eg: ``` go test -short -run=BenchmarkDurabilityProcess ``` Durability benchmark already supports this, but we can make it slightly more faster with using less sgements and pieces during the `-short` run. Change-Id: I9547ca1e3cd0178eb395a7a388f2e7936a9862d7	2023-11-08 19:00:30 +00:00
Márton Elek	db3578d9ba	satellite: durability rangeloop observer for monitoring risks Change-Id: I92805fcc6e7c1bbe0f42bbf849d22f9908fedadb	2023-10-12 16:32:30 +00:00
paul cannon	72189330fd	satellite/gracefulexit: revamp graceful exit Currently, graceful exit is a complicated subsystem that keeps a queue of all pieces expected to be on a node, and asks the node to transfer those pieces to other nodes one by one. The complexity of the system has, unfortunately, led to numerous bugs and unexpected behaviors. We have decided to remove this entire subsystem and restructure graceful exit as follows: * Nodes will signal their intent to exit gracefully * The satellite will not send any new pieces to gracefully exiting nodes * Pieces on gracefully exiting nodes will be considered by the repair subsystem as "retrievable but unhealthy". They will be repaired off of the exiting node as needed. * After one month (with an appropriately high online score), the node will be considered exited, and held amounts for the node will be released. The repair worker will continue to fetch pieces from the node as long as the node stays online. * If, at the end of the month, a node's online score is below a certain threshold, its graceful exit will fail. Refs: https://github.com/storj/storj/issues/6042 Change-Id: I52d4e07a4198e9cb2adf5e6cee2cb64d6f9f426b	2023-09-27 08:40:01 +00:00
Márton Elek	98921f9faa	satellite/overlay: fix placement selection config parsing When we do `satellite run api --placement '...'`, the placement rules are not parsed well. The problem is based on `viper.AllSettings()`, and the main logic is sg. like this (from a new unit test): ``` r := ConfigurablePlacementRule{} err := r.Set(p) require.NoError(t, err) serialized := r.String() r2 := ConfigurablePlacementRule{} err = r2.Set(serialized) require.NoError(t, err) require.Equal(t, p, r2.String()) ``` All settings evaluates the placement rules in `ConfigurablePlacementRules` and stores the string representation. The problem is that we don't have proper `String()` implementation (it prints out the structs instead of the original definition. There are two main solutions for this problem: 1. We can fix the `String()`. When we parse a placement rule, the `String()` method should print out the original definition 2. We can switch to use pure string as configuration parameter, and parse the rules only when required. I feel that 1 is error prone, we can do it (and in this patch I added a lot of `String()` implementations, but it's hard to be sure that our `String()` logic is inline with the parsing logic. Therefore I decided to make the configuration value of the placements a string (or a wrapper around string). That's the main reason why this patch seems to be big, as I updated all the usages. But the main part is in beginning of the `placement.go` (configuration parsing is not a pflag.Value implementation any more, but a separated step). And `filter.go`, (a few more String implementation for filters. https://github.com/storj/storj/issues/6248 Change-Id: I47c762d3514342b76a2e85683b1c891502a0756a	2023-09-21 14:31:41 +00:00
Michal Niewrzal	1d62dc63f5	satellite/repair/repairer: fix NumHealthyInExcludedCountries calculation Currently, we have issue were while counting unhealthy pieces we are counting twice piece which is in excluded country and is outside segment placement. This can cause unnecessary repair. This change is also doing another step to move RepairExcludedCountryCodes from overlay config into repair package. Change-Id: I3692f6e0ddb9982af925db42be23d644aec1963f	2023-07-10 12:01:19 +02:00
Márton Elek	97a89c3476	satellite: switch to use nodefilters instead of old placement.AllowedCountry placement.AllowedCountry is the old way to specify placement, with the new approach we can use a more generic (dynamic method), which can check full node information instead of just the country code. The 90% of this patch is just search and replace: * we need to use NodeFilters instead of placement.AllowedCountry * which means, we need an initialized PlacementRules available everywhere * which means we need to configure the placement rules The remaining 10% is the placement.go, where we introduced a new type of configuration (lightweight expression language) to define any kind of placement without code change. Change-Id: Ie644b0b1840871b0e6bbcf80c6b50a947503d7df	2023-07-07 16:55:45 +00:00
Michal Niewrzal	f2cd7b0928	satellite/overlay: refactor Reliable to be used with repair checker Currently we are using Reliable to get missing pieces for repair checker. The issue is that now checker is looking at more things than just missing pieces (clumped/off, placement pieces) and using only node ID is not enough. We have issue where we are skipping offline nodes from clumped and off placement pieces check. Reliable was refactored to get data (e.g. country, lastNet) about all reliable nodes. List is split into online and offline. This data will be cached for quick use by repair checker. It will be also possible to check nodes metadata like country code or lastNet. We are also slowly moving `RepairExcludedCountryCodes` config from overlay to repair which makes more sens for it. This this first part of changes. https://github.com/storj/storj/issues/5998 Change-Id: If534342488c0e440affc2894a8fbda6507b8959d	2023-07-05 10:56:31 +02:00
Clement Sam	1166fdfbab	satellite/gc: add piece tracker ranged loop observer Resolves https://github.com/storj/storj/issues/5798 Change-Id: I6fe2c57b3a247b085026feb8bee60c2d002db71b	2023-06-22 18:17:39 +00:00
Michal Niewrzal	9b3488276d	satellite/gracefulexit: use node alias instead id with observer Using node alias helps using less cpu and memory. Fixes https://github.com/storj/storj/issues/5654 Change-Id: If3a5c7810732cbb1bff4dcb78706c81aca56b71b	2023-05-18 22:37:46 +00:00
Michal Niewrzal	2592aaef9c	satellite/gc/bloomfilter: remove segments loop parts We are switching completely to ranged loop. https://github.com/storj/storj/issues/5368 Change-Id: I1a22ac4b242998e287b2b7d8167b64e850b61a0f	2023-05-15 11:46:26 +00:00
Michal Niewrzal	36e046375c	satellite/repair/checker: remove segments loop parts We are switching completely to ranged loop. https://github.com/storj/storj/issues/5368 Change-Id: I8583549973cd36aa0e0c482c20d7a75cb7568ab3	2023-05-08 12:19:13 +00:00
Michal Niewrzal	6a55682bc6	satellite/accounting/nodetally: remove segments loop parts We are switching completely to ranged loop. https://github.com/storj/storj/issues/5368 Change-Id: I6176a129ba14cf83fb635048d09e6748276b52a1	2023-04-24 14:25:53 +00:00
Michal Niewrzal	fbfe5aaad7	satellite/metrics: remove code related to segments loop We are switching completely to ranged loop. Change-Id: I32120ef496addebec2de088fd10d0c1d02313c68	2023-04-20 13:47:22 +00:00
paul cannon	ae5947327b	satellite/accounting: Use metabase.AliasPiece with tally observer We want to eliminate usages of LoopSegmentEntry.Pieces, because it is costing a lot of cpu time to look up node IDs with every piece of every segment we read. In this change, we are eliminating use of LoopSegmentEntry.Pieces in the node tally observer (both the ranged loop and segments loop variants). It is not necessary to have a fully resolved nodeID until it is time to store totals in the database. We can use NodeAliases as the map key instead, and resolve NodeIDs just before storing totals. Refs: https://github.com/storj/storj/issues/5622 Change-Id: Iec12aa393072436d7c22cc5a4ae1b63966cbcc18	2023-03-29 12:24:05 +00:00
Márton Elek	ffaf15a3b0	satellite/overlay: remove unused mail service from overlay It was surprising that `satellite auditor` complained about SMTP mail settings, even if it's not supposed to sending any mail. Looks like we can remove the mail service dependency, as it's not a hard requirement for overlay.Service. Change-Id: I29a52eeff3f967ddb2d74a09458dc0ee2f051bd7	2023-03-09 12:17:35 +00:00
Michal Niewrzal	67ad792d1a	satellite/rangedloop: migrate segments verification from segment loop Segments loop have build-in sanity check to verify if number of segments processed by loop is roughly fine. We want to have the same verification for ranged loop. https://github.com/storj/storj/issues/5544 Change-Id: Ia19edc0fb4aa8dc45993498a8e6a4eb5928485e9	2023-03-08 17:00:11 +00:00
Qweder93	d6a948f59d	satellite/repair : implemented ranged loop observer implemented observer and partial, created new structures to keep mon metrics remain in same way as in segment loop Change-Id: I209c126096c84b94d4717332e56238266f6cd004	2023-01-23 14:23:03 +00:00
Yaroslav Vorobiov	5644fb1a7e	satellite/accounting/nodetally: add ranged loop Add node tally ranged loop observer and partial. Add node tally randed observer to range loop peer. Add config flag to select which loop to use for node tally. Update satellite core to use segement/ranged loop based on a flag. Duplicate existing node tally test but using ranged loop. Change-Id: I6786f1a16933463fab5f79601bf438203a7a5f9e	2023-01-17 13:50:18 +01:00
Erik van Velzen	2d863759b0	satellite/metabase/rangedloop: add AsOfSystemTime Add option AsOfSystemTime to segment provider to make it equivalent to the old segment loop. There's no comment on what it does because it's pretty complex and makes no sense, but we can improve it later. closes https://github.com/storj/storj/issues/5434 Change-Id: I8f09b03803e681e2fd41008c5dba67804b0f37a1	2023-01-11 16:22:18 +00:00
Erik van Velzen	23b92da490	satellite/metabase/rangedloop: live reporting (#5366 ) Add an observer to monitor ranged segment loop progress. Tested by running the segment loop in storj-up and navigating to http://<container>:11111/mon/stats and there is the entry: rangedloop-live,scope=storj.io/storj/satellite/metabase/rangedloop numSegments=364523630000.000000 part of https://github.com/storj/storj/issues/5223 Change-Id: If3d2774d2f17f51eac86f47c6dda1fb8ad696dfe	2023-01-06 09:49:14 +01:00
Qweder93	8c69ee62fc	{cmd/storj-sim, satellite/rangedloop}: added rangedloop to storj-sim, removed identity added in storj-sim rangedloop for each satellite, to verify it works for metrics oveserver, removed identity from rangedloop peer as we never use it, added logs on service run, added loop to service instead of endless for loop, interval value to config Closes: https://github.com/storj/storj/issues/5414 Change-Id: Ibc3b06071b68feda4a35b45da2bbe36e22a02fc8	2023-01-05 11:29:00 +00:00
Andrew Harding	5362dff94b	satellite/gc/bloomfilter: implement rangedloop observer https://github.com/storj/storj/issues/5235 Change-Id: Iffe8f682adfa46e48e47976bf838326e7125ff80	2023-01-03 09:46:02 -07:00
Andrew Harding	590d44301c	satellite/audit: implement rangedloop observer This change implements the ranged loop observer to replace the audit chore that builds the audit queue. The strategy employed by this change is to use a collector for each segment range to build separate per-node segment reservoirs that are then merge them during the join step. In previous observer migrations, there were only a handful of tests so the strategy was to duplicate them. In this package, there are dozens of tests that utilize the chore. To reduce code churn and maintenance burden until the chore is removed, this change introduces a helper that runs tests under both the chore and observer, providing a pair of functions that can be used to pause or run the queueing function. https://github.com/storj/storj/issues/5232 Change-Id: I8bb4b4e55cf98b1aac9f26307e3a9a355cb3f506	2023-01-03 08:52:01 -07:00
Andrew Harding	4241e6bf5f	satellite/gracefulexit: implement rangedloop observer The tests are forked from the chore tests with slight adaptations for being run against the ranged loop. I also moved a benchmark for the database from chore_test.go to db_test.go. The pathcollector is reused as a rangedloop.Partial. https://github.com/storj/storj/issues/5234 Change-Id: I56182031d133812a9f4d4a433c01b9150af39f31	2022-12-22 10:47:10 -07:00
Andrew Harding	1cb2eb4c3b	satellite/rangedloop: wire up metrics observer Final touches on https://github.com/storj/storj/issues/5236. Change-Id: I2259ec4e7825d20db9efb36beb42d6309dee55ba	2022-12-12 19:06:23 +00:00
Andrew Harding	633ab8dcf6	satellite/metadabase/rangedloop: stream affinity for test provider Some observers assume that they will observe all the segments for a given stream, and that they will observe those segments in a sequential stream over one or more iterations. This change updates the range provider from rangedlooptest to provide these guarantees. The change also removes the Mock suffix from the provider/splitter types since the package name (rangedlooptest) implies that the type is a test double. Change-Id: I927c409807e305787abcde57427baac22f663eaa	2022-12-09 16:49:02 +00:00
Erik van Velzen	ff6d640fca	satellite/metabase/rangedloop: minimal loop (#5334 ) Minimal implementation of the ranged (=threaded) segment loop service, to improve performance over the existing loop. Has tests with a an inmemory segment database and example observer. Does not have yet: database link, observer duration tracking, suspicious processed ratio guard, rate limiting, minimum execution interval per observer, etc. Part of https://github.com/storj/storj/issues/5223 Change-Id: I08ffb392c3539e380f4e7b4f1afd56c4c394668d	2022-12-08 15:27:21 +01:00
Erik van Velzen	b574ee5e6d	satellite/metabase/rangedloop: service skeleton Create skeleton for multi-threaded segment loop, peer, cmd command for rangedloop. Change-Id: I52c78a313f15070d43207c52ea94e53169821654	2022-11-22 15:21:41 +02:00

30 Commits