2019-10-11 22:18:05 +01:00
// Copyright (C) 2019 Storj Labs, Inc.
// See LICENSE for copying information.
package gracefulexit
import (
"time"
2019-11-08 20:40:39 +00:00
"github.com/spacemonkeygo/monkit/v3"
2019-10-11 22:18:05 +01:00
"github.com/zeebo/errs"
)
var (
// Error is the default error class for graceful exit package.
Error = errs . Class ( "gracefulexit" )
2020-08-11 15:50:01 +01:00
// ErrNodeNotFound is returned if a graceful exit entry for a node does not exist in database.
2019-10-23 02:06:01 +01:00
ErrNodeNotFound = errs . Class ( "graceful exit node not found" )
2019-11-08 18:57:51 +00:00
// ErrAboveOptimalThreshold is returned if a graceful exit entry for a node has more pieces than required.
2020-12-16 16:47:31 +00:00
ErrAboveOptimalThreshold = errs . Class ( "segment has more pieces than required" )
2019-11-08 18:57:51 +00:00
2019-10-11 22:18:05 +01:00
mon = monkit . Package ( )
)
2020-07-16 15:18:02 +01:00
// Config for the chore.
2019-10-11 22:18:05 +01:00
type Config struct {
2023-07-25 18:48:36 +01:00
Enabled bool ` help:"whether or not graceful exit is enabled on the satellite side." default:"true" `
TimeBased bool ` help:"whether graceful exit will be determined by a period of time, rather than by instructing nodes to transfer one piece at a time" default:"false" `
NodeMinAgeInMonths int ` help:"minimum age for a node on the network in order to initiate graceful exit" default:"6" testDefault:"0" `
// these items only apply when TimeBased=false:
2019-11-01 14:21:24 +00:00
testplanet/satellite: reduce the number of places default values need to be configured
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.
As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.
This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.
In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.
Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
2021-05-31 22:15:00 +01:00
ChoreBatchSize int ` help:"size of the buffer used to batch inserts into the transfer queue." default:"500" testDefault:"10" `
ChoreInterval time . Duration ` help:"how often to run the transfer queue chore." releaseDefault:"30s" devDefault:"10s" testDefault:"$TESTINTERVAL" `
2023-04-24 10:10:00 +01:00
UseRangedLoop bool ` help:"whether use GE observer with ranged loop." default:"true" `
2019-10-11 22:18:05 +01:00
testplanet/satellite: reduce the number of places default values need to be configured
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.
As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.
This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.
In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.
Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
2021-05-31 22:15:00 +01:00
EndpointBatchSize int ` help:"size of the buffer used to batch transfer queue reads and sends to the storage node." default:"300" testDefault:"100" `
2019-10-24 17:24:42 +01:00
2020-01-24 21:06:49 +00:00
MaxFailuresPerPiece int ` help:"maximum number of transfer failures per piece." default:"5" `
2019-10-24 17:24:42 +01:00
OverallMaxFailuresPercentage int ` help:"maximum percentage of transfer failures per node." default:"10" `
testplanet/satellite: reduce the number of places default values need to be configured
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.
As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.
This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.
In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.
Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
2021-05-31 22:15:00 +01:00
MaxInactiveTimeFrame time . Duration ` help:"maximum inactive time frame of transfer activities per node." default:"168h" testDefault:"10s" `
RecvTimeout time . Duration ` help:"the minimum duration for receiving a stream from a storage node before timing out" default:"2h" testDefault:"1m" `
MaxOrderLimitSendCount int ` help:"maximum number of order limits a satellite sends to a node before marking piece transfer failed" default:"10" testDefault:"3" `
2021-02-10 18:09:49 +00:00
2021-06-15 11:49:56 +01:00
AsOfSystemTimeInterval time . Duration ` help:"interval for AS OF SYSTEM TIME clause (crdb specific) to read from db at a specific time in the past" default:"-10s" testDefault:"-1µs" `
2021-02-10 18:09:49 +00:00
TransferQueueBatchSize int ` help:"batch size (crdb specific) for deleting and adding items to the transfer queue" default:"1000" `
2023-07-25 18:48:36 +01:00
// these items only apply when TimeBased=true:
GracefulExitDurationInDays int ` help:"number of days it takes to execute a passive graceful exit" default:"30" testDefault:"1" `
OfflineCheckInterval time . Duration ` help:"how frequently to check uptime ratio of gracefully-exiting nodes" default:"30m" testDefault:"10s" `
MinimumOnlineScore float64 ` help:"a gracefully exiting node will fail GE if it falls below this online score (compare AuditHistoryConfig.OfflineThreshold)" default:"0.8" `
2019-10-11 22:18:05 +01:00
}