Add a flag that allows us to easily switch disqualification from
suspension mode on or off. A node will only be disqualified from
suspension mode if it has been suspended for longer than the grace
period AND the SuspensionDQEnabled flag is true.
Change-Id: I9e67caa727183cd52ab2042b0a370a1bcaebe792
This test was the last place using it. Replace it with a direct call so
we can remove the method from uplink piecestore.
Change-Id: I62e13028663a7e67aa2495f90ecc02d0d8657fbd
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.
Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
it was being used in ways that implied it should be NOT NULL
even though it was possibly null. we used to get this data
from the satellite db's added_at column as seen in 30369b02,
so backfill it using that data where joined_at is NULL, and
then alter the table to constrain the column to be NOT NULL.
Fixes#3866.
Change-Id: If2d856189209740d985f71dada7b93525e625ef3
According to the docs at https://www.sqlite.org/lang_altertable.html
doing the steps
1. Rename old table
2. Create new table
3. Copy data
4. Drop old table
is incorrect and should be
1. Create new table
2. Copy data
3. Drop old table
4. Rename new into old
Additionally, each step was being run in a different transaction,
which could cause permanent failures if a problem happened during
the migration.
Avoid both of those problems by changing up some previous migrations
that ran in this way. Since they are semantically identical, it's
fine to change up these old migrations. It will help make newer
nodes coming up for the first time more robust.
Change-Id: I43fb004fa1b6cb2fe2554f9920925420da28fb4a
This catches the bug that hit prod where we missed that a migration
was necessary for a column, but only on uploads, and only for nodes
that had not checked in yet.
So, this does more uploads, and stops a node from starting so that
we trigger that specific scenario. Hopefully it will catch other
similar ones in the future.
I confirmed that this locally caught the bug when the release under
test was v0.34.10 and HEAD did not include the fix for it.
Change-Id: If7d41e8241d6a042fa524b4aff956b0264ecb128
TestVerifierSlowDownload would sometimes not have enough nodes finish in
the allotted deadline period. This increases the deadline and also does
not assert that exactly 3 have finished. Instead, in keeping with the
purpose of the test, it asserts that the slow download is never counted
as a success and is always counted as a pending audit in the final
report.
Change-Id: I180734fcc4a499420c75164bad6253ed155d87de
CreateTables hasn't been quite true for a while now, rename to
MigrateToLatest to be clearer in it's behavior.
Change-Id: Ida48e95122a5d9b7a814e922d3698e00024a2ba7
The UpdateAddress method use to be used when storage node's checked in with the Satellite, but once the contact service was created this method was no longer used. This PR finally removes it.
Change-Id: Ib3f83c8003269671d97d54f21ee69665fa663f24
time.Now() in a short amount of time can return the exact same value.
Ensure that the test uses times that are distinct.
Change-Id: Ia653ce0af4bfcf7b5da133a9cf98b823033d9592
Peers require that Run finishes before calling Close.
Cancel only signals peer to start closing however it does not wait for
it to complete.
Change-Id: If4b3778f4fc86402363ed3b555db11e1189e6200
Before the deleter would close its done channel once, so if additional
tests shared a storagenode, even if not in parallel, the later waits
would not work properly. This fixes that problem.
Change-Id: I7dcacf6699cef7c2c2948ba0f4369ef520601bf5
Currently this test was the last place that was using
piecestore.Client.DeletePieces. This way we can remove it from uplink to
reduce the code.
Change-Id: I72fda8888d05181f95eeb544d067c031ec3e36a0
CustomerRepositoryList relied on created at time for the output sorting,
however the granularity of the timer may cause multiple customers
having the same "created at" time.
Add a sleep so that every customer does get a unique time.
Change-Id: If2923174f304fa5f41260d500f6139e3fa7c3ba5
Delay of 100ms could happen due to other things happening on the test
server. Increase the time to 1s.
Change-Id: I2c7c21f966101771633d73e84cf9850d28089e71
Sometimes nodes who have gracefully exited will still be holding pieces
according to the satellite. This has some unintended side effects
currently, such as nodes getting disqualified after having successfully
exited.
* When the audit reporter attempts to update node stats, do not update
stats (alpha, beta, suspension, disqualification) if the node has
finished graceful exit (audit/reporter_test.go TestGracefullyExitedNotUpdated)
* Treat gracefully exited nodes as "not reputable" so that the repairer
and checker do not count them as healthy (overlay/statdb_test.go
TestKnownUnreliableOrOffline, repair/repair_test.go
TestRepairGracefullyExited)
Change-Id: I1920d60dd35de5b2385a9b06989397628a2f1272
When running testplanet tests, mark storagenode peer PieceDeleter as in
testing mode so that you don't have to do it on each test.
Change-Id: I2592e02c63f8bcc9152ecf436bac4e798b08bccf
when tracing is enabled, we should also set sampling rate to
a non-zero value. For now, we will set it to 1.
Uplink CLI users should be able to override it with the sample
flag.
Change-Id: I8bcf514fb14c2a1c4349b7957dd24ec23e4a85e5