Up to now, we have been implementing the DistinctIP preference with code
in two places:
1. On check-in, the last_net is determined by taking the /24 or /64
(in ResolveIPAndNetwork()) and we store it with the node record.
2. On node selection, a preference parameter defines whether to return
results that are distinct on last_net.
It can be observed that we have never yet had the need to switch from
DistinctIP to !DistinctIP, or from !DistinctIP to DistinctIP, on the
same satellite, and we will probably never need to do so in an automated
way. It can also be observed that this arrangement makes tests more
complicated, because we often have to arrange for test nodes to have IP
addresses in different /24 networks (a particular pain on macOS).
Those two considerations, plus some pending work on the repair framework
that will make repair take last_net into consideration, motivate this
change.
With this change, in the #2 place, we will _always_ return results that
are distinct on last_net. We implement the DistinctIP preference, then,
by making the #1 place (ResolveIPAndNetwork()) more flexible. When
DistinctIP is enabled, last_net will be calculated as it was before. But
when DistinctIP is _off_, last_net can be the same as address (IP and
port). That will effectively implement !DistinctIP because every
record will have a distinct last_net already.
As a side effect, this flexibility will allow us to change the rules
about last_net construction arbitrarily. We can do tests where last_net
is set to the source IP, or to a /30 prefix, or a /16 prefix, etc., and
be able to exercise the production logic without requiring a virtual
network bridge.
This change should be safe to make without any migration code, because
all known production satellite deployments use DistinctIP, and the
associated last_net values will not change for them. They will only
change for satellites with !DistinctIP, which are mostly test
deployments that can be recreated trivially. For those satellites which
are both permanent and !DistinctIP, node selection will suddenly start
acting as though DistinctIP is enabled, until the operator runs a single
SQL update "UPDATE nodes SET last_net = last_ip_port". That can be done
either before or after deploying software with this change.
I also assert that this will not hurt performance for production
deployments. It's true that adding the distinct requirement to node
selection makes things a little slower, but the distinct requirement is
already present for all production deployments, and they will see no
change.
Refs: https://github.com/storj/storj/issues/5391
Change-Id: I0e7e92498c3da768df5b4d5fb213dcd2d4862924
Testplanet tests will print into logs (WARN) if full table scan will
be detected. Test won't be failed automatically. That's because
currently we have multiple queries which are doing full table scan and it's not trivial to change.
https://github.com/storj/storj/issues/5471
Change-Id: Ia2fcbfb9102424d58f95e00071329454a8c1066e
Our DB support in storj/private was updated to enable basic context
support for executing SQL queries. This change requires some small
adjustments as not all parts were working correctly.
storj/private commit with change:
4bc77107b7acfcc2f7ad65796d5dd3d7c64801e4
Change-Id: I64d7ed92788ea0920d12cecd1aa0e414720e9b9c
Testplanet limits the execution of a single test case to 3 minutes.
This change adds a Timeout field to the Testplanet's config, so test
cases can configure their timeout. This is helpful when executing larger
3rd party test suite on top of Testplanet.
Change-Id: Ibbf7c5ffdc0a9e723e7e28b885eac084f04c6ca1
Rather than starting all servers on 127.0.0.1 start them
on a random local host to try avoid port exhaustion.
The port exhaustion is just a guess.
Change-Id: Ibf31d6a017852238d836291d703642b44ff66c0c
* Remove "enable-logging", because it ends up spawning consoles on
Windows.
* Remove "disable-gpu", if we have a GPU, then let's use it.
* Create custom client, so we can attach logging to CDP.
* Ignore potential context.Canceled.
* Fix onboarding wizard test for new objects browser.
* Return an error on a context cancellation.
* Wire all loggers to planet.
Change-Id: I67eb138ba31252f55ac5b383679d033bcf71f1b2
Testplanet does not have a multinode instance, hence,
makes it difficult to run ui tests for the multinode dashboard.
Instead of using storj-sim for multinode ui tests,
it's better to use the same approach (testplanet) fir all UI tests.
This changes adds a multinode instance to testplanet.
Change-Id: I58aa8c4597e789275f9e7ea7059703c742903492
This allows to set `STORJ_TEST_MONKIT` to either
`svg` or `json` to write individual testplanet test
traces to disk.
It also allows to specify an absolute directory:
STORJ_TEST_MONKIT=svg,dir:/abs/dir/path
It requires an absolute path, because from the context of
tests, there's no easy way to find the folder where tests
were called.
Change-Id: I6fe008a4d4237d221cf5a5bede798b46399ee197
Currently, reputation table is only populated when a node has been
audited. This is ok in production, however a lot of our tests doesn't
upload any data or trigger audits.
This PR adds an initialization step in testplanet to populate reputation
table with zero value for nodes reputation.
Change-Id: I11b381236669db346dc68a48a6d4a27334a0a8b8
Initially we duplicated the code to avoid large scale changes to
the packages. Now we are past metainfo refactor we can remove the
duplication.
Change-Id: I9d0b2756cc6e2a2f4d576afa408a15273a7e1cef
This PR removes all back-end related referral program code including the
marketing portal.
We will have a separate PR for front-end code and database migration to
drop `offers` and `usercredits` table
Change-Id: If59f952cddfe0558a7dc03a0eac7cc1081517f88
Peers require that Run finishes before calling Close.
Cancel only signals peer to start closing however it does not wait for
it to complete.
Change-Id: If4b3778f4fc86402363ed3b555db11e1189e6200
Currently Cockroach isn't performant for concurrent database setup and
tear-down. Instead of a single instance allow setting multiple potential
connection strings and let the tests pick one connection string
randomly.
This improves test duration by ~10 minutes.
While we are at significantly changing how pgtest works, introduce
helper PickPostgres and PickCockroach for selecting the database to
reduce code duplications in multiple places.
Change-Id: I8ad171d5c4c8a4fc081ec2ae9bdd0cc948a80619
Instead of providing the database from outside to testplanet create it
inside and then allow wrapping and modifying it. This is more convenient
to use.
Change-Id: I9b8f69e6e0a19ff984b4e2bfe927c9100c77bc6c
Previously, we were simply discarding rows from the repair queue when
they couldn't be repaired (either because the overlay said too many
nodes were down, or because we failed to download enough pieces).
Now, such segments will be put into the irreparableDB for further
and (hopefully) more focused attention.
This change also better differentiates some error cases from Repair()
for monitoring purposes.
Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c
NonParallel running is needed for gateway tests, because minio
unfortunately relies on global state.
Change-Id: If730db2ab86d10f4d02e1ac3128f758e9c18cdff
Curently, storage nodes only report their capacity to satellites
once per hour. If a node fills up, it will fail all uploads until
the next contact cycle begins. With these changes, at the end of an
upload we check whether the MinimumDiskSpace threshold has been
passed. If so, trigger the monitor chore to update the node's
capacity, then trigger the contact chore to report the new
capacity to the satellites
Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0
Since we have caches on top of databases and they are included in the
databases list, we need to shut them down in-reverse order to avoid
issues with flushing to a closed database.
Change-Id: I3f23a527a2a5425638b1a7e2cab84741f019d493
Close a peer didn't guarantee that the peer ended its services and we
want that when a StopPeer method returns the peer service is actually
finished.
Change-Id: If97f41b7e404990555640c71e097ebc719678ae7