this is a very old tool built in the very early days
of v3, when we didn't know how the network would be
used. this tool anticipated being able to query remote
nodes for internal state. we don't do that. i don't
think anyone uses this.
Change-Id: Ie1ded3ecbedb09313f2d6fc721039e0f15e4ee85
The query for GetNodesNetworkInOrder is causing far too much load on the
database. Since it is not critical that the repair checker have
perfectly up-to-date node network information, we can use a cache
instead.
Change-Id: I07ad45bfdeb46529da093941a06c2da8a00ce878
The blobstore implementation is entirely related to storagenode, so the
rightful place is together with the storagenode implementation.
Fixes https://github.com/storj/storj/issues/5754
Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d
This flag was in general one time switch to enable versions internally.
New we can remove it as it makes code more complex.
Change-Id: I740b6e8fae80d5fac51d9425793b02678357490e
Only API peer needs access to order DB (and rollups cache) because it's
only place where we are creating orders for PUT and GET operations. For
other peers like auditor and repairer we can set noop implementation to
reduce number of dependencies needed for them.
Change-Id: Ic32d1879f0b97ffc4516f401898e31e95ae892e4
Up to now, we have been implementing the DistinctIP preference with code
in two places:
1. On check-in, the last_net is determined by taking the /24 or /64
(in ResolveIPAndNetwork()) and we store it with the node record.
2. On node selection, a preference parameter defines whether to return
results that are distinct on last_net.
It can be observed that we have never yet had the need to switch from
DistinctIP to !DistinctIP, or from !DistinctIP to DistinctIP, on the
same satellite, and we will probably never need to do so in an automated
way. It can also be observed that this arrangement makes tests more
complicated, because we often have to arrange for test nodes to have IP
addresses in different /24 networks (a particular pain on macOS).
Those two considerations, plus some pending work on the repair framework
that will make repair take last_net into consideration, motivate this
change.
With this change, in the #2 place, we will _always_ return results that
are distinct on last_net. We implement the DistinctIP preference, then,
by making the #1 place (ResolveIPAndNetwork()) more flexible. When
DistinctIP is enabled, last_net will be calculated as it was before. But
when DistinctIP is _off_, last_net can be the same as address (IP and
port). That will effectively implement !DistinctIP because every
record will have a distinct last_net already.
As a side effect, this flexibility will allow us to change the rules
about last_net construction arbitrarily. We can do tests where last_net
is set to the source IP, or to a /30 prefix, or a /16 prefix, etc., and
be able to exercise the production logic without requiring a virtual
network bridge.
This change should be safe to make without any migration code, because
all known production satellite deployments use DistinctIP, and the
associated last_net values will not change for them. They will only
change for satellites with !DistinctIP, which are mostly test
deployments that can be recreated trivially. For those satellites which
are both permanent and !DistinctIP, node selection will suddenly start
acting as though DistinctIP is enabled, until the operator runs a single
SQL update "UPDATE nodes SET last_net = last_ip_port". That can be done
either before or after deploying software with this change.
I also assert that this will not hurt performance for production
deployments. It's true that adding the distinct requirement to node
selection makes things a little slower, but the distinct requirement is
already present for all production deployments, and they will see no
change.
Refs: https://github.com/storj/storj/issues/5391
Change-Id: I0e7e92498c3da768df5b4d5fb213dcd2d4862924
The tests were using global variables for keeping the mock state, which
was indexed by the satellite ID. However, the satellite ID-s are
deterministic and it's possible for two tests end up using the same
mocks.
Instead make the mock creation not depend on the satellite ID and
instead require it being configured via paymentsconfig.
This fixes TestAutoFreezeChore failure.
Change-Id: I531d3550a934fbb36cff2973be96fd43b7edc44a
Peer for generating bloom filters will be able to use ranged loop.
As an addition some cleanup were made:
* remove unused parts of GC BF peer (identity, version control)
* added missing Close method for ranged loop service
* some additional tests added
https://github.com/storj/storj/issues/5545
Change-Id: I9a3d85f5fffd2ebc7f2bf7ed024220117ab2be29
this change uses the new storj/common noise helpers, which:
* add a security fix (require an expected node id for validating
noise key attestations)
* stops doing an unnecessary order signature validation (it's
already been done inside of PutPiece)
* removes some duplicate code
Change-Id: I5e67a08ff216cd9c5b0b82e40b4d9de664b6b0fc
We will be needing an infrequent chore to check which nodes are in the
reverify queue and synchronize that set with the 'contained' field in
the nodes db, since it is easily possible for them to get out of sync.
(We can't require that the reverification queue table be in the same
database as the nodes table, so maintaining consistency with SQL
transactions is out. Plus, even if they were in the same database, using
such SQL transactions to maintain consistency would be slow and
unwieldy.)
This commit adds the actual chore.
Refs: https://github.com/storj/storj/issues/5431
Change-Id: Id78b40bf69fae1ac39010e3b553315db8a1472bd
Previously we were exposing the testing facilities via interface casting
the necessary parts, however, when things are not part of the main
satellite.DB interface they need to be manually propagated. Rather than
relying on using hidden methods lets expose things as long as they don't
create a direct dependency to the database driver.
Change-Id: I2eb7d8b60f4b64de1320c2d32581f7be267c0f57
Testplanet tests will print into logs (WARN) if full table scan will
be detected. Test won't be failed automatically. That's because
currently we have multiple queries which are doing full table scan and it's not trivial to change.
https://github.com/storj/storj/issues/5471
Change-Id: Ia2fcbfb9102424d58f95e00071329454a8c1066e
Satellite DB tests will print into logs (WARN) if full table scan will
be detected. Test won't be failed automatically. That's because currently
we have multiple queries which are doing full table scan and it's not
trivial to change.
We may change that behavior when we will figure out how to skip
specific query from detection or we will fix all problematic queries.
https://github.com/storj/storj/issues/5471
Change-Id: Icafe782257a0d353e8bcdf6fa8a19c20b1091a0b
Our DB support in storj/private was updated to enable basic context
support for executing SQL queries. This change requires some small
adjustments as not all parts were working correctly.
storj/private commit with change:
4bc77107b7acfcc2f7ad65796d5dd3d7c64801e4
Change-Id: I64d7ed92788ea0920d12cecd1aa0e414720e9b9c
added in storj-sim rangedloop for each satellite, to verify it works for metrics oveserver,
removed identity from rangedloop peer as we never use it, added logs on service run, added loop
to service instead of endless for loop, interval value to config
Closes: https://github.com/storj/storj/issues/5414
Change-Id: Ibc3b06071b68feda4a35b45da2bbe36e22a02fc8
This change stubs userinfo endpoint from storj/common/pb/userinfo.proto.
It also adds config for allowed peers, and a method for verifying peers.
Issue: https://github.com/storj/storj/issues/5358
Change-Id: I057a0e873a9e9b3b9ad0bba69305f0d708bd9b9e
Adds new method Exists which can be used to verify which
requested piece ids exists on storage node. Will verify only pieces
which belongs to the satellite that used that endpoint.
Minum WASM size was increased a bit.
https://github.com/storj/storj/issues/5415
Change-Id: Ia5f9cadeb526541b2776a8973eb7d50133ad8636
This change creates a new independent process, the 'auditor', comparable
to the repairer, gc, and api processes. This will allow auditors to be
scaled independently of the core.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I8a29eeb0a6e35753dfa0eab5c1246048065d1e91
Now that all the reverification changes have been made and the old code
is out of the way, this commit renames the new things back to the old
names. Mostly, this involves renaming "newContainment" to "containment"
or "NewContainment" to "Containment", but there are a few other renames
that have been promised and are carried out here.
Refs: https://github.com/storj/storj/issues/5230
Change-Id: I34e2b857ea338acbb8421cdac18b17f2974f233c
Now that we are doing scalable piecewise reverifications, the code for
handling the old way of doing things (containment, pending audits,
reporting, testing) can now be removed.
Refs: https://github.com/storj/storj/issues/5230
Change-Id: Ief1a75f423eff682e8f3d57804e343b3409a6631
Here we add a worker class comparable to audit.Worker, which will be
responsible for pulling items off of the reverification queue and
calling reverifier.ReverifyPiece on them.
Note that piecewise reverification audits (which this will control) are
not yet being done. That is, nothing is being added to the
reverification queue at this point.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I94e28830e27caa49f2c8bd4a2336533e187ab69c
NewContainment will replace Containment later in this commit chain, but
for now it is not yet being used.
NewContainment will allow a node to be contained for multiple pending
reverify jobs at a time. It is implemented by way of the reverify queue.
Refs: https://github.com/storj/storj/issues/5231
Change-Id: I126eda0b3dfc4710a88fe4a5f41780618ec19101
Adding a new worker comparable to Verifier, called Reverifier; as the
name suggests, it will be used for reverifications, whereas Verifier
will be used for verifications.
This allows distinct logging from the two classes, plus we can add some
configuration that is specific to the Reverifier.
There is a slight modification to GetNextJob that goes along with this.
This should have no impact on operational concerns.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: Ie60d2d833bc5db8660bb463dd93c764bb40fc49c
This change turns off fsync on the postgres container used for tests. This
reduces migration time significantly when initializing new satellite
databases.
The change also includes a new benchmark for satellite initialization in
testplanet.
$ benchstat old.txt new.txt name old time/op new time/op delta
Run_Satellite/Postgres-16 1.36s ± 0% 0.08s ± 0% ~ (p=1.000 n=1+1)
Change-Id: Ic954767133864770cf652b0dfdcd6b109a167b5f
Running all of the migrations necessary to initialize a storage node
database takes a significant amount of time during runs.
The package current supports initializing a database from manually coalesced
migration data (i.e. snapshot) which improves the situation somewhat.
This change takes things a bit further by changing the snapshot code to
instead hydrate the database directory from a pre-generated snapshot zip
file.
name old time/op new time/op delta
Run_StorageNodeCount_4/Postgres-16 2.50s ± 0% 0.16s ± 0% ~ (p=1.000 n=1+1)
Change-Id: I213bbba5f9199497fbe8ce889b627e853f8b29a0
As part of the effort of splitting out the auditor workers to their own
process, we are transitioning the communication between the auditor
chore and the verification workers to a queue implemented in the
database, rather than the sequence of in-memory queues we used to use.
This logical database is safely partitionable from the rest of
satelliteDB.
Refs: https://github.com/storj/storj/issues/5251
Change-Id: I6cd31ac5265423271fbafe6127a86172c5cb53dc
Add a new chore to periodically insert nodes who are offline and
have not gotten an offline email in a certain amount of time into node
events
Change-Id: I658b385bb777b0240c98092946a93d65bee94abc
Create NodeEvents Chore on satellite core to read nodeevents DB and
notify node operators on node events. The chore sends notifications
grouped by email and event type: it selects the oldest entry in
nodeevents.DB and also any other event with the same email and event
type no matter how old it is. The oldest entry of a group must exist for
a minimum amount of time before that group can be selected, however.
This minimum amount of time is a configurable value:
--node-events.selection-wait-period. This wait period allows us to
combine events of the same time and same email address into a singular
email.
Change-Id: I8b444aa324d2dae265cc27d9e9e85faef79195d8
Add nodeevents.DB to satellite overlay service so we can insert node
events into the nodeevents DB.
Change-Id: I642c0ccc9941ecdb08cb22d5c8cf701959a55156
Currently, its not strightforward how to benchmark with testplanet.
This change add Bench method to make it easy.
Change-Id: I212dfe71a18bb6ddd7a127e5b6d313b1b0c1f824
We will introduce new logic for creating new objects (BeginObject).
Instead of using single version internally (1) we will be selecting first
available version during object creation. Because we need to be sure
that everything is wired up correctly we need a feature flag to be
able to control if new feature is enabled.
Change-Id: If0f8496397130811f43bf9db9fdcc2b30cd2e4ca
Implement a new service to read retain filter from a bucket and
send them out to storagenodes.
This allows the retain filters to be generated by a separate command on
a backup of the database.
Paralellism (setting ConcurrentSends) and end-to-end garbage collection
tests will be restored in a subsequent commit.
Solves https://github.com/storj/team-metainfo/issues/121
Change-Id: Iaf8a33fbf6987676cc3cf74a18a8078916fe673d