We avoid putting more than one piece of a segment on the same /24
network (or /64 for ipv6). However, it is possible for multiple pieces
of the same segment to move to the same network over time. Nodes can
change addresses, or segments could be uploaded with dev settings, etc.
We will call such pieces "clumped", as they are clumped into the same
net, and are much more likely to be lost or preserved together.
This change teaches the repair checker to recognize segments which have
clumped pieces, and put them in the repair queue. It also teaches the
repair worker to repair such segments (treating clumped pieces as
"retrievable but unhealthy"; i.e., they will be replaced on new nodes if
possible).
Refs: https://github.com/storj/storj/issues/5391
Change-Id: Iaa9e339fee8f80f4ad39895438e9f18606338908
This code is essentially replacement for eestream.CalcPieceSize. To call
eestream.CalcPieceSize we need eestream.RedundancyStrategy which is not
trivial to get as it requires infectious.FEC. For example infectious.FEC
creation is visible on GE loop observer CPU profile because we were
doing this for each segment in DB.
New method was added to storj.Redundancy and here we are just wiring it
with metabase Segment.
BenchmarkSegmentPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize
BenchmarkSegmentPieceSize/eestream.CalcPieceSize-8 5822 189189 ns/op 9776 B/op 8 allocs/op
BenchmarkSegmentPieceSize/segment.PieceSize
BenchmarkSegmentPieceSize/segment.PieceSize-8 94721329 11.49 ns/op 0 B/op 0 allocs/op
Change-Id: I5a8b4237aedd1424c54ed0af448061a236b00295
This modification introduce support of the new "desired node" field of download segment/object.
This can be used to request more nodes than the suggested minimum. It can be used to achieve better performance in exchange of using more bandwidth. (more parallel downloads).
Change-Id: Ia167d6979e6d70a597c85070a4ccd1c3a573e406
Most of our (~integration) tests based on testplanet runner.
However running testplanet for each test make the testing process slow.
It seems to be better to use real unit tests (without db dependency) when it's possible.
This patch makes small modification to make it possible to test orders.Service with real unit test.
As the existing unit test of `service.go` is isolated with `_test` package name, it's moved to an `_integration_test.go` file to make place for the unit test.
Change-Id: Ia69f26a34e2c48d230d8d36c2040dd02a60455a6
we have two more fields in the database (noise_proto and
noise_public_key) that now need to go into pb.NodeAddress when
returning AddressedOrderLimits.
the only real complication is making sure type conversions between
database types and NodeURLs and so on don't lose this new
pb.NodeAddress field (NoiseInfo). otherwise this is a relatively
straightforward commit
Change-Id: I45b59d7b2d3ae21c2e6eb95497f07cd388d454b3
bucket_bandwidth_rollups table
We have performance problems with updating bucket_bandwidth_rollups. To
improve situation we can stop storing allocated bandwidth in this table.
This should reduce large number of updates which are comming from
metainfo endpoints, repair workers and audit.
Next step will be to drop `allocated` column completely from
bucket_bandwidth_rollups.
Allocated GET bandwidth is all we need and we are keeping it in
bucket_bandwidth_rollups table.
Change-Id: Ifdd26a89ba8262acbca6d794a6c02883ad0c0c9b
ReverifyPiece() is not currently hooked up to anything, but is planned
to take the place of audit.(*Verifier).Reverify().
ReverifyPiece() works by downloading one piece in its entirety, rather
than pulling an entire stripe across many nodes.
Change-Id: Ie2c680f4d3c3b65273a72466a3f9f55c115b0311
Use DownloadSelectionCache to avoid querying database for every
download.
This change only addresses downloads from users. The download selection
cache is not currently used for audit and repair.
Change-Id: I96a49e121dac0b4204f97592a63131edabd73fb5
For nodes in excluded areas, we don't necessarily want to remove them
from the pointer, but we do want to increase the number of pieces in the
segment in case those excluded area nodes go down. To do that, we
increase the number of pieces repaired by the number of pieces in
excluded areas.
Change-Id: I0424f1bcd7e93f33eb3eeeec79dbada3b3ea1f3a
inconsistency
The original design had a flaw which can potentially cause discrepancy
for nodes reputation status between reputations table and nodes table.
In the event of a failure(network issue, db failure, satellite failure, etc.)
happens between update to reputations table and update to nodes table, data
can be out of sync.
This PR tries to fix above issue by passing through node's reputation from
the beginning of an audit/repair(this data is from nodes table) to the next
update in reputation service. If the updated reputation status from the service
is different from the existing node status, the service will try to update nodes
table. In the case of a failure, the service will be able to try update nodes
table again since it can see the discrepancy of the data. This will allow
both tables to be in-sync eventually.
Change-Id: Ic22130b4503a594b7177237b18f7e68305c2f122
In the past ExpirationDate was available inside CreatePutRepairOrderLimits
but this was removed since the metabase segment was missing the ExpiresAt field.
Now ExpiresAt field is available in the metabase segment
and can be set correctly while executing NewSignerRepairPut.
Change-Id: I068c07492ab27bde2c44477bbd32c5872edd024a
Sometimes we see timeouts from DNS lookups when trying to do
repair GETs. Solution: try using node's last IP and port first.
If we can't connect, retry with DNS lookup.
Change-Id: I59e223aebb436118779fb18378f6e09d072f12be
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.
As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.
This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.
In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.
Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
Previously the object range was not used for calculating order limit.
This meant that even if you were downloading only a small range it would
account bandwidth based on the full segment.
This doesn't fully address the accounting since the lazy segment
downloads do not send their requested range nor requested limit.
Change-Id: Ic811e570c889be87bac4293547d6537a255078da
metabase has become a central concept and it's more suitable for it to
be directly nested under satellite rather than being part of metainfo.
metainfo is going to be the "endpoint" logic for handling requests.
Change-Id: I53770d6761ac1e9a1283b5aa68f471b21e784198
The previous default FlushBatchSize of 10000 was causing major
slow down in select and insert statements on bucket_bandwidth_rollups.
We saw on the saltlake satellite that a FlushBatchSize of 1000 helped
reduce contention and query latency.
Change-Id: Ib95e73482219bc5aedc11925b1849fa5999774ba
Delete satellite order methods and DB tables which aren't used anymore
after we have done a refactoring on the orders to stuck bucket
information in the orders' encrypted metadata.
There are also configuration parameters and a satellite chore that
aren't needed anymore after the orders refactoring.
Change-Id: Ida3682b95921df70792284b42c96d2508bf8ca9c
CreateGetOrderLimits is not used anymore because we have CreateGetOrderLimits2. We need to remove old method and fix name of second.
Change-Id: I59148b8d28fc9dbab7d452c884319125a02745d1
Avoid using project uuid string representation, because
it uses more bandwidth.
This reduces the encrypted metadata size from 118 -> 97 bytes.
Change-Id: Ic53a81b83acc065f24f28cd404f9c0b1fe592594
Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature.
Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488
This PR contains the minimum changes needed to stop inserting into the serial_numbers table. This is the first step in completely deprecating that table.
The next step is to create another PR to remove the expiredSerial chore, fix more tests, and remove any other methods on the serial_number table.
Change-Id: I5f12a56ebf3fa4d1a1976141d2911f25a98d2cc3
We want to stop using the serial_numbers table in satelliteDB. One of the last places using the serial_numbers table is when storagenodes settle orders, we look up the bucket name and project ID from the serial number from the serial_numbers table.
Now that we have support to add encrypted metadata into the OrderLimit, this PR makes use of that and now attempts to read the project ID and bucket name from the encrypted orderLimit metadata instead of from the serial_numbers table. For backwards compatibility and to ensure no errors, we will still fallback to the old way of getting that info from the serial_numbers table, but this will be removed in the next release as long as there are no errors.
All processes that create orderLimits must have an orders.encryption-keys set. The services that create orderLimits (and thus need to encrypt the order metadata) are the satellite apiProcess, the repair process, audit service (core process), and graceful exit (core process). Only the satellite api process decrypts the order metadata when storagenodes settle orders. This means that the same encryption key needs to be provided in the config for the satellite api process, repair process, and the core process like so:
orders.include-encrypted-metadata=true
orders.encryption-keys="<"encryptionKeyID>=<encryptionKey>"
Change-Id: Ie2c037971713d6fbf69d697bfad7f8b672eedd66
This change is adjusting metainfo endpoint to use metabase for uploading
and downloading remote objects. Inline segments will be added later.
Change-Id: I109d45bf644cd48096c47361043ebd8dfeaea0f3
Storage nodes undergoing Graceful Exit have up to now been receiving
hostnames for all other storage nodes they need to contact when
transferring pieces. This adds up to a lot of DNS lookups, which
apparently overwhelm some home routers. There does not seem to be any
need for us to send hostnames for graceful exit as opposed to IP
addresses; we already use IP addresses (as given by the last_ip_port
column in the nodes table) for all the GET and PUT orders we send out.
This change causes IP addresses to be used instead.
I started trying to construct a test to ensure that the behavior
changed, but it was rabbit-holing, so I've begun to feel that maybe this
change doesn't require one; it is a very simple change, and very much of
the same nature as what we already do for IPs in CreateGetOrderLimits
and CreatePutOrderLimits (and others).
Change-Id: Ib2b5ffe7a9310e9cdbe7464450cc7c934fa229a1
This preserves the last_ip_and_port field from node lookups through
CreateAuditOrderLimits() and CreateAuditOrderLimit(), so that later
calls to (*Verifier).GetShare() can try to use that IP and port. If a
connection to the given IP and port cannot be made, or the connection
cannot be verified and secured with the target node identity, an
attempt is made to connect to the original node address instead.
A similar change is not necessary to the other Create*OrderLimits
functions, because they already replace node addresses with the cached
IP and port as appropriate. We might want to consider making a similar
change to CreateGetRepairOrderLimits(), though.
The audit situation is unique because the ramifications are especially
powerful when we get the address wrong. Failing a single audit can have
a heavy cost to a storage node. We need to make extra effort in order
to avoid imposing that cost unfairly.
Situation 1: If an audit fails because the repair worker failed to make
a DNS query (which might well be the fault on the satellite side), and
we have last_ip_and_port information available for the target node, it
would be unfair not to try connecting to that last_ip_and_port address.
Situation 2: If a node has changed addresses recently and the operator
correctly changed its DNS entry, but we don't bother querying DNS, it
would be unfair to penalize the node for our failure to connect to it.
So the audit worker must try both last_ip_and_port _and_ the node
address as supplied by the SNO.
We elect here to try last_ip_and_port first, on the grounds that (a) it
is expected to work in the large majority of cases, and (b) there
should not be any security concerns with connecting to an out-or-date
address, and (c) avoiding DNS queries on the satellite side helps
alleviate satellite operational load.
Change-Id: I9bf6c6c79866d879adecac6144a6c346f4f61200
We are moving an error into rejectErr since its preventing storage nodes from being able to settle other orders.
Change-Id: I3ac97c340e491b127f5e0024c5e8bd9f4df8d5c3
Doing it at the ProcessOrders level was insufficient: the endpoints
make multiple database calls. It was a misguided attempt to only
have one spot enter the semaphore. By putting it in the endpoint
we can not only be sure that the concurrency is correctly limited
but it can be configurable easily.
Change-Id: I937149dd077adf9eb87fce52a1a17dc0afe96f64