Currently slower storagenodes can slow down deletion queue.
To make piece deletion faster reduce the maximum time spent in
either dialing or piece deletion requests.
With this change:
* dial timeout is 3s
* request timeout is 15s
* fail threshold is set to 10min
Similarly, we'll mark storage node as failed when the timeout occurs.
The timeout usually indicates that the storagenode is overwhelmed.
Garbage collection will ensure that the pieces get deleted eventually.
Change-Id: Iec5de699f5917905f5807140e2c3252088c6399b
We needed Redundancy insided sat StreamID when uplink was defining RS values. Now it can be removed.
Change-Id: Id37187493eaa00cf29cb0262a050d71add3deb96
We should improve the way how we are handling metabase errors in
metainfo endpoint.
https://storjlabs.atlassian.net/browse/PG-316
Change-Id: I1da6f333546cabf34d6eb1de8e94a3ef455d75d5
Multipart upload limits added. Last part has no size limit.
Max number of parts: 10000, min part size: 5 MiB
Change-Id: Ic2262ce25f989b34d92f662bde720d4c4d0dc93d
Speedup is done by reducing number of testplanet instances
for tests without changing main test logic.
Change-Id: Ic3849485d37b8ca55c013a45b7191dce65b88b04
Uplink needs only part of columns we are reading from DB.
To improve performance we should read only those that are
realy needed.
Change-Id: Ib39259318169c46afe5fa4c6ce2184da82e960c8
This change introduce problems with server side move so
let's revert it for now. Problem was found when latest
version of storj/storj was used in uplink tests.
This reverts commit 1ef06fae99.
Change-Id: I4d4fad5d1ea04ba15ff9d7bd765f7e078e9187c2
We were using mixed types for nonce fields. Protobuf
have storj.Nonce, metabase have []byte. This change
is a refactoring to have everywere its possible only
storj.Nonce.
Change-Id: Id54bd8481f30c721cdaf3df79206d25e7cfdab55
We are not using the benchmark results for anything, they are mostly
there to ensure that we don't break the benchmarks. So we can disable
CockroachDB for them.
Similarly add short versions of other tests.
Also try to precompile test/benchmark code.
Change-Id: I60b501789f70c289af68c37a052778dc75ae2b69
It's safer to create new connection pool for piece deletion
only if dialer have no existing pool assigned.
Change-Id: I26661683ab7c0198587905478057c01c8f533a7e
We should be using object naming insted of path.
This is one place where we can easiliy change it.
To regenerate protbuf I had to remove gogo.proto.
Most probably it was confilicting with gogo.proto
from common/pb.
Change-Id: Ia5972f77994765c8f26bf1c3dc8205d2eadd70fa
We want to enable connection pool for piece deletion to avoid
doing multiple SSL hanshakes to SN while massive deletion process.
Change-Id: Ic917e4eda304ee16a286926ef046fe9e38bf38ca
This PR utilize the new burst limit column from projects table to allow
control on the limit for request per seconds and token bucket size
When no burst limit is explicitly set, rate limit is applied to both so
we don't limit how quickly request can be made in a second.
Change-Id: I883235c60c5d6416aeadd1c80ed2ebd193aa4d9f
server-side move extended with moving between buckets, for this reason
we change bucket name for object in db.
Change-Id: Ie21bcccc170e6ff14dcd8053fdb86fdf6d8438a0
Some processing inside storagenodes is async compared to uplink upload
and download, hence we need to explicitly wait for storagenodes to
finish their pending work before flushing orders to the satellite.
Hopefully this fixes TestAttributionReport flakiness.
Change-Id: I77c651ab6471ae094b5c21d1ab3860c96cb0d039
Second method needed to perform server-side move. It updates
metadata key and nonce and all segments key and nonces.
Change-Id: Ia43b26622a13048269f0ae9e1524b345db112adb
First from two methods needed to perform server-side move. It gets
metadata key and nonce and all segments key and nonces and returns
all of that to uplink.
Change-Id: Ied2c79559e77d3f63091c4d61948f2d6a2147d67
Currently, requests that were successfully passed through the metainfo
endpoints rate-limiter might still fail in the middle of the
corresponding response. The problem is that we perform rate-limiting a
second time, which means other requests would influence whether the
current (already rate-checked) request will fail. This also has other
unintended effects, like responding with rpcstatus.PermissionDenied for
requests that were successfully rate-checked and did not lack
permissions but were rate-checked again in the middle of
(*Endpoint).BeginObject. This situation has been happening on the
gateway side and might affect other uplink clients. This change, where
appropriate, swaps subsequent validateAuth with validateAuthN that
performs rate-limiting once.
Change-Id: I6fc26dedb8c442dd20acaab5942f751279020b08
At some point we moved metabase package outside Metainfo
but we didn't do that for satellite structure. This change
refactors only tests.
When uplink will be adjusted we can remove old entries in
Metainfo struct.
Change-Id: I2b66ed29f539b0ec0f490cad42c72840e0351bcb
Two small cleanups:
* merging private commitObject, commitSegment,
makeInlineSegment with its public versions. We were
using it when pb.Pointer was still used.
* removing unused CreatePath method
Change-Id: Ib18b07473d91259335dab874559ef52412ab813d
We're seeing BeginDeleteObject in metaclient returning object not found:
metabase: no rows deleted in the Gateway-MT mint tests. There's a
client check for rpcStatus.NotFound, but the metabase endpoint isn't
wrapping the db error as a DRPC error.
Here's the chain:
gateway.AbortMultipartUpload()
project.AbortUpload()
metainfoClient.BeginDeleteObject() <- understands DRPC errors
endpoint.DeletePendingObject() <- where this code is
db.DeletePendingObject() <- returns error
Change-Id: I93991de76487426df0a807b0d1e69fc975196a1a
We need a way to delete whole part. This especially
needed for uplink multipart API to do cleanup after
aborted or failed part upload.
Test will be added when uplink part will be merged.
Change-Id: I9ba69a49e1adcdce0f42dd3a76f938fcf931155a
Added includeMetadata parameter which represents if metadata should be included in response
by default true, in case of new uplink version - ObjectIncludes will be used instead.
Change-Id: I2f8d3b4cc354cd655f8093bbbebe0e3c2ae14e6f
Bucket tally calculation will be removed from metaloop and will
use metabase objects iterator directly.
At the moment only bucket tally needs objects so it make no sense
to implement separate objects loop.
Change-Id: Iee60059fc8b9a1bf64d01cafe9659b69b0e27eb1
We added expires_at column to segments table and now
we need to populate this column while committing segment.
We still need to migrate existing segments with
separate tool.
Change-Id: Ibac8c63d97201dd98cc2cb9db385f4cb73bc3f7e
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.
As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.
This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.
In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.
Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
We want to move some of current metainfo loop observers to
segment loop. This change adds new service, similar to metainfo
loop but which is iterating only over segments.
Change-Id: I67f7f461781723a4476e2b83377f31736d7c4870
Previously the object range was not used for calculating order limit.
This meant that even if you were downloading only a small range it would
account bandwidth based on the full segment.
This doesn't fully address the accounting since the lazy segment
downloads do not send their requested range nor requested limit.
Change-Id: Ic811e570c889be87bac4293547d6537a255078da
Currently the interface is not useful. When we need to vary the
implementation for testing purposes we can introduce a local interface
for the service/chore that needs it, rather than using the large api.
Unfortunately, this requires adding a cleanup callback for tests, there
might be a better solution to this problem.
Change-Id: I079fe4dbe297b0ae08c10081a1cea4dfbc277682
The system and database time may drift. We should use database time for
absolute "as of system time" to ensure that it's not newer than the
current database time. When the "as of system time" is in the future,
then the query will fail.
Change-Id: I5423f6aaad966ca03a76b5ff805bfba932e44a51