Remove starting up messages from peers. We expect all of them to start,
if they don't, then they should return an error why they don't start.
The only informative message is when a service is disabled.
When doing initial database setup then each migration step isn't
informative, hence print only a single line with the final version.
Also use shorter log scopes.
Change-Id: Ic8b61411df2eeae2a36d600a0c2fbc97a84a5b93
stage
The test-versions script no longer uses the `testfiles` directory, which
the final upload for the rolling-upgrade script depended on. This change
creates and populates a `testfiles` diirectory during the final upload
stage of the rolling upgrade test.
Change-Id: Iabeccbadc55a8c85a1febbd5eb4e7d889a57a8dc
When the context was being cancelled the error was being discarded within the rate limiting error handling which caused tests to fail.
Change-Id: I5c6458c16da09a11531233ea0ee80d914969cb3f
deletePointer must return an ErrObjectNotFound rather than a rpc status
error NotFound because the callers must distinguish such error if it
comes from the getPointer or from the UnsynchronizedDelete.
Change-Id: I68b4e45a2765e63b73bf85c2c39a5fc0198373f6
We don't want slowloris nodes to be able to indefinitely block
up the satellite, so add a timeout. Some monitoring inspection
showed the largest success times being on the order of 30s, so
a 1min timeout should be sufficient to kill the misbehaving nodes.
Change-Id: I5e2c3480a15f6304e37262d0a4d30d07eae99bb3
Go will, by default, set tcp keep alives on sockets. But
the kernel does not send keep alives to sockets that have
a non-empty send queue. That can cause connections that
hang forever.
So we set TCP_USER_TIMEOUT on all of the sockets as well.
That option will close any connection that has not received
an ack for any sent data (keep alive or otherwise) in the
configured time period. This places an upper bound on the
amount of time a socket can be stuck due to a client not
acknowleding data.
See https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
for more information on what these options do and how they
interact.
Additionally, make sure that we close every connection coming
from the listeners by wrapping them in a type with a finalizer
that closes the connection, much like the os package does for
file handles. It monitors if a connection was closed due to a
finalizer so that we can go and look for the bug if we ever
see a non-zero value.
Change-Id: Idc6c0564224b8dc2e4c9d769e80374ed1fe8cce0
storing live accounting in memory will not work, as the core and api each create
their own instance. Using redis will allow each to access the same store
Change-Id: I4c8250b579d7b6b6d8991bc890894573626effe6
As per discussed we decided to rate limit how fast we iterate through
the metainfo database in the metainfo loop. This puts in place a
mechanism for rate limiting and burst limiting if need be in the future.
The default for this rate limiting is still no limits so it stays the
same as our previous functionality.
Change-Id: I950f7192962b0e49f082d2c4284e2d52b0a925c7
We are missing some tests for new Metainfo API that we have for old API.
This is first change to adjust old tests to new API.
Change-Id: Ie2b16bf85de8633662f952e863dbf3d409d801d9
For improving the deletion performance we are shifting the
responsibility to delete the pieces of the object from Uplink to the
Satellite.
BeginDeleteObject was the first call to return the stream ID which was
used for after retrieving the list of segments and then get addressed
order limits for deleting the pieces (of each segment) from the storage
nodes.
Now we want the Satellite deletes the pieces of all the object segments
from the storage nodes hence we don't need anymore to have several
network round trips between the Uplink and the Satellite because the
Satellite can delete all of them in the initial BegingDeleteObject
request.
satellite/metainfo.ListSegments has been changed to return 0 items if
the pointer of the last segment of an object is not found because we
need to preserve the backward compatibility with Uplinks that won't be
updated to the last release and they rely on listing the segments after
calling BeginDeleteObject for retrieving the addressed order limits
to contact the storage nodes to delete the pieces.
Change-Id: I5f99ecf27d62d65b0a062936b9b17581ef692af0
Don't use a hardcoded year number for tests which depends on the current
year for avoiding that they don't pass when a new year starts.
Change-Id: I52b6248f6c3ddd2df89a4b04caf3a228b0d564e0
Remove direct dependency on uplink.RSConfig, this simplifies
moving the config file without introducing weird dependencies.
Change-Id: I7fd2a145401e0205d7047631df9d2810241efeec
ensure that every integer conversion fits into the destination
type, and for any unchecked ones, annotate why they are safe.
additionally, any integers we pass into slice headers need to
check that they are not negative.
all of our allocations should check for allocation failure and
return an error if there was a problem. even though an allocation
just failed, we don't pre-allocate the failure because we don't
want the callers to free it like they have to with the other
errors we return. maybe we should just panic.
Change-Id: Id4dfad802d312d35e565041efc9872453c90d793
Adds check to see if storage nodes are eligible to initiate
graceful exit, by checking their CreatedAt date and seeing if
their "age" is greater than the new config value:
NodeMinAgeInMonths
The default for this value is 6 months for now.
https://storjlabs.atlassian.net/browse/V3-3357
Change-Id: Ib807ab8987ddb5a38a27a83886490f73fe8c5816
The endpoint listSegmentsManually method misses a check for the limit
parameter, otherwise it can return inconsistent results when it's 0 or
negative.
When 0 or negative, without the check, it returns no segments but also
that there isn't more segments and that isn't correct.
The function is only called from the Endpoint.ListSegments method and
the function cares to ensure that limit is always greater than 0, but if
the method doesn't check that a new future caller could misuse it and
provoke a bug.
Additionally:
* Documentation for the modified function has been written
* The part of the function that repeated the logic of the
Endpoint.getPointer method has been removed for using that method.
* Added logging before returning an internal error in
Endpoint.getPointer.
Change-Id: I5c4f0db2292da0162db6b7d63553895808d0925a
Do some cleanup for adding new identified TODOs (associated with ticket
https://storjlabs.atlassian.net/browse/V3-3406) and remove an old one.
Change-Id: I5d20dbe1c4dee0a8279e08b05b907f4cc9dba278
Fixes Least Authority Issue F:
https://storjlabs.atlassian.net/browse/V3-3409
If the --allowed-path-prefix flag is not set to the `share` command, any
command arguments will be used as allowed path prefixes.
This patch also improves the output of the `share` command to print the
state of all restrictions, so users can confirm they match their
intention.
Change-Id: Id1b4df20b182d3fe04cb2196feea090975fce8b4
This commit adds functionality to include the space used in the trash
directory when calculating available space on the node.
It also includes this trash value in the space used cache, with methods
to keep the cache up-to-date as files are trashed, restored, and
emptied.
As part of the commit, the RestoreTrash and EmptyTrash methods have
slightly changed signatures. RestoreTrash now also returns the keys that
were restored, while EmptyTrash also returns the total disk space
recovered. Each of these changes makes it possible to keep the cache
up-to-date and know how much space is being used/recovered.
Also changed is the signature of PieceStoreAccess.ContentSize method.
Previously this method returns only the content size of the blob,
removing the size of any header data. This method has been renamed
`Size` and returns both the full disk size and content size of the blob.
This allows us to only stat the file once, and in some instances (i.e.
cache) knowing the full file size is useful.
Note: This commit simply adds the trash size data to the piece size data
we were already collecting. The piece size data is not accurate for all
use-cases (e.g. because it does not contain piece header data); however,
this commit does not fix that problem. Now that the ContentSize (Size)
method returns the full size of the file, it should be easier to fix
this problem in a future commit.
Change-Id: I4a6cae09e262c8452a618116d1dc66b687f59f85
The test case wasn't testing all the combinations.
But, testing all 3 byte combinations would take too long,
instead test special values and +- 1 of them, with some
additional noise characters.
Change-Id: If53ff25863a1f27c534922bd399fbbbdfefda441
flate compression with default settings plays very poorly
together with race, causing test to take a significant amount
of time.
Use pass-through compression to avoid the issue.
Improves test from 2m45s to 17s.
Change-Id: Iadf1381c538736d48e018164697bdfd3356e24b8