Move rpc test that uses testplanet into private/testplanet.
This ensures that rpc doesn't have the whole system as a dependency
making it easier to separate.
This unfortunately leaves pkg/rpc without specific tests, but
we would need to write new tests that only use the core packages.
Change-Id: I402ab3c2d50282af159c2ef3371d23b0997fef0a
This changes when we write the drpcheader. Rather than making it its own
write to the connection, it now prepends the drpc header to the first
write on the connection (typically the tls handshake). This results in
one less packet being sent at the beginning of each drpc connection.
For an operation like uploading a file from uplink, this results in many
packets being dropped: one when communicating with the satellite, and
one for each communication with the storage nodes.
Change-Id: I7644b46e90ffa7acea73ac56831396307352ed7a
After changing how we execute the storagenode-updater process we lost
timestamps in the log.
The fix is to start using zap logging.
The Windows Installer is changed to register the storagenode-updater
service in a way that the Windows Service Manager passes the
--log.output flag instead of the old --log.
The old --log flag is deprecated, but not removed. We will support it
for backward compatibility. This is required as the storagenode-updater
can auto-updated itself, but the Windows Service Manager of this old
installtion will continue passing the old --log flag when starting it.
Change-Id: I690dff27e01335e617aa314032ecbadc4ea8cbd5
Signed-off-by: Kaloyan Raev <kaloyan@storj.io>
long lived uplinks could just hold on to connections forever
if their client to the storagenode or satellite isn't closed.
this will prevent that from happening on the client. more
changes will be necessary to add appropriate prevention on
the servers.
Change-Id: Ib36d85e70cbafb315664ad7657bb70b936b3828c
planet.Start starts a testplanet system, whereas planet.Run starts a testplanet
and runs a test against it with each DB backend (cockroach compat).
Change-Id: I39c9da26d9619ee69a2b718d24ab00271f9e9bc2
* pkg/pg: Add new service function storage node
Add a new service function to the storage node piece store for deleting
pieces when satellites request them.
* storagenode/piecestore: Add endpoint to delete piece
Add a new endpoint to receive from trusted satellites to delete a piece.
* private/testplanet: Fix storagenode mock
Add to the storagenode mock the new endpoint method.
* proto.lock: Update it with the last protbuff changes
* storagenode/piecestore: Reuse test piece upload
Extract the repeated logic from several tests functions for uploading a
test piece to a test helper function.
* uplink/piecestore: Implement client side method
Implement the client side method of the new piecestore RPC function.
* storagenode/piecestore: Add test DeletePiece endpoint
Implement a test for the DeletePiece new endpoint method.
if your server is built to make drpc connections, clients can
still connect with grpc. thus, your responses to grpc clients
must still look the same, so we have to have all of our status
wrapping include codes for both the drpc and grpc servers to
return the right thing.
Change-Id: If99fa0e674dec2e20ddd372a827f1c01b4d305b2
these may not be optimal but they're probably better based on
our previous testing. we can tune better in the future now that
the groundwork is there.
Change-Id: Iafaee86d3181287c33eadf6b7eceb307dda566a6
We don't use reverse listing in any of our code, outside of tests, and
it is only exposed through libuplink in the
lib/uplink.(*Project).ListBuckets() API. We also don't know of any users
who might have a need for reverse listing through ListBuckets().
Since one of our prospective pointerdb backends can not support
backwards iteration, and because of the above considerations, we are
going to remove the reverse listing feature.
Change-Id: I8d2a1f33d01ee70b79918d584b8c671f57eef2a0
drpc will call Close on any transport we pass to it, but some
transports (like tls.Conn) will attempt to notify the remote
side of things. we don't want to do that, so pass a new
interface that just closes the underlying socket.
Change-Id: I53344d2747de21b3146abe4f82b8394bb8948cb5
Change signature of metainfo DeleteObject to get rid of an extra call to
kvmetainfo GetBucket method and eliminate one round trip to the
satellite when deleting objects.
grpc doesn't exit dials right away if the context dialer
returns an error. since that's the only spot where we were
enforcing dial timeouts, dials could just leak for an
unknown amount of time.
add a timeout above the grpc dial because that's the documented
way that grpc expected to be canceled.
Change-Id: Ic47ac61ce8a5f721510cc2c4584f63d43fe4f2d5
we don't know if an incoming connection is from drpc or grpc during
the migration time, so check both.
Change-Id: I2418dde8b651dcc4a23726057178465224a48103
* add signatures, fix process loop bug, move delete to on success
* added tests for signatures
* PR comment updates
* fixed setting reason by default.
* updates for PR comments
* added signed failure when verificationi fails
* moved to sign_test
* fix panic
* removed testplanet from test
* add overall failure percentage check and inactive time frame check before sending a response to sno
* update comment
* delete node from transfer queue if it has been inactive for too long
* fix linting error
* add test config value
* fix nil pointer
* add config value into testplanet
* add unit test for overall failure threshold
* move timeframe threshold to chore
* update protolock
* add chore test
* add per peiece failure count logic
* change config name from EndpointMaxFailures to MaxFailuresPerPiece
* address comments
* fix linting error
* add error handling for no row returned from progress table
* fix test for graceful exit chore on storagenode
* fix typo InActive -> Inactive
* improve readability for failure threshold calculation
* update config lock
* change error handling for GetProgress in graceful exit endpoint on the satellite side
* return proper rpc error in endpoint
* add check in chore test for checking finish timestamp and queue
keep a pool of connections open when dialing for drpc. this
makes it so that long lived clients (like lib/uplink's Project)
don't continue to use a bad connection forever. it also allows
for concurrent rpcs.
Change-Id: If649b286050e4f09c413fadc3e1ce88f5fc6e600
libuplink was incorrectly setting timeouts to 10 seconds still, but
should have been at least 10 minutes. the order sender was setting them
to 1 hour. we don't want timeouts in uplink-side logic as it establishes
a minimum rate on tcp streams.
instead of all of this, just use tcp keep alive. tcp keep alive packets are
sent every 15 seconds and if the peer stops responding the connection
dies. this is enabled by default with go. this will kill tcp connections
when they stop working.
Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b