Commit Graph

117 Commits

Author SHA1 Message Date
Egon Elbre
36fead0093 satellite/metainfo: add UserAgent support to endpoints (#3548) 2019-11-26 03:12:37 -08:00
Ivan Fraixedes
8e1e4cc342
piecestore: Fix invalid comment and typos (#3604) 2019-11-19 16:30:48 +01:00
Michal Niewrzal
5964502ce0
uplink/metainfo: remove GetObject from download Batch (#3596) 2019-11-19 04:58:26 -08:00
littleskunk
8b3444e088
satellite/nodeselection: don't select nodes that haven't checked in for a while (#3567)
* satellite/nodeselection: dont select nodes that havent checked in for a while

* change testplanet online window to one minute

* remove satellite reconfigure online window = 0 in repair tests

* pass timestamp into UpdateCheckIn

* change timestamp to timestamptz

* edit tests to set last_contact_success to 4 hours ago

* fix syntax error

* remove check for last_contact_success > last_contact_failure in IsOnline
2019-11-15 23:43:06 +01:00
Michal Niewrzal
bc16cb5d24
libuplink: remove additional GetBucket for upload/download (#3568) 2019-11-15 02:06:17 -08:00
Ivan Fraixedes
c193dee9ae
uplink/storage/streams: Fix upload error clean up (#3555)
The number of the last segment uploaded returned by upload wasn't
correct in all the return statements. The Put method calls upload but
with the returned values wasn't even certain of cleaning up correctly
the segments uploaded when an error happens.

This commit moves the logic of cleaning up inside of the upload method
because it's easier to understand than only doing inside of it the clean
up when a context cancellation happens. It also fixes the number of
segments to be deleted by cancelHandler when an error happens.

These changes should avoid trying to delete segments which haven't been
uploaded because of the wrong index value or because of being inline.
2019-11-15 10:37:20 +01:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
paul cannon
0c025fa937 storage/: remove reverse-key-listing feature
We don't use reverse listing in any of our code, outside of tests, and
it is only exposed through libuplink in the
lib/uplink.(*Project).ListBuckets() API. We also don't know of any users
who might have a need for reverse listing through ListBuckets().

Since one of our prospective pointerdb backends can not support
backwards iteration, and because of the above considerations, we are
going to remove the reverse listing feature.

Change-Id: I8d2a1f33d01ee70b79918d584b8c671f57eef2a0
2019-11-12 18:47:51 +00:00
Ivan Fraixedes
6516471cbc
uplink/storage/streams: Upload loop operations reorganization (#3429)
* uplink/storage/streams: Upload loop ops reorganization

  Reorganize the operations of the loop run by streamsStore.upload method
  for not doing unneeded computations on each iteration.

* uplink/storage/streams: Move out returns values declaration

  Move out return values declarations for those which aren't strictly
  needed due to defer statements nor documentation purpose.
2019-11-12 08:30:18 +01:00
Ivan Fraixedes
e4a220347a
uplink: Suppress one metainfo call on delete (#3511)
Change signature of metainfo DeleteObject to get rid of an extra call to
kvmetainfo GetBucket method and eliminate one round trip to the
satellite when deleting objects.
2019-11-07 10:39:40 +01:00
littleskunk
7eb6724c92
logging: unify logging around satellite ID, node ID and piece ID (#3491)
* logging: unify logging around satellite ID, node ID and piece ID

* unify segment index
2019-11-05 22:04:07 +01:00
Maximillian von Briesen
d9bb25b4b9 satellite/metainfo: support a wider range of values for RS.Total in satellite metainfo validation (#3431)
change uplink RS default configuration from 130 to 95
2019-10-31 15:04:33 -04:00
Jeff Wendling
59f81a4a0d groupcancel/ec delete: add a timeout based on completion times
we used to do something similar for puts, but that ended up hurting
more than it helped. since deletes are best effort, we can do it
here to kill long tails or unresponsive nodes.

Change-Id: I89fd2d9dcf519d76c78ddad70bc419d1868d2df1
2019-10-30 16:18:39 -06:00
Natalie Villasana
5453886231 satellite/repair, uplink/ecclient: remove unused expiration arg from ec.Repair and ec.putPiece (#3416) 2019-10-30 11:35:00 -04:00
Michal Niewrzal
8786a37f89
uplink/storage: use Batch to optimize upload requests (#3408) 2019-10-29 08:49:16 -07:00
Ivan Fraixedes
14e661a1b0
uplink/storage/segments: non-functional improvements (#3400)
* uplink/storage/streams: Remove unused field of struct

  Remove an unused field from a struct type because it isn't used.

* uplink/storage/streams: Add end period in func doc

  Add missing end period in some functions documentation comments for
  following conventions.

* uplink/storage/segments: Replace switch by if
  
  Replace the switch statement of 2 branches one with a condition which
  returns and the second is the default by a if conditional because it's
  easy to read.
2019-10-29 15:39:17 +01:00
Michal Niewrzal
56f8b2d626
uplink/storage: remove bucket store (#3376) 2019-10-28 09:40:46 -07:00
Ivan Fraixedes
4f281ef348
uplink: Refactor segments Store Get for metainfo Batch (#3362)
Refactoring of the segments Store interface Get method signature to
force the implementations to not use metainfo Client and be able for the
callers to batch requests.
2019-10-28 17:23:20 +01:00
Maximillian von Briesen
6df4d7bc73
storagenode/gracefulexit + satellite/gracefulexit: add storagenode-side transfer validation (#3371)
* Make the exiting node check piece hashes, piece IDs, and piece hash signatures before relaying successful transfer data to the satellite.
* Enable immediate graceful exit failure for "successful" transfers that fail satellite-side validation.
* Move transfer piece logic in storagenode worker to separate function (to make the worker easier to understand)
2019-10-25 13:16:20 -04:00
Ivan Fraixedes
d9d82b0336 uplink: Reduce satellite request using Batch when possible (#3351)
* uplink/metainfo: Return classified Not Found error

Metainfo client Batch method must return the Storj Not Found error class
when the RCP server response with a not found status code as any other
metainfo Client method does.

Also if the error isn't Not Found one, it must wrap the error.

* uplink/storage/streams: Use Batch request in Delete

Change the 2 individual metainfo Client calls that streamStore Delete
method does by a single Batch one.
2019-10-24 14:18:48 -07:00
Michal Niewrzal
521c39bda0
uplink/metainfo: cleanup method names (#3315) 2019-10-22 23:59:56 -07:00
JT Olio
2c6fa3c5f8
pkg/rpc: remove read/write deadlines as a mechanism for request timeouts (#3335)
libuplink was incorrectly setting timeouts to 10 seconds still, but
should have been at least 10 minutes. the order sender was setting them
to 1 hour. we don't want timeouts in uplink-side logic as it establishes
a minimum rate on tcp streams.

instead of all of this, just use tcp keep alive. tcp keep alive packets are
sent every 15 seconds and if the peer stops responding the connection
dies. this is enabled by default with go. this will kill tcp connections
when they stop working.

Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b
2019-10-22 17:57:24 -06:00
Ethan Adams
3e0d12354a
storagenode/gracefulexit: Implement storage node graceful exit worker - part 1 (#3322) 2019-10-22 16:42:21 -04:00
Ivan Fraixedes
071d1c4313
upload: Add more info to returned error response & to logs (#3218)
* uplink/storage/segments: return error no optimal threshold
  Return an error if the store get less uploaded pieces than the indicated
  by the optimal threshold.

* satellite/metainfo: Fix gRPC status error & add reason
  This commit fix the CommitSegment endpoint method to return an
  "Invalid Argument" status code when uplink submits invalid data which is
  detected when filtering invalid pieces by filterInvalidPieces endpoint
  method.

  Because filterInvalidPieces is also used by CommitSegmentOld, such
  method part has been changed accordingly.

  * An initial check in CommitSegment to detect earlier if uplink sends an
    invalid number of upload pieces.
  * Add more information to some log messages.
  * Return more information to uplink when it sends a number of invalid
    pieces which make impossible to finish the operation successfully.

* satellite/metainfo: Swap some "sugar" loggers to normal ones
  Swap "sugar" loggers to normal ones because they impact the performance
  in production systems and they should only be used under specific
  circumstances which were none of the ones changed.
2019-10-17 20:01:40 +02:00
Ivan Fraixedes
21c6737b10
uplink/ecclient: clarify defer logic in putPiece (#3247)
Refactoring the 'defer' function logic to just only have what's important to not forget before returning but simplifying its logic for making easy to understand the overall function logic.
2019-10-16 18:05:22 +02:00
littleskunk
2301a8287f Satellite/PieceHashValidation: Increase time window from 2h to 24h to avoid timezone issues (#3291) 2019-10-16 06:47:08 -06:00
Ivan Fraixedes
9caa3181d3
uplink/piecestore: Check SN piece hash timestamp (#3246)
Uplink must verify that every piece upload to a storage node return a
hash whose timestamp isn't older than the maximum elapsed time allowed
by the Satellite.

We cannot leave this check only to the Satellite site, because if there
is no error reported by this matter, the uplink cuts down the long tail.
When uplink submits the result uploads including these invalid ones, the
Satellite filters out the invalid ones and that can provoke that it gets
less than the optimal threshold amount of valid upload results, so it
rejects the request.

Detecting the error at this stage will allow the uplink to detect these
uploads as invalid and avoid to cut down the long tail prematurely.
2019-10-15 16:07:18 +02:00
JT Olio
6ede140df1
pkg/rpc: defeat MITM attacks in most cases (#3215)
This change adds a trusted registry (via the source code) of node address to node id mappings (currently only for well known Satellites) to defeat MITM attacks to Satellites. It also extends the uplink UI such that when entering a satellite address by hand, a node id prefix can also be added to defeat MITM attacks with unknown satellites.

When running uplink setup, satellite addresses can now be of the form 12EayRS2V1k@us-central-1.tardigrade.io (not even using a full node id) to ensure that the peer contacted is the peer that was expected. When using a known satellite address, the known node ids are used if no override is provided.
2019-10-12 14:34:41 -06:00
Jeff Wendling
098cbc9c67 all: use pkg/rpc instead of pkg/transport
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.

most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.

a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.

Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
2019-09-25 15:37:06 -06:00
Michal Niewrzal
607da4ab4a
metainfo: move FinishDeleteSegment logic to BeginDeleteSegment (#3104) 2019-09-23 14:41:58 -07:00
JT Olio
946ec201e2
metainfo: move api keys to part of the request (#3069)
What: we move api keys out of the grpc connection-level metadata on the client side and into the request protobufs directly. the server side still supports both mechanisms for backwards compatibility.

Why: dRPC won't support connection-level metadata. the only thing we currently use connection-level metadata for is api keys. we need to move all information needed by a request into the request protobuf itself for drpc support. check out the .proto changes for the main details.

One fun side-fact: Did you know that protobuf fields 1-15 are special and only use one byte for both the field number and type? Additionally did you know we don't use field 15 anywhere yet? So the new request header will use field 15, and should use field 15 on all protobufs going forward.

Please describe the tests: all existing tests should pass

Please describe the performance impact: none
2019-09-19 10:19:29 -06:00
Michal Niewrzal
1c72e80e40 uplink/satellite: fix for case when inline segment is last one (#3062)
* uplink/satellite: fix when inline seg is last one

* review comments
2019-09-19 01:18:14 +02:00
Jess G
7c203b4884
add satelliteSystem to testplanet and update tests (#3066) 2019-09-17 13:14:49 -07:00
Isaac Hess
5a50042c77
uplink/storage/streams: Add test for interrupted deletes (#3040)
* uplink/storage/streams: Add test for interrupted deletes

* uplink/storage/streams: Fix linting errors
2019-09-13 13:08:15 -06:00
Ivan Fraixedes
ccbf73ecc7
uplink/ecclient: Remove unneeded atomic operation (#3036)
Atomic operations are only needed when a variable can be accessed
concurrently, so when it isn't the case there is no need to use them.
2019-09-13 12:47:35 +02:00
Ivan Fraixedes
8a48500ba4
uplink/ecclient: Report success in debug level (#3037)
Packages shouldn't be chatty when the things go as expected unless the
DEBUG log level is set.
2019-09-13 12:04:12 +02:00
Michal Niewrzal
64c467ffe7
uplink: integrate new Metainfo calls (#2640) 2019-09-10 08:39:47 -07:00
Maximillian von Briesen
fb10815229 Repair with hashes (#2925)
* add outline for ECRepairer

* add description of process in TODO comments

* begin download/getting hash for a single piece

* verify piece hash and order limit during download

* fix download piece

* begin filling out ESREpair. Get

* wip move ecclient.Repair to ecrepairer.Repair

* pass satellite signee into repairer

* reconstruct original stripe from pieces

* move rebuildStripe()

* calculate piece size differently, increment successful count

* fix shares slices initialization

* rename stripeData to segment

* do not pad reader in Repair()

* temp debug

* create unsafeRSScheme

* use decode reader

* rename file name to be all lowercase

* make repair downloader async

* declare condition variable inside Get method

* set downloadAndVerifyPiece's in-memory buffer to be share size

* update unusedLimits var

* address comments

* remove unnecessary comments

* move initialization of segmentRepaire to be outside of repairer service

* use ReadAll during download

* remove dots and move hashing to after validating for order limit signature

* wip test

* make sure files exactly at min threshold are repaired

* remove unused code

* use corrput data and write back to storagenode

* only create corrupted node and piece ids once

* add comment

* address nat's comment

* fix linting and checker_test

* update comment

* add comments

* remove "copied from ecclient" comments

* add clarification comments in ec.Repair
2019-09-06 15:20:36 -04:00
Michal Niewrzal
61168493dc
uplink: don't stop deleting segments on first error (#2943) 2019-09-05 14:25:30 +02:00
Michal Niewrzal
a6721ba92f
satellite/metainfo: Improve metainfo ListSegments (#2882) 2019-08-30 23:30:18 +02:00
Natalie Villasana
9a1b9f8431
uplink/ecclient: change delete logs from err to debug level (#2917) 2019-08-30 17:00:34 -04:00
Egon Elbre
c309bd3fec
lint: add linting for errs package (#2881) 2019-08-27 19:07:12 +03:00
Bill Thorp
a250551b6d storagenode/piecestore + uplink/piecestore: return PieceHash and original OrderLimit during GET_REPAIR (#2775) 2019-08-26 14:57:41 -04:00
JT Olio
12d50ebb99
streams: don't encrypt segment count (#2859)
What: this change makes sure the count of segments is not encrypted.

Why: having the segment count encrypted just makes things hard for no reason - a satellite operator can figure out how many segments an object has by looking at the other segments in the database. but if a user has access but has lost their encryption key, they now can't clean up or delete old segments because they can't know how many there are without just guessing until they get errors. :(

Backwards compatibility: clients will still understand old pointers and will still write old pointers. at some point in the future perhaps we can do a migration for remaining old pointers so we can delete the old code.

Please describe the tests: covered by existing tests

Please describe the performance impact: none
2019-08-22 15:15:58 -06:00
Jeff Wendling
057d30152c
uplink/storage/segments: seed download permuatation with timestamp (#2809) 2019-08-16 11:14:02 -06:00
Maximillian von Briesen
189b268892
uplink/piecestore: Change where ignore cancel happens for closing downloads (#2786) 2019-08-15 10:32:05 -04:00
Bryan White
1915b59af3 satellite/repair: monkit improvements (#2773) 2019-08-14 15:40:26 -04:00
Maximillian von Briesen
3a82b63974
uplink/ecclient: performance - close connections faster (#2757) 2019-08-14 10:03:51 -04:00
Egon Elbre
48211daa9d
uplink/piecestore: handle Download errors better (#2771) 2019-08-14 12:02:58 +03:00
Egon Elbre
9eba5ac631
lib/uplink: remove Seek method (#2768) 2019-08-13 20:29:02 +03:00