Commit Graph

79 Commits

Author SHA1 Message Date
Yingrong Zhao
5011e78311 storagenode/piecestore: remove unused DeletePiece endpoint
With commit: 3331b443e7, satellite will
start calling `DeletePieces`. Therefore, we can remove the old endpoint
once the above commit is deployed with all satellites

Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb
2020-02-21 21:03:49 +00:00
Cameron Ayer
3e70a893dd storagenode/{piecestore, contact}: report capacity to satellites if below specific threshold
Curently, storage nodes only report their capacity to satellites
once per hour. If a node fills up, it will fail all uploads until
the next contact cycle begins. With these changes, at the end of an
upload we check whether the MinimumDiskSpace threshold has been
passed. If so, trigger the monitor chore to update the node's
capacity, then trigger the contact chore to report the new
capacity to the satellites

Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0
2020-02-18 15:42:48 -05:00
Egon Elbre
8f20085683 storagenode/piecestore: clearer client cancellation error message
Change-Id: Ia0595f71eb3eb1c0f091e615652e2de376d5609d
2020-02-14 09:36:03 +00:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Jeff Wendling
03166d6be3 storagenode/piecestore: log available bandwidth and space on uploads
Change-Id: Ia92228cb2a178da45f4f123b48c476e5ec821fe8
2020-01-31 19:47:14 +00:00
Jeff Wendling
16bb374deb storagenode/piecestore: add large timeouts to read/write operations
this is to help protect against intentional or unintentional
slowloris style problems where a client keeps a tcp connection
alive but never sends any data. because grpc is great, we have
to spawn a separate goroutine for every read/write to the stream
so that we can return from the server handler to cancel it if
necessary. yep. really.

additionally, we update the rpcstatus package to do some stack
trace capture and add a Wrap method for the times where we want
to just use the existing error.

also fixes a number of TODOs where we attach status codes to the
returned errors in the endpoints.

Change-Id: Id8bb8ff84aa34e0f711b0cf9bce3908b36a1d3c1
2020-01-23 19:20:49 +00:00
Egon Elbre
08f63614be private/context2: add WithoutCancellation
Change-Id: I38557c16f41b8983886f256353cc6afb7634d9e6
2020-01-15 14:23:46 +02:00
Egon Elbre
5af1f9e6d1 storagenode/{piecestore,storagenodedb}: use context in queries
In endpoint.saveOrder, ensure we always try to save orders such
that they can be settled.

Change-Id: Ic9ac8f4bf684d8493282912ca97f386c1762e364
2020-01-14 20:27:26 +00:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Fadila
115b8b0fc8 storagenode/piecestore: delete several pieces in a single request
This is part of the deletion performance improvement.
See https://storjlabs.atlassian.net/browse/V3-3349

Change-Id: Idcd83a302f2bd5cc3299e1a4195a7e177f452599
2019-12-27 10:58:04 +00:00
Andrew Harding
cb89496569 storagenode/trust: wire up list into pool
- also updated ping chore to pick up trust changes
- fixed small typo in blueprint
- fixed flags for storj-sim
- wired up changes to testplanet

Change-Id: I02982f3a63a1b4150b82a009ee126b25ed51917d
2019-12-13 20:32:50 +00:00
Ivan Fraixedes
42c61138e8
storage: Improve doc comments delete methods (#3591)
Improve the documentation of several methods involved in the delete
operation to make clear their behavior without having to inspect their
logic.
2019-12-02 12:18:20 +01:00
Ivan Fraixedes
bf97ef06fc
storagenode: Add new endpoint to receive satellite requests for… (#3590)
* pkg/pg: Add new service function storage node

  Add a new service function to the storage node piece store for deleting
  pieces when satellites request them.

* storagenode/piecestore: Add endpoint to delete piece

  Add a new endpoint to receive from trusted satellites to delete a piece.

* private/testplanet: Fix storagenode mock

  Add to the storagenode mock the new endpoint method.

* proto.lock: Update it with the last protbuff changes

* storagenode/piecestore: Reuse test piece upload

  Extract the repeated logic from several tests functions for uploading a
  test piece to a test helper function.

* uplink/piecestore: Implement client side method

  Implement the client side method of the new piecestore RPC function.

* storagenode/piecestore: Add test DeletePiece endpoint

  Implement a test for the DeletePiece new endpoint method.
2019-11-26 18:47:19 +01:00
Isaac Hess
6aeddf2f53
storagenode/pieces: Add Trash and RestoreTrash to piecestore (#3575)
* storagenode/pieces: Add Trash and RestoreTrash to piecestore

* Add index for expiration trash
2019-11-20 09:28:49 -07:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
paul cannon
bd89f51c66
Keep v0pieceinfo database isolated (#3364)
* put TestCreateV0 back in StoreForTest
* avoid direct handles to V0 pieceinfo db
* type mismatch fix
* use storage.Blobs interface in store_test.go

..instead of filestore.Store. this will allow filestore.Store to become
unexported.

* unexport filestore.Store

rename it to blobStore. things should use the storage.Blobs interface
instead. changes in this commit are purely mechanical (made through the
"refactor" tool in Gocode followed by search/replace on the word "Store"
within the storage/filestore/ directory).

* kill filestore.StoreForTest

now that filestore.blobStore is unexported, there isn't a need for a
specialized wrapper type. this (not coincidentally) also makes it
possible for the WriterForFormatVersion() method on
storagenode/pieces.StoreForTest to work, without requiring everything to
wrap the store.blobs attribute in a filestore.StoreForTest, which was
impractical.
2019-11-13 13:15:31 -06:00
littleskunk
7eb6724c92
logging: unify logging around satellite ID, node ID and piece ID (#3491)
* logging: unify logging around satellite ID, node ID and piece ID

* unify segment index
2019-11-05 22:04:07 +01:00
Isaac Hess
1defd4dbfe
storagenode/piecestore: Respect config.MaxConcurrentRequests for drpc (#3402) 2019-10-28 13:12:49 -06:00
Isaac Hess
75412e54e5
storagenode/piecestore: Rename liveGRPCRequests back to liveRequests (#3354) 2019-10-23 13:43:43 -06:00
Isaac Hess
14c7648530
storagenode/piecestore: Only limit grpc requests (#3342) 2019-10-23 10:14:02 -06:00
Isaac Hess
ed6b88a12d piecestore: update usage before completing upload (#3286)
The upload code currently updates the usage in a deferred call to saveOrder().
The consequence is that in the success case, the RPC is completed before
the usage has been updated.

This change repurposes the deferred call to update usage in the
failure case, while explicitly updating the usage before completing the
RPC.

This fixes some test flakiness when using dRPC. gRPC waits until the final status is written before a Recv call completes, and the final status is written by the server after the handler function has exited. In practice this means that the client is blocked until the defer call is also finished. So this change will not change performance at all.

It has two advantages:

(1) It fixes test flakiness

and, more importantly:

(2) reduces the chances that someone will accidentally write a flaky test in the future
2019-10-15 20:17:17 -06:00
Simon Guindon
abb5b6c499
storagenode/piecestore: Fix to ignore both gRPC and dRPC EOF errors. (#3274)
* Fix to ignore both gRPC and dRPC EOF errors.

* Fix to ignore both gRPC and dRPC EOF errors.
2019-10-15 12:13:53 -04:00
littleskunk
96aeedcdee
OrderLimit/GracePeriod: Increase time window from 1h to 24h (#3255)
* OrderLimit/GracePeriod: Increase time window from 1h to 24h

* update satellite config lock
2019-10-13 17:40:24 +02:00
JT Olio
6ede140df1
pkg/rpc: defeat MITM attacks in most cases (#3215)
This change adds a trusted registry (via the source code) of node address to node id mappings (currently only for well known Satellites) to defeat MITM attacks to Satellites. It also extends the uplink UI such that when entering a satellite address by hand, a node id prefix can also be added to defeat MITM attacks with unknown satellites.

When running uplink setup, satellite addresses can now be of the form 12EayRS2V1k@us-central-1.tardigrade.io (not even using a full node id) to ensure that the peer contacted is the peer that was expected. When using a known satellite address, the known node ids are used if no override is provided.
2019-10-12 14:34:41 -06:00
littleskunk
b2e328f118 storagenode/dashboard: update online status (#3168) 2019-10-03 20:31:39 +02:00
Jeff Wendling
098cbc9c67 all: use pkg/rpc instead of pkg/transport
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.

most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.

a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.

Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
2019-09-25 15:37:06 -06:00
Jeff Wendling
0dcbd3dc08 bootstrap/satellite/certificate/storagenode: register drpc services
Change-Id: Id29f14b76a8c9cb2be31001b9a7a4356a4bda183
2019-09-12 15:09:46 -06:00
Egon Elbre
e5ac95b6e9 storagenode/inspector: fix TestInspectorStats flakyiness by waiting for requests to be handled (#3018) 2019-09-12 02:26:55 -07:00
Egon Elbre
a801fab66a
all: add archview annotations (#2964) 2019-09-10 16:24:16 +03:00
Bill Thorp
a250551b6d storagenode/piecestore + uplink/piecestore: return PieceHash and original OrderLimit during GET_REPAIR (#2775) 2019-08-26 14:57:41 -04:00
Maximillian von Briesen
65e2d2e711
storagenode/piecestore: ignore canceled errors on download (#2822)
* ignore canceled errors on piecestore endpoint download
2019-08-23 11:16:43 -04:00
Cameron
3d9441999a
storagenode/orders: add archive cleanup to orders service (#2821)
This PR introduces functionality for routine deletion of archived orders.

The user may specify an interval at which to run archive cleanup and a TTL for archived items. During each cleanup, all items that have reached the TTL are deleted

This archive cleanup job is combined with the order sender into a new combined orders service
2019-08-22 10:33:14 -04:00
Maximillian von Briesen
d83a965139
storagenode/piecestore: Add retain service on storagenode (#2785)
Add retain service on storagenode. This service runs retain jobs that have been queued by the storagenodes. Rather than running retain jobs during the grpc Retain() call, the grpc call queues a retain job to the retain service and returns immediately afterwards, removing a significant bottleneck in garbage collection.
2019-08-19 14:52:47 -04:00
Jess G
022f5d2e14
storagenode: add space used cache for pieces (#2753)
* add cache, update cache w/piece create/delete

* add service w/loop to cache to recalculate space used cache

* add piecestore cache to other sn svcs to use

* add table to persist the total space used

* rm cache where not needed

* rm stuff from sn svcs

* start fixing tests, changes per comments

* update commits

* add unit tests

* fix commiting before we write header bytes

* fix cache create test

* copy cache map, add started back to recalc

* fix test

* add test, update comments
2019-08-12 14:43:05 -07:00
paul cannon
17bdb5e9e5
move piece info into files (#2629)
Deprecate the pieceinfo database, and start storing piece info as a header to
piece files. Institute a "storage format version" concept allowing us to handle
pieces stored under multiple different types of storage. Add a piece_expirations
table which will still be used to track expiration times, so we can query it, but
which should be much smaller than the pieceinfo database would be for the
same number of pieces. (Only pieces with expiration times need to be stored in piece_expirations, and we don't need to store large byte blobs like the serialized
order limit, etc.) Use specialized names for accessing any functionality related
only to dealing with V0 pieces (e.g., `store.V0PieceInfo()`). Move SpaceUsed-
type functionality under the purview of the piece store. Add some generic
interfaces for traversing all blobs or all pieces. Add lots of tests.
2019-08-07 20:47:30 -05:00
Maximillian von Briesen
bdcb40fbc8
storagenode/storagenodedb: Add cursor to pieceInfo.GetPieceIDs (#2724) 2019-08-06 13:19:16 -04:00
JT Olio
28156d3573
storagenode: more live request tracking (#2699)
* storagenode/piecestore: track live requests together

Change-Id: I9ed44e4484b97bcbe076c222450c3449fe8b1075

* show grpc status codes in monkit failures

Change-Id: I68bc3a8d24a372e8147ef2a74636fc3e40fa799a

* small nit

Change-Id: I722b09345377b079e41c5a3dc86d7fd6232c9d24
2019-08-02 16:49:39 -06:00
Ivan Fraixedes
3cd477454f storagenode/piecestore: Make method unexported (#2674) 2019-07-31 10:13:39 -04:00
Ivan Fraixedes
abef20930f
storagenode: Report gRPC error when satellite is untrusted (#2658)
* storagenode/piecestore: Unexport endpoint method
  Make an exported endpoint method to be unexported because it's only used
  by the same package and makes easy to change without thinking in
  breaking changes.
* uplink/ecclient: Use structured logger
  Swap sugared logger by the normal structured logger for having the full
  stack traces of the error in the debug message.
* storagenode/piecestore: Send gRPC error codes upload
  Refactoring in the storagenode/piecestore to send gRPC status error codes
  when some of the methods involved by upload return an error.
  
  The uplink related to uploads has also been modified to retrieve the
  gRPC status code when an error is returned by the server.
2019-07-30 18:58:08 +02:00
Egon Elbre
5d0816430f
rename all the things (#2531)
* rename pkg/linksharing to linksharing
* rename pkg/httpserver to linksharing/httpserver
* rename pkg/eestream to uplink/eestream
* rename pkg/stream to uplink/stream
* rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo
* rename pkg/auth/signing to pkg/signing
* rename pkg/storage to uplink/storage
* rename pkg/accounting to satellite/accounting
* rename pkg/audit to satellite/audit
* rename pkg/certdb to satellite/certdb
* rename pkg/discovery to satellite/discovery
* rename pkg/overlay to satellite/overlay
* rename pkg/datarepair to satellite/repair
2019-07-28 08:55:36 +03:00
Maximillian von Briesen
906c77b55a
Add RetainStatus to storagenode config (#2633)
--storage2.retain-status = "disabled" (default), "debug", or "enabled"
2019-07-26 16:49:08 -04:00
paul cannon
b9a17913fa storagenode/pieces: remove buffering from reading/writing and fix io.EOF bug (#2554) 2019-07-25 11:22:15 +03:00
Egon Elbre
f6f65a80d7
storagenode/trust: implement fetching peer identity without kademlia and endpoint (#2584) 2019-07-17 21:14:44 +03:00
paul cannon
0d1dce508e
ensure uplink is sending correct size with PieceHash (#2555)
If we verify that the size matches reality, we can then expect to use
the filesystem to store the piece size as used in the signed PieceHash
from the uplink. Otherwise, the uplink might send a garbage size value,
leaving the storagenode with no good way to verify the uplink signature
on the piece at a later date.

Also fix the code in uplink/piecestore/ so that it sends a valid size,
because it was being rude and sending 0.
2019-07-15 11:26:18 -04:00
Egon Elbre
d52f764e54
protocol: implement new piece signing and verification (#2525) 2019-07-11 16:51:40 -04:00
Maximillian von Briesen
8b507f3d73 Address concerns with storagenode Retain endpoint (#2527) 2019-07-11 16:04:21 -04:00
Fadila
fa1f5c8d7f garbage collection endpoint on storage node (#2424) 2019-07-10 09:41:47 -04:00
Alexander Leitner
1c5db71faf
Change protobuf expirations to use time.Time (#2509)
* Change protobuf expirations to use time.Time instead of timestamp.Timestamp
2019-07-09 17:54:00 -04:00
Michal Niewrzal
bbc25a2bf7 Drop SN certifiates table from DB (#2498) 2019-07-09 17:33:45 -04:00
JT Olio
65aa8f227f piecestore: pipeline chunks with orders (#2451) 2019-07-08 17:26:19 +03:00