Commit Graph

841 Commits

Author SHA1 Message Date
Egon Elbre
25b76fe63f storagenode/storagenodedb: use tagsql
Change-Id: Iba3b34a97b982deb4f72ce55517a294f249b6b55
2020-01-19 14:39:16 +02:00
Egon Elbre
59d06644b9 private/migrate: switch to tagsql
Also added temporary types withRebind and withTagTx,
which will be later removed. Currently they help to avoid
changing the whole codebase at the same time.

Change-Id: I7f07ba8f4709a23a463bfa67464628665a05808f
2020-01-19 14:39:16 +02:00
Moby von Briesen
273eb66fae cmd/storagenode,storagenode/preflight: add config flag to disable
storagenode database preflight check.

Disable preflight database check by default, and have the option to
enable it. This will allow us to enable it once it is definitely
working.

Also change the name of the config flag for preflight  time sync.

Change-Id: Ie2e20f9e25dcb38794eafa7e1505e7c6ff287c99
2020-01-17 17:53:17 +00:00
Isaac Hess
614e04d055 storagenode/pieces: Cache inits trash info from db
On pieces usage cache init we now load the trash info from the db. Also
fixes a test that was masking the failure here.

Change-Id: I9ff7da5bc6c0f74cf0942e20931b40e0c88d70fa
2020-01-17 09:33:05 -07:00
Bill Thorp
6f2f97b313 storagenode\gracefulexit: broke worker deleteOnePieceOrAll into deleteOnePiece and deleteAllPieces and deletePiece
Change-Id: Ic3bd21e89fa71e962c2bb1c4943f4696bc4f83e5
2020-01-17 15:07:34 +00:00
Moby von Briesen
e115bc1903 cmd/storagenode;storagenode/storagenodedb: add preflight database check
for storagenode

Ensure that database schema matches latest test migration schema before
allowing the node to start up.

Ensure minimal read/write functionality for each storagenode database
before allowing the node to start up.

This will eliminate many unhandled audit errors we are seeing.

Change-Id: Ic0e628b04a9c35b7a8243f6a81d4683918170ba9
2020-01-16 18:44:46 +00:00
Egon Elbre
81d53b8097 storagenode/storagenodedb: fixes to row handling
Change-Id: I3813310b48337428f13678a9fcba5c8a0e0b2b2a
2020-01-16 15:08:37 +00:00
Yingrong Zhao
db8aee0806 satellite/contact; storagenode/preflight: add clock check on startup for storagenode
add config preflight.enabled-local-time

Change-Id: I7b942c9bee063aae409ee6721ae9d079dff0144f
2020-01-15 15:35:26 +00:00
Yingrong Zhao
07c2824d94 storagenode/gracefulexit: fix exit-status command output
When exit succeeded, cli should display `Y` in Successful column and `100%` in PercentComplete.

Change-Id: I6093eca207ecd618bb332af12e5e455bc8224dde
2020-01-15 14:58:15 +00:00
Egon Elbre
08f63614be private/context2: add WithoutCancellation
Change-Id: I38557c16f41b8983886f256353cc6afb7634d9e6
2020-01-15 14:23:46 +02:00
Egon Elbre
64fb2d3d2f Revert "dbutil: statically require all databases accesses to use contexts"
This reverts commit 8e242cd012.

Revert because lib/pq has known issues with context cancellation.
These issues need to be resolved before these changes can be merged.

Change-Id: I160af51dbc2d67c5449aafa406a403e5367bb555
2020-01-15 07:28:00 +00:00
JT Olio
8e242cd012 dbutil: statically require all databases accesses to use contexts
this will allow for some nice runtime analysis down the road.
also, this allows for wrapping database handles in a way that
can interact with these contexts

requires https://review.dev.storj.io/c/storj/dbx/+/514

Change-Id: Ib087b7cd73296dd2c1e0331314da34d861f61d2b
2020-01-14 18:20:47 -05:00
Egon Elbre
5af1f9e6d1 storagenode/{piecestore,storagenodedb}: use context in queries
In endpoint.saveOrder, ensure we always try to save orders such
that they can be settled.

Change-Id: Ic9ac8f4bf684d8493282912ca97f386c1762e364
2020-01-14 20:27:26 +00:00
Egon Elbre
d80cfeb4ab storagenode: ensure we don't eat the underlying error
When error is formatted using %v it's not possible to check
whether the error was caused by a context cancellation.

Change-Id: Ia77dfb0817e49d9a7b168c12a6300d131007d0ee
2020-01-14 20:26:23 +00:00
Egon Elbre
23e2664327 storagenode/inspector: return rpcstatus
Change-Id: I7e13b6dc8c9c3f4550f77885b1ef99662f5a5727
2020-01-14 20:24:46 +00:00
Egon Elbre
ff267168c5 private/migrate: add ctx argument
Change-Id: I3d65912d89261386413c494c7ed1576fed4dcaf4
2020-01-13 15:52:26 +02:00
Egon Elbre
c7b846589e private/dbutil/sqliteutil: add ctx argument
Change-Id: If1caa9cde746817e62cae32a152eeec81959129c
2020-01-13 15:03:30 +02:00
Qweder93
cf19e141e0 storagenode/notifications: return unread count and fix json id, list-notifications method fix
Change-Id: Ic56beac1f388d91a29c9e8266161715d09364520
2020-01-09 17:56:00 +00:00
Yingrong Zhao
ebeee58001 storagenode/gracefulexit: remove satellite entry when node fail precondition
Change-Id: I3c215170f10f0053e4f8718ee31d64d93f52ec80
2020-01-08 18:11:58 +00:00
Egon Elbre
082ec81714
uplink: move to storj.io/uplink (#3746) 2020-01-08 15:40:19 +02:00
paul cannon
0c88a7b475 private/migrate: use transactional helpers and not Begin()
This code needs to work against cockroachDB, so transactions must be retried
when a retryable error is returned. This change puts migrate
transactions into the dbutil.WithTx transactional helpers to achieve
this in the easiest way.

Change-Id: Ib930e82d55cb0257357a222ce9131e6e53372c03
2020-01-07 18:25:38 +00:00
Egon Elbre
f41d440944 all: reduce number of log messages
Remove starting up messages from peers. We expect all of them to start,
if they don't, then they should return an error why they don't start.
The only informative message is when a service is disabled.

When doing initial database setup then each migration step isn't
informative, hence print only a single line with the final version.

Also use shorter log scopes.

Change-Id: Ic8b61411df2eeae2a36d600a0c2fbc97a84a5b93
2020-01-06 19:03:46 +00:00
Egon Elbre
2680bae88c private/testplanet: remove dependency to uplink
Remove direct dependency on uplink.RSConfig, this simplifies
moving the config file without introducing weird dependencies.

Change-Id: I7fd2a145401e0205d7047631df9d2810241efeec
2020-01-02 09:40:46 +00:00
Stefan Benten
758fe35aba
storagenode/orders: adding jitter to sending (#3725) 2019-12-30 21:35:26 +01:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Fadila
115b8b0fc8 storagenode/piecestore: delete several pieces in a single request
This is part of the deletion performance improvement.
See https://storjlabs.atlassian.net/browse/V3-3349

Change-Id: Idcd83a302f2bd5cc3299e1a4195a7e177f452599
2019-12-27 10:58:04 +00:00
Isaac Hess
7d1e28ea30 storagenode: Include trash space when calculating space used
This commit adds functionality to include the space used in the trash
directory when calculating available space on the node.

It also includes this trash value in the space used cache, with methods
to keep the cache up-to-date as files are trashed, restored, and
emptied.

As part of the commit, the RestoreTrash and EmptyTrash methods have
slightly changed signatures. RestoreTrash now also returns the keys that
were restored, while EmptyTrash also returns the total disk space
recovered. Each of these changes makes it possible to keep the cache
up-to-date and know how much space is being used/recovered.

Also changed is the signature of PieceStoreAccess.ContentSize method.
Previously this method returns only the content size of the blob,
removing the size of any header data. This method has been renamed
`Size` and returns both the full disk size and content size of the blob.
This allows us to only stat the file once, and in some instances (i.e.
cache) knowing the full file size is useful.

Note: This commit simply adds the trash size data to the piece size data
we were already collecting. The piece size data is not accurate for all
use-cases (e.g. because it does not contain piece header data); however,
this commit does not fix that problem. Now that the ContentSize (Size)
method returns the full size of the file, it should be easier to fix
this problem in a future commit.

Change-Id: I4a6cae09e262c8452a618116d1dc66b687f59f85
2019-12-23 19:07:03 -07:00
Egon Elbre
d55288cf68 pkg/rpc: replace methods with direct calls to pb
Change-Id: I8bd015d8d316a2c12c1daceca1d9fd257f6f57bc
2019-12-22 17:12:43 +02:00
Egon Elbre
006baa9ca6 pkg/rpc: remove drpc aliases
We need to split up pb package, which means we cannot have a core package
that depends on them.

Change-Id: I7f4f6fd82f89a51a9b2ad08bf2b1207253b8a215
2019-12-22 16:58:08 +02:00
Yingrong Zhao
6e71591b9b satellitedb;storagenodedb: remove unnecessary use of DB transactions in graceful exit
Change-Id: Ief0a28c6750c130896b48bfebfbea7fb3caa810f
2019-12-20 21:24:38 +00:00
Qweder93
e47ec84dee storagenode notification service and api added
Change-Id: I36898d7c43e1768e0cae0da8d83bb20b16f0cdde
2019-12-20 18:42:23 +00:00
Egon Elbre
afe05edff2 {storagenode,satellite}/gracefulexit: ensure workers finish their work
Fixes a data race caused by not waiting for workers to finish
before shutting down. Currently this ended up failing logging
because it was closed when test tried to write to it.

Change-Id: I074045cd83bbf49e658f51353aa7901e9a5d074b
2019-12-17 17:21:52 +02:00
Egon Elbre
7a36507a0a private/testcontext: ensure we call cleanup everywhere
Change-Id: Icb921144b651611d78f3736629430d05c3b8a7d3
2019-12-17 14:16:09 +00:00
littleskunk
08947e177d storagenode/garbagecollection: enable in production
Change-Id: I627b7a37ca4a85eb19936ca2c7ca907d7cc63f5b
2019-12-16 22:44:04 +00:00
Vitalii Shpital
53d9bc4530
storagenode/notifications: db created (#3707) 2019-12-16 19:59:01 +02:00
littleskunk
c2ea75208f
storagenode/orderdb: fix db lock
Change-Id: Id1add0ba7ae1b20bd98099bd4d3aff0fcfdd90c9
2019-12-15 23:41:22 +01:00
Andrew Harding
cb89496569 storagenode/trust: wire up list into pool
- also updated ping chore to pick up trust changes
- fixed small typo in blueprint
- fixed flags for storj-sim
- wired up changes to testplanet

Change-Id: I02982f3a63a1b4150b82a009ee126b25ed51917d
2019-12-13 20:32:50 +00:00
Andrew Harding
2867b6a466 storagenode/trust: list implementation
Change-Id: Ia886e84990efaf2c783f199741552a7a8ff41d4e
2019-12-12 17:15:47 +00:00
Jeff Wendling
fb8e78132d storagenodedb: reenable utccheck in tests
Change-Id: If7d64dd4ae58e4b656ff9122ae3195b2a5173cb3
2019-12-10 23:17:14 +00:00
Andrew Harding
5ed9373dba storagenode/trust: source entry cache
Implements a cache that can persist trust entries returned by sources

Change-Id: I72579e42e9f72d34a54b7510c9b665844f187314
2019-12-10 21:45:01 +00:00
Andrew Harding
715d97e3d8 storagenode/trust: rule and excluders
Change-Id: I84ed542e1ef3cfaa5cc3d3f631cdc295393bf978
2019-12-10 21:08:12 +00:00
Cameron Ayer
6fae361c31 replace planet.Start in tests with planet.Run
planet.Start starts a testplanet system, whereas planet.Run starts a testplanet
and runs a test against it with each DB backend (cockroach compat).

Change-Id: I39c9da26d9619ee69a2b718d24ab00271f9e9bc2
2019-12-10 16:55:54 +00:00
Andrew Harding
eb52ac623b storagenode/trust: source implementations
Change-Id: Ie36e79cc15257db88051f63e5b9463fd9d7b4736
2019-12-09 20:00:02 +00:00
Andrew Harding
7d0aadfeca storagenode/trust: satellite URL implementation
Satellite URL is a stricter form of the STORJ Node URL. It requires both
the ID and port specifier.

Change-Id: I7fd302064f864c1de8240a7915bf5263b898dfd1
2019-12-09 17:05:57 +00:00
littleskunk
9d1faeee58 storagenode/garbagecollection: increase MaxTimeSkew to be higher than satellite MaxCommitInterval
Change-Id: I86f8d0b44bea3aa005ff26d52588611c59df5e9a
2019-12-09 16:03:55 +00:00
Ethan Adams
9420fa9fc5 satellite/gracefulexit: Add graceful exit completed/failed receipt verification to satellite CLI (#3679) 2019-12-03 17:09:39 -05:00
Ivan Fraixedes
42c61138e8
storage: Improve doc comments delete methods (#3591)
Improve the documentation of several methods involved in the delete
operation to make clear their behavior without having to inspect their
logic.
2019-12-02 12:18:20 +01:00
Ivan Fraixedes
bf97ef06fc
storagenode: Add new endpoint to receive satellite requests for… (#3590)
* pkg/pg: Add new service function storage node

  Add a new service function to the storage node piece store for deleting
  pieces when satellites request them.

* storagenode/piecestore: Add endpoint to delete piece

  Add a new endpoint to receive from trusted satellites to delete a piece.

* private/testplanet: Fix storagenode mock

  Add to the storagenode mock the new endpoint method.

* proto.lock: Update it with the last protbuff changes

* storagenode/piecestore: Reuse test piece upload

  Extract the repeated logic from several tests functions for uploading a
  test piece to a test helper function.

* uplink/piecestore: Implement client side method

  Implement the client side method of the new piecestore RPC function.

* storagenode/piecestore: Add test DeletePiece endpoint

  Implement a test for the DeletePiece new endpoint method.
2019-11-26 18:47:19 +01:00
Yingrong Zhao
66f1a1680f
add completion receipt to exit-status cli command on storage node (#3650) 2019-11-26 12:32:26 -05:00
Isaac Hess
56f8fd2dd7
storagenode/pieces: Add EmptyTrash functionality (#3640)
* storagenode/pieces: Add EmptyTrash functionality

* storagenode/pieces: Fix err

* storagenode/pieces: Fix lint
2019-11-26 09:25:21 -07:00
Vitalii Shpital
038ac58600
web/storagenode: minimal allowed version view implemented (#3583) 2019-11-26 18:08:24 +02:00
littleskunk
8842b0c252 storagenode/gracefulexit: improve logging (#3633) 2019-11-21 21:10:02 -05:00
Rafael Antonio Ribeiro Gomes
2739771761
storagenode: add bandwidth metrics (#3623)
* storagenode: add bandwidth metrics

* remove unecessary metric
2019-11-21 16:51:40 -03:00
Isaac Hess
6aeddf2f53
storagenode/pieces: Add Trash and RestoreTrash to piecestore (#3575)
* storagenode/pieces: Add Trash and RestoreTrash to piecestore

* Add index for expiration trash
2019-11-20 09:28:49 -07:00
Kaloyan Raev
6d728d6ea0
storagenode/collect: delete piece 24 hours after expiration (#3613) 2019-11-20 17:02:57 +02:00
Vitalii Shpital
61c8bcc9a6
web/storagenode: egress chart implemented (#3574) 2019-11-20 16:37:57 +02:00
Rafael Antonio Ribeiro Gomes
da39c71d35
storagenode: add new metric satellite.request (#3610)
* storagenode: add new metric satellite.request

* storagenode: metrics fixed

* switch from Counter to Meter
2019-11-19 18:11:31 -03:00
Ivan Fraixedes
8e1e4cc342
piecestore: Fix invalid comment and typos (#3604) 2019-11-19 16:30:48 +01:00
Nikolai Siedov
24318d74b3
storagenode/console: show satellite url in satellite selection (#3602) 2019-11-19 14:16:56 +02:00
Nikolai Siedov
0d35505fe1
SNOboard/console: router changed for gorillaMux, caching added (#3577) 2019-11-15 14:36:43 +02:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Egon Elbre
1a54007f1c
storagenode/storagenodedb: dont log opening of each database (#3571) 2019-11-14 17:08:16 +02:00
Egon Elbre
1e64006e32 lint: add staticcheck as a separate step (#3569) 2019-11-14 10:31:30 +02:00
paul cannon
bd89f51c66
Keep v0pieceinfo database isolated (#3364)
* put TestCreateV0 back in StoreForTest
* avoid direct handles to V0 pieceinfo db
* type mismatch fix
* use storage.Blobs interface in store_test.go

..instead of filestore.Store. this will allow filestore.Store to become
unexported.

* unexport filestore.Store

rename it to blobStore. things should use the storage.Blobs interface
instead. changes in this commit are purely mechanical (made through the
"refactor" tool in Gocode followed by search/replace on the word "Store"
within the storage/filestore/ directory).

* kill filestore.StoreForTest

now that filestore.blobStore is unexported, there isn't a need for a
specialized wrapper type. this (not coincidentally) also makes it
possible for the WriterForFormatVersion() method on
storagenode/pieces.StoreForTest to work, without requiring everything to
wrap the store.blobs attribute in a filestore.StoreForTest, which was
impractical.
2019-11-13 13:15:31 -06:00
Yingrong Zhao
db8294cfba
storagenode/gracefulexit: get hash and limit using original piece ID (#3557) 2019-11-13 12:45:55 -05:00
Jeff Wendling
ebcd37c572 storagenode/contact: fix connection leak with contact checkin
Change-Id: If86002557144d5d8dbff939d2b6a2dfec6537577
2019-11-06 18:00:09 +00:00
littleskunk
7eb6724c92
logging: unify logging around satellite ID, node ID and piece ID (#3491)
* logging: unify logging around satellite ID, node ID and piece ID

* unify segment index
2019-11-05 22:04:07 +01:00
Maximillian von Briesen
257d3946d5
storagenode/gracefulexit: allow storagenodes to concurrently transfer pieces for graceful exit (#3478) 2019-11-05 10:33:44 -05:00
Jennifer Li Johnson
11f0ea3258
5s (#3477) 2019-11-04 16:20:31 -05:00
Jennifer Li Johnson
aa7d15a365
storagenode/contact: exponential backoff retries for pinging Satellites (#3372) 2019-11-04 16:03:21 -05:00
Jess G
5abb91afcf
satellite: change the Peer name to Core (#3472)
* change satellite.Peer name to Core

* change to Core in testplanet

* missed a few places

* keep shared stuff in peer.go to stay consistent with storj/docs
2019-11-04 11:01:02 -08:00
Isaac Hess
4d26d0a6a6 storagenode/pieces: Add migration from v0 piece to v1 piece (#3401) 2019-11-04 17:59:45 +01:00
Egon Elbre
87687938d1 storagenode/contact: fix panic in ping satellites (#3447) 2019-11-01 16:20:53 +01:00
Ethan Adams
43103ae13f
lower storage node counts in tests (#3427) 2019-10-31 10:57:54 -04:00
Jess G
4d85b11574
satellite/contact: improve errors in contact endpoints (#3356)
* improve errors in satellite contact endpoints

* add changes per CR comments

* update pingback method so it still updates node table

* fix err and returns

* fix zap logging to be better
2019-10-30 11:57:21 -07:00
Natalie Villasana
4878135068
satellite/gracefulexit, storagenode/gracefulexit: add timeouts (#3407) 2019-10-30 13:40:57 -04:00
Natalie Villasana
5453886231 satellite/repair, uplink/ecclient: remove unused expiration arg from ec.Repair and ec.putPiece (#3416) 2019-10-30 11:35:00 -04:00
Yingrong Zhao
3ee0b89f8f
storagenode/gracefulexit: delete pieces when receive Delete or Completed message from satellite (#3406) 2019-10-30 10:46:56 -04:00
Egon Elbre
65a8e0bcbc
{satellite,storagenode}/gracefulexit: clearer log messages (#3413) 2019-10-30 10:21:27 +02:00
Isaac Hess
1defd4dbfe
storagenode/piecestore: Respect config.MaxConcurrentRequests for drpc (#3402) 2019-10-28 13:12:49 -06:00
Ethan Adams
5b0398a718 storagenode/gracefulexit: Exclude finished exits from chore/worker processing. Fix update status bug (#3399) 2019-10-28 13:59:45 -04:00
Egon Elbre
93353df4d6
internal/sync2: make Fence accept context (#3393) 2019-10-28 16:04:31 +02:00
paul cannon
1469f7f41f
storagenode/contact: wait for UpdateSelf before start (#3332)
When the contact chore starts running before the monitor service has
provided any useful capacity data, the first outgoing contact has
not-very-helpful data for the satellite. This change causes the contact
chore to wait until capacity data is available. The wait should be quite
short in all reasonable cases: even when a node starts with a lot of
stored pieces and no cached spaceUsedDB data, new data will have been
calculated and cached by the call to
`peer.Storage2.CacheService.Init(ctx)` in `storagenode.cmdRun()` before
`peer.Run(ctx)`.

Change-Id: Ibc26d5c1fc10a23006c00bc3f13ff6cf71f8bf1d
2019-10-26 12:16:25 -05:00
Jeff Wendling
ed48e74e20 gracefulexit: fix build for drpc (#3387)
Change-Id: I335e9f8991a10c9e8a0737bc7c9ea3f04cbe2546
2019-10-26 15:53:35 +02:00
Maximillian von Briesen
6df4d7bc73
storagenode/gracefulexit + satellite/gracefulexit: add storagenode-side transfer validation (#3371)
* Make the exiting node check piece hashes, piece IDs, and piece hash signatures before relaying successful transfer data to the satellite.
* Enable immediate graceful exit failure for "successful" transfers that fail satellite-side validation.
* Move transfer piece logic in storagenode worker to separate function (to make the worker easier to understand)
2019-10-25 13:16:20 -04:00
Yingrong Zhao
fa1ac24e19
satellite/gracefulexit: add failure threshold check (#3329)
* add overall failure percentage check and inactive time frame check before sending a response to sno

* update comment

* delete node from transfer queue if it has been inactive for too long

* fix linting error

* add test config value

* fix nil pointer

* add config value into testplanet

* add unit test for overall failure threshold

* move timeframe threshold to chore

* update protolock

* add chore test

* add per peiece failure count logic

* change config name from EndpointMaxFailures to MaxFailuresPerPiece

* address comments

* fix linting error

* add error handling for no row returned from progress table

* fix test for graceful exit chore on storagenode

* fix typo InActive -> Inactive

* improve readability for failure threshold calculation

* update config lock

* change error handling for GetProgress in graceful exit endpoint on the satellite side

* return proper rpc error in endpoint

* add check in chore test for checking finish timestamp and queue
2019-10-24 12:24:42 -04:00
Isaac Hess
75412e54e5
storagenode/piecestore: Rename liveGRPCRequests back to liveRequests (#3354) 2019-10-23 13:43:43 -06:00
Isaac Hess
14c7648530
storagenode/piecestore: Only limit grpc requests (#3342) 2019-10-23 10:14:02 -06:00
JT Olio
2c6fa3c5f8
pkg/rpc: remove read/write deadlines as a mechanism for request timeouts (#3335)
libuplink was incorrectly setting timeouts to 10 seconds still, but
should have been at least 10 minutes. the order sender was setting them
to 1 hour. we don't want timeouts in uplink-side logic as it establishes
a minimum rate on tcp streams.

instead of all of this, just use tcp keep alive. tcp keep alive packets are
sent every 15 seconds and if the peer stops responding the connection
dies. this is enabled by default with go. this will kill tcp connections
when they stop working.

Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b
2019-10-22 17:57:24 -06:00
Ethan Adams
3e0d12354a
storagenode/gracefulexit: Implement storage node graceful exit worker - part 1 (#3322) 2019-10-22 16:42:21 -04:00
paul cannon
5e78f4000b storagenode/pieces: remove old comment (#3334)
the reservedSpace member it's talking about was removed quite a while
ago.

Change-Id: I28433b2a44467376a408453d875c389656347cab
2019-10-22 12:51:51 +03:00
Bryan White
f468816f13
{internal/version,versioncontrol,cmd/storagenode-updater}: add rollout to storagenode updater (#3276) 2019-10-21 12:50:59 +02:00
Bryan White
243ba1cb17
{versioncontrol,internal/version,cmd/*}: refactor version control (#3253) 2019-10-20 09:56:23 +02:00
Yingrong Zhao
e5099f31f3
add context.Clean and correct rpc error code (#3295) 2019-10-16 13:50:01 -04:00
Isaac Hess
ed6b88a12d piecestore: update usage before completing upload (#3286)
The upload code currently updates the usage in a deferred call to saveOrder().
The consequence is that in the success case, the RPC is completed before
the usage has been updated.

This change repurposes the deferred call to update usage in the
failure case, while explicitly updating the usage before completing the
RPC.

This fixes some test flakiness when using dRPC. gRPC waits until the final status is written before a Recv call completes, and the final status is written by the server after the handler function has exited. In practice this means that the client is blocked until the defer call is also finished. So this change will not change performance at all.

It has two advantages:

(1) It fixes test flakiness

and, more importantly:

(2) reduces the chances that someone will accidentally write a flaky test in the future
2019-10-15 20:17:17 -06:00
Yingrong Zhao
87e3764390
storagenode/cmd: add exit-status command for graceful exit (#3264)
* add exit-status command

* remove todo and fix format

* fix status display

* change startExit to exit progress

* fix linting error

* add successful column in exit progress

* fix test

* remove extra new line

* fix TYPOS

* format the percentage better
2019-10-15 18:07:32 -04:00
Andrew Harding
4962c6843e
piecestore: fix test flakiness around upload/download usage tracking (#3282) 2019-10-15 11:22:15 -06:00
Simon Guindon
abb5b6c499
storagenode/piecestore: Fix to ignore both gRPC and dRPC EOF errors. (#3274)
* Fix to ignore both gRPC and dRPC EOF errors.

* Fix to ignore both gRPC and dRPC EOF errors.
2019-10-15 12:13:53 -04:00
Ethan Adams
1ad2ba7e3e
storagenode/gracefulexit: Add graceful exit chore and worker. (#3262)
Adds graceful exit chore and worker for V3-2614
2019-10-15 11:29:47 -04:00
Jennifer Li Johnson
b185dbbee2
satellite/discovery: remove discovery related code (#3175) 2019-10-14 10:57:01 -04:00
littleskunk
96aeedcdee
OrderLimit/GracePeriod: Increase time window from 1h to 24h (#3255)
* OrderLimit/GracePeriod: Increase time window from 1h to 24h

* update satellite config lock
2019-10-13 17:40:24 +02:00
JT Olio
6ede140df1
pkg/rpc: defeat MITM attacks in most cases (#3215)
This change adds a trusted registry (via the source code) of node address to node id mappings (currently only for well known Satellites) to defeat MITM attacks to Satellites. It also extends the uplink UI such that when entering a satellite address by hand, a node id prefix can also be added to defeat MITM attacks with unknown satellites.

When running uplink setup, satellite addresses can now be of the form 12EayRS2V1k@us-central-1.tardigrade.io (not even using a full node id) to ensure that the peer contacted is the peer that was expected. When using a known satellite address, the known node ids are used if no override is provided.
2019-10-12 14:34:41 -06:00
Isaac Hess
e567f27634
storagenode/piecestore: Change test to use ioutil.ReadAll to attempt to reduce test flake (#3250) 2019-10-11 15:57:59 -06:00
Cameron
d17be58237 remove random sleep in storagenode contact (#3243) 2019-10-11 16:44:18 -04:00
Vitalii Shpital
78a71ad3b6
web/storagenode: node status updated (#3220) 2019-10-11 19:28:47 +03:00
Yingrong Zhao
743a0fc38b storagenode/cmd: create start graceful exit CLI (#3202) 2019-10-11 09:58:12 -04:00
littleskunk
d5b2e1ef89
storagenode/signature: Reject uploads with a timestamp too far in the future (#3194) 2019-10-08 13:09:46 +02:00
JT Olio
37491d0d32 storagenode: embed the console into the binary and makefile (#3164)
* web/storagenode: add package-lock.json
* storagenode: compile console into binary
2019-10-08 10:52:19 +02:00
Jennifer Li Johnson
7ceaabb18e
Delete Bootstrap and Kademlia (#2974) 2019-10-04 16:48:41 -04:00
Jeff Wendling
64e43e555e pkg/rpc: return context error if ready after DialContext fails
the net package does not make it easy to know if DialContext
failed because the context was done. it's important for some
of our tests that canceled contexts are detected as such, so
we accept the small race that's arguably correct (the context
must be canceled asynchronously) to ensure we always return
the context error if available.

Change-Id: I058064d5c666e5353b74fb5bd300bf7abe537ff5
2019-10-04 20:09:00 +00:00
Yaroslav Vorobiov
a11619e7f3
storagenode/console: use bandwidth monthly summary (#3183) 2019-10-04 09:29:25 -06:00
Yaroslav Vorobiov
4824ecdb8d storagenode/console: use bytes for remaining info (#3186) 2019-10-04 18:17:28 +03:00
littleskunk
b2e328f118 storagenode/dashboard: update online status (#3168) 2019-10-03 20:31:39 +02:00
Maximillian von Briesen
08ed50bcaa
satellite/metainfo: add commit interval to prevent long delays between order limit creation and segment commit (#3149) 2019-10-01 12:55:02 -04:00
Bill Thorp
89c59d06f9
storagenode/storagenodedb: add SQL receiver logic for graceful exit (#3067)
* added graceful exit db methods
2019-10-01 10:34:03 -04:00
Jennifer Li Johnson
755cbd4dce
storagenode/main: map aliases for kademlia config values (#3118) 2019-09-30 19:33:00 -04:00
Jennifer Li Johnson
29b96a666b
internal/testplanet: fix conn leak (#3132) 2019-09-27 09:47:57 -06:00
Isaac Hess
2c5e169888
storagenode/storagenodedb: Vacuum info.db to prepare for splitting storagenodedbs (#3134) 2019-09-27 07:55:51 -06:00
Jeff Wendling
098cbc9c67 all: use pkg/rpc instead of pkg/transport
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.

most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.

a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.

Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
2019-09-25 15:37:06 -06:00
Isaac Hess
580e511b4c
storagenode/storagenodedb: Migrate to separate dbs (#3081)
* storagenode/storagenodedb: Migrate to separate dbs

* storagenode/storagenodedb: Add migration to drop versions tables

* Put drop table statements into a transaction.

* Fix CI errors.

* Fix CI errors.

* Changes requested from PR feedback.

* storagenode/storagenodedb: fix tx commit
2019-09-23 12:36:46 -07:00
Jennifer Li Johnson
d2502bb51b Adds tests for kad replacement and restores kad operator configs (#3094)
* test that all nodes can check in with all satellites

* keep kademlia config

* add untrusted satellite test

* use getversion

* remove kademlia config changes in test-sim-backwards.sh

* add kademlia flags back to storj-sim storagenode

* reset kademlia flags in storagenode entrypoint
2019-09-20 16:02:23 -04:00
Jennifer Li Johnson
724bb44723
Remove Kademlia dependencies from Satellite and Storagenode (#2966)
What:

cmd/inspector/main.go: removes kad commands
internal/testplanet/planet.go: Waits for contact chore to finish
satellite/contact/nodesservice.go: creates an empty nodes service implementation
satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value
satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover()
satellite/peer.go: sets up contact service and endpoints
storagenode/console/service.go: replaces nodeID with contact.Local()
storagenode/contact/chore.go: replaces routing table with contact service
storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method
storagenode/contact/service.go: creates a service to return the local node and update its own capacity
storagenode/monitor/monitor.go: uses contact service in place of routing table
storagenode/operator.go: moves operatorconfig from kad into its own setup
storagenode/peer.go: sets up contact service, chore, pingstats and endpoints
satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection
Removes kademlia setups in:

cmd/storagenode/main.go
cmd/storj-sim/network.go
internal/testplane/planet.go
internal/testplanet/satellite.go
internal/testplanet/storagenode.go
satellite/peer.go
scripts/test-sim-backwards.sh
scripts/testdata/satellite-config.yaml.lock
storagenode/inspector/inspector.go
storagenode/peer.go
storagenode/storagenodedb/database.go
Why: Replacing Kademlia

Please describe the tests:
• internal/testplanet/planet_test.go:

TestBasic: assert that the storagenode can check in with the satellite without any errors
TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup
• satellite/contact/contact_test.go:

TestFetchInfo: Tests that the FetchInfo method returns the correct info
• storagenode/contact/contact_test.go:

TestNodeInfoUpdated: tests that the contact chore updates the node information
TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info
Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).
2019-09-19 15:56:34 -04:00
Jess G
93788e5218
remove kademlia: create upsert query to update uptime (#2999)
* create upsert query for check-in method

* add tests

* fix lint err

* add benchmark test for db query

* fix lint and tests

* add a unit test, fix lint

* add address to tests

* replace print w/ b.Fatal

* refactor query per CR comments

* fix disqualified, only set if null

* fix query

* add version to updatecheckin query

* fix version

* fix tests

* change version for tests

* add version to tests

* add IP, add transport, mv unit test

* use node.address as arg

* add last ip

* fix lint
2019-09-19 11:37:31 -07:00
Simon Guindon
a2b1e9fa95
storagenode/storagenodedb: refactor both data access objects and migrations to support multiple DB connections (#3057)
* Split the info.db database into multiple DBs using Backup API.

* Remove location. Prev refactor assumed we would need this but don't.

* Added VACUUM to reclaim space after splitting storage node databases.

* Added unique names to SQLite3 connection hooks to fix testplanet.

* Moving DB closing to the migration step.

* Removing the closing of the versions DB. It's already getting closed.

* Swapping the database connection references on reconnect.

* Moved sqlite closing logic away from the boltdb closing logic.

* Moved sqlite closing logic away from the boltdb closing logic.

* Remove certificate and vouchers from DB split migration.

* Removed vouchers and bumped up the migration version.

* Use same constructor in tests for storage node databases.

* Use same constructor in tests for storage node databases.

* Adding method to access underlining SQL database connections and cleanup

* Adding logging for migration diagnostics.

* Moved migration closing database logic to minimize disk usage.

* Cleaning up error handling.

* Fix missing copyright.

* Fix linting error.

* Add test for migration 21 (#3012)

* Refactoring migration code into a nicer to use object.

* Refactoring migration code into a nicer to use object.

* Fixing broken migration test.

* Removed unnecessary code that is no longer needed now that we close DBs.

* Removed unnecessary code that is no longer needed now that we close DBs.

* Fixed bug where an invalid database path was being opened.

* Fixed linting errors.

* Renamed VersionsDB to LegacyInfoDB and refactored DB lookup keys.

* Renamed VersionsDB to LegacyInfoDB and refactored DB lookup keys.

* Fix migration test. NOTE: This change does not address new tables satellites and satellite_exit_progress

* Removing v22 migration to move into it's own PR.

* Removing v22 migration to move into it's own PR.

* Refactored schema, rebind and configure functions to be re-useable.

* Renamed LegacyInfoDB to DeprecatedInfoDB.

* Cleaned up closeDatabase function.

* Renamed storageNodeSQLDB to migratableDB.

* Switched from using errs.Combine() to errs.Group in closeDatabases func.

* Removed constructors from storage node data access objects.

* Reformatted usage of const.

* Fixed broken test snapshots.

* Fixed linting error.
2019-09-18 12:17:28 -04:00
Simon Guindon
91d54af705
Add satellites database business objects. (#3055)
* Add satellites database business objects.

* Fixed linting error.
2019-09-16 13:54:53 -04:00
Jess G
d3ef574b20 pkg/pb: minor changes to contact.proto (#3048)
* minor fixes to contact proto

* simply and rm nodeAddr object from client
2019-09-13 19:37:32 -05:00
Egon Elbre
ca058e606f
storagenode/orders: fix data race in settle (#3042) 2019-09-13 15:50:39 +03:00
Cameron
44bcdd222f storagenode/contact: test node info is updated (#3039) 2019-09-13 07:53:48 -04:00
Jeff Wendling
0dcbd3dc08 bootstrap/satellite/certificate/storagenode: register drpc services
Change-Id: Id29f14b76a8c9cb2be31001b9a7a4356a4bda183
2019-09-12 15:09:46 -06:00
Cameron
ab1147afb6
storagenode/pieces: fix race condition in cache service (#2972)
* add mutex lock to PersistCacheTotals, move lock around copyCacheTotals to inside function
2019-09-12 12:42:39 -04:00
Egon Elbre
e5ac95b6e9 storagenode/inspector: fix TestInspectorStats flakyiness by waiting for requests to be handled (#3018) 2019-09-12 02:26:55 -07:00
Yingrong Zhao
9f2f1527c5
storagenode/storagenodedb: add new tables for graceful exit (#3008)
* add database schema

* add migration

* change table name and update blueprint
2019-09-11 18:57:53 -04:00
Natalie Villasana
aa3567187e
satellite/audit: worker now verifies and reverifies (#2965) 2019-09-11 18:37:01 -04:00
paul cannon
c139ed8ea1 storagenode/console: remove kademlia (#2942)
this is a trivial operation for storagenode/console, as it doesn't
really need or use kademlia in the first place.

What:

Removes kademlia from storagenode/console

Why:

We are in the process of getting rid of kademlia, and this is one place where it's particularly easy.

Please describe the tests:

Existing tests exercise storagenode/console behavior; if they continue to work, everything here should be tested satisfactorily.
Please describe the performance impact:

None
2019-09-11 16:41:43 -04:00
Isaac Hess
7718802f0c
storagenode/storagenodedb: prepare for multiple databases (#3005)
* Migrate test: prepare for multiple databases

* Add copyright

* Fix unused variables

* Move data to testdata, split MultiDBSnapshot from MultiDBState
2019-09-11 14:31:46 -06:00
Isaac Hess
0b32572ae6
migrate: Allow work on separate dbs (#2996) 2019-09-10 13:42:23 -06:00
Jess G
2fc4d61610
implement contact.checkin method (#2952)
* implement contact.checkin method

* add batching to update uptime checks

* rm batching

* rm other unneeded things

* fix lint

* fix unit test

* changes per CR comments

* couple more CR changes

* add identity check into grpcOpt

* fix lint

* why do you fix the test

* revert test change

* stop contact chore for repair test

* put node in cache

* comment out contact chore. See what happens

* Revert "comment out contact chore. See what happens"

This reverts commit 2e45008e36a50e0a842ae455ac83de77093d4daa.

* try stopping contact earlier

* stop contact chore in uplink_test

* replace self on chore with *RoutingTable for access to latest node info

* Revert "stop contact chore in uplink_test"

This reverts commit 302db70f4071112d1b9f7ee0279225ea12757723.

* Revert "try stopping contact earlier"

This reverts commit 806cc3b82f9d598899dafd83da9315a1cb0cb43c.

* Revert "stop contact chore for repair test"

This reverts commit dd34de1cfdfc09b972186c9ab9a4f1e822446b79.
2019-09-10 09:05:07 -07:00
Egon Elbre
a801fab66a
all: add archview annotations (#2964) 2019-09-10 16:24:16 +03:00
Jennifer Li Johnson
3387750280 storagenode/contact: create chore for nodes to ping satellites (#2877)
Creates a chore for nodes to announce themselves to their trusted satellites. Runs on startup and every hour thereafter
2019-09-06 12:14:03 -04:00
Yaroslav Vorobiov
c35ad5cbfc
storagenode/console: update api (#2969) 2019-09-06 15:01:03 +03:00
paul cannon
9821a21e5c satellite,storagenode,bootstrap: add contact service to peer (#2951)
* satellite,storagenode,bootstrap: add contact service to peer
2019-09-04 15:04:18 -04:00
paul cannon
adfa16188b pkg/contact: bare-bones service and endpoint (#2941)
* pkg/contact: bare-bones service and endpoint

* split contact package into satellite and node

* use new contact protobuf types
2019-09-04 11:29:34 -07:00
Yaroslav Vorobiov
f7403f97b0
storagenode/storageusage: add summary, rename timestamp to interval_start (#2911) 2019-09-04 17:13:43 +03:00
Yaroslav Vorobiov
758f7cb3dd
storagenode/bandwidth: remove bandwidth concerns from console, add satellite summary (#2923) 2019-09-04 17:01:55 +03:00
Michal Niewrzal
ee614bf032
storagenode: add custom dial timeout for orders sending (#2939) 2019-09-03 17:32:28 +02:00
Michal Niewrzal
3fbe31aada
storagenode: Increase order sending request timeout (#2930) 2019-09-02 13:24:02 +02:00
littleskunk
9d1910cb2b
storagenode/orderarchive: Reduce TTL from 45 to 7 days (#2915) 2019-08-29 22:38:09 +02:00
Ivan Fraixedes
537769d7fa
storagenode/orders: Don't return error Archiving unsent (#2903)
Don't return error when archiving errors which aren't found in the DB
because it causes Storage Node send orders cycle to stop.

This was applied in the commit e47b8ed131
but the last call to orders.Archive function was missed so the errors
weren't returned when not found orders in the first call but they were
returned in the second call.

This commit address the second call for making handleBatches function
never returns error on not found orders.
2019-08-29 20:22:22 +02:00
Egon Elbre
8a5db77e04
storagenode/retain: add comment (#2910) 2019-08-29 19:42:17 +03:00
Yaroslav Vorobiov
b4d7d6778f
storagenode/reputation: add disqualified flag (#2862) 2019-08-28 23:54:12 +03:00
Egon Elbre
62e3bf5b34 storagenode/retain: fix concurrency issues (#2828)
* nicer flags

* fix concurrency

* add concurrent workers

* initialize things

* fix tests

* close retain service

* ensure we don't have workers working on the same satellite

* ensure things compile

* fix other compilation issues:

* concurrency changes

ran this with `go test -count=1000` and it passed all of them.

- we add a closed channel so that we can select on it with
  context cancellation.
- we put a once in so we only close the channel once.
- every time the queue/running state changes, we have to broadcast
  because we may want to wake up N pending Wait calls or other
  concurrent workers.
- because we broadcast, we don't need to do the polling in Wait
  anymore.
- ensure Run doesn't start multiple times so that we don't have
  to worry about concurrent Close with multiple Runs.
- hold the lock while we start workers so that a concurrent Close
  with Run can't decide that there's nothing started and exit
  and then have Run start things.
- make sure to poll the closed/context channels through loops
  or at the start of Run calls in case Close happens first.
- these polls should be under a mutex because they have a default
  case which makes it possible to schedule such that Close hasn't
  executed the channel close so it starts more work.
- cancel a local Run context when it's going to exit to make sure
  that any retainPieces calls have a canceled context.
- hopefully enough comments to both check my work and help readers
  digest what's going on.

Change-Id: Ida0e226a7e01e8ae64fa2c59dd5a84b04bccfbd7

* use the retain error class

Change-Id: I1511eaef135f98afd57b878e997e4c8a0d11cafc

* concurrency fixes again

- forgot to update the gc test to use the old Wait api.
- we need to drop the lock while we wait for the workers
  to exit, because they may be blocked on the condition
  variable
- additionally, we need to broadcast when we close the
  signal channel because the state changed: they want
  to wake up and exit.

Change-Id: I4204699792275260cd912f29aa73720f7d9b14b5

* undo my misguided rename

Change-Id: I6baffe1eb0434e260212c485bbcc01bed3250881

* remove pollInterval

* format paragraph more nicely

* move skew calculation into retain pieces
2019-08-28 16:35:25 -04:00
Bill Thorp
a250551b6d storagenode/piecestore + uplink/piecestore: return PieceHash and original OrderLimit during GET_REPAIR (#2775) 2019-08-26 14:57:41 -04:00
Egon Elbre
977472ed32 all: use fewer storage nodes to improve test memory usage (#2875)
* storagenode/inspector: use less storage nodes

* lib/uplinkc: use fewer storage nodes
2019-08-26 14:40:44 -04:00
Cameron
1f3537d4a9 storagenode/vouchers: remove storagenode vouchers (#2873) 2019-08-26 19:35:19 +03:00
Maximillian von Briesen
65e2d2e711
storagenode/piecestore: ignore canceled errors on download (#2822)
* ignore canceled errors on piecestore endpoint download
2019-08-23 11:16:43 -04:00
Cameron
3d9441999a
storagenode/orders: add archive cleanup to orders service (#2821)
This PR introduces functionality for routine deletion of archived orders.

The user may specify an interval at which to run archive cleanup and a TTL for archived items. During each cleanup, all items that have reached the TTL are deleted

This archive cleanup job is combined with the order sender into a new combined orders service
2019-08-22 10:33:14 -04:00
Egon Elbre
00b2e1a7d7 all: enable staticcheck (#2849)
* by having megacheck in disable it also disabled staticcheck

* fix closing body

* keep interfacer disabled

* hide bodies

* don't use deprecated func

* fix dead code

* fix potential overrun

* keep stylecheck disabled

* don't pass nil as context

* fix infinite recursion

* remove extraneous return

* fix data race

* use correct func

* ignore unused var

* remove unused consts
2019-08-22 13:40:15 +02:00
Jeff Wendling
14e36e4d60
storagenode/nodestats: fix issue on 32 bit platforms (#2841)
* storagenode/nodestats: fix issue on 32 bit platforms

time.Duration is an int64, so casting it down to an int
can cause it to become negative, causing a panic.

Change-Id: I33da7c29ddd59be60d8deec944a25f4a025902c7

* storagenode/nodestats: fix lint issue in test

Change-Id: Ie68598d724d2cae0dc959d4877098a08f4eb9af7
2019-08-21 18:57:44 -06:00
Egon Elbre
2d69d47655
all: fix Error.New formatting (#2840) 2019-08-21 19:30:29 +03:00
Simon Guindon
476fbf919a
storagenode/storagenodedb: refactor SQLite3 database connection initialization. (#2732)
* Rebasing changes against master.

* Added back withTx().

* Fix using new error type.

* Moving back database initialization back into the struct.

* Fix failing migration tests.

* Fix linting errors.

* Renamed database object names to be consistent.

* Fixing linting error in imports.

* Rebasing changes against master.

* Added back withTx().

* Fix using new error type.

* Moving back database initialization back into the struct.

* Fix failing migration tests.

* Fix linting errors.

* Renamed database object names to be consistent.

* Fixing linting error in imports.

* Adding missing change from merge.

* Fix error name.
2019-08-21 10:32:25 -04:00
Egon Elbre
9ec0ceddf3
pkg/revocation: ensure we close revocation databases (#2825) 2019-08-20 18:04:17 +03:00
littleskunk
6615350188 initialize used space table with sum over pieceinfo (#2818) 2019-08-20 08:13:18 -04:00
Isaac Hess
25154720bd
lib/uplink: remove redis and bolt dependencies (#2812)
* identity: remove redis and bolt dependencies

* identity: move revDB creation to main files
2019-08-19 16:10:38 -06:00
Maximillian von Briesen
d83a965139
storagenode/piecestore: Add retain service on storagenode (#2785)
Add retain service on storagenode. This service runs retain jobs that have been queued by the storagenodes. Rather than running retain jobs during the grpc Retain() call, the grpc call queues a retain job to the retain service and returns immediately afterwards, removing a significant bottleneck in garbage collection.
2019-08-19 14:52:47 -04:00
Ivan Fraixedes
546d099cf5
storagenode/orders: An invalid one don't have to stop all (#2804)
When an unsent order stored in the DB cannot be unmarshalled due to an
unmarshal error the rest unsent orders must be processed as usual.

This changes will avoid that a Storage Node with unsent orders with
invalid protobuf serialized values get blocked without sending orders
until those invalid ones get removed from the DB.
2019-08-16 17:33:51 +02:00
Ivan Fraixedes
e47b8ed131
storagenode: No FATAL error when unsent orders aren't found (#2801)
* pkg/process: Fatal show complete error information
  Change the general process execution function to not using the sugared
  logger for outputting the full error information.
  Delete some unreachable code because Zap logger Fatal method calls exit
  1 internally.
* storagenode/storagenodedb: Add info to error
  Add more information to an error returned due to some data
  inconsistency.
* storagenode/orders: Don't use sugared logger
  Don't use sugar logger and provide better contextualized error messages
  in settle method.
* storagenode/orders: Add some log fields to error msgs
  Add some relevant log fields to some logged errors of the sender settle
  method.
* satellite/orders: Remove always nil error from debug
  Remove an error which as logged in debug level which was always nil and
  makes the logic that used this variable clear.
* storagenode/orders: Don't return error Archiving unsent
  Don't stop the process which archive unsent orders if some of them
  aren't found the DB because it cause the Storage Node to stop with a
  fatal error.
2019-08-16 16:53:22 +02:00
Cameron
497f10d7b1
add method CleanArchive to delete archived orders (#2796) 2019-08-15 12:56:33 -04:00
Yaroslav Vorobiov
141af7e2f7
storagenode/console: refactor service and api (#2751) 2019-08-14 15:17:11 +03:00
Ivan Fraixedes
26fb992474 storagenode: Add more test assertions (#2772) 2019-08-13 15:08:05 -04:00
Egon Elbre
9eba5ac631
lib/uplink: remove Seek method (#2768) 2019-08-13 20:29:02 +03:00
Egon Elbre
43cadc65e2 skip flaky test (#2769) 2019-08-13 08:06:28 -07:00
Ivan Fraixedes
89a8d32733
storagenode/pieces: Restore lost test case (#2767)
PR https://github.com/storj/storj/pull/2596 applied a refactoring which
moved tests in of the storagenodb package and it lost a test having
replacing the one lost by another one which belonged to another package.

This commit removes the duplicated test and restores the lost one.
2019-08-13 14:57:05 +02:00
Jess G
022f5d2e14
storagenode: add space used cache for pieces (#2753)
* add cache, update cache w/piece create/delete

* add service w/loop to cache to recalculate space used cache

* add piecestore cache to other sn svcs to use

* add table to persist the total space used

* rm cache where not needed

* rm stuff from sn svcs

* start fixing tests, changes per comments

* update commits

* add unit tests

* fix commiting before we write header bytes

* fix cache create test

* copy cache map, add started back to recalc

* fix test

* add test, update comments
2019-08-12 14:43:05 -07:00
Yaroslav Vorobiov
4cf2b96731
storagenode/nodestats: fix test duration (#2748) 2019-08-09 14:12:32 +03:00
Kaloyan Raev
9dccf59e8e
Restrict node info only for trusted satellites (#2737) 2019-08-09 12:21:41 +03:00
Yaroslav Vorobiov
28a7778e9e
storagenode/nodestats: cache node stats (#2543) 2019-08-08 16:47:04 +03:00
paul cannon
17bdb5e9e5
move piece info into files (#2629)
Deprecate the pieceinfo database, and start storing piece info as a header to
piece files. Institute a "storage format version" concept allowing us to handle
pieces stored under multiple different types of storage. Add a piece_expirations
table which will still be used to track expiration times, so we can query it, but
which should be much smaller than the pieceinfo database would be for the
same number of pieces. (Only pieces with expiration times need to be stored in piece_expirations, and we don't need to store large byte blobs like the serialized
order limit, etc.) Use specialized names for accessing any functionality related
only to dealing with V0 pieces (e.g., `store.V0PieceInfo()`). Move SpaceUsed-
type functionality under the purview of the piece store. Add some generic
interfaces for traversing all blobs or all pieces. Add lots of tests.
2019-08-07 20:47:30 -05:00
Maximillian von Briesen
bdcb40fbc8
storagenode/storagenodedb: Add cursor to pieceInfo.GetPieceIDs (#2724) 2019-08-06 13:19:16 -04:00
JT Olio
28156d3573
storagenode: more live request tracking (#2699)
* storagenode/piecestore: track live requests together

Change-Id: I9ed44e4484b97bcbe076c222450c3449fe8b1075

* show grpc status codes in monkit failures

Change-Id: I68bc3a8d24a372e8147ef2a74636fc3e40fa799a

* small nit

Change-Id: I722b09345377b079e41c5a3dc86d7fd6232c9d24
2019-08-02 16:49:39 -06:00
ethanadams
b74d4198f0
Use UTC date in TestCachedBandwidthMonthRollover (#2684) 2019-08-01 11:30:04 -04:00
Jeff Wendling
26a2fbb719 storagenode: batch archiving unsent_orders (#2507) 2019-07-31 19:40:08 +03:00
Egon Elbre
4f0d39cc64
don't use global loggers (#2675) 2019-07-31 17:38:44 +03:00
Ivan Fraixedes
3cd477454f storagenode/piecestore: Make method unexported (#2674) 2019-07-31 10:13:39 -04:00
ethanadams
cc7f5d2f82
check nil against Bandwidth service not DB (#2673) 2019-07-31 09:30:36 -04:00
Egon Elbre
ec3d5c0bdd
don't use global loggers (#2671)
* pkg/server: don't use global logger
* satellite/overlay: use correct logger
* pkg/kademlia: use correct logger
* linksharing: use conventional way to pass in logger
* use zaptest in tests
2019-07-31 15:09:45 +03:00
Ivan Fraixedes
abef20930f
storagenode: Report gRPC error when satellite is untrusted (#2658)
* storagenode/piecestore: Unexport endpoint method
  Make an exported endpoint method to be unexported because it's only used
  by the same package and makes easy to change without thinking in
  breaking changes.
* uplink/ecclient: Use structured logger
  Swap sugared logger by the normal structured logger for having the full
  stack traces of the error in the debug message.
* storagenode/piecestore: Send gRPC error codes upload
  Refactoring in the storagenode/piecestore to send gRPC status error codes
  when some of the methods involved by upload return an error.
  
  The uplink related to uploads has also been modified to retrieve the
  gRPC status code when an error is returned by the server.
2019-07-30 18:58:08 +02:00
ethanadams
8f8b13abb9
Re-enable SN bandwidth rollups. Fix SN bandwidth rollup unique constraint issue. Re-organize service code (#2617)
* re-organizing into bandwidth service. re-enable rollup loop
* Prevent uniqueness failure in bandwidth rollup
* Add test to make sure the rollup select date range works correctly
* add bandwidth config for rollup interval
2019-07-29 10:07:52 -04:00
Egon Elbre
5d0816430f
rename all the things (#2531)
* rename pkg/linksharing to linksharing
* rename pkg/httpserver to linksharing/httpserver
* rename pkg/eestream to uplink/eestream
* rename pkg/stream to uplink/stream
* rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo
* rename pkg/auth/signing to pkg/signing
* rename pkg/storage to uplink/storage
* rename pkg/accounting to satellite/accounting
* rename pkg/audit to satellite/audit
* rename pkg/certdb to satellite/certdb
* rename pkg/discovery to satellite/discovery
* rename pkg/overlay to satellite/overlay
* rename pkg/datarepair to satellite/repair
2019-07-28 08:55:36 +03:00
Maximillian von Briesen
906c77b55a
Add RetainStatus to storagenode config (#2633)
--storage2.retain-status = "disabled" (default), "debug", or "enabled"
2019-07-26 16:49:08 -04:00
paul cannon
b9a17913fa storagenode/pieces: remove buffering from reading/writing and fix io.EOF bug (#2554) 2019-07-25 11:22:15 +03:00
Natalie Villasana
f11413bc8e Implement garbage collection on satellite (#2577)
* Added a gc package at satellite/gc, which contains the gc.Service, which runs garbage collection integrated with the metainfoloop, and the gc PieceTracker, which implements the metainfo loop Observer interface and stores all of the filters (about which pieces are good) for each node.
* Added a gc config located at satellite/gc/service.go (loop disabled by default in release)
* Creates bloom filters with pieces to be retained inside the metainfo loop
* Sends RetainRequests (or filters with good piece ids) to all storage nodes.
2019-07-24 13:26:43 -04:00
Jess G
353b089927
update testplanet with libuplink (#2618)
* update testplanet uplink upload with libuplink

* add libuplink to testplanet download

* update createbucket and delete obj with libuplink

* update downloadStream, fix tests

* fix test

* updates for CR comments
2019-07-23 07:58:45 -07:00
ethanadams
5b73180f9b Disable bandwidth rollups until duplicate data issue is resolved (#2606)
* disable bandwidth rollups until duplicate data issue is resolved
* disable rollup to make sure summaries still work correctly without the rollup
2019-07-20 13:39:14 +03:00
Egon Elbre
13dd501042
storagenode/storagenodedb: move tests near the interface rather than the implementation (#2596) 2019-07-19 20:40:27 +03:00
Cameron
5f096a3eab
Remove GetValid, add GetAll to vouchers DB (#2594)
* refactor GetValid to GetAll

* update design doc
2019-07-19 10:52:44 -04:00
Maximillian von Briesen
2e3ad0ce2a Use trusted pool to get satellite addr in storagenode orders send (#2590)
* use trusted satellite service to get address for sending orders on storagenod
* remove kad from order sender
2019-07-18 13:15:09 -04:00
Cameron
848eb8c02f
remove kademlia from vouchers package (#2589)
* remove kademlia from vouchers package
2019-07-18 10:09:25 -04:00
Egon Elbre
f6f65a80d7
storagenode/trust: implement fetching peer identity without kademlia and endpoint (#2584) 2019-07-17 21:14:44 +03:00
Stefan Benten
38a40088c7 Remove orphaned tmp data from Storagenodes (#2582) 2019-07-17 16:00:37 +03:00
Yehor Butko
5f194b4533
SNO Dashboard API (#2427)
* SNO Dashboard API
2019-07-17 14:42:00 +03:00
Jeff Wendling
89afe3ee37
remove struct to ensure 64bit alignment for atomics (#2578)
Change-Id: Id2a4b740b2486a844673f69ce2e54c8c1e3187e2
2019-07-16 14:53:58 -06:00
Stefan Benten
de300e9235 Network Wipe (Pre Beta) (#2566) 2019-07-16 18:31:29 +02:00
ethanadams
d044613679
SN DB Optimization: Add rollups to bandwidth usage (#2541)
* V3-2119: Add storagenode bandwidth usage rollup
2019-07-16 10:58:58 -04:00
Jeff Wendling
b51d3a69da enable utccheck in all tests by default (#2565)
* enable utccheck in all tests by default
2019-07-16 09:42:19 +09:00
Jeff Wendling
b9d8ddaad1
storagenode: remove datetime calls in favor of UTC (#2557)
* storagenode: remove datetime calls in favor of UTC

datetime only has second level granularity whereas string
comparisons don't. Since we're wiping everything anyway, it's
easier to just use UTC everywhere rather than migrate to
datetime calls.

* add utcdb to check that arguments are utc

* storagenodedb: add trivial tests to ensure calls work

This at least tests that all of the timestamps passed in are
in the UTC timezone.

* fix truncated comment and change migrations to be UTC
2019-07-15 13:38:08 -04:00
paul cannon
0d1dce508e
ensure uplink is sending correct size with PieceHash (#2555)
If we verify that the size matches reality, we can then expect to use
the filesystem to store the piece size as used in the signed PieceHash
from the uplink. Otherwise, the uplink might send a garbage size value,
leaving the storagenode with no good way to verify the uplink signature
on the piece at a later date.

Also fix the code in uplink/piecestore/ so that it sends a valid size,
because it was being rude and sending 0.
2019-07-15 11:26:18 -04:00
Michal Niewrzal
5bec820145
Fix monkit leaking (#2553) 2019-07-13 11:04:54 -04:00
Jeff Wendling
a2418b22af storagenodedb: optimize index usage and queries (#2545)
- Drops some unused indexes
- Applies a computed index to timestamp columns
- Applies a partial index for expired pieces
- Uses BETWEEN to avoid some datetime calls
- Filters expired piece search by those that aren't NULL
2019-07-12 15:29:09 -04:00
Alexander Leitner
3cc45a02fb
Update vouchers.go (#2544) 2019-07-12 12:55:48 -04:00
Egon Elbre
d52f764e54
protocol: implement new piece signing and verification (#2525) 2019-07-11 16:51:40 -04:00
Maximillian von Briesen
8b507f3d73 Address concerns with storagenode Retain endpoint (#2527) 2019-07-11 16:04:21 -04:00
Jeff Wendling
02565db73a storagenode: migration to drop unused index and used_serials data (#2508) 2019-07-10 15:16:13 -04:00
ethanadams
f06aec06fb
Move int64s to top of struct to resolve alignment issue on ARM (#2521)
* move int64s to top of struct to resolve alignment issue on ARM
2019-07-10 13:47:22 -04:00
Jeff Wendling
7886a4d7b9 storagenodedb: use datetime functions in sqlite queries (#2512)
This way comparison happens on the actual time rather than the
string representation of the time which may change depending on
the time zone.
2019-07-10 10:47:59 -04:00
Fadila
fa1f5c8d7f garbage collection endpoint on storage node (#2424) 2019-07-10 09:41:47 -04:00
Alexander Leitner
1c5db71faf
Change protobuf expirations to use time.Time (#2509)
* Change protobuf expirations to use time.Time instead of timestamp.Timestamp
2019-07-09 17:54:00 -04:00
Michal Niewrzal
bbc25a2bf7 Drop SN certifiates table from DB (#2498) 2019-07-09 17:33:45 -04:00
Jeff Wendling
d616be8ae0 storagenode: use minimum time in the order for expiration (#2504) 2019-07-09 17:16:30 -04:00
ethanadams
4faed2098d
method name changer per PR 2469 (#2494) 2019-07-09 16:05:14 -04:00
ethanadams
ff6f1d1b32
storagenode: add in-memory tracking for bandwidth and disk usage (#2469)
* Add in-memory cache for bandwidth and space usage monitoring

* moved some structs around and added error handling for get piece size query

* added to existing bandwidth test.  fixed typo

* added test, updates from PR review, added monkit for new methods

* PR review updates. renamed space used methods

* changed bw cache so that only Add updates the cache and it only overwrites when the date moves forward

* moved bandwidth usage to bw and space usage to pieceinfodb

* fixed interface comment

* removed pointer from sync.Once
2019-07-08 20:33:50 -04:00
Alexander Leitner
3587e1a579 Change pointerdb pointer to use time.Time for Creation date (#2483) 2019-07-09 00:16:50 +02:00
Alexander Leitner
dcf8e2936b
Update vouchers to use time.Time instead of timestamp (#2478)
* Update vouchers to use time.Time instead of timestamp
2019-07-08 13:07:30 -04:00
Yaroslav Vorobiov
524eb24c83 storagenode/nodestats: combine stats into single RPC call (#2455)
* change satellite nodestats endpoint
2019-07-08 17:33:43 +03:00
JT Olio
65aa8f227f piecestore: pipeline chunks with orders (#2451) 2019-07-08 17:26:19 +03:00
Alexander Leitner
88732188cb
Update inspector timestamp to time.time (#2464)
* Update inspector timestamp
2019-07-08 10:06:12 -04:00
Fadila
3f4662598e
storagenode/piecestore: add piece_creation field (#2441) 2019-07-08 09:22:36 +02:00
Yaroslav Vorobiov
7aca0eb284 storagenode/dashboard: show console address (#2456) 2019-07-06 15:40:58 +02:00
Yaroslav Vorobiov
ce4b997623
storagenode/nodestats: connection leak (#2443) 2019-07-04 13:34:23 +03:00
Yaroslav Vorobiov
5557d557f9 storagenode/consoledb: fix daily bandwidth query (#2446)
* storagenode/consoledb fix daily bandwidth query
2019-07-03 15:08:40 -04:00
Cameron
d499d162f4
implement storj.NodeURL in trusted satellites (#2388)
* implement storj.NodeURL in trusted satellites
2019-07-03 13:29:18 -04:00
Michal Niewrzal
61dfa61e3a
Add timestamp and piece size to piece hash (#2198) 2019-07-03 18:14:37 +02:00
Egon Elbre
38f3d860a4
storagenode: decline uploads when there are too many live requests (#2397) 2019-07-03 16:47:55 +03:00
ethanadams
47e4584fbe
V3-1989: Storage node database is locked for several minutes while submiting orders (#2410)
* remove infodb locks and give a unique name for each in memory created.

* changed max idle and open to 1 for memory DBs.  fixes table locking errors

* fixed race condition

* added file based infodb test

* added busy timeout parameter to the file based infodb for testing

* fixed imports

* removed db.locked() after merge from master
2019-07-02 17:23:02 -04:00
Alexander Leitner
6d55bbdb57
OrderLimit creation date time limit (#2412)
* Limit by order creation
2019-07-02 12:06:12 -04:00
Yaroslav Vorobiov
9e8ecb6303
Storagenode nodestats at daily space usage (#2422) 2019-07-02 15:05:58 +03:00
Yaroslav Vorobiov
16cd1fde87 Storagenode add daily bandwidth usage query for SNO (#2348) 2019-07-02 11:53:39 +02:00
Jennifer Li Johnson
699ccea19f
Creates Routing Table Antechamber (#2318) 2019-07-01 17:20:19 -04:00
Egon Elbre
385c046723
pkg/pb: rename Order2 to Order, OrderLimit2 to OrderLimit (#2406) 2019-07-01 18:54:11 +03:00
Egon Elbre
2b68a72428
internal/testplanet: ensure that metainfo connections get closed (#2381) 2019-07-01 17:35:10 +03:00
Egon Elbre
8a59999537 Revert "miscommit add debug info"
This reverts commit 512f3fa93e.
2019-07-01 12:36:35 +03:00
Egon Elbre
512f3fa93e add debug info 2019-07-01 12:33:03 +03:00
nerdatwork
477ac876af Fix typo in sender.go (#2395) 2019-06-30 15:02:12 +02:00
Egon Elbre
e83ebd7cde
jenkins: avoid using goimports and distribute load better (#2359) 2019-06-27 21:52:50 +03:00
Cameron
261750252a
edit voucher denied message (#2362) 2019-06-27 14:15:41 -04:00
Yaroslav Vorobiov
a6db2dd332
Storagenode add nodestats client for SNO console (#2287) 2019-06-26 21:55:22 +03:00
Egon Elbre
615bfca135 Fix TestGetSignee flakiness (#2350)
* add IsCanceled

* fixes to error handling

* fix imports

* retrigger jenkins
2019-06-26 09:30:37 -06:00
Yehor Butko
8bf7c5c671
SNO Dashboard http status codes updated (#2333) 2019-06-26 16:36:47 +03:00
JT Olio
fbe9696e92 pkg/kademlia: clean up peer discovery (#2252) 2019-06-26 16:16:46 +03:00
Egon Elbre
b6ad3e9c9f
internal/testrand: new package for random data (#2282) 2019-06-26 13:38:51 +03:00
Egon Elbre
c7679b9b30
Fix some leaks and add notes about close handling (#2334) 2019-06-25 23:00:51 +03:00
Stefan Benten
1f58708910 Delete all Tardigrade Satellite Data from SNO's (#2324) 2019-06-25 13:10:56 +02:00
Egon Elbre
6502143e79
fix import ordering (#2322) 2019-06-25 12:46:29 +03:00
Cameron
b3da72c21c
Fix TestVoucherService (#2317)
* change parameters in update stats to fix test issue
2019-06-24 18:29:24 -04:00
Yehor Butko
96bc0ccfa4
SNO Dshboard initial api endpoint added (#2284)
* initial api endpoint added
2019-06-24 18:15:31 +03:00
Cameron
1283036e37
add storage node voucher request service (#2158)
* add voucher service on storage node

* config field tag syntax, go routines for requests

* hook up voucher service in storagenode/peer.go

* add voucher config to testplanet

* add voucher config to testplanet

* add voucher response status INVALID, ACCEPTED, REJECTED

* add a test for vouchers service

* handle no row from GetValid, test it

* add trust pool to voucher service

* use trusted list to get satellites

* verify vouchers upon receipt

* test VerifyVoucher
2019-06-21 18:48:52 -04:00
Egon Elbre
23e081f0c7 storagenode: delete piece when upload is cancelled (#2286)
* storagenode: delete piece when upload is cancelled

* don't delete when piece info has been committed
2019-06-21 18:16:39 +02:00
Yehor Butko
e5fd0287e4
V3-1819 Storage node operator server and service started (#2112)
* V3-1819 Storage node operator server and service started
2019-06-20 14:52:32 +03:00
Kaloyan Raev
964c87c476 Fix checks around repair threshold (#2246) 2019-06-19 22:13:11 +02:00
littleskunk
b1e5cf1200
add index on pieceinfo expireation for faster GetExpired calls (#2220)
* add index on pieceinfo expireation for faster GetExpired calls

* Add Migration File
2019-06-18 01:27:14 +02:00
littleskunk
b8bced690c
improve logging (#2219) 2019-06-18 00:38:52 +02:00
Egon Elbre
1a1a084477
testcontext: sanitize folder name (#2195) 2019-06-13 15:46:08 +03:00
Kaloyan Raev
252c8ac189
Add email to self node info (#2171) 2019-06-11 16:30:28 +03:00
littleskunk
5e7fed7541 improve logging (#2170)
Signed-off-by: littleskunk <jens.heimbuerge@googlemail.com>
2019-06-10 21:30:17 -06:00
Stefan Benten
74484fc57e
Enforce our Minimum Requirements for Node Operators and sanity check them (#2155) 2019-06-10 12:14:50 +02:00
Cameron
23587bba0c
Storagenode vouchers table (#2121)
* add vouchers table with methods
2019-06-07 16:20:34 -04:00
Egon Elbre
03fece56de
Ensure Storage Nodes collect expired used serial numbers (#2143) 2019-06-06 22:06:31 +03:00
JT Olio
ccb158c99b
pkg/auth: add monkit task to missing places (#2123)
What: add monkit.Task to a bunch of functions that are missing it

Why: this will significantly help our instrumentation, data collection, and tracing about what's going on in the network
2019-06-05 07:47:01 -06:00
JT Olio
679cdfda9e storage/filestore: add monkit task to missing places (#2124)
Change-Id: I61f8056617a8d0596a00c232a6d0b330909e08f9
2019-06-05 15:06:06 +02:00
JT Olio
d02427e41a db: set max open conns, conn max lifetime, add db stat monitoring (#2117) 2019-06-04 23:30:21 +02:00
JT Olio
d6c02fc657 storagenode/piecestore: add meters to upload/download rates (#2106) 2019-06-04 15:22:00 +02:00
JT Olio
c4bb84f209 storagenode: add monkit task to missing places (#2107) 2019-06-04 14:31:38 +02:00
Kaloyan Raev
2ab95b533e
Check errors for possible outcomes from audit's DownloadShares (#2072) 2019-06-03 12:17:09 +03:00
Maximillian von Briesen
04c20b0ac0
Add monkit stats to piecestore upload/download (#2078) 2019-05-30 11:44:47 -04:00
Brandon Iglesias
771271e7b8 updating help message on storage node for AllocatedBandwidth (#2074) 2019-05-29 13:40:04 -04:00
ethanadams
268dc6b7e4
Enable gocritic linter (#2051)
* first round cleanup based on go-critic

* more issues resolved for ifelsechain and unlambda checks

* updated from master and gocritic found a new ifElseChain issue

* disable appendAssign. i reports false positives

* re-enabled go-critic appendAssign and disabled lint check at code line level

* fixed go-critic lint error

* fixed // nolint add gocritic specifically
2019-05-29 09:14:25 -04:00
Egon Elbre
9c23c2d427 db: set max idle connections higher to avoid redialing all the time (#1991) 2019-05-21 17:30:06 +03:00
JT Olio
32b3f8fef0 cmd/storagenode: pull more things into releaseDefaults (#1980) 2019-05-21 13:48:47 +02:00
littleskunk
910eb5d2c7 improve logging (#1987) 2019-05-17 19:02:39 +02:00
Marc Schubert
2509a4c98a Update readwrite.go (#1950)
Correcting wrong doc string.
2019-05-13 13:53:14 +02:00
Egon Elbre
a2b61fd67c
storage node collector (#1913) 2019-05-08 14:11:59 +03:00
Stefan Benten
ac452a5819
Add Network Wipe for Storagenodes (#1909)
* Add Network Wipe Migration to InfoDB

* Remove New Data Section
2019-05-07 22:05:50 +02:00
Stefan Benten
eeb4f6541e Update Calculation to include the used space for the allocation (#1899) 2019-05-06 14:59:30 -04:00
Bill Thorp
6ece4f11ad
moved invalid/offline back into SQL (#1838)
* moved invalid/offline back into SQL, removed GetAll()
2019-05-01 09:45:52 -04:00
Egon Elbre
60c4c10c79
storagenode: delete psserver (#1837) 2019-04-26 08:17:18 +03:00
Egon Elbre
7ba1a2bc53
storagenode: getting signee cert always asked the kademlia network (#1823) 2019-04-24 11:13:48 +03:00
Egon Elbre
f7ed63a119
handle database error checks properly (#1796) 2019-04-23 14:13:57 +03:00
Kaloyan Raev
8fc5fe1d6f
Refactor pb.Node protobuf (#1785) 2019-04-22 12:07:50 +03:00
Egon Elbre
5b3c146d8a
Check context cancellation more nicely (#1752) 2019-04-17 13:09:44 +03:00
Michal Niewrzal
e922f71f61 Piecestore space and bandwidth allocation check (#1527)
* Space and bandwidth allocation check

* use proper config flag

* one more check + tests

* use monitor to check space and bandwidth

* remove unused field

* check during read/write

* fix linter

* fix pieceid

* remove unused methods

* revert unneeded change
2019-04-15 12:12:22 +02:00
Egon Elbre
a1fd7cb6b4 don't cache failed satellite certificate fetch attempt (#1745)
* don't cache failing error

* fix test

* fix linter errors
2019-04-12 16:28:27 +02:00
Natalie Villasana
a73fd1aa9b
no longer includes totalUsedSpaceOld in used space calculation (#1736) 2019-04-10 11:55:06 -04:00
Stefan Benten
bae4c820ee
Add Version Information into KAD Network and SatelliteDB & Change Selection Process (#1648)
* Initial Webserver Draft for Version Controlling

* Rename type to avoid confusion

* Move Function Calls into Version Package

* Fix Linting and Language Typos

* Fix Linting and Spelling Mistakes

* Include Copyright

* Include Copyright

* Adjust Version-Control Server to return list of Versions

* Linting

* Improve Request Handling and Readability

* Add Configuration File Option
Add Systemd Service file

* Add Logging to File

* Smaller Changes

* Add Semantic Versioning and refuses outdated Software from Startup (#1612)

* implements internal Semantic Version library

* adds version logging + reporting to process

* Advance SemVer struct for easier handling

* Add Accepted Version Store

* Fix Function

* Restructure

* Type Conversion

* Handle Version String properly

* Add Note about array index

* Set temporary Default Version

* Add Copyright

* Adding Version to Dashboard

* Adding Version Info Log

* Renaming and adding CheckerProcess

* Iteration Sync

* Iteration V2

* linting

* made LogAndReportVersion a go routine

* Refactor to Go Routine

* Add Context to Go Routine and allow Operation if Lookup to Control Server fails

* Handle Unmarshal properly

* Linting

* Relocate Version Checks

* Relocating Version Check and specified default Version for now

* Linting Error Prevention

* Refuse Startup on outdated Version

* Add Startup Check Function

* Straighten Logging

* Dont force Shutdown if --dev flag is set

* Create full Service/Peer Structure for ControlServer

* Linting

* Straighting Naming

* Finish VersionControl Service Layout

* Improve Error Handling

* Change Listening Address

* Move Checker Function

* Remove VersionControl Peer

* Linting

* Linting

* Create VersionClient Service

* Renaming

* Add Version Client to Peer Definitions

* Linting and Renaming

* Linting

* Remove Transport Checks for now

* Move to Client Side Flag

* Remove check

* Linting

* Transport Client Version Intro

* Adding Version Client to Transport Client

* Add missing parameter

* Adding Version Check, to set Allowed = true

* Set Default to true, testing

* Restructuring Code

* Uplink Changes

* Add more proper Defaults

* Renaming of Version struct

* Dont pass Service use Pointer

* Set Defaults for Versioning Checks

* Put HTTP Server in go routine

* Add Versioncontrol to Storj-Sim

* Testplanet Fixes

* Linting

* Add Error Handling and new Server Struct

* Move Lock slightly

* Reduce Race Potentials

* Remove unnecessary files

* Linting

* Add Proper Transport Handling

* small fixes

* add fence for allowed check

* Add Startup Version Check and Service Naming

* make errormessage private

* Add Comments about VersionedClient

* Linting

* Remove Checks that refuse outgoing connections

* Remove release cmd

* Add Release Script

* Linting

* Update to use correct Values

* Change Timestamp handling

* Adding Protobuf changes back in

* Adding SatelliteDB Changes and adding Storj Node Version to PB

* Add Migration Table

* Add Default Stats for Creation

* Move to BigInt

* Proper SQL Migration

* Ensure minimum Version is passed to the node selection

* Linting...

* Remove VersionedClient and adjust smaller changes from prior merge

* Linting

* Fix PB Message Handling and Query for Node Selection

* some future-proofing type changes

Change-Id: I3cb5018dcccdbc9739fe004d859065992720caaf

* fix a compiler error

Change-Id: If66bb92d8b98e31cd618ecec9c6448ab9b037fa5

* Comment on Constant for Overlay

* Remove NOT NULL and add epoch call as function

* add versions to bootstrap and satellites

Change-Id: I436944589ea5f21600cdd997742a84fe0b16e47b

* Change Update Migration

* Fix DB Migration

* Increase Timeout temporarily, to see whats going on

* Remove unnecessary const and vars
Cleanup Function calls from deprecated NodeVersion struct

* Updated Protopuf, removed depcreated Code from Inspector

* Implement NodeVersion into InfoResponse

* Regenerated locked.go

* Linting

* Fix Tests

* Remove unnecessary constant

* Update Function and Flag Description

* Remove Empty Stat Creation

* return properly with error

* Remove unnecessary struct

* simplify migration step

* Update Inspector to return Version Info

* Update local Endpoint Version Handling

* Reset Travis Timeout

* Add Default for CommitHash

* single quotes
2019-04-10 08:04:24 +02:00
littleskunk
a3caa8e00d
upload, download and delete success message on log level info (#1704) 2019-04-09 01:14:09 +02:00
Egon Elbre
1c87a53eb8 Don't cancel fetching peer identity context when request is cancelled (#1699)
* don't cancel fetching peer identity context when request is cancelled

* fix typo

* add tests

* fix typo
2019-04-08 22:31:20 +02:00
Bryan White
faf5fae3f9
Identity versioning (#1389) 2019-04-08 20:15:19 +02:00
Maximillian von Briesen
bb3b4e4816 Data repair integration test (#1582) 2019-04-08 13:33:47 -04:00
aligeti
6a1d343abd Delete expired pieces on storage nodes (#1629) 2019-04-08 18:46:38 +02:00
Bill Thorp
255b92b4b8
fixed test on windows (#1667)
* fixed test on windows
2019-04-04 12:56:42 -04:00
JT Olio
09be9964eb internal/version: do version checks much earlier in the process initialization, take 2 (#1666)
* internal/version: do version checks much earlier in the process initialization, take 2

Change-Id: Ida8c7e3757e0deea0ec7aea867d3d27ce97dc134

* linter and test failures

Change-Id: I45b02a16ec1c0f0981227dc842e68dbdf67fdbf4
2019-04-04 17:40:07 +02:00
Stefan Benten
2cf86703a3
Add Versioning Server (#1576)
* Initial Webserver Draft for Version Controlling

* Rename type to avoid confusion

* Move Function Calls into Version Package

* Fix Linting and Language Typos

* Fix Linting and Spelling Mistakes

* Include Copyright

* Include Copyright

* Adjust Version-Control Server to return list of Versions

* Linting

* Improve Request Handling and Readability

* Add Configuration File Option
Add Systemd Service file

* Add Logging to File

* Smaller Changes

* Add Semantic Versioning and refuses outdated Software from Startup (#1612)

* implements internal Semantic Version library

* adds version logging + reporting to process

* Advance SemVer struct for easier handling

* Add Accepted Version Store

* Fix Function

* Restructure

* Type Conversion

* Handle Version String properly

* Add Note about array index

* Set temporary Default Version

* Add Copyright

* Adding Version to Dashboard

* Adding Version Info Log

* Renaming and adding CheckerProcess

* Iteration Sync

* Iteration V2

* linting

* made LogAndReportVersion a go routine

* Refactor to Go Routine

* Add Context to Go Routine and allow Operation if Lookup to Control Server fails

* Handle Unmarshal properly

* Linting

* Relocate Version Checks

* Relocating Version Check and specified default Version for now

* Linting Error Prevention

* Refuse Startup on outdated Version

* Add Startup Check Function

* Straighten Logging

* Dont force Shutdown if --dev flag is set

* Create full Service/Peer Structure for ControlServer

* Linting

* Straighting Naming

* Finish VersionControl Service Layout

* Improve Error Handling

* Change Listening Address

* Move Checker Function

* Remove VersionControl Peer

* Linting

* Linting

* Create VersionClient Service

* Renaming

* Add Version Client to Peer Definitions

* Linting and Renaming

* Linting

* Remove Transport Checks for now

* Move to Client Side Flag

* Remove check

* Linting

* Transport Client Version Intro

* Adding Version Client to Transport Client

* Add missing parameter

* Adding Version Check, to set Allowed = true

* Set Default to true, testing

* Restructuring Code

* Uplink Changes

* Add more proper Defaults

* Renaming of Version struct

* Dont pass Service use Pointer

* Set Defaults for Versioning Checks

* Put HTTP Server in go routine

* Add Versioncontrol to Storj-Sim

* Testplanet Fixes

* Linting

* Add Error Handling and new Server Struct

* Move Lock slightly

* Reduce Race Potentials

* Remove unnecessary files

* Linting

* Add Proper Transport Handling

* small fixes

* add fence for allowed check

* Add Startup Version Check and Service Naming

* make errormessage private

* Add Comments about VersionedClient

* Linting

* Remove Checks that refuse outgoing connections

* Remove release cmd

* Add Release Script

* Linting

* Update to use correct Values

* Move vars private and set minimum default versions for testing builds

* Remove VersionedClient

* Better Error Handling and naked return removal

* Straighten the Regex and string conversion

* Change Check to allows testplanet and storj-sim to run without the
need to pass an LDFlag

* Cosmetic Change to Dashboard

* Cleanup Returns and remove commented code

* Remove Version Check if no build options are passed in

* Pass in Config Values instead of Pointers

* Handle missed Error

* Update Endpoint URL

* Change Type of Release Flag

* Add additional Logging

* Remove Versions Logging of other Services

* minor fixes

Change-Id: I5cc04a410ea6b2008d14dffd63eb5f36dd348a8b
2019-04-03 21:13:39 +02:00
Egon Elbre
fba9a5f945 migration tests for storagenodedb infodb (#1628) 2019-04-02 09:54:09 +02:00
Michal Niewrzal
f80750693c Store bandwidth from orders on satellite (#1586) 2019-04-01 16:14:58 -04:00
Stefan Benten
c50a21d4cf
Fix Dashboard calculation for used space and bandwidth usage (#1615)
* Fix Dashboard calculation for used space and bandwidth usage

* Copy+Paste Err
2019-03-30 14:16:08 +01:00
Egon Elbre
63737e350f
Delete psserver and unused mocks (#1605) 2019-03-29 16:40:06 +02:00
Cameron
cac55a29e4
Add used egress/ingress to storage node dashboard (#1565)
* add egress and ingress to StatSummaryResponse

* print egress and ingress to storagenode dashboard
2019-03-27 15:44:18 -04:00
Michal Niewrzal
bfdfebbde2
Satellite orders receiving (#1564)
This change adds satellite endpoint for receiving OrderLimits sent by storage node.
Change includes:
* wire up orders sender in storage node (also in testplanet)
* saving serial number for OrderLimit in serial_numbers table
* satellite endpoint for receiving, verifying and storing OrderLimit and Order serial number
* initial implementation for Orders DB
* basic test for sending orders to satellite
2019-03-27 11:24:35 +01:00
Natalie Villasana
0fa1d536e7
removes pingbackTimeout (#1556) 2019-03-22 16:06:57 -04:00
Egon Elbre
1d96d25f3f
kademlia ping tracking (#1538) 2019-03-22 15:27:59 +02:00
Egon Elbre
2c5c2c29da
storage node order sending (#1535) 2019-03-21 15:24:26 +02:00
Michal Niewrzal
d7feafe56b Move psserver tests (#1522) 2019-03-20 23:12:00 +02:00
Natalie Villasana
61ee04d363
adds pingbackTimeout to kademlia endpoint (#1518) 2019-03-19 14:30:27 -04:00
Michal Niewrzal
b05cf05649
Restrict slash in bucket name (#1524) 2019-03-19 15:37:28 +01:00
Egon Elbre
7961bcbc92 remove free disk check since it's unreliable (#1516) 2019-03-18 16:18:44 -04:00
Egon Elbre
a24c74c502 fix message formatting (#1512) 2019-03-18 18:02:37 +01:00
Egon Elbre
80916ffb53
storagenode/pieces: ensure we can call Commit or Cancel only once (#1511) 2019-03-18 16:29:54 +02:00
Egon Elbre
117edec54c
Add serial number type (#1508) 2019-03-18 15:08:24 +02:00
Egon Elbre
05d148aeb5
Storage node and upload/download protocol refactor (#1422)
refactor storage node server
refactor upload and download protocol
2019-03-18 12:55:06 +02:00
Dylan Lott
59f1e267c9
Removes concept of email from kademlia metadata (#1435)
* Removes concept of Email from Kademlia

* Removes kad email

* adds emails back to operator config for satellite

* replace operator configs in testplanet
2019-03-12 14:05:18 -06:00
Jess G
193a70f0a6
add private listener to grpc server (#1398)
* add private listener to grpc server

* add changes per init CR

* fix server.close

* add insecure grpc connection, update logs msg

* fix tests, move insecure client

* add private ports to storj-sim, add insecure client to other inspectors

* add ports to test so there arent conflicts

* fix lint err

* fix node started log msg, close public listener

* remove commented out line
2019-03-07 13:19:37 -05:00
Jess G
3c9d83dbfe
convert psserver dashboard into an inspector (#1407)
* Convert psserver dashboard into an inspector

* remove dashboard stream, update ps.pb.mock

* fixes for lint errs
2019-03-05 15:48:37 -05:00
Egon Elbre
3f3209c8d5
fixes to piecestore and psdb (#1380)
* replace direct reference with an interface in various places
* hide piecePath
* ensure psserver tests don't use path
* ensure psserver tests don't use sql queries directly
2019-03-01 07:46:16 +02:00
Michal Niewrzal
6186b3f90a
Storage node hash calculation on upload (#1347)
Storage node is calculating hash of uploaded data and send it back to uplink with signature
2019-02-23 11:46:07 +01:00
Michal Niewrzal
8d685217e4
Storagenode migrations (#1299)
* creates initial migration for psdb
* add test mechanism to validate migration to every version
* fix few small issues in versions.go and context.go
2019-02-19 10:39:04 +01:00
JT Olio
2a59679766 pkg/transport: require tls configuration for dialing (#1286)
* separate TLS options from server options (because we need them for dialing too)
* stop creating transports in multiple places
* ensure that we actually check revocation, whitelists, certificate signing, etc, for all connections.
2019-02-11 13:17:32 +02:00
Egon Elbre
e37e0c1b5f
Fix server config usage (#1282) 2019-02-08 20:57:17 +02:00
Egon Elbre
9c1e299f3c
Ensure everyone sees everyone else (#1275) 2019-02-08 11:25:13 +02:00
Egon Elbre
bb11d83ed0
Proper planet shutdown (#1249) 2019-02-06 15:19:14 +02:00
Egon Elbre
fdbe2db273
Remove node package and simplify DHT interface (#1233) 2019-02-06 14:37:17 +02:00
Egon Elbre
87d6410b50 Revert "Remove node package and simplify DHT interface."
This reverts commit 03ec1ff92d.
2019-02-05 10:38:48 +02:00
Egon Elbre
03ec1ff92d Remove node package and simplify DHT interface. 2019-02-05 10:37:24 +02:00
Egon Elbre
b91d77436f
Test merging planets (#1181) 2019-02-01 15:32:28 +02:00
Egon Elbre
1df81b1460
Separate garbage collect logic from psdb (#1167) 2019-01-29 17:41:01 +02:00
Egon Elbre
e1a8bbdcb6
Kademlia flags cleanup (#1137) 2019-01-29 08:51:07 +02:00
Egon Elbre
d50c07e56c
Implement WorkGroup (#1151) 2019-01-28 21:04:42 +02:00
Egon Elbre
cecd4b0816
Remove server aliases (#1154) 2019-01-28 17:04:53 +02:00
Egon Elbre
85b43926b4
Separate identity from server config (#1138) 2019-01-25 16:54:54 +02:00
Egon Elbre
187e9b2138
Code consistency between peers (#1126) 2019-01-24 22:28:06 +02:00
Egon Elbre
b6c61cdd55
Use storagenode.Peer for storagenode (#1107) 2019-01-23 12:39:03 +02:00
Egon Elbre
78dc02b758 Satellite Peer (#1034)
* add satellite peer

* Add overlay

* reorganize kademlia

* add RunRefresh

* add refresh to storagenode.Peer

* add discovery

* add agreements and metainfo

* rename

* add datarepair checker

* add repair

* add todo notes for audit

* add testing interface

* add into testplanet

* fixes

* fix compilation errors

* fix compilation errors

* make testplanet run

* remove audit refrences

* ensure that audit tests run

* dev

* checker tests compilable

* fix discovery

* fix compilation

* fix

* fix

* dev

* fix

* disable auth

* fixes

* revert go.mod/sum

* fix linter errors

* fix

* fix copyright

* Add address param for SN dashboard (#1076)

* Rename storj-sdk to storj-sim (#1078)

* Storagenode logs and config improvements  (#1075)

* Add more info to SN logs

* remove config-dir from user config

* add output where config was stored

* add message for successful connection

* fix linter

* remove storage.path from user config

* resolve config path

* move success  message to info

* log improvements

* Remove captplanet (#1070)

* pkg/server: include production cert (#1082)

Change-Id: Ie8e6fe78550be83c3bd797db7a1e58d37c684792

* Generate Payments Report (#1079)

* memory.Size: autoformat sizes based on value entropy (#1081)

* Jj/bytes (#1085)

* run tally and rollup

* sets dev default tally and rollup intervals

* nonessential storj-sim edits (#1086)

* Closing context doesn't stop storage node (#1084)

* Print when cancelled

* Close properly

* Don't log nil

* Don't print error when closing dashboard

* Fix panic in inspector if ping fails (#1088)

* Consolidate identity management to identity cli commands (#1083)

* Consolidate identity management:

Move identity cretaion/signing out of storagenode setup command.

* fixes

* linters

* Consolidate identity management:

Move identity cretaion/signing out of storagenode setup command.

* fixes

* sava backups before saving signed certs

* add "-prebuilt-test-cmds" test flag

* linters

* prepare cli tests for travis

* linter fixes

* more fixes

* linter gods

* sp/sdk/sim

* remove ca.difficulty

* remove unused difficulty

* return setup to its rightful place

* wip travis

* Revert "wip travis"

This reverts commit 56834849dcf066d3cc0a4f139033fc3f6d7188ca.

* typo in travis.yaml

* remove tests

* remove more

* make it only create one identity at a time for consistency

* add config-dir for consitency

* add identity creation to storj-sim

* add flags

* simplify

* fix nolint and compile

* prevent overwrite and pass difficulty, concurrency, and parent creds

* goimports
2019-01-18 08:54:08 -05:00
Egon Elbre
8893884044
convert piecestorage into a struct (#1024) 2019-01-11 13:26:39 +02:00
Egon Elbre
eb69ecadec
Storage Node Peer (#1005) 2019-01-10 15:13:27 +02:00