Commit Graph

364 Commits

Author SHA1 Message Date
Egon Elbre
d55288cf68 pkg/rpc: replace methods with direct calls to pb
Change-Id: I8bd015d8d316a2c12c1daceca1d9fd257f6f57bc
2019-12-22 17:12:43 +02:00
Egon Elbre
006baa9ca6 pkg/rpc: remove drpc aliases
We need to split up pb package, which means we cannot have a core package
that depends on them.

Change-Id: I7f4f6fd82f89a51a9b2ad08bf2b1207253b8a215
2019-12-22 16:58:08 +02:00
Yingrong Zhao
6e71591b9b satellitedb;storagenodedb: remove unnecessary use of DB transactions in graceful exit
Change-Id: Ief0a28c6750c130896b48bfebfbea7fb3caa810f
2019-12-20 21:24:38 +00:00
Qweder93
e47ec84dee storagenode notification service and api added
Change-Id: I36898d7c43e1768e0cae0da8d83bb20b16f0cdde
2019-12-20 18:42:23 +00:00
Egon Elbre
afe05edff2 {storagenode,satellite}/gracefulexit: ensure workers finish their work
Fixes a data race caused by not waiting for workers to finish
before shutting down. Currently this ended up failing logging
because it was closed when test tried to write to it.

Change-Id: I074045cd83bbf49e658f51353aa7901e9a5d074b
2019-12-17 17:21:52 +02:00
Egon Elbre
7a36507a0a private/testcontext: ensure we call cleanup everywhere
Change-Id: Icb921144b651611d78f3736629430d05c3b8a7d3
2019-12-17 14:16:09 +00:00
littleskunk
08947e177d storagenode/garbagecollection: enable in production
Change-Id: I627b7a37ca4a85eb19936ca2c7ca907d7cc63f5b
2019-12-16 22:44:04 +00:00
Vitalii Shpital
53d9bc4530
storagenode/notifications: db created (#3707) 2019-12-16 19:59:01 +02:00
littleskunk
c2ea75208f
storagenode/orderdb: fix db lock
Change-Id: Id1add0ba7ae1b20bd98099bd4d3aff0fcfdd90c9
2019-12-15 23:41:22 +01:00
Andrew Harding
cb89496569 storagenode/trust: wire up list into pool
- also updated ping chore to pick up trust changes
- fixed small typo in blueprint
- fixed flags for storj-sim
- wired up changes to testplanet

Change-Id: I02982f3a63a1b4150b82a009ee126b25ed51917d
2019-12-13 20:32:50 +00:00
Andrew Harding
2867b6a466 storagenode/trust: list implementation
Change-Id: Ia886e84990efaf2c783f199741552a7a8ff41d4e
2019-12-12 17:15:47 +00:00
Jeff Wendling
fb8e78132d storagenodedb: reenable utccheck in tests
Change-Id: If7d64dd4ae58e4b656ff9122ae3195b2a5173cb3
2019-12-10 23:17:14 +00:00
Andrew Harding
5ed9373dba storagenode/trust: source entry cache
Implements a cache that can persist trust entries returned by sources

Change-Id: I72579e42e9f72d34a54b7510c9b665844f187314
2019-12-10 21:45:01 +00:00
Andrew Harding
715d97e3d8 storagenode/trust: rule and excluders
Change-Id: I84ed542e1ef3cfaa5cc3d3f631cdc295393bf978
2019-12-10 21:08:12 +00:00
Cameron Ayer
6fae361c31 replace planet.Start in tests with planet.Run
planet.Start starts a testplanet system, whereas planet.Run starts a testplanet
and runs a test against it with each DB backend (cockroach compat).

Change-Id: I39c9da26d9619ee69a2b718d24ab00271f9e9bc2
2019-12-10 16:55:54 +00:00
Andrew Harding
eb52ac623b storagenode/trust: source implementations
Change-Id: Ie36e79cc15257db88051f63e5b9463fd9d7b4736
2019-12-09 20:00:02 +00:00
Andrew Harding
7d0aadfeca storagenode/trust: satellite URL implementation
Satellite URL is a stricter form of the STORJ Node URL. It requires both
the ID and port specifier.

Change-Id: I7fd302064f864c1de8240a7915bf5263b898dfd1
2019-12-09 17:05:57 +00:00
littleskunk
9d1faeee58 storagenode/garbagecollection: increase MaxTimeSkew to be higher than satellite MaxCommitInterval
Change-Id: I86f8d0b44bea3aa005ff26d52588611c59df5e9a
2019-12-09 16:03:55 +00:00
Ethan Adams
9420fa9fc5 satellite/gracefulexit: Add graceful exit completed/failed receipt verification to satellite CLI (#3679) 2019-12-03 17:09:39 -05:00
Ivan Fraixedes
42c61138e8
storage: Improve doc comments delete methods (#3591)
Improve the documentation of several methods involved in the delete
operation to make clear their behavior without having to inspect their
logic.
2019-12-02 12:18:20 +01:00
Ivan Fraixedes
bf97ef06fc
storagenode: Add new endpoint to receive satellite requests for… (#3590)
* pkg/pg: Add new service function storage node

  Add a new service function to the storage node piece store for deleting
  pieces when satellites request them.

* storagenode/piecestore: Add endpoint to delete piece

  Add a new endpoint to receive from trusted satellites to delete a piece.

* private/testplanet: Fix storagenode mock

  Add to the storagenode mock the new endpoint method.

* proto.lock: Update it with the last protbuff changes

* storagenode/piecestore: Reuse test piece upload

  Extract the repeated logic from several tests functions for uploading a
  test piece to a test helper function.

* uplink/piecestore: Implement client side method

  Implement the client side method of the new piecestore RPC function.

* storagenode/piecestore: Add test DeletePiece endpoint

  Implement a test for the DeletePiece new endpoint method.
2019-11-26 18:47:19 +01:00
Yingrong Zhao
66f1a1680f
add completion receipt to exit-status cli command on storage node (#3650) 2019-11-26 12:32:26 -05:00
Isaac Hess
56f8fd2dd7
storagenode/pieces: Add EmptyTrash functionality (#3640)
* storagenode/pieces: Add EmptyTrash functionality

* storagenode/pieces: Fix err

* storagenode/pieces: Fix lint
2019-11-26 09:25:21 -07:00
Vitalii Shpital
038ac58600
web/storagenode: minimal allowed version view implemented (#3583) 2019-11-26 18:08:24 +02:00
littleskunk
8842b0c252 storagenode/gracefulexit: improve logging (#3633) 2019-11-21 21:10:02 -05:00
Rafael Antonio Ribeiro Gomes
2739771761
storagenode: add bandwidth metrics (#3623)
* storagenode: add bandwidth metrics

* remove unecessary metric
2019-11-21 16:51:40 -03:00
Isaac Hess
6aeddf2f53
storagenode/pieces: Add Trash and RestoreTrash to piecestore (#3575)
* storagenode/pieces: Add Trash and RestoreTrash to piecestore

* Add index for expiration trash
2019-11-20 09:28:49 -07:00
Kaloyan Raev
6d728d6ea0
storagenode/collect: delete piece 24 hours after expiration (#3613) 2019-11-20 17:02:57 +02:00
Vitalii Shpital
61c8bcc9a6
web/storagenode: egress chart implemented (#3574) 2019-11-20 16:37:57 +02:00
Rafael Antonio Ribeiro Gomes
da39c71d35
storagenode: add new metric satellite.request (#3610)
* storagenode: add new metric satellite.request

* storagenode: metrics fixed

* switch from Counter to Meter
2019-11-19 18:11:31 -03:00
Ivan Fraixedes
8e1e4cc342
piecestore: Fix invalid comment and typos (#3604) 2019-11-19 16:30:48 +01:00
Nikolai Siedov
24318d74b3
storagenode/console: show satellite url in satellite selection (#3602) 2019-11-19 14:16:56 +02:00
Nikolai Siedov
0d35505fe1
SNOboard/console: router changed for gorillaMux, caching added (#3577) 2019-11-15 14:36:43 +02:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Egon Elbre
1a54007f1c
storagenode/storagenodedb: dont log opening of each database (#3571) 2019-11-14 17:08:16 +02:00
Egon Elbre
1e64006e32 lint: add staticcheck as a separate step (#3569) 2019-11-14 10:31:30 +02:00
paul cannon
bd89f51c66
Keep v0pieceinfo database isolated (#3364)
* put TestCreateV0 back in StoreForTest
* avoid direct handles to V0 pieceinfo db
* type mismatch fix
* use storage.Blobs interface in store_test.go

..instead of filestore.Store. this will allow filestore.Store to become
unexported.

* unexport filestore.Store

rename it to blobStore. things should use the storage.Blobs interface
instead. changes in this commit are purely mechanical (made through the
"refactor" tool in Gocode followed by search/replace on the word "Store"
within the storage/filestore/ directory).

* kill filestore.StoreForTest

now that filestore.blobStore is unexported, there isn't a need for a
specialized wrapper type. this (not coincidentally) also makes it
possible for the WriterForFormatVersion() method on
storagenode/pieces.StoreForTest to work, without requiring everything to
wrap the store.blobs attribute in a filestore.StoreForTest, which was
impractical.
2019-11-13 13:15:31 -06:00
Yingrong Zhao
db8294cfba
storagenode/gracefulexit: get hash and limit using original piece ID (#3557) 2019-11-13 12:45:55 -05:00
Jeff Wendling
ebcd37c572 storagenode/contact: fix connection leak with contact checkin
Change-Id: If86002557144d5d8dbff939d2b6a2dfec6537577
2019-11-06 18:00:09 +00:00
littleskunk
7eb6724c92
logging: unify logging around satellite ID, node ID and piece ID (#3491)
* logging: unify logging around satellite ID, node ID and piece ID

* unify segment index
2019-11-05 22:04:07 +01:00
Maximillian von Briesen
257d3946d5
storagenode/gracefulexit: allow storagenodes to concurrently transfer pieces for graceful exit (#3478) 2019-11-05 10:33:44 -05:00
Jennifer Li Johnson
11f0ea3258
5s (#3477) 2019-11-04 16:20:31 -05:00
Jennifer Li Johnson
aa7d15a365
storagenode/contact: exponential backoff retries for pinging Satellites (#3372) 2019-11-04 16:03:21 -05:00
Jess G
5abb91afcf
satellite: change the Peer name to Core (#3472)
* change satellite.Peer name to Core

* change to Core in testplanet

* missed a few places

* keep shared stuff in peer.go to stay consistent with storj/docs
2019-11-04 11:01:02 -08:00
Isaac Hess
4d26d0a6a6 storagenode/pieces: Add migration from v0 piece to v1 piece (#3401) 2019-11-04 17:59:45 +01:00
Egon Elbre
87687938d1 storagenode/contact: fix panic in ping satellites (#3447) 2019-11-01 16:20:53 +01:00
Ethan Adams
43103ae13f
lower storage node counts in tests (#3427) 2019-10-31 10:57:54 -04:00
Jess G
4d85b11574
satellite/contact: improve errors in contact endpoints (#3356)
* improve errors in satellite contact endpoints

* add changes per CR comments

* update pingback method so it still updates node table

* fix err and returns

* fix zap logging to be better
2019-10-30 11:57:21 -07:00
Natalie Villasana
4878135068
satellite/gracefulexit, storagenode/gracefulexit: add timeouts (#3407) 2019-10-30 13:40:57 -04:00
Natalie Villasana
5453886231 satellite/repair, uplink/ecclient: remove unused expiration arg from ec.Repair and ec.putPiece (#3416) 2019-10-30 11:35:00 -04:00
Yingrong Zhao
3ee0b89f8f
storagenode/gracefulexit: delete pieces when receive Delete or Completed message from satellite (#3406) 2019-10-30 10:46:56 -04:00
Egon Elbre
65a8e0bcbc
{satellite,storagenode}/gracefulexit: clearer log messages (#3413) 2019-10-30 10:21:27 +02:00
Isaac Hess
1defd4dbfe
storagenode/piecestore: Respect config.MaxConcurrentRequests for drpc (#3402) 2019-10-28 13:12:49 -06:00
Ethan Adams
5b0398a718 storagenode/gracefulexit: Exclude finished exits from chore/worker processing. Fix update status bug (#3399) 2019-10-28 13:59:45 -04:00
Egon Elbre
93353df4d6
internal/sync2: make Fence accept context (#3393) 2019-10-28 16:04:31 +02:00
paul cannon
1469f7f41f
storagenode/contact: wait for UpdateSelf before start (#3332)
When the contact chore starts running before the monitor service has
provided any useful capacity data, the first outgoing contact has
not-very-helpful data for the satellite. This change causes the contact
chore to wait until capacity data is available. The wait should be quite
short in all reasonable cases: even when a node starts with a lot of
stored pieces and no cached spaceUsedDB data, new data will have been
calculated and cached by the call to
`peer.Storage2.CacheService.Init(ctx)` in `storagenode.cmdRun()` before
`peer.Run(ctx)`.

Change-Id: Ibc26d5c1fc10a23006c00bc3f13ff6cf71f8bf1d
2019-10-26 12:16:25 -05:00
Jeff Wendling
ed48e74e20 gracefulexit: fix build for drpc (#3387)
Change-Id: I335e9f8991a10c9e8a0737bc7c9ea3f04cbe2546
2019-10-26 15:53:35 +02:00
Maximillian von Briesen
6df4d7bc73
storagenode/gracefulexit + satellite/gracefulexit: add storagenode-side transfer validation (#3371)
* Make the exiting node check piece hashes, piece IDs, and piece hash signatures before relaying successful transfer data to the satellite.
* Enable immediate graceful exit failure for "successful" transfers that fail satellite-side validation.
* Move transfer piece logic in storagenode worker to separate function (to make the worker easier to understand)
2019-10-25 13:16:20 -04:00
Yingrong Zhao
fa1ac24e19
satellite/gracefulexit: add failure threshold check (#3329)
* add overall failure percentage check and inactive time frame check before sending a response to sno

* update comment

* delete node from transfer queue if it has been inactive for too long

* fix linting error

* add test config value

* fix nil pointer

* add config value into testplanet

* add unit test for overall failure threshold

* move timeframe threshold to chore

* update protolock

* add chore test

* add per peiece failure count logic

* change config name from EndpointMaxFailures to MaxFailuresPerPiece

* address comments

* fix linting error

* add error handling for no row returned from progress table

* fix test for graceful exit chore on storagenode

* fix typo InActive -> Inactive

* improve readability for failure threshold calculation

* update config lock

* change error handling for GetProgress in graceful exit endpoint on the satellite side

* return proper rpc error in endpoint

* add check in chore test for checking finish timestamp and queue
2019-10-24 12:24:42 -04:00
Isaac Hess
75412e54e5
storagenode/piecestore: Rename liveGRPCRequests back to liveRequests (#3354) 2019-10-23 13:43:43 -06:00
Isaac Hess
14c7648530
storagenode/piecestore: Only limit grpc requests (#3342) 2019-10-23 10:14:02 -06:00
JT Olio
2c6fa3c5f8
pkg/rpc: remove read/write deadlines as a mechanism for request timeouts (#3335)
libuplink was incorrectly setting timeouts to 10 seconds still, but
should have been at least 10 minutes. the order sender was setting them
to 1 hour. we don't want timeouts in uplink-side logic as it establishes
a minimum rate on tcp streams.

instead of all of this, just use tcp keep alive. tcp keep alive packets are
sent every 15 seconds and if the peer stops responding the connection
dies. this is enabled by default with go. this will kill tcp connections
when they stop working.

Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b
2019-10-22 17:57:24 -06:00
Ethan Adams
3e0d12354a
storagenode/gracefulexit: Implement storage node graceful exit worker - part 1 (#3322) 2019-10-22 16:42:21 -04:00
paul cannon
5e78f4000b storagenode/pieces: remove old comment (#3334)
the reservedSpace member it's talking about was removed quite a while
ago.

Change-Id: I28433b2a44467376a408453d875c389656347cab
2019-10-22 12:51:51 +03:00
Bryan White
f468816f13
{internal/version,versioncontrol,cmd/storagenode-updater}: add rollout to storagenode updater (#3276) 2019-10-21 12:50:59 +02:00
Bryan White
243ba1cb17
{versioncontrol,internal/version,cmd/*}: refactor version control (#3253) 2019-10-20 09:56:23 +02:00
Yingrong Zhao
e5099f31f3
add context.Clean and correct rpc error code (#3295) 2019-10-16 13:50:01 -04:00
Isaac Hess
ed6b88a12d piecestore: update usage before completing upload (#3286)
The upload code currently updates the usage in a deferred call to saveOrder().
The consequence is that in the success case, the RPC is completed before
the usage has been updated.

This change repurposes the deferred call to update usage in the
failure case, while explicitly updating the usage before completing the
RPC.

This fixes some test flakiness when using dRPC. gRPC waits until the final status is written before a Recv call completes, and the final status is written by the server after the handler function has exited. In practice this means that the client is blocked until the defer call is also finished. So this change will not change performance at all.

It has two advantages:

(1) It fixes test flakiness

and, more importantly:

(2) reduces the chances that someone will accidentally write a flaky test in the future
2019-10-15 20:17:17 -06:00
Yingrong Zhao
87e3764390
storagenode/cmd: add exit-status command for graceful exit (#3264)
* add exit-status command

* remove todo and fix format

* fix status display

* change startExit to exit progress

* fix linting error

* add successful column in exit progress

* fix test

* remove extra new line

* fix TYPOS

* format the percentage better
2019-10-15 18:07:32 -04:00
Andrew Harding
4962c6843e
piecestore: fix test flakiness around upload/download usage tracking (#3282) 2019-10-15 11:22:15 -06:00
Simon Guindon
abb5b6c499
storagenode/piecestore: Fix to ignore both gRPC and dRPC EOF errors. (#3274)
* Fix to ignore both gRPC and dRPC EOF errors.

* Fix to ignore both gRPC and dRPC EOF errors.
2019-10-15 12:13:53 -04:00
Ethan Adams
1ad2ba7e3e
storagenode/gracefulexit: Add graceful exit chore and worker. (#3262)
Adds graceful exit chore and worker for V3-2614
2019-10-15 11:29:47 -04:00
Jennifer Li Johnson
b185dbbee2
satellite/discovery: remove discovery related code (#3175) 2019-10-14 10:57:01 -04:00
littleskunk
96aeedcdee
OrderLimit/GracePeriod: Increase time window from 1h to 24h (#3255)
* OrderLimit/GracePeriod: Increase time window from 1h to 24h

* update satellite config lock
2019-10-13 17:40:24 +02:00
JT Olio
6ede140df1
pkg/rpc: defeat MITM attacks in most cases (#3215)
This change adds a trusted registry (via the source code) of node address to node id mappings (currently only for well known Satellites) to defeat MITM attacks to Satellites. It also extends the uplink UI such that when entering a satellite address by hand, a node id prefix can also be added to defeat MITM attacks with unknown satellites.

When running uplink setup, satellite addresses can now be of the form 12EayRS2V1k@us-central-1.tardigrade.io (not even using a full node id) to ensure that the peer contacted is the peer that was expected. When using a known satellite address, the known node ids are used if no override is provided.
2019-10-12 14:34:41 -06:00
Isaac Hess
e567f27634
storagenode/piecestore: Change test to use ioutil.ReadAll to attempt to reduce test flake (#3250) 2019-10-11 15:57:59 -06:00
Cameron
d17be58237 remove random sleep in storagenode contact (#3243) 2019-10-11 16:44:18 -04:00
Vitalii Shpital
78a71ad3b6
web/storagenode: node status updated (#3220) 2019-10-11 19:28:47 +03:00
Yingrong Zhao
743a0fc38b storagenode/cmd: create start graceful exit CLI (#3202) 2019-10-11 09:58:12 -04:00
littleskunk
d5b2e1ef89
storagenode/signature: Reject uploads with a timestamp too far in the future (#3194) 2019-10-08 13:09:46 +02:00
JT Olio
37491d0d32 storagenode: embed the console into the binary and makefile (#3164)
* web/storagenode: add package-lock.json
* storagenode: compile console into binary
2019-10-08 10:52:19 +02:00
Jennifer Li Johnson
7ceaabb18e
Delete Bootstrap and Kademlia (#2974) 2019-10-04 16:48:41 -04:00
Jeff Wendling
64e43e555e pkg/rpc: return context error if ready after DialContext fails
the net package does not make it easy to know if DialContext
failed because the context was done. it's important for some
of our tests that canceled contexts are detected as such, so
we accept the small race that's arguably correct (the context
must be canceled asynchronously) to ensure we always return
the context error if available.

Change-Id: I058064d5c666e5353b74fb5bd300bf7abe537ff5
2019-10-04 20:09:00 +00:00
Yaroslav Vorobiov
a11619e7f3
storagenode/console: use bandwidth monthly summary (#3183) 2019-10-04 09:29:25 -06:00
Yaroslav Vorobiov
4824ecdb8d storagenode/console: use bytes for remaining info (#3186) 2019-10-04 18:17:28 +03:00
littleskunk
b2e328f118 storagenode/dashboard: update online status (#3168) 2019-10-03 20:31:39 +02:00
Maximillian von Briesen
08ed50bcaa
satellite/metainfo: add commit interval to prevent long delays between order limit creation and segment commit (#3149) 2019-10-01 12:55:02 -04:00
Bill Thorp
89c59d06f9
storagenode/storagenodedb: add SQL receiver logic for graceful exit (#3067)
* added graceful exit db methods
2019-10-01 10:34:03 -04:00
Jennifer Li Johnson
755cbd4dce
storagenode/main: map aliases for kademlia config values (#3118) 2019-09-30 19:33:00 -04:00
Jennifer Li Johnson
29b96a666b
internal/testplanet: fix conn leak (#3132) 2019-09-27 09:47:57 -06:00
Isaac Hess
2c5e169888
storagenode/storagenodedb: Vacuum info.db to prepare for splitting storagenodedbs (#3134) 2019-09-27 07:55:51 -06:00
Jeff Wendling
098cbc9c67 all: use pkg/rpc instead of pkg/transport
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.

most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.

a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.

Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
2019-09-25 15:37:06 -06:00
Isaac Hess
580e511b4c
storagenode/storagenodedb: Migrate to separate dbs (#3081)
* storagenode/storagenodedb: Migrate to separate dbs

* storagenode/storagenodedb: Add migration to drop versions tables

* Put drop table statements into a transaction.

* Fix CI errors.

* Fix CI errors.

* Changes requested from PR feedback.

* storagenode/storagenodedb: fix tx commit
2019-09-23 12:36:46 -07:00
Jennifer Li Johnson
d2502bb51b Adds tests for kad replacement and restores kad operator configs (#3094)
* test that all nodes can check in with all satellites

* keep kademlia config

* add untrusted satellite test

* use getversion

* remove kademlia config changes in test-sim-backwards.sh

* add kademlia flags back to storj-sim storagenode

* reset kademlia flags in storagenode entrypoint
2019-09-20 16:02:23 -04:00
Jennifer Li Johnson
724bb44723
Remove Kademlia dependencies from Satellite and Storagenode (#2966)
What:

cmd/inspector/main.go: removes kad commands
internal/testplanet/planet.go: Waits for contact chore to finish
satellite/contact/nodesservice.go: creates an empty nodes service implementation
satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value
satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover()
satellite/peer.go: sets up contact service and endpoints
storagenode/console/service.go: replaces nodeID with contact.Local()
storagenode/contact/chore.go: replaces routing table with contact service
storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method
storagenode/contact/service.go: creates a service to return the local node and update its own capacity
storagenode/monitor/monitor.go: uses contact service in place of routing table
storagenode/operator.go: moves operatorconfig from kad into its own setup
storagenode/peer.go: sets up contact service, chore, pingstats and endpoints
satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection
Removes kademlia setups in:

cmd/storagenode/main.go
cmd/storj-sim/network.go
internal/testplane/planet.go
internal/testplanet/satellite.go
internal/testplanet/storagenode.go
satellite/peer.go
scripts/test-sim-backwards.sh
scripts/testdata/satellite-config.yaml.lock
storagenode/inspector/inspector.go
storagenode/peer.go
storagenode/storagenodedb/database.go
Why: Replacing Kademlia

Please describe the tests:
• internal/testplanet/planet_test.go:

TestBasic: assert that the storagenode can check in with the satellite without any errors
TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup
• satellite/contact/contact_test.go:

TestFetchInfo: Tests that the FetchInfo method returns the correct info
• storagenode/contact/contact_test.go:

TestNodeInfoUpdated: tests that the contact chore updates the node information
TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info
Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).
2019-09-19 15:56:34 -04:00
Jess G
93788e5218
remove kademlia: create upsert query to update uptime (#2999)
* create upsert query for check-in method

* add tests

* fix lint err

* add benchmark test for db query

* fix lint and tests

* add a unit test, fix lint

* add address to tests

* replace print w/ b.Fatal

* refactor query per CR comments

* fix disqualified, only set if null

* fix query

* add version to updatecheckin query

* fix version

* fix tests

* change version for tests

* add version to tests

* add IP, add transport, mv unit test

* use node.address as arg

* add last ip

* fix lint
2019-09-19 11:37:31 -07:00
Simon Guindon
a2b1e9fa95
storagenode/storagenodedb: refactor both data access objects and migrations to support multiple DB connections (#3057)
* Split the info.db database into multiple DBs using Backup API.

* Remove location. Prev refactor assumed we would need this but don't.

* Added VACUUM to reclaim space after splitting storage node databases.

* Added unique names to SQLite3 connection hooks to fix testplanet.

* Moving DB closing to the migration step.

* Removing the closing of the versions DB. It's already getting closed.

* Swapping the database connection references on reconnect.

* Moved sqlite closing logic away from the boltdb closing logic.

* Moved sqlite closing logic away from the boltdb closing logic.

* Remove certificate and vouchers from DB split migration.

* Removed vouchers and bumped up the migration version.

* Use same constructor in tests for storage node databases.

* Use same constructor in tests for storage node databases.

* Adding method to access underlining SQL database connections and cleanup

* Adding logging for migration diagnostics.

* Moved migration closing database logic to minimize disk usage.

* Cleaning up error handling.

* Fix missing copyright.

* Fix linting error.

* Add test for migration 21 (#3012)

* Refactoring migration code into a nicer to use object.

* Refactoring migration code into a nicer to use object.

* Fixing broken migration test.

* Removed unnecessary code that is no longer needed now that we close DBs.

* Removed unnecessary code that is no longer needed now that we close DBs.

* Fixed bug where an invalid database path was being opened.

* Fixed linting errors.

* Renamed VersionsDB to LegacyInfoDB and refactored DB lookup keys.

* Renamed VersionsDB to LegacyInfoDB and refactored DB lookup keys.

* Fix migration test. NOTE: This change does not address new tables satellites and satellite_exit_progress

* Removing v22 migration to move into it's own PR.

* Removing v22 migration to move into it's own PR.

* Refactored schema, rebind and configure functions to be re-useable.

* Renamed LegacyInfoDB to DeprecatedInfoDB.

* Cleaned up closeDatabase function.

* Renamed storageNodeSQLDB to migratableDB.

* Switched from using errs.Combine() to errs.Group in closeDatabases func.

* Removed constructors from storage node data access objects.

* Reformatted usage of const.

* Fixed broken test snapshots.

* Fixed linting error.
2019-09-18 12:17:28 -04:00
Simon Guindon
91d54af705
Add satellites database business objects. (#3055)
* Add satellites database business objects.

* Fixed linting error.
2019-09-16 13:54:53 -04:00
Jess G
d3ef574b20 pkg/pb: minor changes to contact.proto (#3048)
* minor fixes to contact proto

* simply and rm nodeAddr object from client
2019-09-13 19:37:32 -05:00
Egon Elbre
ca058e606f
storagenode/orders: fix data race in settle (#3042) 2019-09-13 15:50:39 +03:00