storj

Author	SHA1	Message	Date
Jennifer Li Johnson	724bb44723	Remove Kademlia dependencies from Satellite and Storagenode (#2966 ) What: cmd/inspector/main.go: removes kad commands internal/testplanet/planet.go: Waits for contact chore to finish satellite/contact/nodesservice.go: creates an empty nodes service implementation satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover() satellite/peer.go: sets up contact service and endpoints storagenode/console/service.go: replaces nodeID with contact.Local() storagenode/contact/chore.go: replaces routing table with contact service storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method storagenode/contact/service.go: creates a service to return the local node and update its own capacity storagenode/monitor/monitor.go: uses contact service in place of routing table storagenode/operator.go: moves operatorconfig from kad into its own setup storagenode/peer.go: sets up contact service, chore, pingstats and endpoints satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection Removes kademlia setups in: cmd/storagenode/main.go cmd/storj-sim/network.go internal/testplane/planet.go internal/testplanet/satellite.go internal/testplanet/storagenode.go satellite/peer.go scripts/test-sim-backwards.sh scripts/testdata/satellite-config.yaml.lock storagenode/inspector/inspector.go storagenode/peer.go storagenode/storagenodedb/database.go Why: Replacing Kademlia Please describe the tests: • internal/testplanet/planet_test.go: TestBasic: assert that the storagenode can check in with the satellite without any errors TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup • satellite/contact/contact_test.go: TestFetchInfo: Tests that the FetchInfo method returns the correct info • storagenode/contact/contact_test.go: TestNodeInfoUpdated: tests that the contact chore updates the node information TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).	2019-09-19 15:56:34 -04:00
Jess G	93788e5218	remove kademlia: create upsert query to update uptime (#2999 ) * create upsert query for check-in method * add tests * fix lint err * add benchmark test for db query * fix lint and tests * add a unit test, fix lint * add address to tests * replace print w/ b.Fatal * refactor query per CR comments * fix disqualified, only set if null * fix query * add version to updatecheckin query * fix version * fix tests * change version for tests * add version to tests * add IP, add transport, mv unit test * use node.address as arg * add last ip * fix lint	2019-09-19 11:37:31 -07:00
JT Olio	946ec201e2	metainfo: move api keys to part of the request (#3069 ) What: we move api keys out of the grpc connection-level metadata on the client side and into the request protobufs directly. the server side still supports both mechanisms for backwards compatibility. Why: dRPC won't support connection-level metadata. the only thing we currently use connection-level metadata for is api keys. we need to move all information needed by a request into the request protobuf itself for drpc support. check out the .proto changes for the main details. One fun side-fact: Did you know that protobuf fields 1-15 are special and only use one byte for both the field number and type? Additionally did you know we don't use field 15 anywhere yet? So the new request header will use field 15, and should use field 15 on all protobufs going forward. Please describe the tests: all existing tests should pass Please describe the performance impact: none	2019-09-19 10:19:29 -06:00
Jess G	7c203b4884	add satelliteSystem to testplanet and update tests (#3066 )	2019-09-17 13:14:49 -07:00
littleskunk	1d8cd526e0	storj-sim: correct storagenode dashboard config (#3010 )	2019-09-12 15:20:52 +03:00
Egon Elbre	8b668ab1f8	satellite/metainfo.Loop: use a parsed path for observers (#3003 )	2019-09-12 13:38:49 +03:00
Natalie Villasana	aa3567187e	satellite/audit: worker now verifies and reverifies (#2965 )	2019-09-11 18:37:01 -04:00
Natalie Villasana	dbe90926ca	internal/testplanet: reduce coalesce duration (#3009 )	2019-09-11 18:15:14 -04:00
Jennifer Li Johnson	3387750280	storagenode/contact: create chore for nodes to ping satellites (#2877 ) Creates a chore for nodes to announce themselves to their trusted satellites. Runs on startup and every hour thereafter	2019-09-06 12:14:03 -04:00
Natalie Villasana	6d363fb756	satellite/audit: create the audit queue, chore, and worker (#2888 )	2019-09-05 11:40:52 -04:00
Michal Niewrzal	61168493dc	uplink: don't stop deleting segments on first error (#2943 )	2019-09-05 14:25:30 +02:00
Cameron	af5fb8e9c5	satellite/vouchers: deprecate voucher endpoint, return 'please upgrade' error (#2940 ) * voucher endpoint returns 'please upgrade' error, test	2019-09-04 13:21:02 -04:00
Michal Niewrzal	a6721ba92f	satellite/metainfo: Improve metainfo ListSegments (#2882 )	2019-08-30 23:30:18 +02:00
Egon Elbre	62e3bf5b34	storagenode/retain: fix concurrency issues (#2828 ) * nicer flags * fix concurrency * add concurrent workers * initialize things * fix tests * close retain service * ensure we don't have workers working on the same satellite * ensure things compile * fix other compilation issues: * concurrency changes ran this with `go test -count=1000` and it passed all of them. - we add a closed channel so that we can select on it with context cancellation. - we put a once in so we only close the channel once. - every time the queue/running state changes, we have to broadcast because we may want to wake up N pending Wait calls or other concurrent workers. - because we broadcast, we don't need to do the polling in Wait anymore. - ensure Run doesn't start multiple times so that we don't have to worry about concurrent Close with multiple Runs. - hold the lock while we start workers so that a concurrent Close with Run can't decide that there's nothing started and exit and then have Run start things. - make sure to poll the closed/context channels through loops or at the start of Run calls in case Close happens first. - these polls should be under a mutex because they have a default case which makes it possible to schedule such that Close hasn't executed the channel close so it starts more work. - cancel a local Run context when it's going to exit to make sure that any retainPieces calls have a canceled context. - hopefully enough comments to both check my work and help readers digest what's going on. Change-Id: Ida0e226a7e01e8ae64fa2c59dd5a84b04bccfbd7 * use the retain error class Change-Id: I1511eaef135f98afd57b878e997e4c8a0d11cafc * concurrency fixes again - forgot to update the gc test to use the old Wait api. - we need to drop the lock while we wait for the workers to exit, because they may be blocked on the condition variable - additionally, we need to broadcast when we close the signal channel because the state changed: they want to wake up and exit. Change-Id: I4204699792275260cd912f29aa73720f7d9b14b5 * undo my misguided rename Change-Id: I6baffe1eb0434e260212c485bbcc01bed3250881 * remove pollInterval * format paragraph more nicely * move skew calculation into retain pieces	2019-08-28 16:35:25 -04:00
Cameron	599324c364	satellite/dbcleanup: delete expired serials from satellite (#2867 ) Creates a new chore, dbcleanup, which can be used for routine deletion of items from the satellite database and adds functionality for deletion of expired serial numbers	2019-08-27 13:12:38 -04:00
Cameron	1f3537d4a9	storagenode/vouchers: remove storagenode vouchers (#2873 )	2019-08-26 19:35:19 +03:00
Cameron	3d9441999a	storagenode/orders: add archive cleanup to orders service (#2821 ) This PR introduces functionality for routine deletion of archived orders. The user may specify an interval at which to run archive cleanup and a TTL for archived items. During each cleanup, all items that have reached the TTL are deleted This archive cleanup job is combined with the order sender into a new combined orders service	2019-08-22 10:33:14 -04:00
Egon Elbre	00b2e1a7d7	all: enable staticcheck (#2849 ) * by having megacheck in disable it also disabled staticcheck * fix closing body * keep interfacer disabled * hide bodies * don't use deprecated func * fix dead code * fix potential overrun * keep stylecheck disabled * don't pass nil as context * fix infinite recursion * remove extraneous return * fix data race * use correct func * ignore unused var * remove unused consts	2019-08-22 13:40:15 +02:00
Egon Elbre	9ec0ceddf3	pkg/revocation: ensure we close revocation databases (#2825 )	2019-08-20 18:04:17 +03:00
Isaac Hess	25154720bd	lib/uplink: remove redis and bolt dependencies (#2812 ) * identity: remove redis and bolt dependencies * identity: move revDB creation to main files	2019-08-19 16:10:38 -06:00
Maximillian von Briesen	d83a965139	storagenode/piecestore: Add retain service on storagenode (#2785 ) Add retain service on storagenode. This service runs retain jobs that have been queued by the storagenodes. Rather than running retain jobs during the grpc Retain() call, the grpc call queues a retain job to the retain service and returns immediately afterwards, removing a significant bottleneck in garbage collection.	2019-08-19 14:52:47 -04:00
Egon Elbre	9eba5ac631	lib/uplink: remove Seek method (#2768 )	2019-08-13 20:29:02 +03:00
Jess G	022f5d2e14	storagenode: add space used cache for pieces (#2753 ) * add cache, update cache w/piece create/delete * add service w/loop to cache to recalculate space used cache * add piecestore cache to other sn svcs to use * add table to persist the total space used * rm cache where not needed * rm stuff from sn svcs * start fixing tests, changes per comments * update commits * add unit tests * fix commiting before we write header bytes * fix cache create test * copy cache map, add started back to recalc * fix test * add test, update comments	2019-08-12 14:43:05 -07:00
Yaroslav Vorobiov	4cf2b96731	storagenode/nodestats: fix test duration (#2748 )	2019-08-09 14:12:32 +03:00
Yaroslav Vorobiov	28a7778e9e	storagenode/nodestats: cache node stats (#2543 )	2019-08-08 16:47:04 +03:00
Simon Guindon	f236fc3d91	Remove the use of in memory SQLite3 tables for storage nodes. (#2726 )	2019-08-07 10:52:00 -04:00
Egon Elbre	c8edeb0257	satellite/overlay: rename overlay.Cache to overlay.Service (#2717 )	2019-08-06 19:35:59 +03:00
Jeff Wendling	21a3bf89ee	cmd/uplink: use scopes to open (#2501 ) What: Change cmd/uplink to use scopes It moves the fields that will be subsumed by scopes into an explicit legacy section and hides their configuration flags. Why: So that it can read scopes in from files and stuff	2019-08-05 11:01:20 -06:00
Egon Elbre	369a51ed00	lib/uplink: ensure it's silent by default (#2676 )	2019-08-01 07:14:09 -04:00
ethanadams	c9b46f2fe2	V3-1987: Optimize audits stats persistence (#2632 ) * Added batch update stats for recordAuditSuccessStatus * Added batch update stats to recordAuditFailStatus * added configurable batch size * build individual update/delete statements so the statements can be batched into 1 call to the DB * notified #config-changes channel and ran make update-satellite-config-lock * updated tests to use batch update stats	2019-07-31 13:21:06 -04:00
Egon Elbre	ec3d5c0bdd	don't use global loggers (#2671 ) * pkg/server: don't use global logger * satellite/overlay: use correct logger * pkg/kademlia: use correct logger * linksharing: use conventional way to pass in logger * use zaptest in tests	2019-07-31 15:09:45 +03:00
ethanadams	8f8b13abb9	Re-enable SN bandwidth rollups. Fix SN bandwidth rollup unique constraint issue. Re-organize service code (#2617 ) * re-organizing into bandwidth service. re-enable rollup loop * Prevent uniqueness failure in bandwidth rollup * Add test to make sure the rollup select date range works correctly * add bandwidth config for rollup interval	2019-07-29 10:07:52 -04:00
Egon Elbre	5d0816430f	rename all the things (#2531 ) * rename pkg/linksharing to linksharing * rename pkg/httpserver to linksharing/httpserver * rename pkg/eestream to uplink/eestream * rename pkg/stream to uplink/stream * rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo * rename pkg/auth/signing to pkg/signing * rename pkg/storage to uplink/storage * rename pkg/accounting to satellite/accounting * rename pkg/audit to satellite/audit * rename pkg/certdb to satellite/certdb * rename pkg/discovery to satellite/discovery * rename pkg/overlay to satellite/overlay * rename pkg/datarepair to satellite/repair	2019-07-28 08:55:36 +03:00
Maximillian von Briesen	906c77b55a	Add RetainStatus to storagenode config (#2633 ) --storage2.retain-status = "disabled" (default), "debug", or "enabled"	2019-07-26 16:49:08 -04:00
Egon Elbre	0cdeae1922	add missing error handling (#2630 )	2019-07-25 17:01:44 +02:00
Natalie Villasana	f11413bc8e	Implement garbage collection on satellite (#2577 ) * Added a gc package at satellite/gc, which contains the gc.Service, which runs garbage collection integrated with the metainfoloop, and the gc PieceTracker, which implements the metainfo loop Observer interface and stores all of the filters (about which pieces are good) for each node. * Added a gc config located at satellite/gc/service.go (loop disabled by default in release) * Creates bloom filters with pieces to be retained inside the metainfo loop * Sends RetainRequests (or filters with good piece ids) to all storage nodes.	2019-07-24 13:26:43 -04:00
Jess G	353b089927	update testplanet with libuplink (#2618 ) * update testplanet uplink upload with libuplink * add libuplink to testplanet download * update createbucket and delete obj with libuplink * update downloadStream, fix tests * fix test * updates for CR comments	2019-07-23 07:58:45 -07:00
Jennifer Li Johnson	53d96be44a	Stylistic Go Cleanup (#2524 )	2019-07-22 15:10:04 -04:00
Maximillian von Briesen	6c1c3fb4a7	Add metainfo loop service (#2563 ) Add a metainfo loop service on the satellite that can be subscribed to by various services that need to make use of metainfo information	2019-07-22 09:34:12 -04:00
Egon Elbre	13dd501042	storagenode/storagenodedb: move tests near the interface rather than the implementation (#2596 )	2019-07-19 20:40:27 +03:00
Egon Elbre	f6f65a80d7	storagenode/trust: implement fetching peer identity without kademlia and endpoint (#2584 )	2019-07-17 21:14:44 +03:00
Alexander Leitner	64b2769de3	discovery: parallelize refresh (#2535 ) * parallelize discovery refresh * add paginateQualifiedtest, address pr comments * Remove duplicate uptime update * Lower concurrency in Testplanet for discovery	2019-07-12 10:35:48 -04:00
Ivan Fraixedes	f420b29d35	[V3-1927] Repairer uploads to max threshold instead of success… (#2423 ) * pkg/datarepair: Add test to check num upload pieces Add a new test for ensuring the number of pieces that the repair process upload when a segment is injured. * satellite/orders: Don't create "put order limits" over total Repair must not create "put order limits" more than the total count. * pkg/datarepair: Update upload repair pieces test Update the test which checks the number of pieces which are uploaded during a repair for using the same excess over the success threshold value than the implementation. * satellites/orders: Limit repair put order for not being total Limit the number of put orders to be used by repair for only uploading pieces to a % excess over the successful threshold. * pkg/datarepair: Change DataRepair test to pass again Make some changes in the DataRepair test to make pass again after the repair upload repaired pieces only until a % excess over success threshold. Also update the steps description of the DataRepair test after it has been changed, to match on what's now, besides to leave it more generic for avoiding having to update it on minimal future refactorings. * satellite: Make repair excess optimal threshold configurable Add a new configuration parameter to the satellite for being able to configure the percentage excess over the optimal threshold, used for determining how many pieces should be repaired/uploaded, rather than having the value hard coded. * repairer: Add configurable param to segments/repairer Add a new parameters to the segment/repairer to calculate the maximum number of excess nodes, based on the optimal threshold, that repaired pieces can be uploaded. This new parameter has been added for not returning more nodes than the number of upload orders for data repair satellite service calculate for repairing pieces. * pkg/storage/ec: Update log message in clien.Repair * satellite: Update configuration lock file	2019-07-12 00:44:47 +02:00
Egon Elbre	d52f764e54	protocol: implement new piece signing and verification (#2525 )	2019-07-11 16:51:40 -04:00
Maximillian von Briesen	8b507f3d73	Address concerns with storagenode Retain endpoint (#2527 )	2019-07-11 16:04:21 -04:00
Bill Thorp	0e463dccfd	7 day validity window for order limits (#2520 ) * 7 day limit	2019-07-10 17:17:00 -04:00
JT Olio	a79c7d77f3	overlay cache: slight modification of node-is-online rules (#2490 )	2019-07-09 22:36:09 -04:00
Jeff Wendling	d616be8ae0	storagenode: use minimum time in the order for expiration (#2504 )	2019-07-09 17:16:30 -04:00
Stefan Benten	16156e3b3d	Ensure we force a segment size and account storage before committing them (#2473 )	2019-07-08 18:24:38 -04:00
Egon Elbre	674742d1a7	satellite/datarepair: use reliability cache (#1976 )	2019-07-09 01:04:35 +03:00

1 2 3 4 5

215 Commits