storj

Author	SHA1	Message	Date
paul cannon	79553059cb	satellite/repair: put irreparable segments in irreparableDB Previously, we were simply discarding rows from the repair queue when they couldn't be repaired (either because the overlay said too many nodes were down, or because we failed to download enough pieces). Now, such segments will be put into the irreparableDB for further and (hopefully) more focused attention. This change also better differentiates some error cases from Repair() for monitoring purposes. Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c	2020-03-09 21:45:16 +00:00
paul cannon	92d86fa044	satellite/repair: fix repair concurrency This new repair timeout (configured as TotalTimeout) will include both the time to download pieces and the time to upload pieces, as well as the time to pop the segment from the repair queue. This is a move from Github PR #3645. Change-Id: I47d618f57285845d8473fcd285f7d9be9b4318c8	2020-02-24 19:57:09 +00:00
Michal Niewrzal	426c8eb31a	private/testplanet: add DeleteBucket method for uplink New method added to be able to delete easily bucket during tests. Change-Id: Iaae89618cc676ddbbbd4b0df2eeacd143ea6f3c2	2020-02-11 15:58:13 +00:00
Michal Niewrzal	6502454947	satellite/metainfo: move RS configuration to satellite With this change RS configuration will be set on satellite. Uplink with get RS values with BeginObject request and will use it. For backward compatibility and to avoid super large change redundancy scheme stored with bucket is not touched. This can be done in future. Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff	2020-01-22 09:33:53 +00:00
Yingrong Zhao	76ee8a1b4c	satellite: remove UptimeReputation configs from codebase With the new storage node downtime tracking feature, we need remove current uptime reputation configs: UptimeReputationAlpha, UptimeReputationBeta, and UptimeReputationDQ. This is the first step of removing the uptime reputation columns from satellitedb Change-Id: Ie8fab13295dbf545e33aeda0c4306cda4ba54e36	2020-01-08 18:54:15 +00:00
Egon Elbre	2680bae88c	private/testplanet: remove dependency to uplink Remove direct dependency on uplink.RSConfig, this simplifies moving the config file without introducing weird dependencies. Change-Id: I7fd2a145401e0205d7047631df9d2810241efeec	2020-01-02 09:40:46 +00:00
Egon Elbre	6615ecc9b6	common: separate repository Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a	2019-12-27 14:11:15 +02:00
littleskunk	8b3444e088	satellite/nodeselection: don't select nodes that haven't checked in for a while (#3567 ) * satellite/nodeselection: dont select nodes that havent checked in for a while * change testplanet online window to one minute * remove satellite reconfigure online window = 0 in repair tests * pass timestamp into UpdateCheckIn * change timestamp to timestamptz * edit tests to set last_contact_success to 4 hours ago * fix syntax error * remove check for last_contact_success > last_contact_failure in IsOnline	2019-11-15 23:43:06 +01:00
Egon Elbre	ee6c1cac8a	private: rename internal to private (#3573 )	2019-11-14 21:46:15 +02:00
paul cannon	0c025fa937	storage/: remove reverse-key-listing feature We don't use reverse listing in any of our code, outside of tests, and it is only exposed through libuplink in the lib/uplink.(*Project).ListBuckets() API. We also don't know of any users who might have a need for reverse listing through ListBuckets(). Since one of our prospective pointerdb backends can not support backwards iteration, and because of the above considerations, we are going to remove the reverse listing feature. Change-Id: I8d2a1f33d01ee70b79918d584b8c671f57eef2a0	2019-11-12 18:47:51 +00:00
littleskunk	5b20c716e6	satellite/repair: dont error on deleted segments (#3252 )	2019-10-15 05:39:28 +02:00
Jennifer Li Johnson	b185dbbee2	satellite/discovery: remove discovery related code (#3175 )	2019-10-14 10:57:01 -04:00
Stefan Benten	1db4251234	Satellite/repair: Add Repair Threshold Override to allow earlier repair (#3151 )	2019-10-02 14:58:37 +02:00
Jennifer Li Johnson	724bb44723	Remove Kademlia dependencies from Satellite and Storagenode (#2966 ) What: cmd/inspector/main.go: removes kad commands internal/testplanet/planet.go: Waits for contact chore to finish satellite/contact/nodesservice.go: creates an empty nodes service implementation satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover() satellite/peer.go: sets up contact service and endpoints storagenode/console/service.go: replaces nodeID with contact.Local() storagenode/contact/chore.go: replaces routing table with contact service storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method storagenode/contact/service.go: creates a service to return the local node and update its own capacity storagenode/monitor/monitor.go: uses contact service in place of routing table storagenode/operator.go: moves operatorconfig from kad into its own setup storagenode/peer.go: sets up contact service, chore, pingstats and endpoints satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection Removes kademlia setups in: cmd/storagenode/main.go cmd/storj-sim/network.go internal/testplane/planet.go internal/testplanet/satellite.go internal/testplanet/storagenode.go satellite/peer.go scripts/test-sim-backwards.sh scripts/testdata/satellite-config.yaml.lock storagenode/inspector/inspector.go storagenode/peer.go storagenode/storagenodedb/database.go Why: Replacing Kademlia Please describe the tests: • internal/testplanet/planet_test.go: TestBasic: assert that the storagenode can check in with the satellite without any errors TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup • satellite/contact/contact_test.go: TestFetchInfo: Tests that the FetchInfo method returns the correct info • storagenode/contact/contact_test.go: TestNodeInfoUpdated: tests that the contact chore updates the node information TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).	2019-09-19 15:56:34 -04:00
Jess G	7c203b4884	add satelliteSystem to testplanet and update tests (#3066 )	2019-09-17 13:14:49 -07:00
Yingrong Zhao	b37ea864b1	satellite/repair: delete pieces that failed piece hashes verification from pointer (#3051 ) * add test * add implementation * remove todo comments * modifies cooment * fix linting * typo oops	2019-09-16 13:13:24 -04:00
Maximillian von Briesen	7ada5d4152	satellite/metainfo: make piece hashes nil before storing pointer in metainfo.UpdatePieces() (#3050 )	2019-09-16 12:11:12 -04:00
Yingrong Zhao	95aa33c964	satellite/repair/repairer: update audit status as failed after failing piece hash verification (#2997 ) * update audit status as failed for nodes that failed piece hash verification * remove comment * fix lint error * add test * fix format * use named return value for Get * add comments * add more better comment * format	2019-09-13 12:21:20 -04:00
Natalie Villasana	a085b05ec5	satellitePeer -> satellite rename consistency in repair test (#3032 )	2019-09-12 13:16:39 -04:00
Natalie Villasana	aa3567187e	satellite/audit: worker now verifies and reverifies (#2965 )	2019-09-11 18:37:01 -04:00
Maximillian von Briesen	64602c3007	fix flaky repair tests (#2973 )	2019-09-06 15:02:01 -07:00
Maximillian von Briesen	fb10815229	Repair with hashes (#2925 ) * add outline for ECRepairer * add description of process in TODO comments * begin download/getting hash for a single piece * verify piece hash and order limit during download * fix download piece * begin filling out ESREpair. Get * wip move ecclient.Repair to ecrepairer.Repair * pass satellite signee into repairer * reconstruct original stripe from pieces * move rebuildStripe() * calculate piece size differently, increment successful count * fix shares slices initialization * rename stripeData to segment * do not pad reader in Repair() * temp debug * create unsafeRSScheme * use decode reader * rename file name to be all lowercase * make repair downloader async * declare condition variable inside Get method * set downloadAndVerifyPiece's in-memory buffer to be share size * update unusedLimits var * address comments * remove unnecessary comments * move initialization of segmentRepaire to be outside of repairer service * use ReadAll during download * remove dots and move hashing to after validating for order limit signature * wip test * make sure files exactly at min threshold are repaired * remove unused code * use corrput data and write back to storagenode * only create corrupted node and piece ids once * add comment * address nat's comment * fix linting and checker_test * update comment * add comments * remove "copied from ecclient" comments * add clarification comments in ec.Repair	2019-09-06 15:20:36 -04:00
Egon Elbre	00b2e1a7d7	all: enable staticcheck (#2849 ) * by having megacheck in disable it also disabled staticcheck * fix closing body * keep interfacer disabled * hide bodies * don't use deprecated func * fix dead code * fix potential overrun * keep stylecheck disabled * don't pass nil as context * fix infinite recursion * remove extraneous return * fix data race * use correct func * ignore unused var * remove unused consts	2019-08-22 13:40:15 +02:00
Egon Elbre	95080643b1	satellite/repair: fix data race (#2833 )	2019-08-20 17:46:39 +03:00
Egon Elbre	c8edeb0257	satellite/overlay: rename overlay.Cache to overlay.Service (#2717 )	2019-08-06 19:35:59 +03:00
Alexander Leitner	4632ab0a67	Delete irreparable segments (#2642 ) * Delete irreparable segments	2019-07-30 11:38:25 -04:00
Egon Elbre	dd7c8610bb	satellite/repair: move test files (#2649 )	2019-07-28 12:15:34 +03:00

27 Commits