storj

Author	SHA1	Message	Date
Jennifer Johnson	1c1750e6be	removes bandwidth limiting On satellite, remove all references to free_bandwidth column in nodes table. On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated. Protobuf message, NodeCapacity, is left intact for backwards compatibility. Once this is released to all satellites, we can drop the column from the DB. Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838	2020-03-04 14:04:00 +00:00
Moby von Briesen	6043d01c90	satellite/audit/verifier: add metric for number of successfully downloaded shares Change-Id: Ia4f1dc6e088db802e340aaecf80cc7ef6dc237a4	2020-02-27 14:33:59 +00:00
Egon Elbre	5342dd9fe6	go.mod: update uplink Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811	2020-02-21 16:08:12 +02:00
Cameron Ayer	b22bf16b35	satellite/overlay: add config flag for node selection free disk requirement Currently SNs report their free disk space once per hour. If a node becomes full, it has to wait until the next contact cycle begins to report; all the while receiving and failing upload requests. By increasing the minimum required disk space, we can give the storage nodes more time to report their space before the completely fill up. This change goes hand-in-hand with another change we want to implement: trigger capacity report on SN immediately upon falling below threshold. Change-Id: I12f778286c6c3f582438b0e2949765ac43325e27	2020-02-11 18:08:25 +00:00
Michal Niewrzal	426c8eb31a	private/testplanet: add DeleteBucket method for uplink New method added to be able to delete easily bucket during tests. Change-Id: Iaae89618cc676ddbbbd4b0df2eeacd143ea6f3c2	2020-02-11 15:58:13 +00:00
Jeff Wendling	7999d24f81	all: use monkit v3 this commit updates our monkit dependency to the v3 version where it outputs in an influx style. this makes discovery much easier as many tools are built to look at it this way. graphite and rothko will suffer some due to no longer being a tree based on dots. hopefully time will exist to update rothko to index based on the new metric format. it adds an influx output for the statreceiver so that we can write to influxdb v1 or v2 directly. Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff	2020-02-05 23:53:17 +00:00
Egon Elbre	8dea4f52db	satellite: add control panel Change-Id: Id48246e9bcd4c6ec643277fe740937b2e42ad85b	2020-01-30 08:06:43 -05:00
Michal Niewrzal	6502454947	satellite/metainfo: move RS configuration to satellite With this change RS configuration will be set on satellite. Uplink with get RS values with BeginObject request and will use it. For backward compatibility and to avoid super large change redundancy scheme stored with bucket is not touched. This can be done in future. Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff	2020-01-22 09:33:53 +00:00
Yingrong Zhao	76ee8a1b4c	satellite: remove UptimeReputation configs from codebase With the new storage node downtime tracking feature, we need remove current uptime reputation configs: UptimeReputationAlpha, UptimeReputationBeta, and UptimeReputationDQ. This is the first step of removing the uptime reputation columns from satellitedb Change-Id: Ie8fab13295dbf545e33aeda0c4306cda4ba54e36	2020-01-08 18:54:15 +00:00
Egon Elbre	082ec81714	uplink: move to storj.io/uplink (#3746 )	2020-01-08 15:40:19 +02:00
Egon Elbre	f41d440944	all: reduce number of log messages Remove starting up messages from peers. We expect all of them to start, if they don't, then they should return an error why they don't start. The only informative message is when a service is disabled. When doing initial database setup then each migration step isn't informative, hence print only a single line with the final version. Also use shorter log scopes. Change-Id: Ic8b61411df2eeae2a36d600a0c2fbc97a84a5b93	2020-01-06 19:03:46 +00:00
Egon Elbre	2680bae88c	private/testplanet: remove dependency to uplink Remove direct dependency on uplink.RSConfig, this simplifies moving the config file without introducing weird dependencies. Change-Id: I7fd2a145401e0205d7047631df9d2810241efeec	2020-01-02 09:40:46 +00:00
Egon Elbre	6615ecc9b6	common: separate repository Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a	2019-12-27 14:11:15 +02:00
paul cannon	af24581ac0	satellite/audit: do not report offline to overlay (#3547 )	2019-12-18 04:51:24 -06:00
Moby von Briesen	ab777e823e	do not update pointer for failed audits Change-Id: If88dce8928db28d6f53c3dc771e14ea97aae9661	2019-12-16 10:50:54 -05:00
Egon Elbre	72d407559e	satellite/metainfo: don't leak error implementation detail (#3722 ) * satellite/metainfo: don't leak implementation detail * add missing wrap	2019-12-10 15:21:30 -05:00
Jeff Wendling	17b057b33e	satellite/audit: monitor worker function Change-Id: I94d1161deffe4ea9782abee1afbb5735f18aab44	2019-11-25 17:58:13 +00:00
Maximillian von Briesen	8653dda2b1	satellite/audit: do not contain nodes for unknown errors (#3592 ) * skip unknown errors (wip) * add tests to make sure nodes that time out are added to containment * add bad blobs store * call "Skipped" "Unknown" * add tests to ensure unknown errors do not trigger containment * add monkit stats to lockfile * typo * add periods to end of bad blobs comments	2019-11-19 17:30:28 +01:00
littleskunk	8b3444e088	satellite/nodeselection: don't select nodes that haven't checked in for a while (#3567 ) * satellite/nodeselection: dont select nodes that havent checked in for a while * change testplanet online window to one minute * remove satellite reconfigure online window = 0 in repair tests * pass timestamp into UpdateCheckIn * change timestamp to timestamptz * edit tests to set last_contact_success to 4 hours ago * fix syntax error * remove check for last_contact_success > last_contact_failure in IsOnline	2019-11-15 23:43:06 +01:00
Egon Elbre	ee6c1cac8a	private: rename internal to private (#3573 )	2019-11-14 21:46:15 +02:00
Egon Elbre	cc032d3151	satellite/metainfo: fix some uses of metainfo.Delete (#3513 ) * satellite/metainfo: rename Delete to UnsynchronizedDelete * fix deletes * make db private * fix typos * also verify on commit object	2019-11-06 18:02:14 +01:00
Maximillian von Briesen	7cdc1b351a	satellite/audit: do not audit expired segments (#3497 ) * during audit Verify, return error and delete segment if segment is expired * delete "main" reverify segment and return error if expired * delete contained nodes and pointers when pointers to audit are expired * update testplanet.Upload and testplanet.UploadWithConfig to use an expiration time of an hour from now * Revert "update testplanet.Upload and testplanet.UploadWithConfig to use an expiration time of an hour from now" This reverts commit e9066151cf84afbff0929a6007e641711a56b6e5. * do not count ExpirationDate=time.Time{} as expired	2019-11-05 20:41:48 +01:00
littleskunk	def3dcbaa9	satellite/audit: increase timeout to 5 minutes (#3480 ) * satellite/audit: increase timeout to 5 minutes * fix lint error	2019-11-05 11:21:25 +01:00
JT Olio	2c6fa3c5f8	pkg/rpc: remove read/write deadlines as a mechanism for request timeouts (#3335 ) libuplink was incorrectly setting timeouts to 10 seconds still, but should have been at least 10 minutes. the order sender was setting them to 1 hour. we don't want timeouts in uplink-side logic as it establishes a minimum rate on tcp streams. instead of all of this, just use tcp keep alive. tcp keep alive packets are sent every 15 seconds and if the peer stops responding the connection dies. this is enabled by default with go. this will kill tcp connections when they stop working. Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b	2019-10-22 17:57:24 -06:00
littleskunk	eeb38245ff	satellite/audit: improve logging (#3285 )	2019-10-16 13:48:05 +02:00
JT Olio	a5d1776539	audits: missing continue Change-Id: Ifcac8e61ebd8c59407e01c791adc60d9f88ff1b7	2019-10-11 13:55:27 -06:00
Maximillian von Briesen	784ca1582a	satellite/audit: fix audit panic (#3217 )	2019-10-09 10:06:58 -04:00
Maximillian von Briesen	3a3d576d9b	satellite/audit: add mutex to pieceHashesVerified map (#3214 ) * add mutex * remove double send to ch * lock mutex inside defer * import sync	2019-10-08 17:01:32 -04:00
littleskunk	c009543236	satellite/audit: Add piece hash verified to log messages (#3204 )	2019-10-08 12:51:57 +02:00
Maximillian von Briesen	e1b7d01160	satellite/audit: do not fail or contain nodes for audited segments that are not piece-hash-verified (#3161 )	2019-10-07 16:06:10 -04:00
Maximillian von Briesen	edadf46009	satellite/audit: delete nodes from containment when segment has changed (#3115 )	2019-09-29 04:03:15 +02:00
Jeff Wendling	098cbc9c67	all: use pkg/rpc instead of pkg/transport all of the packages and tests work with both grpc and drpc. we'll probably need to do some jenkins pipelines to run the tests with drpc as well. most of the changes are really due to a bit of cleanup of the pkg/transport.Client api into an rpc.Dialer in the spirit of a net.Dialer. now that we don't need observers, we can pass around stateless configuration to everything rather than stateful things that issue observations. it also adds a DialAddressID for the case where we don't have a pb.Node, but we do have an address and want to assert some ID. this happened pretty frequently, and now there's no more weird contortions creating custom tls options, etc. a lot of the other changes are being consistent/using the abstractions in the rpc package to do rpc style things like finding peer information, or checking status codes. Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412	2019-09-25 15:37:06 -06:00
Jennifer Li Johnson	724bb44723	Remove Kademlia dependencies from Satellite and Storagenode (#2966 ) What: cmd/inspector/main.go: removes kad commands internal/testplanet/planet.go: Waits for contact chore to finish satellite/contact/nodesservice.go: creates an empty nodes service implementation satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover() satellite/peer.go: sets up contact service and endpoints storagenode/console/service.go: replaces nodeID with contact.Local() storagenode/contact/chore.go: replaces routing table with contact service storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method storagenode/contact/service.go: creates a service to return the local node and update its own capacity storagenode/monitor/monitor.go: uses contact service in place of routing table storagenode/operator.go: moves operatorconfig from kad into its own setup storagenode/peer.go: sets up contact service, chore, pingstats and endpoints satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection Removes kademlia setups in: cmd/storagenode/main.go cmd/storj-sim/network.go internal/testplane/planet.go internal/testplanet/satellite.go internal/testplanet/storagenode.go satellite/peer.go scripts/test-sim-backwards.sh scripts/testdata/satellite-config.yaml.lock storagenode/inspector/inspector.go storagenode/peer.go storagenode/storagenodedb/database.go Why: Replacing Kademlia Please describe the tests: • internal/testplanet/planet_test.go: TestBasic: assert that the storagenode can check in with the satellite without any errors TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup • satellite/contact/contact_test.go: TestFetchInfo: Tests that the FetchInfo method returns the correct info • storagenode/contact/contact_test.go: TestNodeInfoUpdated: tests that the contact chore updates the node information TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).	2019-09-19 15:56:34 -04:00
Jess G	93788e5218	remove kademlia: create upsert query to update uptime (#2999 ) * create upsert query for check-in method * add tests * fix lint err * add benchmark test for db query * fix lint and tests * add a unit test, fix lint * add address to tests * replace print w/ b.Fatal * refactor query per CR comments * fix disqualified, only set if null * fix query * add version to updatecheckin query * fix version * fix tests * change version for tests * add version to tests * add IP, add transport, mv unit test * use node.address as arg * add last ip * fix lint	2019-09-19 11:37:31 -07:00
Maximillian von Briesen	d22987ea1d	satellite/audit: Fix flakiness in TestReverifyDifferentShare	2019-09-19 10:50:16 -04:00
Maximillian von Briesen	a4048fd529	satellite/audit: fix containment mode (#3085 ) * add test to make sure we will reverify the share in the containment db rather than in the pointer passed into reverify * use pending audit information only when running reverify	2019-09-19 01:45:15 +02:00
Jess G	7c203b4884	add satelliteSystem to testplanet and update tests (#3066 )	2019-09-17 13:14:49 -07:00
Natalie Villasana	4aaf525bd3	satellite/audit: set devDefaults for ChoreInterval and QueueInterval to 1m (#3058 )	2019-09-16 16:36:33 -04:00
Egon Elbre	7240e6cbb2	satellite: remove remote/inline file from BucketTally (#3041 )	2019-09-13 16:51:41 +03:00
Egon Elbre	8b668ab1f8	satellite/metainfo.Loop: use a parsed path for observers (#3003 )	2019-09-12 13:38:49 +03:00
Natalie Villasana	aa3567187e	satellite/audit: worker now verifies and reverifies (#2965 )	2019-09-11 18:37:01 -04:00
Egon Elbre	a801fab66a	all: add archview annotations (#2964 )	2019-09-10 16:24:16 +03:00
Natalie Villasana	6d363fb756	satellite/audit: create the audit queue, chore, and worker (#2888 )	2019-09-05 11:40:52 -04:00
Yingrong Zhao	8eda360ad3	add segment path into logs (#2898 )	2019-08-29 08:38:26 -04:00
Natalie Villasana	49303ea3ac	satellite/audit: mv ReservoirService into its own file (#2886 )	2019-08-27 13:39:51 -04:00
Ivan Fraixedes	df29699641	satellite/audit: Improve code comment in reporter (#2838 )	2019-08-22 14:13:43 +02:00
Egon Elbre	00b2e1a7d7	all: enable staticcheck (#2849 ) * by having megacheck in disable it also disabled staticcheck * fix closing body * keep interfacer disabled * hide bodies * don't use deprecated func * fix dead code * fix potential overrun * keep stylecheck disabled * don't pass nil as context * fix infinite recursion * remove extraneous return * fix data race * use correct func * ignore unused var * remove unused consts	2019-08-22 13:40:15 +02:00
Egon Elbre	2d69d47655	all: fix Error.New formatting (#2840 )	2019-08-21 19:30:29 +03:00
Natalie Villasana	243cedb628	satellite/audit: implement reservoir struct and RemoteSegment observer method (#2744 )	2019-08-21 11:49:27 -04:00
Ivan Fraixedes	87f3b6c708	satellite/audit: Improve comments in verifier (#2829 ) Improve some source code comments in the verifier.	2019-08-20 10:23:14 -04:00

1 2

56 Commits