storj

Author	SHA1	Message	Date
Jess G	8d92c288e2	satellitedb: separate migration into subcommand (#3436 ) * separate sadb migration, add version check * update checkversion to do same validation as migration * changes per CR * add sa migration to storj-sim * add different debug port in storj-sim for migration * add wait for exit for storj-sim migration * update sa docker entrypoint to support migration * storj-sim satellite parts all wait for migration * upgrade golang-migrate/migrate to v4 because bug * fix go mod tidy	2019-11-02 13:09:07 -07:00
JT Olio	41c0093e5b	drpc: enable by default (#3452 )	2019-11-01 22:43:24 +01:00
Maximillian von Briesen	590312970d	satellite/gracefulexit: add flag for enabling/disabling graceful exit on the satellite (#3437 )	2019-11-01 16:21:24 +02:00
Maximillian von Briesen	d9bb25b4b9	satellite/metainfo: support a wider range of values for RS.Total in satellite metainfo validation (#3431 ) change uplink RS default configuration from 130 to 95	2019-10-31 15:04:33 -04:00
Michal Niewrzal	015350e230	storagenode-updater: add autoupdating (#3422 )	2019-10-31 05:27:53 -07:00
Jeff Wendling	59f81a4a0d	groupcancel/ec delete: add a timeout based on completion times we used to do something similar for puts, but that ended up hurting more than it helped. since deletes are best effort, we can do it here to kill long tails or unresponsive nodes. Change-Id: I89fd2d9dcf519d76c78ddad70bc419d1868d2df1	2019-10-30 16:18:39 -06:00
Yingrong Zhao	bfa6699e2c	satellite/repair: add timeout for repair download from a single node(#3418 )	2019-10-30 16:31:08 -04:00
Natalie Villasana	4878135068	satellite/gracefulexit, storagenode/gracefulexit: add timeouts (#3407 )	2019-10-30 13:40:57 -04:00
Cameron	b2ff13f1fa	{cmd/satellite, storj/satellite}: create command to run repair process in isolation (#3341 ) * set up satellite repair run command * add separated repair process to storj-sim * add repairer peer to satellite in testplanet * move api run cmd into api.go * add satellite run repair to entrypoint	2019-10-29 10:55:57 -04:00
Ivan Fraixedes	016be4525a	internal/textcontext: Give proper name compiled bin (#3395 ) Give a proper name when a Go binary is compiled but the passed package is the current working directory which is passed as an empty string.	2019-10-28 17:54:55 +01:00
Egon Elbre	93353df4d6	internal/sync2: make Fence accept context (#3393 )	2019-10-28 16:04:31 +02:00
Bryan White	d61b3688f7	internal/version: fix `OldMinimum` to use old semver type (#3373 ) * fix version checker type assertion * more fixing	2019-10-25 13:24:23 +02:00
Bryan White	0b678c23c7	{internal/version,versioncontrol}: fix old versions (#3359 ) * fix old semver * go jenkins!	2019-10-24 22:24:28 +02:00
Yingrong Zhao	fa1ac24e19	satellite/gracefulexit: add failure threshold check (#3329 ) * add overall failure percentage check and inactive time frame check before sending a response to sno * update comment * delete node from transfer queue if it has been inactive for too long * fix linting error * add test config value * fix nil pointer * add config value into testplanet * add unit test for overall failure threshold * move timeframe threshold to chore * update protolock * add chore test * add per peiece failure count logic * change config name from EndpointMaxFailures to MaxFailuresPerPiece * address comments * fix linting error * add error handling for no row returned from progress table * fix test for graceful exit chore on storagenode * fix typo InActive -> Inactive * improve readability for failure threshold calculation * update config lock * change error handling for GetProgress in graceful exit endpoint on the satellite side * return proper rpc error in endpoint * add check in chore test for checking finish timestamp and queue	2019-10-24 12:24:42 -04:00
JT Olio	2c6fa3c5f8	pkg/rpc: remove read/write deadlines as a mechanism for request timeouts (#3335 ) libuplink was incorrectly setting timeouts to 10 seconds still, but should have been at least 10 minutes. the order sender was setting them to 1 hour. we don't want timeouts in uplink-side logic as it establishes a minimum rate on tcp streams. instead of all of this, just use tcp keep alive. tcp keep alive packets are sent every 15 seconds and if the peer stops responding the connection dies. this is enabled by default with go. this will kill tcp connections when they stop working. Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b	2019-10-22 17:57:24 -06:00
Bryan White	f468816f13	{internal/version,versioncontrol,cmd/storagenode-updater}: add rollout to storagenode updater (#3276 )	2019-10-21 12:50:59 +02:00
Bryan White	243ba1cb17	{versioncontrol,internal/version,cmd/*}: refactor version control (#3253 )	2019-10-20 09:56:23 +02:00
Egon Elbre	89ed997706	satellite/satellitedb: switch to postgres only (#3320 )	2019-10-18 22:03:10 +03:00
Natalie Villasana	855fca003d	satellite/metrics: create a metrics chore (#3263 ) * add metrics counter and chore * updates metrics observer interval release default and dev default to 15min * add more specific check for remote pointers * add Counter field to metrics chore, add counter tests * rm redundant ObjectCount suffix * make pointer check easier to read * change metrics.Config.Interval to ChoreInterval * rm unneeded var * fix comment * update satellite config lock	2019-10-16 14:08:33 -04:00
Cameron	76ad83f12c	satellite/accounting: add redis support to live accounting (#3213 ) * set up redis support in live accounting * move live.Service interface into accounting package and rename to Cache, pass into satellite * refactor Cache to store one int64 total, add IncrBy method to redis client implementation * add monkit tracing to live accounting	2019-10-16 12:50:29 -04:00
Bryan White	951c2891b9	{versioncontrol,internal/version}: add rollout to versioncontrol server (#3176 )	2019-10-16 10:16:59 +02:00
Ethan Adams	1ad2ba7e3e	storagenode/gracefulexit: Add graceful exit chore and worker. (#3262 ) Adds graceful exit chore and worker for V3-2614	2019-10-15 11:29:47 -04:00
Jess G	87a426f228	internal/testplanet: add satellite.API to testplanet (#3237 )	2019-10-14 16:01:53 -04:00
Jennifer Li Johnson	b185dbbee2	satellite/discovery: remove discovery related code (#3175 )	2019-10-14 10:57:01 -04:00
Ethan Adams	a1275746b4	satellite/gracefulexit: Implement the 'process' endpoint on the satellite (#3223 )	2019-10-11 17:18:05 -04:00
Cameron	d17be58237	remove random sleep in storagenode contact (#3243 )	2019-10-11 16:44:18 -04:00
Egon Elbre	e9c36d560f	satellite: make PointerDB an argument to satellite.New (#3233 )	2019-10-10 21:06:26 +03:00
Egon Elbre	f60a7baf17	internal/testplanet: ensure that metainfo schema gets dropped (#3229 )	2019-10-10 17:04:08 +03:00
Ethan Adams	4c4519f0be	satellite/gracefulexit: add transfer queue for pieces (#3174 ) initial impl of transfer queue updated docs represent the new design how we handle durability during exit	2019-10-07 16:38:05 -04:00
Jennifer Li Johnson	7ceaabb18e	Delete Bootstrap and Kademlia (#2974 )	2019-10-04 16:48:41 -04:00
Michal Niewrzal	b25e0154c9	internal/testplanet: use postgres for pointerDB (#3139 )	2019-10-04 07:12:21 -07:00
Maximillian von Briesen	08ed50bcaa	satellite/metainfo: add commit interval to prevent long delays between order limit creation and segment commit (#3149 )	2019-10-01 12:55:02 -04:00
Jennifer Li Johnson	755cbd4dce	storagenode/main: map aliases for kademlia config values (#3118 )	2019-09-30 19:33:00 -04:00
Jennifer Li Johnson	29b96a666b	internal/testplanet: fix conn leak (#3132 )	2019-09-27 09:47:57 -06:00
Cameron	c874dae596	internal/testplanet: ensure monitor chore is finished before contacting satellite (#3124 )	2019-09-26 16:14:39 +03:00
Jeff Wendling	098cbc9c67	all: use pkg/rpc instead of pkg/transport all of the packages and tests work with both grpc and drpc. we'll probably need to do some jenkins pipelines to run the tests with drpc as well. most of the changes are really due to a bit of cleanup of the pkg/transport.Client api into an rpc.Dialer in the spirit of a net.Dialer. now that we don't need observers, we can pass around stateless configuration to everything rather than stateful things that issue observations. it also adds a DialAddressID for the case where we don't have a pb.Node, but we do have an address and want to assert some ID. this happened pretty frequently, and now there's no more weird contortions creating custom tls options, etc. a lot of the other changes are being consistent/using the abstractions in the rpc package to do rpc style things like finding peer information, or checking status codes. Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412	2019-09-25 15:37:06 -06:00
Stefan Benten	c71f3a3f4a	internal/version: Change default endpoint to query (#3126 ) * change default domain name change default domain name to point to the new version control * Update satellite-config.yaml.lock	2019-09-25 22:55:38 +02:00
Egon Elbre	94bbb9563d	internal/testplanet: set intervals to 15s by default (#3103 )	2019-09-25 18:41:24 +03:00
Isaac Hess	580e511b4c	storagenode/storagenodedb: Migrate to separate dbs (#3081 ) * storagenode/storagenodedb: Migrate to separate dbs * storagenode/storagenodedb: Add migration to drop versions tables * Put drop table statements into a transaction. * Fix CI errors. * Fix CI errors. * Changes requested from PR feedback. * storagenode/storagenodedb: fix tx commit	2019-09-23 12:36:46 -07:00
Jennifer Li Johnson	d2502bb51b	Adds tests for kad replacement and restores kad operator configs (#3094 ) * test that all nodes can check in with all satellites * keep kademlia config * add untrusted satellite test * use getversion * remove kademlia config changes in test-sim-backwards.sh * add kademlia flags back to storj-sim storagenode * reset kademlia flags in storagenode entrypoint	2019-09-20 16:02:23 -04:00
Egon Elbre	1ed724b7a6	internal/migrate: make rebind optional (#3071 )	2019-09-20 19:26:07 +03:00
Egon Elbre	02fb891d27	internal/testplanet: reenable TestUplinksParallel (#2338 )	2019-09-20 10:45:04 +03:00
Isaac Hess	8b358ef365	internal/dbutil/sqliteutil: Fix error handling, ensure connections are closed (#3078 ) * internal/dbutil/sqliteutil: Fix error handling, ensure connections are closed * internal/dbutil/sqliteutil: Separate function to handle conn * internal/dbutil/sqliteutil: Fix names	2019-09-19 15:21:03 -06:00
Jennifer Li Johnson	724bb44723	Remove Kademlia dependencies from Satellite and Storagenode (#2966 ) What: cmd/inspector/main.go: removes kad commands internal/testplanet/planet.go: Waits for contact chore to finish satellite/contact/nodesservice.go: creates an empty nodes service implementation satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover() satellite/peer.go: sets up contact service and endpoints storagenode/console/service.go: replaces nodeID with contact.Local() storagenode/contact/chore.go: replaces routing table with contact service storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method storagenode/contact/service.go: creates a service to return the local node and update its own capacity storagenode/monitor/monitor.go: uses contact service in place of routing table storagenode/operator.go: moves operatorconfig from kad into its own setup storagenode/peer.go: sets up contact service, chore, pingstats and endpoints satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection Removes kademlia setups in: cmd/storagenode/main.go cmd/storj-sim/network.go internal/testplane/planet.go internal/testplanet/satellite.go internal/testplanet/storagenode.go satellite/peer.go scripts/test-sim-backwards.sh scripts/testdata/satellite-config.yaml.lock storagenode/inspector/inspector.go storagenode/peer.go storagenode/storagenodedb/database.go Why: Replacing Kademlia Please describe the tests: • internal/testplanet/planet_test.go: TestBasic: assert that the storagenode can check in with the satellite without any errors TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup • satellite/contact/contact_test.go: TestFetchInfo: Tests that the FetchInfo method returns the correct info • storagenode/contact/contact_test.go: TestNodeInfoUpdated: tests that the contact chore updates the node information TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).	2019-09-19 15:56:34 -04:00
Jess G	93788e5218	remove kademlia: create upsert query to update uptime (#2999 ) * create upsert query for check-in method * add tests * fix lint err * add benchmark test for db query * fix lint and tests * add a unit test, fix lint * add address to tests * replace print w/ b.Fatal * refactor query per CR comments * fix disqualified, only set if null * fix query * add version to updatecheckin query * fix version * fix tests * change version for tests * add version to tests * add IP, add transport, mv unit test * use node.address as arg * add last ip * fix lint	2019-09-19 11:37:31 -07:00
JT Olio	946ec201e2	metainfo: move api keys to part of the request (#3069 ) What: we move api keys out of the grpc connection-level metadata on the client side and into the request protobufs directly. the server side still supports both mechanisms for backwards compatibility. Why: dRPC won't support connection-level metadata. the only thing we currently use connection-level metadata for is api keys. we need to move all information needed by a request into the request protobuf itself for drpc support. check out the .proto changes for the main details. One fun side-fact: Did you know that protobuf fields 1-15 are special and only use one byte for both the field number and type? Additionally did you know we don't use field 15 anywhere yet? So the new request header will use field 15, and should use field 15 on all protobufs going forward. Please describe the tests: all existing tests should pass Please describe the performance impact: none	2019-09-19 10:19:29 -06:00
Isaac Hess	fd20fa38c6	internal/dbutil/sqliteutil: add MigrateTablesToDatabase (#3064 ) * internal/dbutil/sqliteutil: add migrator * internal/dbutil/sqliteutil: Fix errors and tablename	2019-09-17 15:42:40 -06:00
Jess G	7c203b4884	add satelliteSystem to testplanet and update tests (#3066 )	2019-09-17 13:14:49 -07:00
littleskunk	1d8cd526e0	storj-sim: correct storagenode dashboard config (#3010 )	2019-09-12 15:20:52 +03:00
Egon Elbre	8b668ab1f8	satellite/metainfo.Loop: use a parsed path for observers (#3003 )	2019-09-12 13:38:49 +03:00

1 2 3 4 5 ...

407 Commits