storj

Author	SHA1	Message	Date
Jeff Wendling	17e9044c0f	pkg/rpc/rpcpeer: check both drpc and grpc for peers on a context we don't know if an incoming connection is from drpc or grpc during the migration time, so check both. Change-Id: I2418dde8b651dcc4a23726057178465224a48103	2019-11-01 17:04:53 -06:00
JT Olio	41c0093e5b	drpc: enable by default (#3452 )	2019-11-01 22:43:24 +01:00
Jennifer Li Johnson	76b64b79ba	cmd/identity: allow using redis for RevocationDB (#3259 )	2019-11-01 13:27:47 -04:00
Michal Niewrzal	8786a37f89	uplink/storage: use Batch to optimize upload requests (#3408 )	2019-10-29 08:49:16 -07:00
Ethan Adams	e54d290d2e	satellite/gracefulexit: Add signatures for success/failed exit finished messages. (#3368 ) * add signatures, fix process loop bug, move delete to on success * added tests for signatures * PR comment updates * fixed setting reason by default. * updates for PR comments * added signed failure when verificationi fails * moved to sign_test * fix panic * removed testplanet from test	2019-10-25 16:36:26 -04:00
Natalie Villasana	696c567e89	satellite/gracefulexit: add piece hash validation for successful transfer (#3313 )	2019-10-24 15:38:40 -04:00
Yingrong Zhao	fa1ac24e19	satellite/gracefulexit: add failure threshold check (#3329 ) * add overall failure percentage check and inactive time frame check before sending a response to sno * update comment * delete node from transfer queue if it has been inactive for too long * fix linting error * add test config value * fix nil pointer * add config value into testplanet * add unit test for overall failure threshold * move timeframe threshold to chore * update protolock * add chore test * add per peiece failure count logic * change config name from EndpointMaxFailures to MaxFailuresPerPiece * address comments * fix linting error * add error handling for no row returned from progress table * fix test for graceful exit chore on storagenode * fix typo InActive -> Inactive * improve readability for failure threshold calculation * update config lock * change error handling for GetProgress in graceful exit endpoint on the satellite side * return proper rpc error in endpoint * add check in chore test for checking finish timestamp and queue	2019-10-24 12:24:42 -04:00
Jeff Wendling	51d5d8656a	pkg/rpc: drpc connection pooling keep a pool of connections open when dialing for drpc. this makes it so that long lived clients (like lib/uplink's Project) don't continue to use a bad connection forever. it also allows for concurrent rpcs. Change-Id: If649b286050e4f09c413fadc3e1ce88f5fc6e600	2019-10-22 18:15:24 -06:00
JT Olio	2c6fa3c5f8	pkg/rpc: remove read/write deadlines as a mechanism for request timeouts (#3335 ) libuplink was incorrectly setting timeouts to 10 seconds still, but should have been at least 10 minutes. the order sender was setting them to 1 hour. we don't want timeouts in uplink-side logic as it establishes a minimum rate on tcp streams. instead of all of this, just use tcp keep alive. tcp keep alive packets are sent every 15 seconds and if the peer stops responding the connection dies. this is enabled by default with go. this will kill tcp connections when they stop working. Change-Id: I3d7ad49f71950b3eb43044eedf4b17993116045b	2019-10-22 17:57:24 -06:00
Ethan Adams	3e0d12354a	storagenode/gracefulexit: Implement storage node graceful exit worker - part 1 (#3322 )	2019-10-22 16:42:21 -04:00
Michal Niewrzal	04c2454c71	satellite/metainfo: pass streamID/segmentID between Batch request/response (#3311 )	2019-10-22 03:23:22 -07:00
Bryan White	f468816f13	{internal/version,versioncontrol,cmd/storagenode-updater}: add rollout to storagenode updater (#3276 )	2019-10-21 12:50:59 +02:00
Bryan White	243ba1cb17	{versioncontrol,internal/version,cmd/*}: refactor version control (#3253 )	2019-10-20 09:56:23 +02:00
Egon Elbre	f929310add	pkg/rpc/rpcstatus: fix drpc grpc compatibilty (#3306 ) When code is compiled without -tags=drpc the statuses for drpc server weren't handled, which meant an uplink using -tags=drpc didn't get the correct status code.	2019-10-17 15:21:20 -04:00
Yingrong Zhao	87e3764390	storagenode/cmd: add exit-status command for graceful exit (#3264 ) * add exit-status command * remove todo and fix format * fix status display * change startExit to exit progress * fix linting error * add successful column in exit progress * fix test * remove extra new line * fix TYPOS * format the percentage better	2019-10-15 18:07:32 -04:00
Ethan Adams	37ab84355f	satellite/gracefulexit: protobuf field name updates (#3284 ) rename piece_id to original_piece_id	2019-10-15 15:59:12 -04:00
Ethan Adams	1ad2ba7e3e	storagenode/gracefulexit: Add graceful exit chore and worker. (#3262 ) Adds graceful exit chore and worker for V3-2614	2019-10-15 11:29:47 -04:00
Marc Schubert	93d5eeda31	Update dial.go (#3261 ) What: Bring back partial nodeID to debug.trace-out Why: The information is useful for interpreting the trace file and was there up drpc. I just bring it back. https://github.com/storj/storj/blob/v0.21.3/pkg/transport/transport.go#L76 Please describe the tests: Test 1: Test 2: Please describe the performance impact: No impact.	2019-10-14 15:44:15 -06:00
JT Olio	694177e217	pkg/pb: regen gracefulexit.pb.go (#3270 )	2019-10-14 17:06:04 -04:00
Jennifer Li Johnson	b185dbbee2	satellite/discovery: remove discovery related code (#3175 )	2019-10-14 10:57:01 -04:00
JT Olio	6ede140df1	pkg/rpc: defeat MITM attacks in most cases (#3215 ) This change adds a trusted registry (via the source code) of node address to node id mappings (currently only for well known Satellites) to defeat MITM attacks to Satellites. It also extends the uplink UI such that when entering a satellite address by hand, a node id prefix can also be added to defeat MITM attacks with unknown satellites. When running uplink setup, satellite addresses can now be of the form 12EayRS2V1k@us-central-1.tardigrade.io (not even using a full node id) to ensure that the peer contacted is the peer that was expected. When using a known satellite address, the known node ids are used if no override is provided.	2019-10-12 14:34:41 -06:00
Ethan Adams	a1275746b4	satellite/gracefulexit: Implement the 'process' endpoint on the satellite (#3223 )	2019-10-11 17:18:05 -04:00
Isaac Hess	9256399872	CI: test drpc and grpc (#3163 ) * wip: test drpc * Add parallel intregration test * Add jenkinsfile.drpc * Remove unnecessary jenkinsfile items * testing: GOFLAGS=-drpc (#3236) * Use GOFLAGS * add debug * revert tags * revert changes * move goflags to the correct place * add sanity check	2019-10-11 08:30:06 -06:00
Yingrong Zhao	743a0fc38b	storagenode/cmd: create start graceful exit CLI (#3202 )	2019-10-11 09:58:12 -04:00
Ethan Adams	447c219d92	satellite/gracefulexit: Add protobuf definitions for communication between storage node and satellite (#3201 )	2019-10-08 13:42:56 -04:00
Jennifer Li Johnson	7ceaabb18e	Delete Bootstrap and Kademlia (#2974 )	2019-10-04 16:48:41 -04:00
Jeff Wendling	4fab22d691	pkg/rpc: don't leak goroutines during a drpc dial we spawned a goroutine to wait on the context's done channel sending the error afterward, but we forgot to ensure the context was eventually done, so the goroutine would be leaked until then. instead, we can just do a select on two channels to get the error rather than spawn a goroutine which makes it impossible to leak a goroutine. Change-Id: I2fdba206ae6ff7a3441b00708b86b36dfeece2b5	2019-10-04 20:09:36 +00:00
Jeff Wendling	64e43e555e	pkg/rpc: return context error if ready after DialContext fails the net package does not make it easy to know if DialContext failed because the context was done. it's important for some of our tests that canceled contexts are detected as such, so we accept the small race that's arguably correct (the context must be canceled asynchronously) to ensure we always return the context error if available. Change-Id: I058064d5c666e5353b74fb5bd300bf7abe537ff5	2019-10-04 20:09:00 +00:00
Jeff Wendling	c9e0aa7c70	pkg/kademlia: make tests run/work with drpc Change-Id: I69372fd8f0d52913e1ad2cf7d01115460ba8eeda	2019-10-03 15:33:25 -06:00
littleskunk	b2e328f118	storagenode/dashboard: update online status (#3168 )	2019-10-03 20:31:39 +02:00
Isaac Hess	94c7df0d6e	pkg/rpc/rpcstatus: Fix return type (#3162 )	2019-10-02 14:46:18 -06:00
Jennifer Li Johnson	29b96a666b	internal/testplanet: fix conn leak (#3132 )	2019-09-27 09:47:57 -06:00
Jeff Wendling	93349f247e	pkg/rpc: add WithInsecure when doing non-tls dials Change-Id: I993f223f4ac78824b75a7725342ebf2ae0f74254	2019-09-27 09:07:14 -06:00
Bryan White	c8aa821ccb	pkg/certificates: move certificate package to root (#3107 )	2019-09-26 09:11:05 -07:00
Jeff Wendling	098cbc9c67	all: use pkg/rpc instead of pkg/transport all of the packages and tests work with both grpc and drpc. we'll probably need to do some jenkins pipelines to run the tests with drpc as well. most of the changes are really due to a bit of cleanup of the pkg/transport.Client api into an rpc.Dialer in the spirit of a net.Dialer. now that we don't need observers, we can pass around stateless configuration to everything rather than stateful things that issue observations. it also adds a DialAddressID for the case where we don't have a pb.Node, but we do have an address and want to assert some ID. this happened pretty frequently, and now there's no more weird contortions creating custom tls options, etc. a lot of the other changes are being consistent/using the abstractions in the rpc package to do rpc style things like finding peer information, or checking status codes. Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412	2019-09-25 15:37:06 -06:00
Bryan White	a7040647a4	run certificate authorization endpoint (#3108 )	2019-09-23 15:19:13 -07:00
Jeff Wendling	d32d85a717	pkg/listenmux: resolve deadlock in test it was possible, because we spawned Run before we did any calls to Route, that the listenmux would send multiple connections to the default listener. Fix that by ensuring we call Route before we call Run. Change-Id: Ie8fd754997975969a99fd2a3f8d3010c24cdc73d	2019-09-20 21:16:59 +00:00
Jeff Wendling	a20a7db793	pkg/rpc: build tag based selection of rpc details It provides an abstraction around the rpc details so that one can use dprc or gprc with the same code. It subsumes using the protobuf package directly for client interfaces as well as the pkg/transport package to perform dials. Change-Id: I8f5688bd71be8b0c766f13029128a77e5d46320b	2019-09-20 21:07:33 +00:00
Jennifer Li Johnson	724bb44723	Remove Kademlia dependencies from Satellite and Storagenode (#2966 ) What: cmd/inspector/main.go: removes kad commands internal/testplanet/planet.go: Waits for contact chore to finish satellite/contact/nodesservice.go: creates an empty nodes service implementation satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover() satellite/peer.go: sets up contact service and endpoints storagenode/console/service.go: replaces nodeID with contact.Local() storagenode/contact/chore.go: replaces routing table with contact service storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method storagenode/contact/service.go: creates a service to return the local node and update its own capacity storagenode/monitor/monitor.go: uses contact service in place of routing table storagenode/operator.go: moves operatorconfig from kad into its own setup storagenode/peer.go: sets up contact service, chore, pingstats and endpoints satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection Removes kademlia setups in: cmd/storagenode/main.go cmd/storj-sim/network.go internal/testplane/planet.go internal/testplanet/satellite.go internal/testplanet/storagenode.go satellite/peer.go scripts/test-sim-backwards.sh scripts/testdata/satellite-config.yaml.lock storagenode/inspector/inspector.go storagenode/peer.go storagenode/storagenodedb/database.go Why: Replacing Kademlia Please describe the tests: • internal/testplanet/planet_test.go: TestBasic: assert that the storagenode can check in with the satellite without any errors TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup • satellite/contact/contact_test.go: TestFetchInfo: Tests that the FetchInfo method returns the correct info • storagenode/contact/contact_test.go: TestNodeInfoUpdated: tests that the contact chore updates the node information TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).	2019-09-19 15:56:34 -04:00
Jess G	93788e5218	remove kademlia: create upsert query to update uptime (#2999 ) * create upsert query for check-in method * add tests * fix lint err * add benchmark test for db query * fix lint and tests * add a unit test, fix lint * add address to tests * replace print w/ b.Fatal * refactor query per CR comments * fix disqualified, only set if null * fix query * add version to updatecheckin query * fix version * fix tests * change version for tests * add version to tests * add IP, add transport, mv unit test * use node.address as arg * add last ip * fix lint	2019-09-19 11:37:31 -07:00
Kaloyan Raev	45df0c5340	storagenode/process: respond to Windows Service events (#3025 )	2019-09-19 19:37:40 +03:00
JT Olio	946ec201e2	metainfo: move api keys to part of the request (#3069 ) What: we move api keys out of the grpc connection-level metadata on the client side and into the request protobufs directly. the server side still supports both mechanisms for backwards compatibility. Why: dRPC won't support connection-level metadata. the only thing we currently use connection-level metadata for is api keys. we need to move all information needed by a request into the request protobuf itself for drpc support. check out the .proto changes for the main details. One fun side-fact: Did you know that protobuf fields 1-15 are special and only use one byte for both the field number and type? Additionally did you know we don't use field 15 anywhere yet? So the new request header will use field 15, and should use field 15 on all protobufs going forward. Please describe the tests: all existing tests should pass Please describe the performance impact: none	2019-09-19 10:19:29 -06:00
Jess G	695de9dcd7	rm noisy debug logs that we dont need (#3083 )	2019-09-18 12:43:57 -07:00
Egon Elbre	186e67e056	pkg/transport: set default timeout to 10 minutes (#3075 )	2019-09-18 11:56:23 -04:00
Maximillian von Briesen	574c96c350	satellite/metainfo: Verify storagenode signature on satellite upload (#2985 )	2019-09-18 09:50:33 -04:00
Jess G	7c203b4884	add satelliteSystem to testplanet and update tests (#3066 )	2019-09-17 13:14:49 -07:00
Natalie Villasana	cc70cd2329	satellite/repair: add metric trackers for segment age before repair (#3056 )	2019-09-17 15:18:48 -04:00
Ivan Fraixedes	febe32bc7a	pkg/miniogw: Add a missed stack trace error (#3035 ) Add the stack trace to an error returned by one of the functions for being able to track down the origin in case the error happens and gets logged.	2019-09-16 23:00:50 +03:00
Jess G	d3ef574b20	pkg/pb: minor changes to contact.proto (#3048 ) * minor fixes to contact proto * simply and rm nodeAddr object from client	2019-09-13 19:37:32 -05:00
Andrew Harding	f550ab5d1c	Uplink "import" command (#2981 ) * uplink import cmd * pkg/process: fix import order * fix golangci-lint failures * remove "help" from the satellite config lock file	2019-09-13 12:33:30 -06:00
Jeff Wendling	0dcbd3dc08	bootstrap/satellite/certificate/storagenode: register drpc services Change-Id: Id29f14b76a8c9cb2be31001b9a7a4356a4bda183	2019-09-12 15:09:46 -06:00
Jeff Wendling	007662d49e	pkg/server: serve drpc listeners along with grpc The fundamental problem is that both drpc and grpc servers want to close the listener and they both want to ignore the error from Accept after the listener is closed. There's no way to do this in a race free way. Fortunately, the mux hands out listeners that can be independently closed. That means they can both do their own shutdown logic where they ignore the error, and then after they're closed, the code orchestrating the servers can close the listeners. The final weird bit is that the server's Close method is required to wait until the Run method has exited (or at least enough for the listeners to definitely be closed) because tests depend on that behavior, so we have to add some channels/mutexes/onces to ensure that Run has exited and that a new call can't start after Close is called. Change-Id: I7c4ef293f7963f83138815f51824fd5b8d09ce15	2019-09-12 19:18:30 +00:00
Jeff Wendling	477b47f554	pkg/server: use a listenmux with nothing registered Change-Id: I25577b4afb907f4f8b57fc0428de6e6ea4ce9ba9	2019-09-12 19:12:53 +00:00
paul cannon	c139ed8ea1	storagenode/console: remove kademlia (#2942 ) this is a trivial operation for storagenode/console, as it doesn't really need or use kademlia in the first place. What: Removes kademlia from storagenode/console Why: We are in the process of getting rid of kademlia, and this is one place where it's particularly easy. Please describe the tests: Existing tests exercise storagenode/console behavior; if they continue to work, everything here should be tested satisfactorily. Please describe the performance impact: None	2019-09-11 16:41:43 -04:00
Jeff Wendling	708c95d044	pkg/listenmux: multiplex listener based on first bytes Change-Id: If96bfd216e55f9950a42ab5be712a3cdec257a10	2019-09-11 11:16:09 -06:00
Bryan White	6c80f01bf0	pkg/certificates: add authorization endpoint and refactor (#2971 )	2019-09-11 10:36:44 +02:00
paul cannon	7cf5650560	pkg/certificate: properly close certificateclient.Client (#2986 )	2019-09-10 19:24:41 +03:00
Michal Niewrzal	64c467ffe7	uplink: integrate new Metainfo calls (#2640 )	2019-09-10 08:39:47 -07:00
Michal Niewrzal	0ccae6b061	cmd: windows log file workaround (#2979 )	2019-09-10 12:35:59 +03:00
Egon Elbre	e03e6844f8	pkg/process: reduce noise in storj-sim (#2988 ) * Don't display stack for failure to start debug endpoints * Don't display stack for disabled telemetry	2019-09-10 10:32:53 +03:00
Egon Elbre	40ca660c06	all: use min tls 1.2 for grpc (#2967 )	2019-09-09 23:09:01 +03:00
Jeff Wendling	60eba990eb	use-drpc: use protoc-gen-drpc to generate protobufs Change-Id: I5c23256068e30864022dba5137c499796ab9d6ad	2019-09-06 13:28:27 -06:00
Egon Elbre	a3e0955e16	satellite/satellitedb: ensure that we process orders in order (#2950 ) When transactions are handled in different orders there is a potential for a deadlock.	2019-09-06 17:49:30 +03:00
Bryan White	1fc0c63a1d	{cmd,pkg}/certificates: service refactor (#2938 )	2019-09-05 17:11:21 +02:00
Jess G	f7bae57e5b	pk/pb: add initial proto for the new contact endpoints (#2948 ) * add init contact proto * make 2 svcs * add geenerated proto code * fux conflict with NodeRequest name * use correct version on proto * rm extra node fields * add protolock * update field names so they are better * rm node id since we dont need it	2019-09-04 10:07:06 -07:00
Ivan Fraixedes	d7d6e23a3e	pkg/server: Don't use Sugared logger (#2935 ) Replace the usage of the Sugared logger by the normal one.	2019-09-03 11:39:26 +02:00
JT Olio	48ba45eb55	pkg/process prometheus: forgot digits (#2920 ) Change-Id: I1825d31dedcce45a80663d27261e2db0cb14c0f1	2019-08-29 17:51:03 -06:00
Jess G	b20dcfd64c	add type to prometheus metrics (#2916 )	2019-08-29 23:42:38 +02:00
JT Olio	b3f9a8813d	pkg/process: remove prometheus help (#2914 ) the current prometheus help messages have enough unexpected characters that they are breaking prometheus parsing. they may also be triggering prometheus to expect more from us (type annotations) than we have to offer. we're really not adding a lot of value with these help messages, so just take them out Change-Id: I9b723447a294bb492a6292480e9f88634346a80b	2019-08-29 12:42:11 -07:00
Egon Elbre	c309bd3fec	lint: add linting for errs package (#2881 )	2019-08-27 19:07:12 +03:00
Bill Thorp	a250551b6d	storagenode/piecestore + uplink/piecestore: return `PieceHash` and original `OrderLimit` during GET_REPAIR (#2775 )	2019-08-26 14:57:41 -04:00
Yaroslav Vorobiov	2ae4129d06	satellite/nodestats: add disqualified flag #2856	2019-08-23 13:58:20 +03:00
JT Olio	12d50ebb99	streams: don't encrypt segment count (#2859 ) What: this change makes sure the count of segments is not encrypted. Why: having the segment count encrypted just makes things hard for no reason - a satellite operator can figure out how many segments an object has by looking at the other segments in the database. but if a user has access but has lost their encryption key, they now can't clean up or delete old segments because they can't know how many there are without just guessing until they get errors. :( Backwards compatibility: clients will still understand old pointers and will still write old pointers. at some point in the future perhaps we can do a migration for remaining old pointers so we can delete the old code. Please describe the tests: covered by existing tests Please describe the performance impact: none	2019-08-22 15:15:58 -06:00
Egon Elbre	00b2e1a7d7	all: enable staticcheck (#2849 ) * by having megacheck in disable it also disabled staticcheck * fix closing body * keep interfacer disabled * hide bodies * don't use deprecated func * fix dead code * fix potential overrun * keep stylecheck disabled * don't pass nil as context * fix infinite recursion * remove extraneous return * fix data race * use correct func * ignore unused var * remove unused consts	2019-08-22 13:40:15 +02:00
Egon Elbre	9ec0ceddf3	pkg/revocation: ensure we close revocation databases (#2825 )	2019-08-20 18:04:17 +03:00
Bryan White	25f0b13980	pkg/peertls: extension handling refactor (#2831 )	2019-08-20 17:15:43 +03:00
Egon Elbre	3c41794429	pkg/kademlia: clearer error message (#2824 )	2019-08-20 12:54:33 +03:00
Isaac Hess	25154720bd	lib/uplink: remove redis and bolt dependencies (#2812 ) * identity: remove redis and bolt dependencies * identity: move revDB creation to main files	2019-08-19 16:10:38 -06:00
ethanadams	1a69ec8318	satellite/orders: document protocol and fix typos (#2813 ) * Addressing comments from PR 2762 * Rebuild of orders.pb.go after comments added to proto file * run update-satellite-config-lock for spelling fix.	2019-08-19 09:36:11 -04:00
Ivan Fraixedes	e47b8ed131	storagenode: No FATAL error when unsent orders aren't found (#2801 ) * pkg/process: Fatal show complete error information Change the general process execution function to not using the sugared logger for outputting the full error information. Delete some unreachable code because Zap logger Fatal method calls exit 1 internally. * storagenode/storagenodedb: Add info to error Add more information to an error returned due to some data inconsistency. * storagenode/orders: Don't use sugared logger Don't use sugar logger and provide better contextualized error messages in settle method. * storagenode/orders: Add some log fields to error msgs Add some relevant log fields to some logged errors of the sender settle method. * satellite/orders: Remove always nil error from debug Remove an error which as logged in debug level which was always nil and makes the logic that used this variable clear. * storagenode/orders: Don't return error Archiving unsent Don't stop the process which archive unsent orders if some of them aren't found the DB because it cause the Storage Node to stop with a fatal error.	2019-08-16 16:53:22 +02:00
Jeff Wendling	2dab0ab466	pkg/process: only propagate missing keys to flags (#2784 ) this avoids a problem where setting on a flag isn't sufficient to express complex data structures like []string. Change-Id: I06f13656996d658b4c7a957451cb253728a67eda	2019-08-14 14:41:41 -06:00
Egon Elbre	fec10ccbd5	cmd/certificates: ensure we can bind config values (#2779 )	2019-08-14 17:26:45 +03:00
aligeti	32f95a14fd	satellite/certdb: remove certdb that was used to store uplink certificates (#2760 ) * satellitedb/certDB: refactors of the node certificate storage DB table The existing implementation doesnt allow to store the complete certificate chain of uplinkIDs or storagenodeIDs, so the current table is dropped and new table will be added which addresses the storage and retrieval of certificates pkg/identity: fixes spelling mistakes that I missed on PR#2754 Fixes V3-1992/V3-2388	2019-08-12 10:41:34 -04:00
aligeti	7af05177e2	pkg/identity: support encode and decode functionality of Peer Identity information	2019-08-09 15:23:29 -04:00
Kaloyan Raev	9dccf59e8e	Restrict node info only for trusted satellites (#2737 )	2019-08-09 12:21:41 +03:00
Yaroslav Vorobiov	28a7778e9e	storagenode/nodestats: cache node stats (#2543 )	2019-08-08 16:47:04 +03:00
paul cannon	17bdb5e9e5	move piece info into files (#2629 ) Deprecate the pieceinfo database, and start storing piece info as a header to piece files. Institute a "storage format version" concept allowing us to handle pieces stored under multiple different types of storage. Add a piece_expirations table which will still be used to track expiration times, so we can query it, but which should be much smaller than the pieceinfo database would be for the same number of pieces. (Only pieces with expiration times need to be stored in piece_expirations, and we don't need to store large byte blobs like the serialized order limit, etc.) Use specialized names for accessing any functionality related only to dealing with V0 pieces (e.g., `store.V0PieceInfo()`). Move SpaceUsed- type functionality under the purview of the piece store. Add some generic interfaces for traversing all blobs or all pieces. Add lots of tests.	2019-08-07 20:47:30 -05:00
Michal Niewrzal	de7dddbe59	metainfo: Batch request (#2694 )	2019-08-06 16:56:23 +02:00
JT Olio	538e8168a5	pkg/process: support json trace output (#2713 ) Change-Id: Id5a83a8fe23d19d92f5c2622acb0642b4e8f72a2	2019-08-05 12:28:41 -06:00
Jeff Wendling	21a3bf89ee	cmd/uplink: use scopes to open (#2501 ) What: Change cmd/uplink to use scopes It moves the fields that will be subsumed by scopes into an explicit legacy section and hides their configuration flags. Why: So that it can read scopes in from files and stuff	2019-08-05 11:01:20 -06:00
Michal Niewrzal	688d932d93	Make one implementation for SetAttribution/SetBucketAttribution (#2683 )	2019-08-05 09:07:40 +02:00
JT Olio	28156d3573	storagenode: more live request tracking (#2699 ) * storagenode/piecestore: track live requests together Change-Id: I9ed44e4484b97bcbe076c222450c3449fe8b1075 * show grpc status codes in monkit failures Change-Id: I68bc3a8d24a372e8147ef2a74636fc3e40fa799a * small nit Change-Id: I722b09345377b079e41c5a3dc86d7fd6232c9d24	2019-08-02 16:49:39 -06:00
aligeti	471b46c0cd	adding more info the error logs in kademlia (#2668 )	2019-08-02 09:54:10 -04:00
Egon Elbre	369a51ed00	lib/uplink: ensure it's silent by default (#2676 )	2019-08-01 07:14:09 -04:00
Michal Niewrzal	287fdf9936	Integrate new Metainfo calls (server side) (#2682 )	2019-08-01 11:04:31 +02:00
Egon Elbre	4f0d39cc64	don't use global loggers (#2675 )	2019-07-31 17:38:44 +03:00
Egon Elbre	ec3d5c0bdd	don't use global loggers (#2671 ) * pkg/server: don't use global logger * satellite/overlay: use correct logger * pkg/kademlia: use correct logger * linksharing: use conventional way to pass in logger * use zaptest in tests	2019-07-31 15:09:45 +03:00
Egon Elbre	9ba8b53ed5	pkg/auth: use grpc.WithPerRPCCredentials (#2670 )	2019-07-31 13:57:13 +02:00
Jennifer Li Johnson	338c1b2265	adds comments (#2666 )	2019-07-30 14:08:29 -04:00
Egon Elbre	dd7c8610bb	satellite/repair: move test files (#2649 )	2019-07-28 12:15:34 +03:00
Egon Elbre	5d0816430f	rename all the things (#2531 ) * rename pkg/linksharing to linksharing * rename pkg/httpserver to linksharing/httpserver * rename pkg/eestream to uplink/eestream * rename pkg/stream to uplink/stream * rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo * rename pkg/auth/signing to pkg/signing * rename pkg/storage to uplink/storage * rename pkg/accounting to satellite/accounting * rename pkg/audit to satellite/audit * rename pkg/certdb to satellite/certdb * rename pkg/discovery to satellite/discovery * rename pkg/overlay to satellite/overlay * rename pkg/datarepair to satellite/repair	2019-07-28 08:55:36 +03:00
Kaloyan Raev	734c793deb	Add UpdatePieces method to Metainfo service (#2572 )	2019-07-25 19:59:46 +03:00
Egon Elbre	0cdeae1922	add missing error handling (#2630 )	2019-07-25 17:01:44 +02:00
paul cannon	b9a17913fa	storagenode/pieces: remove buffering from reading/writing and fix io.EOF bug (#2554 )	2019-07-25 11:22:15 +03:00
Natalie Villasana	f11413bc8e	Implement garbage collection on satellite (#2577 ) * Added a gc package at satellite/gc, which contains the gc.Service, which runs garbage collection integrated with the metainfoloop, and the gc PieceTracker, which implements the metainfo loop Observer interface and stores all of the filters (about which pieces are good) for each node. * Added a gc config located at satellite/gc/service.go (loop disabled by default in release) * Creates bloom filters with pieces to be retained inside the metainfo loop * Sends RetainRequests (or filters with good piece ids) to all storage nodes.	2019-07-24 13:26:43 -04:00
Isaac Hess	b8fe349816	Support empty path components (#2574 )	2019-07-24 08:40:22 -06:00
Michal Niewrzal	5710dc3a32	Metainfo RPC segment methods (part 2) (#2616 )	2019-07-24 13:33:23 +02:00
Ivan Fraixedes	3c8f1370d2	[v3 2137] - Add more info to find out repair failures (#2623 ) * pkg/datarepair/repairer: Track always time for repair Make a minor change in the worker function of the repairer, that when successful, always track the metric time for repair independently if the time since checker queue metric can be tracked. * storage/postgreskv: Wrap error in Get func Wrap the returned error of the Get function as it is done when the query doesn't return any row. * satellite/metainfo: Move debug msg to the right place NewStore function was writing a debug log message when the DB was connected, however it was always writing it out despite if an error happened when getting the connection. * pkg/datarepair/repairer: Wrap error before logging it Wrap the error returned by process which is executed by the Run method of the repairer service to add context to the error log message. * pkg/datarepair/repairer: Make errors more specific in worker Make the error messages of the "worker" method of the Service more specific and the logged message for such errors. * pkg/storage/repair: Improve error reporting Repair In order of improving the error reporting by the pkg/storage/repair.Repair method, several errors of this method and functions/methods which this one relies one have been updated to be wrapper into their corresponding classes. * pkg/storage/segments: Track path param of Repair method Track in monkit the path parameter passed to the Repair method. * satellite/satellitedb: Wrap Error returned by Delete Wrap the error returned by repairQueue.Delete method to enhance the error with a class and stack and the pkg/storage/segments.Repairer.Repair method get a more contextualized error from it.	2019-07-23 16:28:06 +02:00
Ivan Fraixedes	79e9b62f6d	pkg/storage/segments: Clarify logic in Repair method (#2621 ) Create a new variable rather than reusing the existing one because the name of the existing one is confusing when reading the logic and it requires more time that the logic doesn't have a bug.	2019-07-23 14:16:32 +02:00
Michal Niewrzal	cba008d7df	Add GetObject method to Metainfo (#2611 )	2019-07-23 13:09:12 +02:00
Jennifer Li Johnson	53d96be44a	Stylistic Go Cleanup (#2524 )	2019-07-22 15:10:04 -04:00
Michal Niewrzal	6f2b85603d	Metainfo RPC segment methods (part 1) (#2567 )	2019-07-22 16:45:18 +02:00
Egon Elbre	63c1a050fc	pkg/dht: remove (#2599 )	2019-07-19 20:34:00 +03:00
Egon Elbre	5fe95fe8fa	kademlia: restrict maximum find near query (#2598 )	2019-07-19 18:46:08 +02:00
aligeti	29b576961f	value attribution merge fix and more test cases (#2588 ) * value attribution merge fix and more test cases	2019-07-19 11:17:34 -04:00
Bill Thorp	a7cc940776	Nodes should not be able to fail the same audit multiple times (#2404 ) update pointer on audit failure	2019-07-18 14:08:15 -04:00
Jess G	3af9250659	update irreparableDB.GetLimited query to use where instead of offset (#2585 ) * update query to use where instead of offset, update tests * update cmd/inspector irreparable * add comment for offset	2019-07-18 09:21:21 -07:00
Simon Guindon	91f0adef10	Add the ability to set dial and request timeouts from the cmd/uplink CLI to libuplink. (#2439 ) * Added the ability to pass timeout settings from cmd/uplink to libuplink. * Removed commented out code. * Updated 2min timeouts for the uplink CLI. * Removed comment. * Made transport defaultDialTimeout and defaultRequestTimeout public * Added comments to describe where these defaults apply. * Added a new defaults to libuplink and added tests. * Added a new defaults to libuplink and added tests.	2019-07-18 11:13:59 -04:00
Andrew Harding	416fa80e85	Link Sharing Service (#2431 ) Link sharing service. See `docs/design/link-sharing-service.md` for the design and `cmd/linksharing/README.md` for operational instructions.	2019-07-18 06:26:09 -06:00
Kaloyan Raev	0931f0e71b	Log fatal errors with respective severity (#2581 )	2019-07-17 14:39:10 -04:00
Egon Elbre	f6f65a80d7	storagenode/trust: implement fetching peer identity without kademlia and endpoint (#2584 )	2019-07-17 21:14:44 +03:00
Michal Niewrzal	260d9c49a8	Metainfo RPC objects methods (#2534 )	2019-07-16 12:39:23 +02:00
Cameron	8b2d46a974	add vouchers to QueryRequest (#2559 )	2019-07-15 18:02:22 -04:00
Jeff Wendling	3a34a0df7b	repair: fix data race in reliability cache (#2561 )	2019-07-15 22:58:39 +03:00
Egon Elbre	002d9748ec	signing: ensure we don't break signatures (#2542 )	2019-07-12 16:41:19 -04:00
aligeti	daa3b32ee2	Add Attribution Columns to appropriate tables for OSPP referral link (#2516 ) * adds "partner_id" column to user, project, api_key & bucket_metainfo tables	2019-07-12 13:59:19 -04:00
Jess G	07b23b7cc3	update tally to rm bucket count (#2539 ) * update tally to rm bucket count * rm old comment	2019-07-12 12:19:45 -04:00
Alexander Leitner	64b2769de3	discovery: parallelize refresh (#2535 ) * parallelize discovery refresh * add paginateQualifiedtest, address pr comments * Remove duplicate uptime update * Lower concurrency in Testplanet for discovery	2019-07-12 10:35:48 -04:00
Jess G	f11bf46a11	Jg/1967 mv bucket metadata uplink (#2505 ) * add bucketstore, add init uplink bucket * update uplink to use bucket rpc * fix tests * wrap metainfo client errors * add allowedBucket struct, fix tests * update comment * add paging * updates per CR * add test for pagination * fix lint * fix uplink test so its easier tyo understand * fix gateway pagination bug * changes per cr * fix bug w/allowedBuckets, add test to catch	2019-07-12 08:57:02 -04:00
Ivan Fraixedes	f420b29d35	[V3-1927] Repairer uploads to max threshold instead of success… (#2423 ) * pkg/datarepair: Add test to check num upload pieces Add a new test for ensuring the number of pieces that the repair process upload when a segment is injured. * satellite/orders: Don't create "put order limits" over total Repair must not create "put order limits" more than the total count. * pkg/datarepair: Update upload repair pieces test Update the test which checks the number of pieces which are uploaded during a repair for using the same excess over the success threshold value than the implementation. * satellites/orders: Limit repair put order for not being total Limit the number of put orders to be used by repair for only uploading pieces to a % excess over the successful threshold. * pkg/datarepair: Change DataRepair test to pass again Make some changes in the DataRepair test to make pass again after the repair upload repaired pieces only until a % excess over success threshold. Also update the steps description of the DataRepair test after it has been changed, to match on what's now, besides to leave it more generic for avoiding having to update it on minimal future refactorings. * satellite: Make repair excess optimal threshold configurable Add a new configuration parameter to the satellite for being able to configure the percentage excess over the optimal threshold, used for determining how many pieces should be repaired/uploaded, rather than having the value hard coded. * repairer: Add configurable param to segments/repairer Add a new parameters to the segment/repairer to calculate the maximum number of excess nodes, based on the optimal threshold, that repaired pieces can be uploaded. This new parameter has been added for not returning more nodes than the number of upload orders for data repair satellite service calculate for repairing pieces. * pkg/storage/ec: Update log message in clien.Repair * satellite: Update configuration lock file	2019-07-12 00:44:47 +02:00
Michal Niewrzal	8a9c63809b	tests for PiecePublicKey/PiecePrivateKey (#2526 )	2019-07-11 17:11:04 -04:00
Egon Elbre	d52f764e54	protocol: implement new piece signing and verification (#2525 )	2019-07-11 16:51:40 -04:00
Maximillian von Briesen	8b507f3d73	Address concerns with storagenode Retain endpoint (#2527 )	2019-07-11 16:04:21 -04:00
Michal Niewrzal	268c629ba8	Replace base64 encoding for path segments (#2345 )	2019-07-11 13:26:07 -04:00
Kaloyan Raev	0f36b160fc	Higher dial timeout for TestDownloadSharesDownloadTimeout (#2500 )	2019-07-10 17:45:09 -04:00
Maximillian von Briesen	de85d17069	Add checker metrics (#2487 ) checker_segment_total_count - Number of total segments in pointer during checker iteration checker_segment_healthy_count - Number of healthy segments in pointer during checker iterationn time_since_checker_queue - Seconds elapsed between checker queue and beginning repair time_for_repair - Seconds elapsed between beginning repair and ending repair/dequeueing	2019-07-10 17:27:46 -04:00
Jeff Wendling	10547cc1ea	segments: send in the object path to the initial CreateSegment… (#2518 ) otherwise, api key restictions will fail because we look like we're asking to put to the bucket metadata path.	2019-07-10 11:33:55 -04:00
Cameron	c29f033e7d	Move kademlia dialer into separate package (#2466 ) * move kademlia.Dialer into kademliaclient package	2019-07-10 10:36:37 -04:00
JT Olio	a79c7d77f3	overlay cache: slight modification of node-is-online rules (#2490 )	2019-07-09 22:36:09 -04:00
Alexander Leitner	1c5db71faf	Change protobuf expirations to use time.Time (#2509 ) * Change protobuf expirations to use time.Time instead of timestamp.Timestamp	2019-07-09 17:54:00 -04:00
Jeff Wendling	671648012a	Update password based key derivation design doc (#2367 )	2019-07-09 00:07:03 -04:00
Ivan Fraixedes	50d601ab07	pkg/identity: Use identity error class (#2488 )	2019-07-08 22:52:52 -04:00
JT Olio	887b784f54	discovery: use fetch info to ping (#2491 )	2019-07-08 22:10:17 -04:00
Jess G	f9696d6c5e	satellite/metainfo: add buckets RPC and database (#2460 ) * add db interface and methods, add sa metainfo endpoints and svc * add bucket metainfo svc funcs * add sadb bucekts * bucket list gets all buckets * filter buckets list on macaroon restrictions * update pb cipher suite to be enum * add conversion funcs * updates per comments * bucket settings should say default * add direction to list buckets, add tests * fix test bucket names * lint err * only support forward direction * add comments * minor refactoring * make sure list up to limit * update test * update protolock file * fix lint * change per PR	2019-07-08 15:32:18 -07:00
Alexander Leitner	3587e1a579	Change pointerdb pointer to use time.Time for Creation date (#2483 )	2019-07-09 00:16:50 +02:00
Egon Elbre	674742d1a7	satellite/datarepair: use reliability cache (#1976 )	2019-07-09 01:04:35 +03:00
Ivan Fraixedes	a786e4249c	pkg: Align errs Class messages (#2485 ) Align 2 errs Class messages with the rest of them.	2019-07-08 17:37:12 -04:00
Alexander Leitner	19ab9852f2	Update node.proto to use time.Time instead of timestamp (#2482 )	2019-07-08 14:24:42 -04:00
Alexander Leitner	dcf8e2936b	Update vouchers to use time.Time instead of timestamp (#2478 ) * Update vouchers to use time.Time instead of timestamp	2019-07-08 13:07:30 -04:00
Simon Guindon	8d15d774b6	Removing trace of encodedPiece.Read() because it's fast. (#2477 ) * Removing trace of encodedPiece.Read() because it's fast. * Added comment about why tracing was removed.	2019-07-08 11:16:13 -04:00

1 2 3 4 5 ...

1371 Commits