storj

Author	SHA1	Message	Date
Jennifer Johnson	1c1750e6be	removes bandwidth limiting On satellite, remove all references to free_bandwidth column in nodes table. On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated. Protobuf message, NodeCapacity, is left intact for backwards compatibility. Once this is released to all satellites, we can drop the column from the DB. Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838	2020-03-04 14:04:00 +00:00
Cameron Ayer	7244a6a84e	storagenode/{contact, piecestore}: implement low disk notification with cooldown When a storagenode begins to run low on capacity, we want to notify the satellite before completely running out of space. To achieve this, at the end of an upload request, the SN checks if its available space has fallen below a certain threshold. If so, trigger a notification to the satellites. The new NotifyLowDisk method on the monitor chore is implemented using the common/syn2.Cooldown type, which allows us to execute contact only once within a given timeframe; avoiding hammering the satellites with requests. This PR contains changes to the storagenode/contact package, namely moving methods involving the actual satellite communication out of Chore and into Service. This allows us to ping satellites from the monitor chore Change-Id: I668455748cdc6741291b61130d8ef9feece86458	2020-03-03 10:45:37 -05:00
Qweder93	484ec7463a	storagenode: notifications on outdated software version Change-Id: If19b075c78a7b2c441e11b783c3c09fed55060c7	2020-03-02 16:48:02 +00:00
Egon Elbre	64330c55b3	all: use pbgrpc common/pb moved grpc to a separate package common/pb/pbgrpc. This updates this repository to use it. Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e	2020-02-26 21:27:47 +02:00
Cameron Ayer	d578102672	storagenode/piecestore: add workgroup to endpoint to prevent stray goroutine after shutdown Change-Id: Ie8444c3c8f870745b73342de2e9a93027fcad371	2020-02-24 21:38:52 +00:00
Cameron Ayer	f22bddf122	{storagenode/contact, private/testplanet}: remove ErrFailureToStart and panic in testplanet.Start Change-Id: I252e8c9407400af7bda95a7657c8154660c3c801	2020-02-24 18:24:23 +00:00
Yingrong Zhao	5011e78311	storagenode/piecestore: remove unused DeletePiece endpoint With commit: `3331b443e7`, satellite will start calling `DeletePieces`. Therefore, we can remove the old endpoint once the above commit is deployed with all satellites Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb	2020-02-21 21:03:49 +00:00
NikolaiYurchenko	2601f25c98	web/storagenode: notification logic implementation Change-Id: Iec741997312203117213674ef85125fa8a976249	2020-02-21 15:49:27 +00:00
Egon Elbre	5342dd9fe6	go.mod: update uplink Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811	2020-02-21 16:08:12 +02:00
Egon Elbre	4044b8eeea	storagenode/pieces: ensure chore is stopped before test ends Change-Id: Ibc26e156d13011bf0f91b4206980200a24d348fe	2020-02-21 10:14:44 +02:00
Cameron Ayer	3e70a893dd	storagenode/{piecestore, contact}: report capacity to satellites if below specific threshold Curently, storage nodes only report their capacity to satellites once per hour. If a node fills up, it will fail all uploads until the next contact cycle begins. With these changes, at the end of an upload we check whether the MinimumDiskSpace threshold has been passed. If so, trigger the monitor chore to update the node's capacity, then trigger the contact chore to report the new capacity to the satellites Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0	2020-02-18 15:42:48 -05:00
Egon Elbre	8f20085683	storagenode/piecestore: clearer client cancellation error message Change-Id: Ia0595f71eb3eb1c0f091e615652e2de376d5609d	2020-02-14 09:36:03 +00:00
Jeff Wendling	05a240050e	storagenode: monitor available space and bandwidth Change-Id: I5763597327c5b32982faab8910c136c6c8dc18c5	2020-02-13 07:07:29 +00:00
Michal Niewrzal	426c8eb31a	private/testplanet: add DeleteBucket method for uplink New method added to be able to delete easily bucket during tests. Change-Id: Iaae89618cc676ddbbbd4b0df2eeacd143ea6f3c2	2020-02-11 15:58:13 +00:00
Jeff Wendling	7999d24f81	all: use monkit v3 this commit updates our monkit dependency to the v3 version where it outputs in an influx style. this makes discovery much easier as many tools are built to look at it this way. graphite and rothko will suffer some due to no longer being a tree based on dots. hopefully time will exist to update rothko to index based on the new metric format. it adds an influx output for the statreceiver so that we can write to influxdb v1 or v2 directly. Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff	2020-02-05 23:53:17 +00:00
Isaac Hess	17580fdf57	storagenode/pieces: Add test to cache store This test checks that we are actually walking over the pieces when starting the cache, and that it is returning expected values. A recent outage was partially caused by the fact that this cache was accidentally reading itself (via the pieces store, which has the cache embedded). This test ensures that does not happen, and checks that when the cache's `Run` method is called, the space used values are read from disk and accurately update the cache. Change-Id: I9ec61c4299ed06c90f79b17de3ffdbbb06bc502e	2020-02-05 11:39:06 -07:00
igor gaidaienko	efa0f6d443	storagenode/monitor: set MinimumDiskSpace default to 500GB. As a workaround it was set to 0 in previous release. Now according to the TOC must be set to 500GB. Change-Id: Ia2743d49e86683396958aff51b95df743af4f872	2020-02-04 15:55:42 +00:00
Egon Elbre	9e5679fdaa	storagenode/console/consoleserver: set content-type manually http.FileServer relies on mime types defined in the operating system. These values may be misconfigured, so a javascript file might end up being served as "plain/text". Change-Id: I3c13c8a9ac484bd765a4de0f8253bfe40dde7513	2020-02-03 15:37:47 +02:00
Jeff Wendling	d20db90cff	private/dbutil/txutil: create new transactions for retries it was noticed that if you had a long lived transaction A that was blocking some other transaction B and A was being aborted due to retriable errors, then transaction B was never given priority. this was due to using savepoints to do lightweight retries. this behavior was problematic becaue we had some queries blocked for over 16 hours, so this commit addresses the issue with two prongs: 1. bound the amount of time we will retry a transaction 2. create new transactions when a retry is needed the first ensures that we never wait for 16 hours, and the value chosen is 10 minutes. that should be long enough for an ample amount of retries for small queries, and huge queries probably shouldn't be retried, even if possible: it's more preferrable to find a way to make them smaller. the second ensures that even in the case of retries, queries that are blocked on the aborted transaction gain priority to run. between those two changes, the maximum stall time due to retries should be bounded to around 10 minutes. Change-Id: Icf898501ef505a89738820a3fae2580988f9f5f4	2020-02-01 18:34:28 +00:00
Jeff Wendling	71ff044edb	storagenode/bandwidth: fix tests to not fail for 10 hours near the end of the month Change-Id: I390569a8702164c42edddd3be020e93782227c2e	2020-01-31 16:25:52 -07:00
Jeff Wendling	03166d6be3	storagenode/piecestore: log available bandwidth and space on uploads Change-Id: Ia92228cb2a178da45f4f123b48c476e5ec821fe8	2020-01-31 19:47:14 +00:00
Isaac Hess	78d0868bc9	storagenode/pieces: Log error if cannot calculate piece size Change-Id: I33b49315a0f6044a801a8b118e6b61dbcd751bfe	2020-01-31 09:57:44 -05:00
Egon Elbre	d0b4272467	storagenode: fix global logger in tests https://github.com/storj/storj/wiki/Testing#logging Change-Id: Ic6a31360bcfedae3f37f6b2536a345f00e33cd78	2020-01-31 14:09:28 +00:00
Isaac Hess	2968857e21	storagenode/pieces: Prevent recalculate from having negative numbers Change-Id: Iafd2bcb9963e85508cb5e2bd69f229d89c589a6c	2020-01-30 17:47:54 -05:00
paul cannon	157b8c4d71	storagenode/pieces: accumulate errors in traversal instead of aborting on the first error, so that we can hit all satellites and get the best numbers we can Change-Id: I21d5163884940612d7d39eaf73a6fac07235cd9e	2020-01-30 19:31:29 +00:00
Isaac Hess	5a053483b7	storagenode/pieces: read trash from blobstore Change-Id: Ib134e63a13b8a5dda5d6a9ead42013ce18411227	2020-01-30 13:30:48 -05:00
Isaac Hess	4dafd03f11	storagenode: Prevent negative values in piece_space_used, migrate negatives to 0 Change-Id: Ibd663db087058c928190aa52c520f22e9338dd04	2020-01-30 13:03:18 -05:00
Isaac Hess	00fc192f6b	storagenode/pieces: Explicitly walk satellite pieces in SpaceUsedTotalAndBySatellite Change-Id: I7ff9a1120d4ced0b5cba7d7765ef8aed7a1edae0	2020-01-30 12:01:50 -06:00
Jeff Wendling	21b65ca3b0	storagenode/storagenodedb: migrate to set total to content_size Change-Id: I4906c2fe9cdb3a32c045c98039d4bde6b8b809e3	2020-01-30 08:53:12 -07:00
Egon Elbre	4e2bf81719	pkg/debug: add better title Change-Id: Icc6114f4e7523cfe6c7984ef1f6eec664ae4ee65	2020-01-30 07:49:40 -05:00
littleskunk	81eddaa2c1	storagenode/monitor: reduce space requirement to 0 We have added a bug with v0.31.7 and deploying it would kick out all the storage nodes that are full. Easy fix is setting the requirment to 0. That will allow them to still start up even if they are full. Change-Id: Ie66f369952d929fcfd47f44f6e5e57eea8f51ff6	2020-01-30 01:44:45 +01:00
Egon Elbre	d10d6fd153	storagenode,satellite: ignore error on listening debug port Change-Id: Id3a6d153535776ce41f8edf2bd6f6dad5e2a60bf	2020-01-29 18:06:02 -05:00
Egon Elbre	10be538602	storagenode: add pkg/debug support Change-Id: If941095b886c28a0d53fff4c9bf9fa0ce7471dea	2020-01-29 16:30:31 -05:00
Egon Elbre	f237d70098	storagenode,satellite: use pkg/debug Use debug.Server in storage node and satellite for customizing debug server. Change-Id: I7979412376d028cadf29656d838ab94f18e2aa99	2020-01-29 16:30:31 -05:00
Egon Elbre	e319660f7a	private/lifecycle: implement Group lifecycle.Group implements controlling multiple items such that their startup and close works. Change-Id: Idb4f4a6c3a1f07cdcf44d3147a6c959686df0007	2020-01-29 00:37:33 +00:00
Yingrong Zhao	d8e3556a22	storagenode/preflight: wait for server to shutdown when tests are finished Change-Id: Ie3ede9f285cb61bb6bc6b0158e41d8ea10b2497e	2020-01-28 17:54:19 +00:00
Stefan Benten	3abb8c8ed7	Dont require an IP address being set Per default our server address is listening on all IP addresses on the machine. This caused our preflight check to fail, as it did not have an hostname to lookup. With this change, we are fine with this and go ahead. Change-Id: I9eb5c891c099eb35f679d6d7e79ec38bb43b619f	2020-01-28 15:25:17 +01:00
nerdatwork	9ea32016c2	storagenode/orders: fix typos in log messages (#3760 )	2020-01-26 13:45:57 -05:00
littleskunk	5c68f4fc7c	storagenode/gracefulexit: higher concurrency and shorter timeouts 1 transfer with a minimum speed of 128 Bytes was a nice try but it is way too low. Even a pi3 was able to handle 7 grpc transfers. We have 4 satellites and with 5 concurrent transfers that should be a total of 20 concurrent transfers. Each transfer will have a minimum speed of 5KB/s. That should give us a better througput and still be Ok on a pi3. Change-Id: I650a7baf890080901ef70ea3b5636d93009b4e60	2020-01-24 23:51:39 +00:00
littleskunk	226bc4de36	storagenode/preflightcheck: enable database check by default With the v0.30.5 release we asked the storage node operators to manually enable the preflight check while they are in front of their machine. We didn't want to risk taking too many storage nodes offline at the same time because of some unknow bug. The preflight check worked. We have no negative feedback. We can now enable it by default. Change-Id: Ic670ee52becd0b35eca84af7a0841ea983d7b19d	2020-01-24 23:23:35 +00:00
Moby von Briesen	e4cff1c938	storagenode/preflight: update allowed time difference for preflight clock sync Change 24h and 1h to 30m and 10m respectively for clock sync. If a storagenode's clock is off by more than 30m for every trusted satellite, it will not start. If it is off by more than 10m for any trusted satellite, a warning is displayed. Change-Id: I05ef611a30a49c1783e3b68b513745922c2f7e28	2020-01-24 22:57:13 +00:00
Jeff Wendling	16bb374deb	storagenode/piecestore: add large timeouts to read/write operations this is to help protect against intentional or unintentional slowloris style problems where a client keeps a tcp connection alive but never sends any data. because grpc is great, we have to spawn a separate goroutine for every read/write to the stream so that we can return from the server handler to cancel it if necessary. yep. really. additionally, we update the rpcstatus package to do some stack trace capture and add a Wrap method for the times where we want to just use the existing error. also fixes a number of TODOs where we attach status codes to the returned errors in the endpoints. Change-Id: Id8bb8ff84aa34e0f711b0cf9bce3908b36a1d3c1	2020-01-23 19:20:49 +00:00
Isaac Hess	44de90ecc8	storagenode/pieces: Rename vars and update comments A few variables were not renamed to the new standard piecesTotal and piecesContentSize, so it was unclear which value was being used. These have been updated, and some comments made more thorough. Change-Id: I363bad4dec2a8e5c54d22c3c4cd85fc3d2b3096c	2020-01-23 11:00:24 -07:00
Isaac Hess	14fd6a9ef0	storagenode/pieces: Track total piece size This change updates the storagenode piecestore apis to expose access to the full piece size stored on disk. Previously we only had access to (and only kept a cache of) the content size used for all pieces. This was inaccurate when reporting the amount of disk space used by nodes. We now have access to the total content size, as well as the total disk usage, of all pieces. The pieces cache also keeps a cache of the total piece size along with the content size. Change-Id: I4fffe7e1257e04c46021a2e37c5adc6fe69bee55	2020-01-23 11:00:24 -07:00
stefanbenten	62d3783928	storagenode/peer: ensure contact.external-address and server.address is valid Change-Id: I634f0d355b0be18ba419726ace746921adda3ac0	2020-01-23 15:51:46 +00:00
Egon Elbre	5a4745eddb	all: remove usages of testplanet.New Ensure that tests use testplanet.Run, so we always require running against all database backends. Change-Id: I6b0209e6a4912cf3328bd35b2c31bb8598930acb	2020-01-22 22:42:57 +02:00
Michal Niewrzal	6502454947	satellite/metainfo: move RS configuration to satellite With this change RS configuration will be set on satellite. Uplink with get RS values with BeginObject request and will use it. For backward compatibility and to avoid super large change redundancy scheme stored with bucket is not touched. This can be done in future. Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff	2020-01-22 09:33:53 +00:00
Egon Elbre	c1c878efcf	all: fix import groupings check-imports was broken and didn't complain about things. Change-Id: I38adafd16b4aba86f0eb4f53427b4393f9a6c710	2020-01-20 17:47:44 +00:00
Egon Elbre	21f53e38da	storagenode/storagenodedb/storagenodedbtest: pass ctx as an argument Change-Id: I10b0a8ef3a7d5001e7d361f1873ad5987af1f9c2	2020-01-20 16:56:12 +02:00
Egon Elbre	f3b4bf2b7c	satellite/satellitedb/satellitedbtest: pass ctx as an argument ctx is created in most tests, instead pass in as argument to reduce code duplication. Change-Id: I466c51c008392001129c8b007c9d6b3619935ac4	2020-01-20 16:35:42 +02:00

1 2 3 4 5 ...

395 Commits