storj

Author	SHA1	Message	Date
Stefan Benten	494bd5db81	all: golangci-lint v1.33.0 fixes (#3985 )	2020-12-05 17:01:42 +01:00
Kaloyan Raev	53b7fd7b00	satellite/{audit,gracefulexit}: remove logic for PieceHashesVerified We now have the piece hashes verified for all segments on all production satellites. We can remove the code that handles the case where piece hashes are not verified. This would make easier the migration of services from PointerDB to the new metabase. For consistency, PieceHashesVerified is still set to true in PointerDB for new segments. Change-Id: Idf0ccce4c8d01ae812f11e8384a7221d90d4c183	2020-11-24 11:09:48 +02:00
Moby von Briesen	db6bc6503d	satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes. Make metainfo.RSConfig a valid pflag config value. This allows us to configure the RSConfig as a string like k/m/o/n-shareSize, which makes having multiple supported RS schemes easier in the future. RS-related config values that are no longer needed have been removed (MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify). Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1	2020-11-09 22:16:13 +00:00
Egon Elbre	76f4619a9c	{satellite,storagenode}/gracefulexit: ensure client is closed Change-Id: I576a955a5578caf7fcbee832beca28cef2b0c83e	2020-10-27 23:27:07 +02:00
Kaloyan Raev	92a2be2abd	satellite/metainfo: get away from using pb.Pointer in Metainfo Loop As part of the Metainfo Refactoring, we need to make the Metainfo Loop working with both the current PointerDB and the new Metabase. Thus, the Metainfo Loop should pass to the Observer interface more specific Object and Segment types instead of pb.Pointer. After this change, there are still a couple of use cases that require access to the pb.Pointer (hence we have it as a field in the metainfo.Segment type): 1. Expired Deletion Service 2. Repair Service It would require additional refactoring in these two services before we are able to clean this. Change-Id: Ib3eb6b7507ed89d5ba745ffbb6b37524ef10ed9f	2020-10-27 13:06:47 +00:00
Egon Elbre	9adde49e1a	satellite/gracefulexit: ensure test doesn't timeout on failure Change-Id: Id004f8a075592ffc19b12a9d666058b60cb7724d	2020-10-26 21:16:48 +02:00
paul cannon	76d4977b6a	storagenode/gracefulexit: logic moved from worker to service Change-Id: I8b12606a96b712050bf40d587664fb1b2c578fbc	2020-10-22 23:19:30 +00:00
Egon Elbre	0bdb952269	all: use keyed special comment Change-Id: I57f6af053382c638026b64c5ff77b169bd3c6c8b	2020-10-13 15:13:41 +03:00
Michal Niewrzal	8649a00557	satellite/gracefulexit: replace `Path []byte` to `Key metabaseSegmentKey` TransferQueueItem We are unifying which name (and type) we are using for value we are using to point to segment. We want to use `key` instead of `path`. Dedicated type `metabase.SegmentKey` was created for this purposes also. This change is doing refactoring around gracefulexit. Change-Id: I90d51ff087b206179e61d5f1bc95f4709d76f917	2020-09-04 11:09:48 +00:00
Michal Niewrzal	9202295348	satellite/metainfo: replace ScopedPath with metabase.SegmentLocation Change-Id: I7e89c9e8eaeae58be828a32ad47ed3028501f4c7	2020-09-04 10:06:52 +00:00
Michal Niewrzal	aa47e70f03	satellite/metainfo: use metabase.SegmentKey with metainfo.Service Instead of using string or []byte we will be using dedicated type SegmentKey. Change-Id: I6ca8039f0741f6f9837c69a6d070228ed10f2220	2020-09-03 15:11:32 +00:00
Egon Elbre	3ca405aa97	satellite/orders: use metabase types as arguments Change-Id: I7ddaad207c20572a5ea762667531770a56fd54ef	2020-08-28 15:52:37 +03:00
Qweder93	88ff8829a1	satellite/gracefulexit: RecvTimeout increased to 2h, so slow nodes stop receiving lot of fails and as a result DQ Change-Id: Id4c8a394162ba368aeb573a927f825bf7250aa52	2020-08-24 18:59:24 +03:00
Egon Elbre	94a09ce20b	all: add missing dots Change-Id: I93b86c9fb3398c5d3c9121b8859dad1c615fa23a	2020-08-11 17:50:01 +03:00
Egon Elbre	080ba47a06	all: fix dots Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2	2020-07-16 14:58:28 +00:00
Egon Elbre	5bdcd86fa7	ci: test benchmarks This runs each benchmark for one iteration to ensure that they are valid. Unfortunately, it does not give any useful metrics as output. Change-Id: I68940398c8dd849aed656bd12656f48d5df10128	2020-07-10 13:26:49 +00:00
Qweder93	e52809d53e	cmd/storagenode: add check if satellites available to gracefulexit Change-Id: I8747507593d810bbdec0d140de0600ee147011c3	2020-06-10 13:38:36 +00:00
paul cannon	7395dd1e6e	storagenode/gracefulexit: revalidate existing pieces ..before they are transferred to another node and submitted to the satellite as successful piece transfers, because if we submit an invalid signature, the node will be marked as a cheater and disqualified immediately. These signatures should have been validated when the piece was originally stored, but bitrot does happen and needn't be cause for an immediate DQ. Change-Id: I8b0ebd5812ea8a2e60766005b7251fbb74ef7857	2020-05-28 09:50:14 -05:00
paul cannon	0c9a4a5e56	satellite/gracefulexit: better validation error messages Change-Id: I14168dbeaf17302e5e853854956f8fbcb82a0900	2020-05-28 09:41:15 -05:00
Ethan	7014cf2083	satellite/gracefulexit: Change handleFailed to return nil if we can't get the pending transfer https://storjlabs.atlassian.net/browse/SM-975 Change-Id: Ia1746b5e274e5011b6cd9f5a52f9a5faf703be51	2020-05-26 13:07:38 +00:00
Egon Elbre	94b2b315f7	storagenode/trust: refactor GetAddress to GetNodeURL Most places now need the NodeURL rather than the ID and Address separately. This simplifies code in multiple places. Change-Id: I52621d8ca52296a8b5bf7afbc1001cf8bfb44239	2020-05-20 11:05:15 +00:00
Egon Elbre	ed627144ed	all: use DialNodeURL throughout the codebase Change-Id: Iaf9ae3aeef7305c937f2660c929744db2d88776c	2020-05-20 10:36:30 +00:00
Moby von Briesen	46df8c1977	satellite/gracefulexit: add log message when node fails validation for piece transfer Change-Id: Ic5a53404ceb35003793aebc63637e7f8a58ef259	2020-05-13 16:58:50 +00:00
Egon Elbre	ec589a8289	all: fix comments about grpc Change-Id: Id830fbe2d44f083c88765561b6c07c5689afe5bd	2020-05-11 13:05:34 +03:00
Egon Elbre	7d29f2e0d3	all: remove drpc wrappers Change-Id: I45016f7d2a771dc00776196c1f531f3343e93b40	2020-05-11 08:20:34 +03:00
Egon Elbre	e6d5ce6b77	all: remove grpc It seems everyone has migrated to drpc. Change-Id: Ica6b2d0bdef68c6603083f2963458843eca71e9e	2020-05-10 06:36:09 +00:00
Egon Elbre	bcd93ee375	private/testplanet: add StopNodeAndUpdate This was commonly used and code with it can be simplified. Change-Id: I2f2b91f7de54269aee6ef027f97f9e8a7d222e39	2020-05-08 13:02:19 +00:00
Egon Elbre	4e94da3fda	satellite/overlay: add feature flag for node selection cache Also distinguish the purpose for selecting nodes to avoid potential confusion, what should allow caching and what shouldn't. Change-Id: Iee2451c1f10d0f1c81feb1641507400d89918d61	2020-05-06 16:13:47 +03:00
Egon Elbre	c630cf2490	storagenode/pieces: implement buffering for writing Currently uploads can cause a lot of IOPS, reduce this by introducing a in-memory buffer on-top of the file. Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59	2020-05-04 06:01:32 +00:00
Jessica Grebenschikov	6a6427526b	satellite/overlay: remove old updateaddress method The UpdateAddress method use to be used when storage node's checked in with the Satellite, but once the contact service was created this method was no longer used. This PR finally removes it. Change-Id: Ib3f83c8003269671d97d54f21ee69665fa663f24	2020-04-30 06:41:48 +00:00
Jess G	75b9a5971e	satellite: update log levels (#3851 ) * satellite: update log levels Change-Id: I86bc32e042d742af6dbc469a294291a2e667e81f * log version on start up for every service Change-Id: Ic128bb9c5ac52d4dc6d6c4cb3059fbad73f5d3de * Use monkit for tracking failed ip resolutions Change-Id: Ia5aa71d315515e0c5f62c98d9d115ef984cd50c2 * fix compile errors Change-Id: Ia33c8b6e34e780bd1115120dc347a439d99e83bf * add request limit value to storage node rpc err Change-Id: I1ad6706a60237928e29da300d96a1bafa94156e5 * we cant track storage node ids in monkit metrics so lets use logging to track that for expired orders Change-Id: I1cc1d240b29019ae2f8c774792765df3cbeac887 * fix build errs Change-Id: I6d0ffe058e9a38b7ed031c85a29440f3d68e8d47	2020-04-15 12:32:22 -07:00
Egon Elbre	11a44cdd88	all: don't depend on gogo/proto directly Change-Id: I8822dea0d1b7b99e0b828e0373a0308a42dde2be	2020-04-08 17:32:15 +00:00
Egon Elbre	cb781d66c7	satellite/overlay: optimize FindStorageNodes Reduce the number of fields returned from the query. Benchmark results in `satellite/overlay`: benchstat before.txt after2.txt name old time/op new time/op delta SelectStorageNodes-32 7.85ms ± 1% 6.27ms ± 1% -20.18% (p=0.002 n=10+4) SelectNewStorageNodes-32 8.21ms ± 1% 6.61ms ± 0% -19.53% (p=0.002 n=10+4) SelectStorageNodesExclusion-32 17.2ms ± 1% 15.9ms ± 1% -7.55% (p=0.002 n=10+4) SelectNewStorageNodesExclusion-32 17.8ms ± 2% 16.1ms ± 0% -9.38% (p=0.002 n=10+4) FindStorageNodes-32 48.4ms ± 1% 45.1ms ± 0% -6.69% (p=0.002 n=10+4) FindStorageNodesExclusion-32 79.2ms ± 1% 76.1ms ± 1% -3.89% (p=0.002 n=10+4) Benchmark results from `satellite/overlay` after making them parallel: benchstat before-parallel.txt after2-parallel.txt name old time/op new time/op delta SelectStorageNodes-32 548µs ± 1% 353µs ± 1% -35.60% (p=0.029 n=4+4) SelectNewStorageNodes-32 562µs ± 0% 368µs ± 0% -34.51% (p=0.029 n=4+4) SelectStorageNodesExclusion-32 1.02ms ± 1% 0.84ms ± 0% -18.08% (p=0.029 n=4+4) SelectNewStorageNodesExclusion-32 1.03ms ± 1% 0.86ms ± 2% -16.22% (p=0.029 n=4+4) FindStorageNodes-32 3.11ms ± 0% 2.79ms ± 1% -10.27% (p=0.029 n=4+4) FindStorageNodesExclusion-32 4.75ms ± 0% 4.43ms ± 1% -6.56% (p=0.029 n=4+4) Change-Id: I1d85e2764eb270f4c2b1998303ccfc1179d65b26	2020-03-30 18:36:23 +03:00
Egon Elbre	e1a443b04a	private/testplanet: allow modifying created database Instead of providing the database from outside to testplanet create it inside and then allow wrapping and modifying it. This is more convenient to use. Change-Id: I9b8f69e6e0a19ff984b4e2bfe927c9100c77bc6c	2020-03-27 19:14:48 +00:00
Egon Elbre	e8f18a2cfe	private/testplanet: expose storagenode and satellite Config Change-Id: I80fe7ed8ef7356948879afcc6ecb984c5d1a6b9d	2020-03-27 17:01:25 +02:00
Yingrong Zhao	b7b19289d1	bump storj.io/common to latest Change-Id: I16e337660ce8e1ef332cc842dbf4cfa067b9b98b	2020-03-25 09:08:40 -04:00
Bill Thorp	94c11c5212	satellite: remove some unnecessary UTC() calls Fixes some easy cases of extraneous UTC() calls Change-Id: I3f4c287ae622a455b9a492a8892a699e0710ca9a	2020-03-13 13:49:44 +00:00
Jess G	39cb821196	satellite/overlay: rm combinedcache, fix IP naming to be network (#3798 ) * rn combinedcache, rm dns node lookup Change-Id: I239f07211764b097d851230d8c81900a47756e9e * excludeIPs -> excludedNetworks Change-Id: Ifa6f44ab17457cdd5aff4cd5694296867c18b179 * use lowercase var name Change-Id: I825aad2b718c71f455e747be18f8cabd02aabe55 * update Getnetwork name Change-Id: I002a1b7bc6b4ef40159c0cd2b0ef209f80a9c503 * fix comments Change-Id: Ibddf5b9ffa9d685af6c392d893db063ef18e45fa * update comments with ipv6 Change-Id: I31758b7d4979e7c27d014668f4fb532ad838cda2 Co-authored-by: Stefan Benten <mail@stefan-benten.de>	2020-03-12 11:37:57 -07:00
Jessica Grebenschikov	803e2930f4	satellite: use IP for all uplink operations, use hostname for audit and repairs My understanding is that the nodes table has the following fields: - `address` field which can be a hostname or an IP - `last_net` field that is the /24 subnet of the IP resolved from the address This PR does the following: 1) add back the `last_ip` field to the nodes table 2) for uplink operations remove the calls that the satellite makes to `lookupNodeAddress` (which makes the DNS calls to resolve the IP from the hostname) and instead use the data stored in the nodes table `last_ip` field. This means that the IP that the satellite sends to the uplink for the storage nodes could be approx 1 hr stale. In the short term this is fine, next we will be adding changes so that the storage node pushes any IP changes to the satellite in real time. 3) use the address field for repair and audit since we want them to still make DNS calls to confirm the IP is up to date 4) try to reduce confusion about hostname, ip, subnet, and address in the code base Change-Id: I96ce0d8bb78303f82483d0701bc79544b74057ac	2020-03-11 09:11:40 -07:00
Jennifer Johnson	1c1750e6be	removes bandwidth limiting On satellite, remove all references to free_bandwidth column in nodes table. On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated. Protobuf message, NodeCapacity, is left intact for backwards compatibility. Once this is released to all satellites, we can drop the column from the DB. Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838	2020-03-04 14:04:00 +00:00
Egon Elbre	64330c55b3	all: use pbgrpc common/pb moved grpc to a separate package common/pb/pbgrpc. This updates this repository to use it. Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e	2020-02-26 21:27:47 +02:00
Egon Elbre	5342dd9fe6	go.mod: update uplink Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811	2020-02-21 16:08:12 +02:00
littleskunk	76849558cb	satellite/gracefulexit: increase performance and tolerate higher error rate Graceful exit is very slow at the moment. Over the last couple days we increase the batch size on Stefans satellite to 1000 but as a side effect the error rate was increased. With a batch size of 500 the error rate looks stable. This PR will increase the default to batch size to 300. Graceful exit will still be painful slow but at least it will be a bit faster. At the same time this PR also increases the number of errors we tolerate. We don't want to DQ slow storage nodes just because they didn't finish all 300 transfers in time. We want to give them more retries. Change-Id: I92e3f99e116d4988457d8b902a88e85ed1bcc1a7	2020-02-12 11:40:15 +00:00
Cameron Ayer	b22bf16b35	satellite/overlay: add config flag for node selection free disk requirement Currently SNs report their free disk space once per hour. If a node becomes full, it has to wait until the next contact cycle begins to report; all the while receiving and failing upload requests. By increasing the minimum required disk space, we can give the storage nodes more time to report their space before the completely fill up. This change goes hand-in-hand with another change we want to implement: trigger capacity report on SN immediately upon falling below threshold. Change-Id: I12f778286c6c3f582438b0e2949765ac43325e27	2020-02-11 18:08:25 +00:00
Michal Niewrzal	426c8eb31a	private/testplanet: add DeleteBucket method for uplink New method added to be able to delete easily bucket during tests. Change-Id: Iaae89618cc676ddbbbd4b0df2eeacd143ea6f3c2	2020-02-11 15:58:13 +00:00
Jeff Wendling	7999d24f81	all: use monkit v3 this commit updates our monkit dependency to the v3 version where it outputs in an influx style. this makes discovery much easier as many tools are built to look at it this way. graphite and rothko will suffer some due to no longer being a tree based on dots. hopefully time will exist to update rothko to index based on the new metric format. it adds an influx output for the statreceiver so that we can write to influxdb v1 or v2 directly. Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff	2020-02-05 23:53:17 +00:00
Egon Elbre	8dea4f52db	satellite: add control panel Change-Id: Id48246e9bcd4c6ec643277fe740937b2e42ad85b	2020-01-30 08:06:43 -05:00
paul cannon	8ce9ce7f0f	satellite/gracefulexit: wait for errgroup to return credit to Yingrong Change-Id: I538371040d4dcdf6e943c61e8454320fd57b7526	2020-01-28 19:26:43 +00:00
Jeff Wendling	26e33e7e07	satellite/gracefulexit: make orders with right bucket id and action paths are organized as follows: project_id/segment_index/bucket_name/encrypted_key so by picking parts[0] and parts[1], we were using the segment index instead of the bucket name, causing bandwidth to be accounted for incorrectly. additionally, we were using the PUT action instead of the PUT_GRACEFUL_EXIT action, causing the data to be charged incorrectly. we use PUT_REPAIR for now because nodes won't accept uploads with PUT_GRACEFUL_EXIT and our tables need migrations to handle rollups with it. Change-Id: Ife2aff541222bac930c35df8fcf76e8bac5d60b2	2020-01-24 19:27:38 +00:00
Michal Niewrzal	6502454947	satellite/metainfo: move RS configuration to satellite With this change RS configuration will be set on satellite. Uplink with get RS values with BeginObject request and will use it. For backward compatibility and to avoid super large change redundancy scheme stored with bucket is not touched. This can be done in future. Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff	2020-01-22 09:33:53 +00:00

1 2 3

103 Commits