storj

Author	SHA1	Message	Date
Márton Elek	d38b8fa2c4	satellite/nodeselection: use the same Node object from overlay and nodeselection We use two different Node types in `overlay` and `uploadnodeselection` and converting back and forth. Using the same object would allow us to use a unified node selection interface everywhere. Change-Id: Ie71e29d60184ee0e5b4547eb54325f09c418f73c	2023-07-03 16:59:33 +00:00
Jeff Wendling	32f683fe9d	satellite/orders: filter nodes based on segment placement this change adds code to CreateGetOrderLimits to filter out any nodes that are not in the placement specified by the segment. notably, it does not change the audit or repair order limits. the list segments code had to be changed to include getting the placement field from the database. Change-Id: Ice3e42a327811bb20928c619a72ed94e0c1464ac	2023-06-05 13:56:22 -04:00
Michal Niewrzal	eabd9dd994	satellite/orders: remove unsed argument Change-Id: I6c5221fc19f97ae6db5627d7239795ff663289e0	2023-05-22 14:35:08 +00:00
paul cannon	915f3952af	satellite/repair: repair pieces on the same last_net We avoid putting more than one piece of a segment on the same /24 network (or /64 for ipv6). However, it is possible for multiple pieces of the same segment to move to the same network over time. Nodes can change addresses, or segments could be uploaded with dev settings, etc. We will call such pieces "clumped", as they are clumped into the same net, and are much more likely to be lost or preserved together. This change teaches the repair checker to recognize segments which have clumped pieces, and put them in the repair queue. It also teaches the repair worker to repair such segments (treating clumped pieces as "retrievable but unhealthy"; i.e., they will be replaced on new nodes if possible). Refs: https://github.com/storj/storj/issues/5391 Change-Id: Iaa9e339fee8f80f4ad39895438e9f18606338908	2023-04-06 17:34:25 +00:00
Michal Niewrzal	bc8f8f62b5	satellite/orders: cleanup after altering primary key We changed primary key for bucket_bandwidth_rollups table. Now we need to do some cleanup in places like structs, sorting methods or SQL queries. Change-Id: Ida4f874f161356df193379a53507602e04db1668	2023-03-06 16:03:11 +00:00
Michal Niewrzal	b46c0fb78f	satellite/orders: don't cancel flushing bandwidth orders Earlier we made a change to not cancel flushing orders when flushing was triggered by orders endpoint method but we missed a case where it can be also triggered (and canceled) by metainfo endpoints method. This change moves ignoring context cancellation deeper. Change-Id: Id43176f552efc3167345783f73aab885411ac247	2023-03-01 17:10:05 +00:00
Michal Niewrzal	16b7901fde	satellite/metabase: add piece size calculation to segment This code is essentially replacement for eestream.CalcPieceSize. To call eestream.CalcPieceSize we need eestream.RedundancyStrategy which is not trivial to get as it requires infectious.FEC. For example infectious.FEC creation is visible on GE loop observer CPU profile because we were doing this for each segment in DB. New method was added to storj.Redundancy and here we are just wiring it with metabase Segment. BenchmarkSegmentPieceSize BenchmarkSegmentPieceSize/eestream.CalcPieceSize BenchmarkSegmentPieceSize/eestream.CalcPieceSize-8 5822 189189 ns/op 9776 B/op 8 allocs/op BenchmarkSegmentPieceSize/segment.PieceSize BenchmarkSegmentPieceSize/segment.PieceSize-8 94721329 11.49 ns/op 0 B/op 0 allocs/op Change-Id: I5a8b4237aedd1424c54ed0af448061a236b00295	2023-02-22 11:04:02 +00:00
Márton Elek	8f8e97de23	satellite/metainfo: support desired node number for download object/segment This modification introduce support of the new "desired node" field of download segment/object. This can be used to request more nodes than the suggested minimum. It can be used to achieve better performance in exchange of using more bandwidth. (more parallel downloads). Change-Id: Ia167d6979e6d70a597c85070a4ccd1c3a573e406	2023-02-13 13:57:48 +00:00
Márton Elek	ca6e3a9e88	satellite/orders: create mock based unit test Most of our (~integration) tests based on testplanet runner. However running testplanet for each test make the testing process slow. It seems to be better to use real unit tests (without db dependency) when it's possible. This patch makes small modification to make it possible to test orders.Service with real unit test. As the existing unit test of `service.go` is isolated with `_test` package name, it's moved to an `_integration_test.go` file to make place for the unit test. Change-Id: Ia69f26a34e2c48d230d8d36c2040dd02a60455a6	2023-02-13 13:24:30 +00:00
Michal Niewrzal	252c437b0e	satellite/orders: ignore context canceled when updating bucket bandwidth Orders from storage nodes are received by SettlementWithWindowFinal method. There is a stream which receives all orders and after getting all orders we are inserting into DB storagenode and bucket bandwidth. Problem is with bucket bandwidth which is stored through cache which is often using context from SettlementWithWindowFinal stream to perform DB inserts and its doing this in separate goroutine. Because of that is possible that SettlementWithWindowFinal is finished before flushing was finished and context is canceled while doing insert into DB Change-Id: I3a72c86390e9aedc060f6b082bb059f1406231ee	2023-02-08 13:21:42 +00:00
JT Olio	686faeedbd	satellite/overlay: return noise info with selected nodes we have two more fields in the database (noise_proto and noise_public_key) that now need to go into pb.NodeAddress when returning AddressedOrderLimits. the only real complication is making sure type conversions between database types and NodeURLs and so on don't lose this new pb.NodeAddress field (NoiseInfo). otherwise this is a relatively straightforward commit Change-Id: I45b59d7b2d3ae21c2e6eb95497f07cd388d454b3	2023-02-02 15:46:27 +00:00
Michal Niewrzal	15508d270c	satellite/orders: don't store non user bandwidth actions for bucket For bucket_bandwidth_rollups we are trying to insert lots of entries with empty bucket name and project id. Those are inserts from orders created by repair, audit and GE. High load on the same primary key (the same range) is causing many retries and that's affect all inserts as we are putting 1000 entries into DB a once. This change solves this problem by not storing into bucket_bandwidth_rollups other actions then GET and PUT. Those actions are only important from bucket bandwidth usage perspective because those are actions performed by users. Other actions (repair, audit or GE) are also stored in storagenode_bandwdith_rollups so we will still have access to them e.g. for statistic purposes. https://github.com/storj/storj/issues/5332 Change-Id: Ibb5bf0a4c869b0439dc65da1c9342a38ca2890ba	2023-02-01 15:38:48 +00:00
Michal Niewrzal	3b6e1123b8	satellite/orders: fix sorting rollups before inserting Sorting by primary key before inserting data into DB is fixed. Earlier we were sorting input slice of BucketBandwidthRollup but then we were putting all entries into map to rollup input data. Iteration over map with a range loop doesn't guarantee any specific order so we were loosing sorted order when we were creating with this map slices to use with DB insert. New code is also using map but when map is full its sorting map keys separately and iterates over them to get data from map. https://github.com/storj/storj/issues/5332 Change-Id: I5bf09489b0eecb6858bf854ab387b660124bf53f	2023-02-01 12:17:25 +00:00
Andrew Harding	abd0ad92dc	satellite/metainfo: RetryBeginSegmentPieces RPC implementation Part of: https://github.com/storj/uplink/issues/120 Change-Id: I2a2873455f7498ffd31f50ade16c173fe1d18157	2023-01-27 15:04:59 +00:00
Michal Niewrzal	4cbb1ed296	satellite/orders: log bandwidth values we are dropping When we have problem with inserting bandwidth amounts into cache or DB we are logging information about it but log entries are not very detailed. This change adds bandwidth amounts to the log entry. https://github.com/storj/storj/issues/5470 Change-Id: I55ccad837d17b141501d3def1dec7ad5f3acdb0b	2023-01-20 09:28:25 +00:00
Michal Niewrzal	0185bba90a	cmd: cleanup segment verify/repair tools * use the same DB application name for satellite and metabase * use noop orders DB implementation to avoid storing allocated bandwidth in DB Change-Id: I20e88c694d38240fe1a20c45719e210cfb76402c	2023-01-12 15:27:07 +00:00
Michal Niewrzal	a2a9dafa33	satellite/orders: don't store allocated bandwidth in bucket_bandwidth_rollups table We have performance problems with updating bucket_bandwidth_rollups. To improve situation we can stop storing allocated bandwidth in this table. This should reduce large number of updates which are comming from metainfo endpoints, repair workers and audit. Next step will be to drop `allocated` column completely from bucket_bandwidth_rollups. Allocated GET bandwidth is all we need and we are keeping it in bucket_bandwidth_rollups table. Change-Id: Ifdd26a89ba8262acbca6d794a6c02883ad0c0c9b	2023-01-12 13:21:02 +00:00
Clement Sam	3378215adf	satellite/orders: decrease order expiration time to 24hours Closes https://github.com/storj/storj/issues/5202 Change-Id: I55d1a84c46dd610eeb00dd79df8f4f7e699499a0	2022-11-21 14:52:32 +00:00
Ivan Fraixedes	567557abc3	satellite/orders: Remove period logs messages Remove the final period of two log messages to be consistent with the other logs messages. Change-Id: I9253a4d5fb293c95d3baf8e093dc5744387c1516	2022-11-21 13:19:13 +00:00
Stefan Benten	70d42ead1c	satellite/orders/endpoint.go: order rollups by bucket name first Currently the primary key of the underlying rollup table has the primary key being the bucket name, but we used to sort by projectID. This caused dead locks due to the contention during updates/inserts. We should reevalute if bucket name being the primary key is the right way for this table, this should stop the long running and failing attempts tho. Change-Id: Ie7d0f86944da48ad9cbd92eb162226882a2fb954	2022-11-16 19:48:43 +01:00
paul cannon	c54c45c9c7	satellite/audit: new ReverifyPiece implementation ReverifyPiece() is not currently hooked up to anything, but is planned to take the place of audit.(*Verifier).Reverify(). ReverifyPiece() works by downloading one piece in its entirety, rather than pulling an entire stripe across many nodes. Change-Id: Ie2c680f4d3c3b65273a72466a3f9f55c115b0311	2022-10-27 16:06:21 +00:00
Michal Niewrzal	a97cd97789	satellite/orders: remove unused service dependency Orders service doesn't need buckets service anymore. Change-Id: I27853cda87e82b528f53667e4b4866801f7bfb62	2022-09-28 08:56:36 +00:00
Egon Elbre	51d4e5c275	satellite/{orders,overlay}: use cache for downloads Use DownloadSelectionCache to avoid querying database for every download. This change only addresses downloads from users. The download selection cache is not currently used for audit and repair. Change-Id: I96a49e121dac0b4204f97592a63131edabd73fb5	2022-07-12 11:04:34 +00:00
Michał Niewrzał	456aea727e	satellite: use PieceIDDeriver for derivation We can use PieceIDDeriver in all places where we are deriving id from the same id multiple times. We have serveral such places: gc, segment deletion, segment validation, order limit creation. Using it should save some resources. Change-Id: I24668d516c0f7cea4aec6470614067734149501d	2022-05-19 06:31:42 +00:00
Fadila Khadar	29fd36a20e	satellite/repairer: handle excluded countries For nodes in excluded areas, we don't necessarily want to remove them from the pointer, but we do want to increase the number of pieces in the segment in case those excluded area nodes go down. To do that, we increase the number of pieces repaired by the number of pieces in excluded areas. Change-Id: I0424f1bcd7e93f33eb3eeeec79dbada3b3ea1f3a	2022-03-14 10:59:36 -04:00
Yingrong Zhao	1f8f7ebf06	satellite/{audit, reputation}: fix potential nodes reputation status inconsistency The original design had a flaw which can potentially cause discrepancy for nodes reputation status between reputations table and nodes table. In the event of a failure(network issue, db failure, satellite failure, etc.) happens between update to reputations table and update to nodes table, data can be out of sync. This PR tries to fix above issue by passing through node's reputation from the beginning of an audit/repair(this data is from nodes table) to the next update in reputation service. If the updated reputation status from the service is different from the existing node status, the service will try to update nodes table. In the case of a failure, the service will be able to try update nodes table again since it can see the discrepancy of the data. This will allow both tables to be in-sync eventually. Change-Id: Ic22130b4503a594b7177237b18f7e68305c2f122	2022-01-06 21:05:59 +00:00
littleskunk	b21cbc85f1	satellite/orders: log level warn for remaining "bucketName or projectID not set" (#4326 ) Co-authored-by: Stefan Benten <mail@stefan-benten.de>	2022-01-06 16:31:26 +01:00
dlamarmorgan	ab37b65cfc	satellite/{accounting,orders,satellitedb}: group bucket bandwidth rollups by time window Batching of the order submissions can lead to combining the allocated traffic totals for two completely different time windows, resulting in incorrect customer accounting. This change will group the batched order submissions by projectID as well as time window, leading to distinct updates of a buckets bandwidth rollup based on the hour window in which the order was created. Change-Id: Ifb4d67923eec8a533b9758379914f17ff7abea32	2022-01-05 20:24:48 +00:00
Stefan Benten	a6139f5a6b	satellite/orders: log less decrypt order messages Currently most of the satellite log is made up by this specific log message and thus we should increase the log level. We already have a monkit event tracking this case. In case we need to look into this more we should just increase the satellite log level. Change-Id: I27ebed3e6745701c66c83e8c52ddc836ad9d5f4e	2021-12-21 00:39:29 +01:00
Michał Niewrzał	911540e76e	satellite/orders: avoid logging "bucketName or projectID not set" When we switched to segments loop we stopped adding bucket name and project ID to order limit metadata. This is causing lots of logging that we don't need. Change-Id: Id3f878a170b7a6b801e8a838ee69165715985d60	2021-12-14 10:31:48 +00:00
Fadila Khadar	fb0d055a41	satellites/orders: populate egress_dead in project_bandwidth_daily_rollups Populate the egress_dead column for taking into account allocated bandwidth that can be removed because orders have been sent by the storage nodes. The bandwidth not used in these orders can be allocated again. Change-Id: I78c333a03945cd7330aec052edd3562ec671118e	2021-10-06 16:54:49 +00:00
Michał Niewrzał	1ed5db1467	satellite/metainfo: simplifying limits code Its a very simple change to reduct code duplication. Change-Id: Ia135232e3aefd094f76c6988e82e297be028e174	2021-09-28 06:22:13 +00:00
Jeff Wendling	b160ec4c1b	satellite/orders: bound RollupsWriteCache flushes In the situation where the flushes take longer than the incoming rate of writes, the RollupsWriteCache will take every connection in the database pool and use them forever. Instead of doing that and taking down satellite availability, bound the number of flush operations that it will perform and drop incoming writes earlier to keep memory usage constant. Adds monitoring events for if any flushes or updates are lost. Change-Id: I81b169b73501ee9b999f4b03d1e79645fc56f167	2021-09-15 19:14:39 +00:00
Michał Niewrzał	c258f4bbac	private/testplanet: move Metabase outside Metainfo for satellite At some point we moved metabase package outside Metainfo but we didn't do that for satellite structure. This change refactors only tests. When uplink will be adjusted we can remove old entries in Metainfo struct. Change-Id: I2b66ed29f539b0ec0f490cad42c72840e0351bcb	2021-09-09 07:15:51 +00:00
Cameron Ayer	26f839a445	satellite/repair/repairer: if not enough nodes for repair order limits, increment metric and log as irreparable segment Change-Id: I4bd46f28d64278c8d463e885ad221aafb6ce7cf3	2021-08-27 13:42:28 +00:00
Cameron Ayer	24e02b6352	satellite/{audit,orders}: if not enough nodes for audit order limits, increment metric and wrap error with ErrNotEnoughShares Increment a metric so we can get alerts. Wrap the error so we can search the logs for it. Change-Id: I3827aa306c431009828014d9d9afff8dfc057ee6	2021-08-26 20:14:05 +00:00
Clement Sam	d73b9fff9a	satellite/orders: set the expirationDate in CreatePutRepairOrderLimits In the past ExpirationDate was available inside CreatePutRepairOrderLimits but this was removed since the metabase segment was missing the ExpiresAt field. Now ExpiresAt field is available in the metabase segment and can be set correctly while executing NewSignerRepairPut. Change-Id: I068c07492ab27bde2c44477bbd32c5872edd024a	2021-07-27 12:44:40 +00:00
Cameron Ayer	449c873681	satellite/repair/repairer: attempt repair GETs using nodes' last IP and port first Sometimes we see timeouts from DNS lookups when trying to do repair GETs. Solution: try using node's last IP and port first. If we can't connect, retry with DNS lookup. Change-Id: I59e223aebb436118779fb18378f6e09d072f12be	2021-07-21 13:13:06 +00:00
Michał Niewrzał	70e6cdfd06	satellite/audit: move to segmentloop Change-Id: I10e63a1e4b6b62f5cd3098f5922ad3de1ec5af51	2021-06-28 11:32:00 +00:00
Jeff Wendling	8a6efa1f58	satellite/orders: query for node first before upsert/replace the very common case is that the node api version is indeed at least the requested version, so query for that first to avoid write traffic. Change-Id: Ib047d93078205bc07fee75d1f635503b792307f0	2021-06-22 15:16:12 -04:00
JT Olio	da9ca0c650	testplanet/satellite: reduce the number of places default values need to be configured Satellites set their configuration values to default values using cfgstruct, however, it turns out our tests don't test these values at all! Instead, they have a completely separate definition system that is easy to forget about. As is to be expected, these values have drifted, and it appears in a few cases test planet is testing unreasonable values that we won't see in production, or perhaps worse, features enabled in production were missed and weren't enabled in testplanet. This change makes it so all values are configured the same, systematic way, so it's easy to see when test values are different than dev values or release values, and it's less hard to forget to enable features in testplanet. In terms of reviewing, this change should be actually fairly easy to review, considering private/testplanet/satellite.go keeps the current config system and the new one and confirms that they result in identical configurations, so you can be certain that nothing was missed and the config is all correct. You can also check the config lock to see what actual config values changed. Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e	2021-06-01 22:14:17 +00:00
Egon Elbre	10a0216af5	satellite/metainfo: use range for specifying download limit Previously the object range was not used for calculating order limit. This meant that even if you were downloading only a small range it would account bandwidth based on the full segment. This doesn't fully address the accounting since the lazy segment downloads do not send their requested range nor requested limit. Change-Id: Ic811e570c889be87bac4293547d6537a255078da	2021-06-01 09:36:55 +00:00
Michał Niewrzał	59eabcca24	satellite/orders: populate project_bandwidth_daily_rollups table We want to calculate used bandwidth better so we need to calculate it from allocated and settled bandwidth. To do this we need first populate this new table. https://storjlabs.atlassian.net/browse/PG-56 Change-Id: I308b737bf08ee48ce4e46a3605697ab2095f7257	2021-05-25 18:07:22 +00:00
Egon Elbre	961e841bd7	all: fix error naming errs.Class should not contain "error" in the name, since that causes a lot of stutter in the error logs. As an example a log line could end up looking like: ERROR node stats service error: satellitedbs error: node stats database error: no rows Whereas something like: ERROR nodestats service: satellitedbs: nodestatsdb: no rows Would contain all the necessary information without the stutter. Change-Id: I7b7cb7e592ebab4bcfadc1eef11122584d2b20e0	2021-04-29 15:38:21 +03:00
Egon Elbre	267506bb20	satellite/metabase: move package one level higher metabase has become a central concept and it's more suitable for it to be directly nested under satellite rather than being part of metainfo. metainfo is going to be the "endpoint" logic for handling requests. Change-Id: I53770d6761ac1e9a1283b5aa68f471b21e784198	2021-04-21 15:54:22 +03:00
Egon Elbre	86e698f572	pb: use *UnimplementedServer to avoid breaking API changes Change-Id: I99a34eeb37ac4453411f273511710562a519f57a	2021-03-29 12:26:10 +03:00
Michał Niewrzał	67e26aafcd	Merge remote-tracking branch 'origin/main' into multipart-upload Change-Id: I9b183323cb470185be22f7c648bb76917d2e6fca	2021-03-10 08:53:38 +01:00
Natalie Villasana	c290e5ac9a	satellite/orders: decrease FlushBatchSize default to 1000 The previous default FlushBatchSize of 10000 was causing major slow down in select and insert statements on bucket_bandwidth_rollups. We saw on the saltlake satellite that a FlushBatchSize of 1000 helped reduce contention and query latency. Change-Id: Ib95e73482219bc5aedc11925b1849fa5999774ba	2021-03-02 14:00:48 +00:00
Michał Niewrzał	9a60011774	Merge remote-tracking branch 'origin/main' into multipart-upload Change-Id: Ia90f29be432e207c4125f7f955c912978eabe59a	2021-02-04 09:38:08 +01:00
Ivan Fraixedes	9c9f481469	satellite/orders: Remove deprecated endpoint Remove the orders Settlement endpoint because it isn't used and it was already always returning an error. Change-Id: I81486fbe7044a1444182173bc0693698ee7cfe7e	2021-02-03 23:47:07 +00:00

1 2 3 4 5

226 Commits