storj

Author	SHA1	Message	Date
Moby von Briesen	178dbb4683	storagenode/storagenodedb: allow storagenodes to start test_table exists In many cases when a storagenode fails the preflight check, it is due to test_table existing, which is used to determine read/write capabilities after the initial schema verification. If preflight ends early due to a failure or stopped storagenode, it may not get the chance to drop this table. This change excludes test_table from the schema comparison to ensure that it never prevents a storagenode from starting up. It also adds Preflight DB test for storagenode. Change-Id: Ib8e71df2e42fda3b2a364fbf7a801891c5831d39	2020-03-09 14:29:46 -04:00
Yingrong Zhao	20e96d417a	satellite/metainfo: fix data race in test fix flaky test: TestDeletePiecesService_DeletePieces_Timeout Change-Id: Ia707b78adf65967f6466b034a0fbf79f7355c397	2020-03-09 14:59:44 +00:00
Michal Niewrzal	d7b5df70d3	cmd/uplink: remove unused flag New API has limited number of options to configure at the moment. We should remove unused flags from Uplink CLI and add if needed in the future. Change-Id: Icf3f3dadd43cb61a3b408b02d0762aef34425dbf	2020-03-09 13:44:46 +00:00
Michal Niewrzal	c20cf25f35	cmd: migrate uplink CLI to new API Change-Id: I8f8fcc8dd9a68aac18fd79c4071696fb54853a60	2020-03-09 13:26:29 +00:00
Brandon Iglesias	4e9cd77d54	web/satellite: update links to APIKeys documentation	2020-03-09 15:12:23 +02:00
Egon Elbre	0675413f7a	satellite/satellitedb: increase migrate test timeout Change-Id: I789ea22ad463a6c31737e959ec54941b66830188	2020-03-09 14:30:50 +02:00
littleskunk	842c8d8ed9	scripts/tests/rollingupgrade: fix installation for current commit	2020-03-06 17:19:55 -05:00
Moby von Briesen	e4da7bd9cd	satellite/repair/checker: use repair override if available in checker and irreparable In production, the satellite is overriding the default repair threshold (35) to a higher value (52). In some places in the checker and irreparable processes, the repair threshold on the redundancy scheme is used in place of the override value. This fixes those cases. Change-Id: Ie7387217d9fb3886f050b5e5b67be51f276196de	2020-03-06 15:39:53 -05:00
Bill Thorp	e99e675fb1	satellite/satellitedb: use time zones with all timestamps The migration was broken into one migration per table to reduce table locking and reduce the chances of failure due to SQL timeouts. Of the 14 fields that lacked time zones, only the 3 named 'interval_start` seemed to have non-UTC data in them. These fields are fixed in the migration by removing the +00 and adding AT TIME ZONE current_setting('TIMEZONE') Field with good data are migrated by adding AT TIME ZONE 'UTC' Note that postgres's timezone() is different than cockroach's timezone() so AT TIME ZONE is used. https://storjlabs.atlassian.net/browse/SM-104 Change-Id: I410f2f1d7c11b143f17844347f37e6f4b1e70fce	2020-03-05 21:11:25 +00:00
Jennifer Johnson	0d60c1a4b2	satellite/audit: fix checkSegmentAltered to detect segments that have changed during an audit - Previously, checkSegmentAltered only checked for segments that were replaced but we want to detect all changes to a segment that occurred while an audit was being conducted. - Fixed a bug where nodes failing audits during reverify for non-piece-hash-verified segments were not being removed from containment mode. - Filled in gaps in reverify testing to ensure nodes are properly removed from containment. Change-Id: Icd96d369278987200fd28581395725438972b292	2020-03-05 19:05:39 +00:00
Ivan Fraixedes	e6d452decd	satellite/accounting: Billing tests wait for SNs The billing tests were flaky because some assertions ran before the storage nodes finish their work. A new helper function in testplanet has been added to allow to wait for storage nodes endpoints to finish their work. This function now it's used in the billing tests for avoiding their flakiness. This commit closes the ticket: https://storjlabs.atlassian.net/browse/SM-403 A part of fixing other billing tests flakiness. Change-Id: Iacb750af435f515c04b1e1d3510a218d184c9abc	2020-03-05 12:37:24 +01:00
Egon Elbre	f4d5d89b68	private/testplanet: add WaitForStorageNodeEndpoints After calling uplink.Upload it is not guaranteed that the storage node has yet saved all the orders since it happens asynchronously. Hence we need a separate func to wait for them to complete. Change-Id: I0c34b3ea6c98dbcf37f80493c0e10a8bdbbb2aaf	2020-03-05 10:33:56 +00:00
Michal Niewrzal	9f390f37da	satellite/metainfo: return default ciphers (path and encryption) for old uplinks New libuplink is not storing encryption values in with bucket but old uplinks are using those values for configuration. If bucket was created with new libuplink we will send back satellite defaults. Change-Id: Ie1bf3682847e07b302270b4c4bf1a7219f4bf011	2020-03-05 10:04:50 +00:00
Ivan Fraixedes	a7f927df96	satellite/accounting: Disable billing test Disable a billing test that sometimes fails in the CI. Change-Id: Ib77ff32060b2303822f36fdd1774d8a29d7d94a6	2020-03-05 10:46:29 +01:00
Jessica Grebenschikov	2af71f3460	satellite/orders: add monkit to looking up node addr Change-Id: Ia0eb0ffc343879a6ef9827d46e936e1fbc2e198a	2020-03-04 23:15:18 +00:00
Fadila Khadar	5c9becb9be	satellite/orders: billing partial download Submit an order limit with a high amount but the order has a low amount of traffic. Make sure the order amount is used for billing. Change-Id: I6b6ae26e9b8896f4a3acf530b2f48510b6df89cc	2020-03-04 17:12:50 +00:00
Jennifer Johnson	1c1750e6be	removes bandwidth limiting On satellite, remove all references to free_bandwidth column in nodes table. On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated. Protobuf message, NodeCapacity, is left intact for backwards compatibility. Once this is released to all satellites, we can drop the column from the DB. Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838	2020-03-04 14:04:00 +00:00
Egon Elbre	5f2ca0338b	satellite/satellitedb: fix err and close order Change-Id: Ied927275853c4cf4a8ccb500048d50545f6c6efe	2020-03-04 09:05:22 +00:00
Jessica Grebenschikov	bcb0453db2	upgrade dependencies for trace db debug endpoint Change-Id: I4de658b361bb39ce28dc31b982895bb4f45b580a	2020-03-04 07:35:34 +00:00
littleskunk	8fa8178f04	release/rollingupgrade: on release tags run rolling upgrade against previous release (#3792 ) Co-authored-by: Stefan Benten <mail@stefan-benten.de>	2020-03-03 23:56:32 +01:00
Cameron Ayer	7244a6a84e	storagenode/{contact, piecestore}: implement low disk notification with cooldown When a storagenode begins to run low on capacity, we want to notify the satellite before completely running out of space. To achieve this, at the end of an upload request, the SN checks if its available space has fallen below a certain threshold. If so, trigger a notification to the satellites. The new NotifyLowDisk method on the monitor chore is implemented using the common/syn2.Cooldown type, which allows us to execute contact only once within a given timeframe; avoiding hammering the satellites with requests. This PR contains changes to the storagenode/contact package, namely moving methods involving the actual satellite communication out of Chore and into Service. This allows us to ping satellites from the monitor chore Change-Id: I668455748cdc6741291b61130d8ef9feece86458	2020-03-03 10:45:37 -05:00
Michal Niewrzal	d384e48ad7	private/testplanet: set rollout seed to avoid warnings in logs Each test log is starting with warnings like this: "rollout config error: empty seed {"binary": "Identity"}". Make no sense to print them and pollute output. Change-Id: Ib50e28d09d8b259106d3b79d8f1262954a7aed63	2020-03-03 12:58:54 +00:00
Egon Elbre	decb2ec69a	private/processgroup: moved to storj.io/common/processgroup Change-Id: I1ec0bb440dda757d8f9a6f564a0084dde2f9cc84	2020-03-03 10:50:33 +00:00
Jeff Wendling	a02424a220	pkg/server: use common implementation for user timeouts Change-Id: Id6d7f1179df9a90819708d101a94939b7df70039	2020-03-03 10:06:45 +00:00
Jeff Wendling	443aa08a06	private/dbutil/txutil: remove the individual retry events Change-Id: I63d06e57d7e6723b4d00d51f77c46345a11c4671	2020-03-03 08:38:19 +00:00
Moby von Briesen	f495544c56	satellite/satellitedb/dbx: add fields to node table for placing nodes into suspended mode for too many unknown-error audits Change-Id: Iac9a619e5c08377de87ffdf4acdd0155027f5eb3	2020-03-03 03:30:59 +00:00
Qweder93	484ec7463a	storagenode: notifications on outdated software version Change-Id: If19b075c78a7b2c441e11b783c3c09fed55060c7	2020-03-02 16:48:02 +00:00
paul cannon	4d3db68283	cmd/gateway: fix go.mod formatting Go is continually rewriting this file this way, making it Git-dirty, and it makes me sad Change-Id: I71cd630259a8bbeeffaa3dc9435562ecfc4e6487	2020-02-28 18:00:55 -06:00
igor gaidaienko	df88f416c9	satellite/accounting: Add test billing download traffic post deletion Test checking that download traffic gets billed even if the file and bucket was deleted Change-Id: Ifd67a8cd4b46d75ed48c86698e18c99f60bc39dc	2020-02-28 11:52:04 +00:00
Ivan Fraixedes	d64ef3d898	satellite/accounting: Test billing donwload/upload traffic Add a test for checking that the billing: * it doesn't include upload traffic * it includes download traffic Change-Id: I1655c15c1fad642f77dd210f2014b2586ae10104	2020-02-28 09:36:51 +00:00
Michal Niewrzal	4deab5ac6c	satellite/metainfo: combine CommitSegment and CommitObject in batch v2 This change is a special case for batch processing. If in batch request CommitSegment and CommitObject are one after another we can execute these requests as one. This will avoid current logic where we are saving pointer for CommitSegment and later we are deleting this pointer and saving it once again as under last segment path for CommitObject. This change should handle issue we have in older uplinks with incorrect order of storing pointers. Change-Id: I86514c95df169e6fbc91b52e5117472cae70cb8b	2020-02-28 07:40:36 +00:00
Jeff Wendling	1db087cfba	satellite/satellitedb: migration to create tables for compensation these tables are used in future commits with respect to the new storagenode payments code. if we create them now, it will make backfilling them with historical data easier. Change-Id: I3c08c9770ec5b2baa38b4f2fd18c2f07746a61c2	2020-02-27 17:34:50 +00:00
Moby von Briesen	6043d01c90	satellite/audit/verifier: add metric for number of successfully downloaded shares Change-Id: Ia4f1dc6e088db802e340aaecf80cc7ef6dc237a4	2020-02-27 14:33:59 +00:00
Egon Elbre	1f7c3be8f9	private/testplanet: add option to run testplanet databases non-parallel NonParallel running is needed for gateway tests, because minio unfortunately relies on global state. Change-Id: If730db2ab86d10f4d02e1ac3128f758e9c18cdff	2020-02-27 15:49:22 +02:00
Michal Niewrzal	fb2711d05e	scripts: update postgres helper script to set password Latest postgres docker image requires non empty password. Change-Id: I03017e1b7ff4803fefc24c39087d9ccd4042373b	2020-02-27 10:33:37 +00:00
NickolaiYurchenko	b0d2cf0e4d	web/storagenode: on logo click action added Change-Id: Iea8cd906a7220d5cd9dd96cd041bf8e7e378e455	2020-02-27 10:02:01 +00:00
Jeff Wendling	2b9f28b029	satellite/accounting/reportedrollup: remove expiration check Remove the check around consuming an expired serial so that we have more time to run the migration. It does open a small race of double spends for entries already counted and then added to the pending queue right around when they're going to expire and the consumed serials have already been removed, but that should be rare if we keep the pending queue empty. Change-Id: I000b15979b09c67751281ff675ea6c81fc9d22dc	2020-02-26 15:35:10 -07:00
Egon Elbre	f85606b5a7	private/grpctlsopts: grpc related tlsopts This moves grpc related tlsopts methods to private/grpctlsopts. This allows to remove grpc dependency from tlsopts. Change-Id: I25090b82b1e7a0633417ad600f8587b0c30ace73	2020-02-26 22:46:06 +02:00
Moby von Briesen	d5540c89a1	satellite/repair/checker: add monkit metrics for segments immediately above repair threshold Record counts for segments at health=rt+1 through health=rt+5 for every checker iteration. Change-Id: I2a00c0bc34d17beb21cacdeab4dac77f755faefe	2020-02-26 20:27:15 +00:00
Egon Elbre	46228fee92	cmd/gateway: use proper module name By using a require for storj.io/storj it will make the import unambiguous. This means it is possible to have a module name storj.io/storj/cmd/gateway. Change-Id: I98439cbbaf433ae31309b7f80a19ced896018f65	2020-02-26 21:44:40 +02:00
Egon Elbre	64330c55b3	all: use pbgrpc common/pb moved grpc to a separate package common/pb/pbgrpc. This updates this repository to use it. Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e	2020-02-26 21:27:47 +02:00
Egon Elbre	8822e98c1f	cmd/gateway: simplify module handling Change-Id: If6ed158a6c9568fa33f69ca2d52e231ee4fcb0cb	2020-02-26 17:59:45 +00:00
Egon Elbre	89e5c77d83	satellite/metainfo: track observer timing Measure total time spent in each observer and distribution of handling pointers by pointer type. Change-Id: I2d125dfce8dbbb17225029fa35557bc106491151	2020-02-26 17:42:56 +00:00
Moby von Briesen	4e5a7f13c7	satellite/repair/queue: Prioritize selection of items off repair queue by segment health Add a column to the repair queue table in the satellite db for healthy piece count. When an item is selected from the repair queue, the least durable segment that has not been attempted in the past hour should be selected first. This prevents our repairer from getting stuck doing work on segments that are close to the repair threshold while allowing segments that are more unhealthy to degrade further. The migration also clears the repair queue so that the migration runs quickly and we can properly account for segment health in future repair work. We do not select items off the repair queue that have been attempted in the past six hours. This was changed from on hour to allow us time to try a wider variety of segments when the repair queue is very large. Change-Id: Iaf183f1e5fd45cd792a52e3563a3e43a2b9f410b	2020-02-26 09:54:16 -05:00
Yingrong Zhao	ac34485f5d	scripts/tests: install correct version of gateway 1. only run release tags that don't contain 'rc' 2. install gateway version that's the same as satellite 3. update gateway access to contain satellite id Change-Id: I8ca1418302c3aafdf0c4eaaf8361422a1eec2bd4	2020-02-26 13:12:31 +00:00
VitaliiShpital	9a8db05836	web/satellite: updating billing history after render added Change-Id: Ic7f3d4734d010759ed31bbae330c84f56057f370	2020-02-26 12:18:57 +00:00
NikolaiYurchenko	fc105af0e5	web/satellite: user select text restricted Change-Id: If3692d55e48255c95b7722c5a574060c84fdf502	2020-02-26 11:13:56 +00:00
Simon Guindon	594d6e03aa	docs/blueprints: Add design doc for distributed tracing. Change-Id: I98f76f857d1a6ccd384adc6287137b46e37b9904	2020-02-25 20:29:05 +00:00
Isaac Hess	e486a073cb	docs: Add uplink telemetry doc Change-Id: I6f47ef4af80d0c76a32dc360f8809a526a4e948f	2020-02-25 17:52:34 +00:00
Jessica Grebenschikov	e19e3c1101	pkg/process: Now that we are trying to identify the root cause of the satellite load limitations (i.e. currently the satellite has a max ability of 400 rps for uploads and we need this to be higher), we are using the golang diagnostic tools to collect insight into what the bottlenecks are. We currently have a debug endpoint to gather some cpu and mem data, but it could be useful to have continuous profiling. GCP stackdriver has support for continuous profiling so lets set that up and see if it is helpful to gather more data. This PR adds support for [GCP continuous profiler](https://cloud.google.com/profiler) which allows enabling continuous cpu/mem profiling and the stats are sent to stackdriver in google cloud console. To enable the continuous profiling for a storj component, do the following: - prereq: the workload must be running in GKE and have Stackdriver Profiling IAM role permissions - provide the config flag `debug.profilename` in the config.yaml file for the workload (i.e. satellite api process, etc). The profilename should be the workload name, for example "satellite-api". - once the above config flag is provided, the profiler will be initialized and profiling stats will automatically be sent to GCP project where the workload is running and viewable in the Stackdriver Profile page in the console The current implementation assumes the workload is running in GKE, however if we find if useful we can add support to enable this from anywhere. But for simplicity, its configured this way assuming the main goal is to enable in production systems. Change-Id: Ibf8ebe2df7bf06fdd4951ee6a1e48854dd36ad47	2020-02-25 09:04:23 -08:00

1 2 3 4 5 ...

3572 Commits