Commit Graph

3602 Commits

Author SHA1 Message Date
Cameron Ayer
7244a6a84e storagenode/{contact, piecestore}: implement low disk notification with cooldown
When a storagenode begins to run low on capacity, we want to notify
the satellite before completely running out of space. To achieve this,
at the end of an upload request, the SN checks if its available space has
fallen below a certain threshold. If so, trigger a notification to the
satellites.

The new NotifyLowDisk method on the monitor chore is implemented using the
common/syn2.Cooldown type, which allows us to execute contact only once
within a given timeframe; avoiding hammering the satellites with requests.
This PR contains changes to the storagenode/contact package, namely moving
methods involving the actual satellite communication out of Chore and into
Service. This allows us to ping satellites from the monitor chore

Change-Id: I668455748cdc6741291b61130d8ef9feece86458
2020-03-03 10:45:37 -05:00
Michal Niewrzal
d384e48ad7 private/testplanet: set rollout seed to avoid warnings in logs
Each test log is starting with warnings like this: "rollout config
error: empty seed {"binary": "Identity"}". Make no sense to print them
and pollute output.

Change-Id: Ib50e28d09d8b259106d3b79d8f1262954a7aed63
2020-03-03 12:58:54 +00:00
Egon Elbre
decb2ec69a private/processgroup: moved to storj.io/common/processgroup
Change-Id: I1ec0bb440dda757d8f9a6f564a0084dde2f9cc84
2020-03-03 10:50:33 +00:00
Jeff Wendling
a02424a220 pkg/server: use common implementation for user timeouts
Change-Id: Id6d7f1179df9a90819708d101a94939b7df70039
2020-03-03 10:06:45 +00:00
Jeff Wendling
443aa08a06 private/dbutil/txutil: remove the individual retry events
Change-Id: I63d06e57d7e6723b4d00d51f77c46345a11c4671
2020-03-03 08:38:19 +00:00
Moby von Briesen
f495544c56 satellite/satellitedb/dbx: add fields to node table for placing nodes into suspended mode for too many unknown-error audits
Change-Id: Iac9a619e5c08377de87ffdf4acdd0155027f5eb3
2020-03-03 03:30:59 +00:00
Qweder93
484ec7463a storagenode: notifications on outdated software version
Change-Id: If19b075c78a7b2c441e11b783c3c09fed55060c7
2020-03-02 16:48:02 +00:00
paul cannon
4d3db68283 cmd/gateway: fix go.mod formatting
Go is continually rewriting this file this way, making it Git-dirty,
and it makes me sad

Change-Id: I71cd630259a8bbeeffaa3dc9435562ecfc4e6487
2020-02-28 18:00:55 -06:00
igor gaidaienko
df88f416c9 satellite/accounting: Add test billing download traffic post deletion
Test checking that download traffic gets billed even if the file and bucket was deleted

Change-Id: Ifd67a8cd4b46d75ed48c86698e18c99f60bc39dc
2020-02-28 11:52:04 +00:00
Ivan Fraixedes
d64ef3d898 satellite/accounting: Test billing donwload/upload traffic
Add a test for checking that the billing:

* it doesn't include upload traffic
* it includes download traffic

Change-Id: I1655c15c1fad642f77dd210f2014b2586ae10104
2020-02-28 09:36:51 +00:00
Michal Niewrzal
4deab5ac6c satellite/metainfo: combine CommitSegment and CommitObject in batch v2
This change is a special case for batch processing. If in batch request
CommitSegment and CommitObject are one after another we can execute
these requests as one. This will avoid current logic where we are saving
pointer for CommitSegment and later we are deleting this pointer and
saving it once again as under last segment path for CommitObject.

This change should handle issue we have in older uplinks with incorrect
order of storing pointers.

Change-Id: I86514c95df169e6fbc91b52e5117472cae70cb8b
2020-02-28 07:40:36 +00:00
Jeff Wendling
1db087cfba satellite/satellitedb: migration to create tables for compensation
these tables are used in future commits with respect to the new
storagenode payments code. if we create them now, it will make
backfilling them with historical data easier.

Change-Id: I3c08c9770ec5b2baa38b4f2fd18c2f07746a61c2
2020-02-27 17:34:50 +00:00
Moby von Briesen
6043d01c90 satellite/audit/verifier: add metric for number of successfully downloaded shares
Change-Id: Ia4f1dc6e088db802e340aaecf80cc7ef6dc237a4
2020-02-27 14:33:59 +00:00
Egon Elbre
1f7c3be8f9 private/testplanet: add option to run testplanet databases non-parallel
NonParallel running is needed for gateway tests, because minio
unfortunately relies on global state.

Change-Id: If730db2ab86d10f4d02e1ac3128f758e9c18cdff
2020-02-27 15:49:22 +02:00
Michal Niewrzal
fb2711d05e scripts: update postgres helper script to set password
Latest postgres docker image requires non empty password.

Change-Id: I03017e1b7ff4803fefc24c39087d9ccd4042373b
2020-02-27 10:33:37 +00:00
NickolaiYurchenko
b0d2cf0e4d web/storagenode: on logo click action added
Change-Id: Iea8cd906a7220d5cd9dd96cd041bf8e7e378e455
2020-02-27 10:02:01 +00:00
Jeff Wendling
2b9f28b029 satellite/accounting/reportedrollup: remove expiration check
Remove the check around consuming an expired serial so that we
have more time to run the migration. It does open a small race
of double spends for entries already counted and then added to
the pending queue right around when they're going to expire and
the consumed serials have already been removed, but that should
be rare if we keep the pending queue empty.

Change-Id: I000b15979b09c67751281ff675ea6c81fc9d22dc
2020-02-26 15:35:10 -07:00
Egon Elbre
f85606b5a7 private/grpctlsopts: grpc related tlsopts
This moves grpc related tlsopts methods to private/grpctlsopts.
This allows to remove grpc dependency from tlsopts.

Change-Id: I25090b82b1e7a0633417ad600f8587b0c30ace73
2020-02-26 22:46:06 +02:00
Moby von Briesen
d5540c89a1 satellite/repair/checker: add monkit metrics for segments immediately above repair threshold
Record counts for segments at health=rt+1 through health=rt+5 for every checker
iteration.

Change-Id: I2a00c0bc34d17beb21cacdeab4dac77f755faefe
2020-02-26 20:27:15 +00:00
Egon Elbre
46228fee92 cmd/gateway: use proper module name
By using a require for storj.io/storj it will make the import
unambiguous. This means it is possible to have a module name
storj.io/storj/cmd/gateway.

Change-Id: I98439cbbaf433ae31309b7f80a19ced896018f65
2020-02-26 21:44:40 +02:00
Egon Elbre
64330c55b3 all: use pbgrpc
common/pb moved grpc to a separate package common/pb/pbgrpc.
This updates this repository to use it.

Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e
2020-02-26 21:27:47 +02:00
Egon Elbre
8822e98c1f cmd/gateway: simplify module handling
Change-Id: If6ed158a6c9568fa33f69ca2d52e231ee4fcb0cb
2020-02-26 17:59:45 +00:00
Egon Elbre
89e5c77d83 satellite/metainfo: track observer timing
Measure total time spent in each observer and distribution of handling
pointers by pointer type.

Change-Id: I2d125dfce8dbbb17225029fa35557bc106491151
2020-02-26 17:42:56 +00:00
Moby von Briesen
4e5a7f13c7 satellite/repair/queue: Prioritize selection of items off repair queue by segment health
Add a column to the repair queue table in the satellite db for healthy
piece count. When an item is selected from the repair queue, the least
durable segment that has not been attempted in the past hour should be
selected first. This prevents our repairer from getting stuck doing work
on segments that are close to the repair threshold while allowing
segments that are more unhealthy to degrade further.

The migration also clears the repair queue so that the migration runs
quickly and we can properly account for segment health in future repair
work.

We do not select items off the repair queue that have been attempted in
the past six hours. This was changed from on hour to allow us time to
try a wider variety of segments when the repair queue is very large.

Change-Id: Iaf183f1e5fd45cd792a52e3563a3e43a2b9f410b
2020-02-26 09:54:16 -05:00
Yingrong Zhao
ac34485f5d scripts/tests: install correct version of gateway
1. only run release tags that don't contain 'rc'
2. install gateway version that's the same as satellite
3. update gateway access to contain satellite id

Change-Id: I8ca1418302c3aafdf0c4eaaf8361422a1eec2bd4
2020-02-26 13:12:31 +00:00
VitaliiShpital
9a8db05836 web/satellite: updating billing history after render added
Change-Id: Ic7f3d4734d010759ed31bbae330c84f56057f370
2020-02-26 12:18:57 +00:00
NikolaiYurchenko
fc105af0e5 web/satellite: user select text restricted
Change-Id: If3692d55e48255c95b7722c5a574060c84fdf502
2020-02-26 11:13:56 +00:00
Simon Guindon
594d6e03aa docs/blueprints: Add design doc for distributed tracing.
Change-Id: I98f76f857d1a6ccd384adc6287137b46e37b9904
2020-02-25 20:29:05 +00:00
Isaac Hess
e486a073cb docs: Add uplink telemetry doc
Change-Id: I6f47ef4af80d0c76a32dc360f8809a526a4e948f
2020-02-25 17:52:34 +00:00
Jessica Grebenschikov
e19e3c1101 pkg/process:
Now that we are trying to identify the root cause of the satellite load limitations (i.e. currently the satellite has a max ability of 400 rps for uploads and we need this to be higher), we are using the golang diagnostic tools to collect insight into what the bottlenecks are.  We currently have a debug endpoint to gather some cpu and mem data, but it could be useful to have continuous profiling. GCP stackdriver has support for continuous profiling so lets set that up and see if it is helpful to gather more data.

This PR adds support for [GCP continuous profiler](https://cloud.google.com/profiler) which allows enabling continuous cpu/mem profiling and the stats are sent to stackdriver in google cloud console.

To enable the continuous profiling for a storj component, do the following:
- prereq: the workload must be running in GKE and have Stackdriver Profiling IAM role permissions
- provide the config flag `debug.profilename` in the config.yaml file for the workload (i.e. satellite api process, etc). The profilename should be the workload name, for example "satellite-api".
- once the above config flag is provided, the profiler will be initialized and profiling stats will automatically be sent to GCP project where the workload is running and viewable in the Stackdriver Profile page in the console

The current implementation assumes the workload is running in GKE, however if we find if useful we can add support to enable this from anywhere. But for simplicity, its configured this way assuming the main goal is to enable in production systems.

Change-Id: Ibf8ebe2df7bf06fdd4951ee6a1e48854dd36ad47
2020-02-25 09:04:23 -08:00
VitaliiShpital
b387c6b90c web/satellite: overview page implemented
Change-Id: I66a8e17635040730906bd6f9c20924abd0db0744
2020-02-25 14:28:00 +00:00
Egon Elbre
29452d82a5 go.mod: unlock graphql dependency and bump to latest
Change-Id: I40026f6c8de155e024f5fbb51105546065393034
2020-02-25 13:17:49 +02:00
Egon Elbre
9752d01884 private/prompt: remove dependency to go-prompt
Change-Id: Ida8ef731ce806cec076343dc77d72a3b0d7736b4
2020-02-25 13:09:41 +02:00
JT Olio
50a21de9dc traces: fix memory leak for long running traces that aren't being collected
for real this time. i'm so ashamed

Change-Id: Ib05bb50d8e947dec2d872fd53e71eec561c2d0e8
2020-02-24 15:02:26 -07:00
Cameron Ayer
d578102672 storagenode/piecestore: add workgroup to endpoint to prevent stray goroutine after shutdown
Change-Id: Ie8444c3c8f870745b73342de2e9a93027fcad371
2020-02-24 21:38:52 +00:00
paul cannon
92d86fa044 satellite/repair: fix repair concurrency
This new repair timeout (configured as TotalTimeout) will include both
the time to download pieces and the time to upload pieces, as well as
the time to pop the segment from the repair queue.

This is a move from Github PR #3645.

Change-Id: I47d618f57285845d8473fcd285f7d9be9b4318c8
2020-02-24 19:57:09 +00:00
Cameron Ayer
f22bddf122 {storagenode/contact, private/testplanet}: remove ErrFailureToStart and panic in testplanet.Start
Change-Id: I252e8c9407400af7bda95a7657c8154660c3c801
2020-02-24 18:24:23 +00:00
VitaliiShpital
8ea620b3c4 satellite/console: redirecting to login after activation implemented
Change-Id: Ibcf65f5d4664ac41c795f5ceb0a94bcd42673004
2020-02-24 19:52:28 +02:00
Jeff Wendling
f671eb2beb satellite/satellitedb: use queue for orders to get back fast billing
This change adds two new tables to process orders as fast as we used
to but in an asynchronous manner and with hopefully less storage
usage. This should help scale on cockroach, but limits us to one
worker. It lays the groundwork for the order processing pipeline to
be queue rather than database driven.

For more details, see the added fast billing changes blueprint.

It also fixes the orders db so that all the timestamps that are
passed to columns that do not contain a time zone are converted to
UTC at the last possible opportunity, making it less likely to use
the APIs incorrectly. We really should migrate to include timezones
on all of our timestamp columns.

Change-Id: Ibfda8e7a3d5972b7798fb61b31ff56419c64ea35
2020-02-24 17:07:07 +00:00
Qweder93
dca6fcbe28 satellite/payments/stripecoinpayments: credits added to invoice calculations
Change-Id: I6d3f5244a46f8945d2703af39ced333940db34e9
2020-02-24 16:48:27 +00:00
VitaliiShpital
985c3ef897 satellite/console: handling graphql errors bug fix
Change-Id: Ib20786485b0ea448e388912bb8406030d4fae1f7
2020-02-24 16:22:09 +00:00
Egon Elbre
e30f7b35b6 cmd/gateway: use a separate repository
Change-Id: Idbb0b2b6cf0e60c6d5d91218c24524d72285cf26
2020-02-24 10:03:03 +02:00
Yingrong Zhao
5011e78311 storagenode/piecestore: remove unused DeletePiece endpoint
With commit: 3331b443e7, satellite will
start calling `DeletePieces`. Therefore, we can remove the old endpoint
once the above commit is deployed with all satellites

Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb
2020-02-21 21:03:49 +00:00
Yingrong Zhao
a645e52ed9 satellite/metainfo: remove DeletePieces_node_id metric
Change-Id: I2cb10d411aa2912b256754a24d5c150e9536b4d3
2020-02-21 20:33:33 +00:00
Rafael Gomes
5132d285db cmd/statreceiver: Add instance tag to influx metric
Change-Id: I6545915c5cb93f6349c7b9d90f39e7d67c29038c
2020-02-21 16:33:00 -03:00
Yaroslav Vorobiov
f185adcf7c satellite/payments: fix projects list pagination
Change-Id: I342e69a17be34a503c1e0cef18ee009f1921fcd4
2020-02-21 19:37:11 +02:00
NikolaiYurchenko
2601f25c98 web/storagenode: notification logic implementation
Change-Id: Iec741997312203117213674ef85125fa8a976249
2020-02-21 15:49:27 +00:00
Michal Niewrzal
54e38b8986 pkg/miniogw: gateway implementation with new libuplink
Change-Id: I170c3a68cfeea33b528eeb27e6aecb126ecb0365
2020-02-21 16:20:38 +01:00
Egon Elbre
5342dd9fe6 go.mod: update uplink
Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811
2020-02-21 16:08:12 +02:00
Ivan Fraixedes
0a8f268a7e
go.mod: Update golang.org/x/crypto to fix vulnerability
Update the golang.org/x/crypto package to fix the vulnerability
CVE-2020-9283.

See https://groups.google.com/forum/#!topic/golang-nuts/XDqhhjZViNk

Change-Id: I7c841c0bae0f55dad0c7de19ac70c730d11733f0
2020-02-21 11:30:05 +01:00