Commit Graph

3965 Commits

Author SHA1 Message Date
Moby von Briesen
de366537a8 satellite/satellitedb/overlaycache: fix behavior around gracefully exited nodes
Sometimes nodes who have gracefully exited will still be holding pieces
according to the satellite. This has some unintended side effects
currently, such as nodes getting disqualified after having successfully
exited.
* When the audit reporter attempts to update node stats, do not update
stats (alpha, beta, suspension, disqualification) if the node has
finished graceful exit (audit/reporter_test.go TestGracefullyExitedNotUpdated)
* Treat gracefully exited nodes as "not reputable" so that the repairer
and checker do not count them as healthy (overlay/statdb_test.go
TestKnownUnreliableOrOffline, repair/repair_test.go
TestRepairGracefullyExited)

Change-Id: I1920d60dd35de5b2385a9b06989397628a2f1272
2020-04-28 23:58:43 +00:00
Isaac Hess
baccfd36b1 private/testplanet: Mark sn peer deleter test mode
When running testplanet tests, mark storagenode peer PieceDeleter as in
testing mode so that you don't have to do it on each test.

Change-Id: I2592e02c63f8bcc9152ecf436bac4e798b08bccf
2020-04-28 15:57:29 -06:00
Yingrong Zhao
c5309a3f91 cmd/uplink: set sample rate for tracing to be 1 when tracing is enabled
when tracing is enabled, we should also set sampling rate to
a non-zero value. For now, we will set it to 1.
Uplink CLI users should be able to override it with the sample
flag.

Change-Id: I8bcf514fb14c2a1c4349b7957dd24ec23e4a85e5
2020-04-28 20:15:28 +00:00
Jeff Wendling
42f63c6538 satellite/compensation: add offline status tracking
Change-Id: I52e615d3db186416ee95029dc72df626f0e69ad7
2020-04-28 19:35:59 +00:00
Egon Elbre
cafa7a5f0b satellite/admin: add a readme about endpoints
Change-Id: I720fcdf6af3860ad4c8566de85c39e9f06f7cb01
2020-04-28 19:13:45 +00:00
Egon Elbre
85c45cd56f private/dbutil/pgtest: support multiple databases for testing
Currently Cockroach isn't performant for concurrent database setup and
tear-down. Instead of a single instance allow setting multiple potential
connection strings and let the tests pick one connection string
randomly.

This improves test duration by ~10 minutes.

While we are at significantly changing how pgtest works, introduce
helper PickPostgres and PickCockroach for selecting the database to
reduce code duplications in multiple places.

Change-Id: I8ad171d5c4c8a4fc081ec2ae9bdd0cc948a80619
2020-04-28 21:55:49 +03:00
Natalie Villasana
6f84be133a satellite/metainfo: add MigrateToLatest to PointerDB
In cases like the segment reaper script connecting to the metainfodb,
we don't want a db migration to happen automatically when we call
metainfo.NewStore. This adds MigrateToLatest method for postgreskv
and cockroackv, and calls MigrateToLatest in places where NewStore used
to create tables.

Change-Id: I682d0f26d609af0601dfdb32a24866cdf5d32a7e
2020-04-28 17:26:35 +00:00
Isaac Hess
13bf0c62ab satellite/pieces: Fix race in piece deleter
There was a race in the test code for piece deleter, which made it
possible to broadcast on the condition variable before anyone was
waiting. This change fixes that and has Wait take a context so it times
out with the context.

Change-Id: Ia4f77a7b7d2287d5ab1d7ba541caeb1ba036dba3
2020-04-28 10:50:20 -06:00
VitaliiShpital
a3eeab2919 web/satellite: UI unit tests for billing history
Change-Id: I60e67fd0a998737dca3a77389f40aefb56311c2c
2020-04-28 18:27:18 +03:00
Bill Thorp
849326efee satellite/console: cleanup rate limiter
Changed == to  >= JIC, removed TODOs after being convinced by Isaac

Change-Id: Ibe8e5aafb3accfd3abb153bc315ebad223d55d15
2020-04-28 13:26:23 +00:00
NickolaiYurchenko
51bf2b6155 web/storagenode: dependencies for testing and payout store tests added
Change-Id: Iae18d073ba35ba8ac48e2d4c88476b38b96bbd9b
2020-04-28 15:44:13 +03:00
Yingrong Zhao
9b4a3f8fcc cmd/uplink: use tracing.enabled flag
Previously we are using tracing.sampled to be the switch for turning on/off tracing.
However we would like to separate sampling rate from being the switch,
so we can set sampling rate to be 0 but still intialize tracing for
satellite and storagenodes

Change-Id: I27e6ba25ea6f6b612b4e1a57cf1301889ded41ec
2020-04-27 17:54:57 +00:00
Egon Elbre
ef913be234 satellite/satellitedb/satellitedbtest: don't use subtest naming
A/B indicates that B is a subtest of A, however in this case they
represent a configuration of the test, not a subtest.

Change-Id: I64eed5d5bcb12759e54fe4b5373f8e88488e50f7
2020-04-27 19:32:09 +03:00
Isaac Hess
db0371703f storagenode/pieces: Return UnhandledCount to satellite
When we receive a piece deletion request, include the number of piece
IDs we couldn't add to the queue in the reponse

Change-Id: Ibebbe92ac50105bb5c74b18211ed38d468eb33f3
2020-04-27 08:56:56 -06:00
Isaac Hess
edda8d73bd storagenode/pieces: Piece deleter monitor queue
Each time we process a piece deletion on the storagenode, monitor how
long the item was in the queue and the size of the queue.

Change-Id: I23f1a44f8b9cecb901bdf4739d55c005ffed4bef
2020-04-27 08:55:43 -06:00
VitaliiShpital
befe7574e1 web/satellite: onboarding tour: adding payment methods step
Change-Id: I40c6680de4778700611f2f6978a02688d50d792f
2020-04-27 12:59:43 +03:00
Ivan Fraixedes
03871d17c3 satellite/satellitedb: Update ticket ref
Update a reference to a ticket in a code comment.

Change-Id: Ib82220e94527482c5ca1a58d8614b919d1113ab5
2020-04-27 08:50:41 +00:00
Michal Niewrzal
c52fc964d5 Upgrade storj.io/uplink to v1.0.4
Change-Id: I5248fcfdb3ccd3c9b60d38f245b1eb6c0e37bf38
2020-04-27 07:35:53 +00:00
NickolaiYurchenko
533a65a299 web/storagenode: disq and suspended text color fixed in dark mode
Change-Id: I49f4a272f84a92c036f14028b17b8926cf003568
2020-04-25 16:14:29 +03:00
NickolaiYurchenko
16d9d86833 web/storagenode: added held amount in table for current period
Change-Id: I0e3018ab27b5d8c86bee7d0f95bd6ae75cc205cf
2020-04-25 15:40:21 +03:00
NickolaiYurchenko
895eac1711 web/storagenode: api calls ungrouped, removed extra current period call
Change-Id: Id3af8822b6d80c29c94976d96e0a490459358f8a
2020-04-24 23:44:57 +03:00
Stefan Benten
d73630fd4a
satellite/satellitedb: Ensure we just return bucket usage for buckets that exist (#3863) 2020-04-24 22:25:16 +02:00
Bill Thorp
341aecfe0f satellite/console: add rate limiter to login, register, password recovery
Added a per IP rate limiter to the console web.
Cleaned up password check to leak less bcyrpt info.

Change-Id: I3c882978bd8de3ee9428cb6434a41ab2fc405fb2
2020-04-24 17:15:49 +00:00
Jess G
825226c98e
satellite/overlay: use node selection cache for uploads (#3859)
* satellite/overlay: use node selection cache for uploads

Change-Id: Ibd16cccee979d0544f2f4a01749af9f36f02a6ad

* fix config lock

Change-Id: Idd307e4dee8ab92749f1ec3f996419ea0af829fd

* start fixing tests

Change-Id: I207d373a3b2a2d9312c9e72fe9bd0b01e06ad6cf

* fix test, add some more

Change-Id: I82b99c2004fca2510965f9b389f87dd4474bc722

* change config name

Change-Id: I0c0f7fc726b2565dc3828cb723f5459a940f2a0b

* add benchmarks

Change-Id: I05fa25bff8d5b65f94d918556855b95163d002e9

* revert bench to put in different PR

Change-Id: I0f6942296895594768f19614bd7b2e3b9b106ade

* add staleness to benchmark

Change-Id: Ia80a310623d5a342afa6d835402170b531b0f870

* add cache config to testplanet

Change-Id: I39abdab8cc442694da543115a9e470b2a8a25dff

* have repair select old way

Change-Id: I25a938457d7d1bcf89fd15130cb6b0ac19585252

* lower testplante config time

Change-Id: Ib56a2ed086c06bc6061388d15a10a2526a663af7

* fix test

Change-Id: I3868e9cacde2dfbf9c407afab04dc5fc2f286f69
2020-04-24 09:11:04 -07:00
Jess G
7a4dcd61f7
satellite/overlay: add changes to selected node benchmarks (#3862)
* add changes to selected node benchmarks

Change-Id: I0259af155f9151cc2c7830d10f8907634c5e494f

* fix lint

Change-Id: I6c7b82bbfa579b468712f90fc03b12a931874a54

* restart jenkins

Change-Id: I1d7300343e94e695cd1c93a3b59895f52bbcb11e
2020-04-23 15:30:50 -07:00
Isaac Hess
a785d37157 storagenode/pieces: Process deletes asynchronously
To improve delete performance, we want to process deletes asynchronously
once the message has been received from the satellite. This change makes
it so that storagenodes will send the delete request to a piece Deleter,
which will process a "best-effort" delete asynchronously and return a
success message to the satellite.

There is a configurable number of max delete workers and a max delete
queue size.

Change-Id: I016b68031f9065a9b09224f161b6783e18cf21e5
2020-04-23 11:51:19 -06:00
littleskunk
1336070fec
storagenode/piecestore: add missing log message about audit errors (#3861)
* storagenode/piecestore: add missing log message about audit errors
* storagenode/piecestore: add monkit data for oder limit verification errors
2020-04-23 13:20:47 -04:00
Moby von Briesen
720e26d235 satellite/satellitedb/overlaycache: update unknown alpha/beta values properly
Update unknown_audit_reputation_alpha and unknown_audit_reputation_beta.
Add test to verify that BatchUpdateStats properly modifies unknown audit
alpha/beta

Change-Id: I0d5f9cac96a99f64905cf575b772402db0756a9d
2020-04-23 10:40:53 -04:00
Ivan Fraixedes
a0692d0db8
private/migrate: enhance docs in some funcs
Enhance the doc comment in some migration methods.

Change-Id: I3d91f7e01f24670fe3d972bd3b022b8a47251bdc
2020-04-23 13:06:06 +02:00
Moby von Briesen
72b93f3120 satellite/satellitedb: disqualify suspended nodes when the grace period passes
If a node is suspended and receives an unknown or failing audit,
disqualify them if the grace period (default 1w in production) has
passed.

Migrate the nodes table so any node that is currently suspended gets
unsuspended when the satellite starts up.

Change-Id: I7b81c68026f823417faa0bf5e5cb5e67c7156b82
2020-04-22 15:45:00 -04:00
Egon Elbre
676f3e8516 satellite/metainfo/piecedeletion: try to make batches larger
Currently it was possible that PopAll returns 1010 items, then
makes one RPC call with 1000 items, then RPC call 10 items. Meanwhile,
there have been added 500 new items added to the queue.

This change ensures that we pull items from the queue early and
try to make rpc batches as large as possible.

Change-Id: I1a30dde9164c2ff7b90c906a9544593c4f1cf0e9
2020-04-22 18:43:29 +00:00
Yingrong Zhao
0bdcf123cf bump monkit, monkit-jaeger, and private to latest
Also bump storj.io/common and sync repo

Change-Id: If8e60db6bdf0af8077b7befcb1da304c3c4dcae4
2020-04-22 12:30:37 -04:00
Ethan Adams
60e07f0a8b Revert "satellite/accounting: Remove unnecessary index bucket_bandwidth_rollups_project_id_action_interval_index"
This reverts commit 105dc7acc6.

Reason for revert: Recent changes to the Postgres query plan seems to want to use this index now.  Reverting until we have time to analyze what's happening.

Change-Id: I74b4b5a8f15c3850d8a958a29f51dbc80e7c282c
2020-04-22 14:49:04 +00:00
Moby von Briesen
178aa8b5e0 satellite/{metainfo,repair}: Delete expired segments from metainfo
* Delete expired segments in expired segments service using metainfo
loop
* Add test to verify expired segments service deletes expired segments
* Ignore expired segments in checker observer
* Modify checker tests to verify that expired segments are ignored
* Ignore expired segments in segment repairer and drop from repair queue
* Add repair test to verify that a segment that expires after being
added to the repair queue is ignored and dropped from the repair queue

Change-Id: Ib2b0934db525fef58325583d2a7ca859b88ea60d
2020-04-22 13:02:31 +00:00
Matt Robinson
e34937317c
Call the right make target for the segment-reaper (#3860) 2020-04-22 08:34:24 +02:00
Yingrong Zhao
0b8699bcb5 cmd: add prompt for enabling tracing during uplink cli setup
We want to make tracing to be opt-in.
For now, we will use `tracing.sample` as the toggle config to enable or
disable tracing and default to sample every traces from uplink cli.
If user wants to change the default sampling rate, they can do so by
using the `--tracing.sample` flag to override the default value

Change-Id: I6f25dac0f43024c50a8aaf6c549e6a514211f834
2020-04-21 20:57:10 +00:00
Qweder93
805e328c47 storagenode/heldamount payments removed
Change-Id: I87cc04f43d182a4190a571ef417be85d02db9d34
2020-04-21 17:15:31 +00:00
Ethan
105dc7acc6 satellite/accounting: Remove unnecessary index bucket_bandwidth_rollups_project_id_action_interval_index
See https://storjlabs.atlassian.net/browse/SM-738

Change-Id: I9ba3cc3fbff9f13fc0b95d25feee5a19e5a5c486
2020-04-21 16:43:09 +00:00
VitaliiShpital
7365fc434e web/satellite: available credits disabled due to coupon expiration bug
Change-Id: I8271640a7f4364e27e5ff570c7e26d5ee4fdbcd4
2020-04-21 15:48:15 +00:00
NickolaiYurchenko
89c877f461 web/storagenode: payout calculation fix
Change-Id: Ibd030d0ef91a28e2cfa94da78c211c27959bb753
2020-04-21 18:11:47 +03:00
Qweder93
3d56efc82d storagenode/console/service: Satellites EarliestJoinDate calculation ignores empty date
Change-Id: Ic528467dbf0a47a7779fd7ae054856744298a39c
2020-04-21 17:50:21 +03:00
Matt Robinson
1e295a48e7
add container image for segment-reaper (#3855) 2020-04-21 17:48:40 +03:00
Egon Elbre
e655e160dc private/testuplink: delete delete
ecclient.Delete is a deprecated func that shouldn't be used anymore.

Change-Id: Ica4d17e334220311c99cea28f1d0e2d854d72896
2020-04-21 13:56:40 +00:00
NickolaiYurchenko
0300076684 web/storagenode: audit checks based on score
Change-Id: I7e9c16ded3165a7da31117412700092de135da1d
2020-04-21 16:18:35 +03:00
NickolaiYurchenko
a237512123 web/storagenode: added division on price multiplier
Change-Id: Ie1146ae6eac1f626753e4bcfaecd3c4919d1e464
2020-04-21 15:56:00 +03:00
NickolaiYurchenko
ed701c196d web/storagenode: disk space displayed by hour
Change-Id: Id52fc9da39e0c38a05b6b343e97d18f5453ea1f5
2020-04-21 15:30:58 +03:00
Qweder93
e999f24e54 storagenode/nodestats/cache: storagenodeDB/heldamount sync with satelliteDB/storagenode_paystub
Change-Id: If894166809bee8a5e036e618005d8141c2a0c594
2020-04-20 19:12:17 +00:00
NickolaiYurchenko
b9dbd80515 web/storagenode: all fetched paystub data treated as list
Change-Id: I536d36bc0edf5c54eaa07b60e55b93f1e2a1f826
2020-04-20 18:33:30 +00:00
Yingrong Zhao
8375a09c89 cmd: remove InitTracing from satellite and storagenode main.go file
Change-Id: I4addbe7d0645f66abfb3e98d74d17035e9624e69
2020-04-20 14:06:26 -04:00
JT Olio
5f38f8f1fe satellite/gc: use hostname for metric instance ids instead of node id
currently production uses a different application suffix for gc
services, so chronograf can distinguish between gc processes and core
processes, but it'd be nice to be a bit more consistent with repairers
and api servers

Change-Id: Icb96fed006c59d7afd730317d35636a6e4573b58
2020-04-20 14:52:44 +00:00