Commit Graph

217 Commits

Author SHA1 Message Date
Egon Elbre
89ce1fe626 storagenode/storagenodedb: add ctx to OpenNew and OpenExisting
Database opening usually dial and hence we should pass ctx to them.

Change-Id: I9160ae95829f22f347bd525904898a47279a7427
2020-10-29 09:52:37 +02:00
Egon Elbre
d0beaa4a87 pkg/revocation: pass ctx into opening the database
Opening a databases requires ctx, this is first step to passing ctx
to the appropriate level.

Change-Id: I12700f39a320206d8a2a4e054452319f8585b44b
2020-10-29 07:15:36 +00:00
Jessica Grebenschikov
f5880f6833 satellite/orders: rollout phase3 of SettlementWithWindow endpoint
Change-Id: Id19fae4f444c83157ce58c933a18be1898430ad0
2020-10-26 14:56:28 +00:00
Stefan Benten
14a2050b8d pkg/auth: move package to consoleauth
To avoid further name collisions, the very broad named package gets moved into
the consoleauth package where its also mainly being used.

Change-Id: Ie563c9700adbf0553baca2b7b8ba4a1d9c29d144
2020-10-06 14:15:07 +02:00
Jessica Grebenschikov
4a2c66fa06 satellite/accounting: add cache for getting project storage and bw limits
This PR adds the following items:
1) an in-memory read-only cache thats stores project limit info for projectIDs

This cache is stored in-memory since this is expected to be a small amount of data. In this implementation we are only storing in the cache projects that have been accessed. Currently for the largest Satellite (eu-west) there is about 4500 total projects. So storing the storage limit (int64) and the bandwidth limit (int64), this would end up being about 200kb (including the 32 byte project ID) if all 4500 projectIDs were in the cache. So this all fits in memory for the time being. At some point it may not as usage grows, but that seems years out.

The cache is a read only cache. When requests come in to upload/download a file, we will read from the cache what the current limits are for that project. If the cache does not contain the projectID, it will get the info from the database (satellitedb project table), then add it to the cache.

The only time the values in the cache are modified is when either a) the project ID is not in the cache, or b) the item in the cache has expired (default 10mins), then the data gets refreshed out of the database. This occurs by default every 10 mins. This means that if we update the usage limits in the database, that change might not show up in the cache for 10 mins which mean it will not be reflected to limit end users uploading/downloading files for that time period..

Change-Id: I3fd7056cf963676009834fcbcf9c4a0922ca4a8f
2020-09-25 16:28:49 +00:00
Jennifer Johnson
4e2413a99d satellite/satellitedb: uses vetted_at field to select for reputable nodes
Additionally, this PR changes NewNodeFraction devDefault and testplanet config from 0.05 to 1.
This is because many tests relied on selecting nodes that were reputable based on audit and uptime
counts of 0, in effect, selecting new nodes as reputable ones.
However, since reputation is now indicated by a vetted_at db field that is explicitly set
rather than implied by audit and uptime counts, it would be more complicated to try to
update all of the nodes' reputations before selecting nodes for tests.
Now we just allow all test nodes to be new if needed.

Change-Id: Ib9531be77408662315b948fd029cee925ed2ca1d
2020-09-04 16:45:32 +00:00
Michal Niewrzal
aa47e70f03 satellite/metainfo: use metabase.SegmentKey with metainfo.Service
Instead of using string or []byte we will be using dedicated type
SegmentKey.

Change-Id: I6ca8039f0741f6f9837c69a6d070228ed10f2220
2020-09-03 15:11:32 +00:00
Cameron Ayer
ca0c1a5f0c storagenode/{monitor,pieces}, storage/filestore: add loop to check storage directory writability
periodically create and delete a temp file in the storage directory
to verify writability. If this check fails, shut the node down.

Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574
2020-08-31 21:20:49 +00:00
Moby von Briesen
5d21e85529 satellite/audit/queue: Separate audit queue into two separate structs.
* The audit worker wants to get items from the queue and process them.
* The audit chore wants to create new queues and swap them in when the
old queue has been processed.

This change adds a "Queues" struct which handles the concurrency
issues around the worker fetching a queue and the chore swapping a new
queue in. It simplifies the logic of the "Queue" struct to its bare
bones, so that it behaves like a normal queue with no need to understand
the details of swapping and worker/chore interactions.

Change-Id: Ic3689ede97a528e7590e98338cedddfa51794e1b
2020-08-31 20:51:25 +00:00
Bill Thorp
dbb53151f0 private/testplanet: Decrease metainfo MaxBuckets test value to speed testing.
TestMaxOutBuckets is one of our slower tests (50-90s).
This change seems to make it 2-12s.

It reduces the number of buckets that need to be created.
It also removes unnecessary storage nodes.

Change-Id: I1012fc6e9258b2f7674b16da4e8b418741c93eea
2020-08-26 17:31:31 +00:00
Jeff Wendling
91698207cf storagenode: live tracking of order window usage
This change accomplishes multiple things:

1. Instead of having a max in flight time, which means
   we effectively have a minimum bandwidth for uploads
   and downloads, we keep track of what windows have
   active requests happening in them.

2. We don't double check when we save the order to see if it
   is too old: by then, it's too late. A malicious uplink
   could just submit orders outside of the grace window and
   receive all the data, but the node would just not commit
   it, so the uplink gets free traffic. Because the endpoints
   also check for the order being too old, this would be a
   very tight race that depends on knowledge of the node system
   clock, but best to not have the race exist. Instead, we piggy
   back off of the in flight tracking and do the check when
   we start to handle the order, and commit at the end.

3. Change the functions that send orders and list unsent
   orders to accept a time at which that operation is
   happening. This way, in tests, we can pretend we're
   listing or sending far into the future after the windows
   are available to send, rather than exposing test functions
   to modify internal state about the grace period to get
   the desired effect. This brings tests closer to actual
   usage in production.

4. Change the calculation for if an order is allowed to be
   enqueued due to the grace period to just look at the
   order creation time, rather than some computation involving
   the window it will be in. In this way, you can easily
   answer the question of "will this order be accepted?" by
   asking "is it older than X?" where X is the grace period.

5. Increases the frequency we check to send up orders to once
   every 5 minutes instead of once every hour because we already
   have hour-long buffering due to the windows. This decreases
   the maximum latency that an order will be reported back to
   the satellite by 55 minutes.

Change-Id: Ie08b90d139d45ee89b82347e191a2f8db1b88036
2020-08-19 19:42:33 +00:00
Cameron Ayer
0155c21b44 private/testplanet, storagenode/{monitor,pieces}: write storage dir verification file on run and verify on loop
On run, write the storage directory verification file.

Every time the node runs it will write the file even if it already exists.
The reason we do this is because if the verification file is missing, the SN
doesn't know whether it is an incorrect directory, or it simply hasn't written
the file yet, and we want to keep nodes running without needing operator intervention.

Once this change has been a part of the minimum version for several releases,
we will move the file creation from the run command to the setup
command. Run will only verify its existence.

Change-Id: Ib7d20e78e711c63817db0ab3036a50af0e8f49cb
2020-08-19 19:12:21 +00:00
Yingrong Zhao
14ad7a4f1c satellite/metainfo: add limiter for objectdeletion and piecedeletion
services

This PR adds a limiter on the amount of concurrent objects deletion can be handled so
we don't run out of memory.

Change-Id: Id2ce368af6f86845fcdfd34cb2f5e460efe9b272
2020-08-19 16:08:29 +00:00
Moby von Briesen
708cb48aa6 storagenode/orders: implement orders filestore on storagenode
* Add all new orders to the orders filestore instead of the database.
* Submit orders from the filestore to the new satellite SettleWindow
endpoint.

The orders filestore will eventually replace the orders DB completely.
For now, we will still be checking the orders DB and submitting those
orders if they exist. In a later release, we will completely remove the
orders DB, but we need both the DB and filestore for the transitionary
period.

Change-Id: Iac8780fd5ab770296181bbd313e1d335f072d4dc
2020-08-19 15:00:35 +00:00
Ivan Fraixedes
7f8df74070
private/testplanet: Use config with name set when empty
In testplanet Run function we create a new configuration variable on
each t.Run for setting the value to the config name field when it's
empty, however the new copy of the configuration was not used.

Change-Id: I9da34e743f9648850c96556eab0349e742db3aac
2020-08-19 13:12:10 +02:00
Michal Niewrzal
88dcc93f3c satellite/metainfo: use user PartnerID for bucket attribution
Change-Id: I20f1bd432333f9b37ca8fb457c349eff94ffb392
2020-08-06 13:14:07 +00:00
Moby von Briesen
e02adfe5e9 satellite/overlay/config.go: Add AuditHistoryConfig to overlay
Adds AuditHistory{WindowSize, TrackingPeriod, GracePeriod,
OfflineThreshold}. These values will be used to track offline audits over
time, and to suspend/disqualify nodes for being offline for too long.

Change-Id: I05f7dbc3c034bdc53c4fbd7719c71a44f37ec6a5
2020-08-04 18:18:56 +00:00
Jeff Wendling
85a74b47e7 satellite/orders: 3-phase rollout
This adds a config flag orders.window-endpoint-rollout-phase
that can take on the values phase1, phase2 or phase3.

In phase1, the current orders endpoint continues to work as
usual, and the windowed orders endpoint uses the same backend
as the current one (but also does a bit extra).

In phase2, the current orders endpoint is disabled and the
windowed orders endpoint continues to use the same backend.

In phase3, the current orders endpoint is still disabled and
the windowed orders endpoint uses the new backend that requires
much less database traffic and state.

The intention is to deploy in phase1, roll out code to nodes
to have them use the windowed endpoint, switch to phase2, wait
a couple days for all existing orders to expire, then switch
to phase3.

Additionally, it fixes a bug where a node could submit a bunch
of orders and rack up charges for a bucket.

Change-Id: Ifdc10e09ae1645159cbec7ace687dcb2d594c76d
2020-08-03 17:01:42 +00:00
Rafael Gomes
935f44ddb7 satellite/metainfo: Add Delete Service config
Change-Id: I0a6e3ce1adfe1488eb23da9dda92877af1834599
2020-08-03 14:28:02 +00:00
Michal Niewrzal
20184d3604 satellite/metainfo: move TestAttributionReport to attribution tests
Additionally test was simplified by adding ability to set user agent for
testplanet uplink.

Change-Id: I82942c2280562b5118a42aa8e1e0f53092f8dbe1
2020-07-30 19:18:15 +00:00
Bill Thorp
b265b7f555 satellite/console: make paywall optional
Add a config so that some percent of users require credit cards /
account balances
in order to create a project or have a promotional coupon applied

UI was updated to match needed paywall status

At this point we decided not to use a field to store if a user is in an
A/B
test, and instead just use math to see if they're in a test.  We decided
to use MD5 (because its in Postgres too) and User UUID for that math.

Change-Id: I0fcd80707dc29afc668632d078e1b5a7a24f3bb3
2020-07-28 10:57:49 +00:00
Qweder93
92efffb48a storagenode/version: notification flow now based on cursor, chore_test added, versioncontrol added to reconfigure.
Change-Id: I70713def8d585228270ec5a8c586ecc5b4d510c4
2020-07-23 14:13:24 +00:00
Ethan
cfca021839 satellite/accounting: Add chore to cleanup old project bandwidth rollups data
Removes old project_bandwidth_rollups records that are no longer used.

Uses a retain months configuration to determine how many months to save.  Current month cannot be removed.
Tests retainMonths=-1, 0, 2

Change-Id: Ia4be2546cdb28802427acf41ecd85ad66df3e62c
2020-07-22 18:56:49 +00:00
Egon Elbre
b84923558b satellite: fix scoping, formatting
Change-Id: I21ef9edc2d449d75ad74891df7f966fb150d80fd
2020-07-16 19:13:14 +03:00
Egon Elbre
e70da5cd4e all: fix comments
Change-Id: I2d2307e3fab87de47a72b3595d051e2c95ff4f8a
2020-07-16 19:13:14 +03:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00
stefanbenten
9ace375ee0 satellite/{console,satellitedb}: change project limiting based on new users field
This change switches the backend logic to use the new DB column on the users table to restrict project creation.
Furthermore it back fills the existing limits from registration tokens to the new column to ensure no users are reset to the new default.

UI is updated to reflect ability to create several projects

Change-Id: Ie29157430ae6b065411ca4c4557c9f1be69cdc4f
2020-07-16 10:57:47 +00:00
Jennifer Johnson
784a156eea satellite: prevents uplink from creating a bucket once it exceeds the max bucket allocation.
Change-Id: I4b3822ed723c03dbbc0df136b2201027e19ba0cd
2020-07-15 17:27:05 +00:00
stefanbenten
1149417615 satellite/admin: cleanup parameter handling
We passed in revocationDB and metainfoDB for no reason.
Lets remove it from the dependency list to further reduce the footprint.

Change-Id: Ic0317bb92670fbd305d4a8b0ed1cb82858e2f6d3
2020-07-14 13:53:09 +02:00
Egon Elbre
4869cfc9a4 satellite/vouchers: remove deprecated endpoint
Change-Id: I0a754217d9424253e448126face6594bc143f412
2020-07-10 12:38:46 +00:00
Cameron Ayer
cadb435d25 {satellite/audit, private/testplanet}: remove ErrAlreadyExists, run 2 audit workers in testplanet
Since we increased the number of concurrent audit workers to two, there are going
to be instances of a single node being audited simultaneously for different segments.
If the node times out for both, we will try to write them both to the pending audits
table, and the second will return an error since the path is not the same as what
already exists. Since with concurrent workers this is expected, we will log the
occurrence rather than return an error.

Since the release default audit concurrency is 2, update testplanet default to run with
concurrent workers as well.

Change-Id: I4e657693fa3e825713a219af3835ae287bb062cb
2020-06-30 18:00:07 +00:00
Rafael Gomes
958ea1b9df satellite/accounting: add download limit cache
Change-Id: I722930cab8bd5d240f4878dc6997e9bc7637311f
2020-06-12 16:33:46 -03:00
Egon Elbre
1ed5a1bac5 satellite/satellitedb/satellitedbtest: skip omitted database
The first implementation missed some changes.

Change-Id: I7ae696175e0a9ea46954970ba8547638a05ed5a9
2020-06-11 13:28:16 +00:00
Egon Elbre
1c30efd3a1 private/testplanet: allow setting "omit" as database to reduce output
Change-Id: I7af90fdefe2ff2df1340aa2b17f40806d889ca18
2020-06-09 12:41:58 +03:00
Moby von Briesen
b82d04e618 satellite/metainfo: limit size of uplink-provided metadata to 2KiB
Change-Id: Id44a46046ddb4a12102525531f4502fcff2b6252
2020-06-01 16:51:29 -04:00
Moby von Briesen
dc57640d9c storagenode/piecestore: switch usedserials db for in-memory usedserials store
Part 2 of moving usedserials in memory
* Drop usedserials table in storagenodedb
* Use in-memory usedserials store in place of db for order limit
verification
* Update order limit grace period to be only one hour - this means
uplinks must send their order limits to storagenodes within an hour of
receiving them

Change-Id: I37a0e1d2ca6cb80854a3ef495af2d1d1f92e9f03
2020-05-28 12:52:52 -04:00
Michal Niewrzal
84892631c8 private/testplanet: remove old libuplink from testplanet
Change-Id: Ib1553f84d0b3ae12a5b00382f0f53357b6a273e2
2020-05-28 13:50:23 +00:00
Michal Niewrzal
5c10964040 satellite/payments/stripecoinpayments: add test for listing issues while
invoice generation

https://review.dev.storj.io/c/storj/storj/+/1853
https://review.dev.storj.io/c/storj/storj/+/1882

Change-Id: Ie71363b819866dd60dbe7117b42cfa8348479310
2020-05-22 17:24:16 +00:00
Michal Niewrzal
3d332de228 private/testplanet: remove StorageNodeCount from testplanet uplink
definion

Small cleanup.

Change-Id: Icabdf1433c36cd0e9f8e10a67975e98391024e14
2020-05-21 14:51:58 +00:00
Egon Elbre
bef84a5f9d storagenode: remove dependency to overlay.NodeDossier
This is the last dependency from storage node to satellite.

Change-Id: I12f7abb91e84f823ba5af126c6e2979519838612
2020-05-21 08:37:13 +03:00
Egon Elbre
b42778c42e private/testplanet: remove some additional Local-s
Change-Id: I49701c41efb92efca27cc18d0a3f6d6b44d3cf8b
2020-05-21 08:37:13 +03:00
Michal Niewrzal
fe6a6f063f private/testplanet: cleanup predefined data generation
Use Console service to create user and project instead direct DB
modifications.

Change-Id: Ib0074b38313b3dc43b7d8d63ab2775d29028fb7b
2020-05-20 12:38:43 +00:00
Egon Elbre
941d10cbc3 private/testplanet: remove Peer.Local()
Currently storagenode depends on overlay.NodeDossier, this is the first
step in removing it.

Change-Id: I034a3f1601835f8349bd41752455022e19bcc707
2020-05-20 11:05:34 +00:00
Egon Elbre
ed627144ed all: use DialNodeURL throughout the codebase
Change-Id: Iaf9ae3aeef7305c937f2660c929744db2d88776c
2020-05-20 10:36:30 +00:00
Michal Niewrzal
705e82ea99 private/testplanet: add AddUser and AddProject to satellite
functionality

We want to start adding more complex test cases for billing/invoices and
we need more handy tooling to be able do this easily.

Change-Id: Ib22ac6b4ba9ee77cc91c88b0cfd2d2efc15657df
2020-05-19 13:02:04 +00:00
Michal Niewrzal
ac375d37bc satellite/payments: remove mockpayments and add Stripe client mock
instead

Change-Id: If3496f6abc16da90d2b43fa0c5be356847a39507
2020-05-19 09:35:37 +02:00
littleskunk
ef2671927d
storagenode/piecestore: move queue size defaults (#3881) 2020-05-15 19:10:26 +02:00
Ethan
159df8b2e4 Add logging listener for retrieving and setting log levels
See https://storjlabs.atlassian.net/browse/SM-752

These changes allow us to change the log level at runtime through a handler off of the debug endpoint.

Examples of changing the log level on storj-sim

To get the current level for the satellite api process:
curl -XGET 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain'

To change the log level:
curl -XPUT 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain' --data-raw '{"level":"error"}'

Change-Id: I05d164b290929fa06b6d78c01075ee41f8238044
2020-05-12 16:38:06 -04:00
Stefan Benten
e23bd806b4
satellite/accounting: separate usage and bandwidth limit (#3878) 2020-05-12 15:01:15 +02:00
Egon Elbre
e6d5ce6b77 all: remove grpc
It seems everyone has migrated to drpc.

Change-Id: Ica6b2d0bdef68c6603083f2963458843eca71e9e
2020-05-10 06:36:09 +00:00
Egon Elbre
bcd93ee375 private/testplanet: add StopNodeAndUpdate
This was commonly used and code with it can be simplified.

Change-Id: I2f2b91f7de54269aee6ef027f97f9e8a7d222e39
2020-05-08 13:02:19 +00:00
Egon Elbre
90d859fbb8 private/testplanet: use drpc piecestore mock for testing
Change-Id: Ia3f93f3c8b6584fb92f5d29025b7f0691120430e
2020-05-07 10:54:49 +03:00
Egon Elbre
c5452a87ec private/testplanet: use drpc referral manager server
Change-Id: I9e9e9a724c78c98859dd3e29416d766d8ffdca63
2020-05-07 07:03:11 +00:00
Egon Elbre
d98b8f6e23 satellite/metainfo,storage: use different limit for metainfo loop
Change-Id: I5ef7233930679b977b33f7b3e1dda45c907dcfad
2020-05-05 10:37:20 +00:00
Moby von Briesen
8f60cfc4fb satellite/overlay: Add flag for enabling/disabling disqualification from suspension mode
Add a flag that allows us to easily switch disqualification from
suspension mode on or off. A node will only be disqualified from
suspension mode if it has been suspended for longer than the grace
period AND the SuspensionDQEnabled flag is true.

Change-Id: I9e67caa727183cd52ab2042b0a370a1bcaebe792
2020-05-04 17:25:09 +00:00
Egon Elbre
c630cf2490 storagenode/pieces: implement buffering for writing
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.

Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
2020-05-04 06:01:32 +00:00
Egon Elbre
8928399d02 all: rename CreateTables to MigrateToLatest
CreateTables hasn't been quite true for a while now, rename to
MigrateToLatest to be clearer in it's behavior.

Change-Id: Ida48e95122a5d9b7a814e922d3698e00024a2ba7
2020-04-30 07:21:17 +00:00
Egon Elbre
51f69d53dc private/testplanet: fix closablePeer
Peers require that Run finishes before calling Close.
Cancel only signals peer to start closing however it does not wait for
it to complete.

Change-Id: If4b3778f4fc86402363ed3b555db11e1189e6200
2020-04-29 17:57:39 +00:00
Isaac Hess
237d9da477 storagenode/pieces: Deleter can handle multiple tests
Before the deleter would close its done channel once, so if additional
tests shared a storagenode, even if not in parallel, the later waits
would not work properly. This fixes that problem.

Change-Id: I7dcacf6699cef7c2c2948ba0f4369ef520601bf5
2020-04-29 11:26:56 -06:00
Isaac Hess
baccfd36b1 private/testplanet: Mark sn peer deleter test mode
When running testplanet tests, mark storagenode peer PieceDeleter as in
testing mode so that you don't have to do it on each test.

Change-Id: I2592e02c63f8bcc9152ecf436bac4e798b08bccf
2020-04-28 15:57:29 -06:00
Egon Elbre
85c45cd56f private/dbutil/pgtest: support multiple databases for testing
Currently Cockroach isn't performant for concurrent database setup and
tear-down. Instead of a single instance allow setting multiple potential
connection strings and let the tests pick one connection string
randomly.

This improves test duration by ~10 minutes.

While we are at significantly changing how pgtest works, introduce
helper PickPostgres and PickCockroach for selecting the database to
reduce code duplications in multiple places.

Change-Id: I8ad171d5c4c8a4fc081ec2ae9bdd0cc948a80619
2020-04-28 21:55:49 +03:00
Egon Elbre
ef913be234 satellite/satellitedb/satellitedbtest: don't use subtest naming
A/B indicates that B is a subtest of A, however in this case they
represent a configuration of the test, not a subtest.

Change-Id: I64eed5d5bcb12759e54fe4b5373f8e88488e50f7
2020-04-27 19:32:09 +03:00
Bill Thorp
341aecfe0f satellite/console: add rate limiter to login, register, password recovery
Added a per IP rate limiter to the console web.
Cleaned up password check to leak less bcyrpt info.

Change-Id: I3c882978bd8de3ee9428cb6434a41ab2fc405fb2
2020-04-24 17:15:49 +00:00
Jess G
825226c98e
satellite/overlay: use node selection cache for uploads (#3859)
* satellite/overlay: use node selection cache for uploads

Change-Id: Ibd16cccee979d0544f2f4a01749af9f36f02a6ad

* fix config lock

Change-Id: Idd307e4dee8ab92749f1ec3f996419ea0af829fd

* start fixing tests

Change-Id: I207d373a3b2a2d9312c9e72fe9bd0b01e06ad6cf

* fix test, add some more

Change-Id: I82b99c2004fca2510965f9b389f87dd4474bc722

* change config name

Change-Id: I0c0f7fc726b2565dc3828cb723f5459a940f2a0b

* add benchmarks

Change-Id: I05fa25bff8d5b65f94d918556855b95163d002e9

* revert bench to put in different PR

Change-Id: I0f6942296895594768f19614bd7b2e3b9b106ade

* add staleness to benchmark

Change-Id: Ia80a310623d5a342afa6d835402170b531b0f870

* add cache config to testplanet

Change-Id: I39abdab8cc442694da543115a9e470b2a8a25dff

* have repair select old way

Change-Id: I25a938457d7d1bcf89fd15130cb6b0ac19585252

* lower testplante config time

Change-Id: Ib56a2ed086c06bc6061388d15a10a2526a663af7

* fix test

Change-Id: I3868e9cacde2dfbf9c407afab04dc5fc2f286f69
2020-04-24 09:11:04 -07:00
Moby von Briesen
72b93f3120 satellite/satellitedb: disqualify suspended nodes when the grace period passes
If a node is suspended and receives an unknown or failing audit,
disqualify them if the grace period (default 1w in production) has
passed.

Migrate the nodes table so any node that is currently suspended gets
unsuspended when the satellite starts up.

Change-Id: I7b81c68026f823417faa0bf5e5cb5e67c7156b82
2020-04-22 15:45:00 -04:00
Moby von Briesen
178aa8b5e0 satellite/{metainfo,repair}: Delete expired segments from metainfo
* Delete expired segments in expired segments service using metainfo
loop
* Add test to verify expired segments service deletes expired segments
* Ignore expired segments in checker observer
* Modify checker tests to verify that expired segments are ignored
* Ignore expired segments in segment repairer and drop from repair queue
* Add repair test to verify that a segment that expires after being
added to the repair queue is ignored and dropped from the repair queue

Change-Id: Ib2b0934db525fef58325583d2a7ca859b88ea60d
2020-04-22 13:02:31 +00:00
Michal Niewrzal
c021b35879 private/testplanet: migrate testplanet to new libuplink
Replace most of old libuplink usages in testplanet. 100% migration will
be possible when we will be able to implement UploadWithClientConfig
with new libuplink.

Change-Id: I432d7d4917c7b67d46a058abd0a2a6a13f565ac4
2020-04-20 12:43:34 +00:00
Egon Elbre
9052085f70 private/testplanet: simplify uplink usage
Change-Id: I3e488dc296f1094ce95e6d6597ca6d3f8da90a76
2020-04-16 16:45:55 +00:00
Moby von Briesen
d7794a4851 satellite/overlay: hardcode default values for audit alpha/beta
Alpha=1 and beta=0 are the expected first values for any alpha/beta
reputation system we are using in the codebase. So we are removing the
configurability of these values.

Change-Id: Ic61861b8ea5047fa1438ea6609b1d0048bf0abc3
2020-04-14 19:12:40 +00:00
Cameron Ayer
3ee6c14f54 satellite/downtime: add concurrency to downtime estimation
We want to increase our throughput for downtime estimation. This commit
adds the ability to reach out to multiple nodes concurrently for downtime
estimation. The number of concurrent routines is determined by a new config
flag, EstimationConcurrencyLimit. It also increases the default
EstimationBatchSize to 1000.

Change-Id: I800ce7ec1035885afa194c3c3f64eedd4f6f61eb
2020-04-14 14:39:13 +00:00
Egon Elbre
a4c554f2ed satellite/admin: support user query by email
This adds new endpoint /api/user/{user-email} which allows to get the
projects where the user is a member.

It also moves existing endpoint:

    /project/{projectid}/limit -> /api/project/{projectid}/limit

To avoid future conflicts for displaying pages.

Change-Id: I5efe3e1c8f79894c136f92ed815f635a34ba6f98
2020-04-06 18:32:25 +00:00
Cameron Ayer
42be4bdc0f satellite/contact: add timeout to PingBack method
Change-Id: I2ec2f82e2e10d8be16f82e9de13ce42358e47c98
2020-04-04 18:26:30 +00:00
Egon Elbre
6492b13d81 all: remove old uuid
Change-Id: I3a137f73456f010c37d3933dbe12cbbb840b809f
2020-04-02 19:30:36 +03:00
Michal Niewrzal
c178a08cb8 satellite/metainfo: add max segment size and max inline size to
BeginObject response

We want to control inline segment size and segment size on satellite
side. We need to return such information to uplink like with redundancy
scheme.

Change-Id: If04b0a45a2757a01c0cc046432c115f475e9323c
2020-04-02 12:41:28 +00:00
Egon Elbre
0a69da4ff1 all: switch to storj.io/common/uuid
Change-Id: I178a0a8dac691e57bce317b91411292fb3c40c9f
2020-03-31 19:16:41 +03:00
Egon Elbre
e1a443b04a private/testplanet: allow modifying created database
Instead of providing the database from outside to testplanet create it
inside and then allow wrapping and modifying it. This is more convenient
to use.

Change-Id: I9b8f69e6e0a19ff984b4e2bfe927c9100c77bc6c
2020-03-27 19:14:48 +00:00
Moby von Briesen
a933bcc99a satellite/repair/repairer/ec.go: add option for downloading pieces onto disk instead of in memory during repair
Add flag to satellite repairer, "InMemoryRepair" that allows the
satellite to decide whether to download the entire segment being
repaired into memory (this is what the satellite already does), or to
download it into temporary files on disk that will be read from in the
upload phase of repair.

This should help with handling high repair traffic on satellites that
cannot afford to spend 64mb of memory per repair worker.

Updates tests to test repair for both in memory and to disk.

Change-Id: Iddf591e165621497c98533d45bfea3c28b08a194
2020-03-27 16:41:00 +00:00
Egon Elbre
e8f18a2cfe private/testplanet: expose storagenode and satellite Config
Change-Id: I80fe7ed8ef7356948879afcc6ecb984c5d1a6b9d
2020-03-27 17:01:25 +02:00
Natalie Villasana
8e0ca0e6f5
satellite/gc: update release default for gc to run separately (#3830) 2020-03-26 14:44:18 -04:00
Yingrong Zhao
b7b19289d1 bump storj.io/common to latest
Change-Id: I16e337660ce8e1ef332cc842dbf4cfa067b9b98b
2020-03-25 09:08:40 -04:00
Yingrong Zhao
a731472496 bump storj.io/common to latest and storj.io/drpc to v0.0.11
Change-Id: I7a6e823b441eeff4621dfdf2d6577be76c9761c8
2020-03-24 15:17:10 -04:00
Michal Niewrzal
fdf40a7526 storj: remove storj/private/version package which was moved to
`storj/private` repo

Change-Id: I81c3f5b9d5e4fe7bca760999eb045ee9734e5e2e
2020-03-24 14:31:33 +00:00
Michal Niewrzal
f0aeda3091 storj: remove from storj/pkg packages moved to storj/private repo
* debug
* traces
* cfgstruct
* process

Package `storj/private/version` will be removed as a separate change.

Change-Id: Iadc40faa782e6225513b28218952f02d9c240a9f
2020-03-24 09:56:29 +01:00
Jennifer Johnson
699b635e5d satellite/overlay: rename newNodePercentage to newNodeFraction
Change-Id: Ie66de91f88183b44de0773589e83e4ade9aa997a
2020-03-19 20:09:32 +00:00
Jessica Grebenschikov
5142874144 satellite/gc: move garbage collection to its own process
Change-Id: I7235aa83f7c641e31c62ba9d42192b2232dca4a5
2020-03-18 16:44:01 +00:00
Egon Elbre
09e0f3de63 satellite/metainfo/piecedeletion: add Service
Change-Id: Id7e32ed569701fa0be66f9527c43a67052994570
2020-03-18 14:50:08 +00:00
Bill Thorp
94c11c5212 satellite: remove some unnecessary UTC() calls
Fixes some easy cases of extraneous UTC() calls

Change-Id: I3f4c287ae622a455b9a492a8892a699e0710ca9a
2020-03-13 13:49:44 +00:00
JT Olio
051569c69f
satellite: enable open registration (and add flag that disables it) SM-441
Change-Id: I47bfedb312089f6d2bfbab013bd74ad4b8aa5f5e
2020-03-11 03:53:34 +01:00
paul cannon
79553059cb satellite/repair: put irreparable segments in irreparableDB
Previously, we were simply discarding rows from the repair queue when
they couldn't be repaired (either because the overlay said too many
nodes were down, or because we failed to download enough pieces).

Now, such segments will be put into the irreparableDB for further
and (hopefully) more focused attention.

This change also better differentiates some error cases from Repair()
for monitoring purposes.

Change-Id: I82a52a6da50c948ddd651048e2a39cb4b1e6df5c
2020-03-09 21:45:16 +00:00
Michal Niewrzal
d7b5df70d3 cmd/uplink: remove unused flag
New API has limited number of options to configure at the moment. We
should remove unused flags from Uplink CLI and add if needed in the
future.

Change-Id: Icf3f3dadd43cb61a3b408b02d0762aef34425dbf
2020-03-09 13:44:46 +00:00
Michal Niewrzal
c20cf25f35 cmd: migrate uplink CLI to new API
Change-Id: I8f8fcc8dd9a68aac18fd79c4071696fb54853a60
2020-03-09 13:26:29 +00:00
Egon Elbre
f4d5d89b68 private/testplanet: add WaitForStorageNodeEndpoints
After calling uplink.Upload it is not guaranteed that the
storage node has yet saved all the orders since it happens
asynchronously. Hence we need a separate func to wait
for them to complete.

Change-Id: I0c34b3ea6c98dbcf37f80493c0e10a8bdbbb2aaf
2020-03-05 10:33:56 +00:00
Jennifer Johnson
1c1750e6be removes bandwidth limiting
On satellite, remove all references to free_bandwidth column in nodes table.
On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated.

Protobuf message, NodeCapacity, is left intact for backwards compatibility.
Once this is released to all satellites, we can drop the column from the DB.

Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838
2020-03-04 14:04:00 +00:00
Cameron Ayer
7244a6a84e storagenode/{contact, piecestore}: implement low disk notification with cooldown
When a storagenode begins to run low on capacity, we want to notify
the satellite before completely running out of space. To achieve this,
at the end of an upload request, the SN checks if its available space has
fallen below a certain threshold. If so, trigger a notification to the
satellites.

The new NotifyLowDisk method on the monitor chore is implemented using the
common/syn2.Cooldown type, which allows us to execute contact only once
within a given timeframe; avoiding hammering the satellites with requests.
This PR contains changes to the storagenode/contact package, namely moving
methods involving the actual satellite communication out of Chore and into
Service. This allows us to ping satellites from the monitor chore

Change-Id: I668455748cdc6741291b61130d8ef9feece86458
2020-03-03 10:45:37 -05:00
Michal Niewrzal
d384e48ad7 private/testplanet: set rollout seed to avoid warnings in logs
Each test log is starting with warnings like this: "rollout config
error: empty seed {"binary": "Identity"}". Make no sense to print them
and pollute output.

Change-Id: Ib50e28d09d8b259106d3b79d8f1262954a7aed63
2020-03-03 12:58:54 +00:00
Egon Elbre
1f7c3be8f9 private/testplanet: add option to run testplanet databases non-parallel
NonParallel running is needed for gateway tests, because minio
unfortunately relies on global state.

Change-Id: If730db2ab86d10f4d02e1ac3128f758e9c18cdff
2020-02-27 15:49:22 +02:00
Egon Elbre
64330c55b3 all: use pbgrpc
common/pb moved grpc to a separate package common/pb/pbgrpc.
This updates this repository to use it.

Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e
2020-02-26 21:27:47 +02:00
paul cannon
92d86fa044 satellite/repair: fix repair concurrency
This new repair timeout (configured as TotalTimeout) will include both
the time to download pieces and the time to upload pieces, as well as
the time to pop the segment from the repair queue.

This is a move from Github PR #3645.

Change-Id: I47d618f57285845d8473fcd285f7d9be9b4318c8
2020-02-24 19:57:09 +00:00
Cameron Ayer
f22bddf122 {storagenode/contact, private/testplanet}: remove ErrFailureToStart and panic in testplanet.Start
Change-Id: I252e8c9407400af7bda95a7657c8154660c3c801
2020-02-24 18:24:23 +00:00
Yingrong Zhao
5011e78311 storagenode/piecestore: remove unused DeletePiece endpoint
With commit: 3331b443e7, satellite will
start calling `DeletePieces`. Therefore, we can remove the old endpoint
once the above commit is deployed with all satellites

Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb
2020-02-21 21:03:49 +00:00