Commit Graph

192 Commits

Author SHA1 Message Date
Ivan Fraixedes
84b844a2a7 redis-server: Move testing type to specific testing pkg
Move a specific interface & types used for testing to be a private
subpackage with a name that clearly identifies it for testing purpose.

Change-Id: I646cf3b6f0a3b518a6f9a125998dc5a02df02db6
2021-03-10 06:09:46 +00:00
Yaroslav Vorobiov
966535e9de {storagenode,satellite}/nodeoperator: add wallet features
Change-Id: Iac7eb40a52b8fddcc573aebaad2e3a30a10cded9
2021-02-08 22:09:45 +02:00
Yingrong Zhao
3b49d3cddf satellite: remove referral program related code
This PR removes all back-end related referral program code including the
marketing portal.

We will have a separate PR for front-end code and database migration to
drop `offers` and `usercredits` table

Change-Id: If59f952cddfe0558a7dc03a0eac7cc1081517f88
2021-02-08 13:52:50 +00:00
Yingrong Zhao
53d189d712 private/testplanet: add reconfiguration for disabling QUIC listener and
TCP listener

Change-Id: I067656e755c4a85ada161ac80d52778ba91f1045
2021-02-03 18:43:00 +00:00
Yingrong Zhao
7e80badaf9 pkg/server,pkg/quic: accept an existing conn to create quic listener and
allow disabling tcp/quic

In order to have more control of a server so that we can
simulate connection failures in `testplanet`, this PR changes
quic.Listener to accept an existing UDPConn instead of relying on the
quic-go library to create the UDPConn.
This PR also adds two flags on the `server.Config` struct to allow
enabling/disabling tcp/tls listener and quic listener. By default, they
are both set to true.
    - `DisableTCPTLS`: internal flag, disables tcp/tls listener.
    - `DisableQUIC`: hidden flag, disables quic listener
By making the `DisableQUIC` a hidden flag, it allows storagenode operators to
have the ability to disable quic traffic in case their set up can't work
with udp traffic.

Change-Id: I853b12435d988b9c41ad9b873fd57480d792e378
2021-02-03 12:04:29 -05:00
Ivan Fraixedes
d93944c57b satellite/orders: Delete unused methods & DB tables
Delete satellite order methods and DB tables which aren't used anymore
after we have done a refactoring on the orders to stuck bucket
information in the orders' encrypted metadata.

There are also configuration parameters and a satellite chore that
aren't needed anymore after the orders refactoring.

Change-Id: Ida3682b95921df70792284b42c96d2508bf8ca9c
2021-02-01 18:01:29 +00:00
Natalie Villasana
91bd4191dd satellite/accounting: add rollup archiver chore
The rollup archiver chore moves bucket bandwidth rollups and
storagenode rollups that are older than a given duration
to two new archive tables.

Change-Id: I1626a3742ad4271bc744fbcefa6355a29d49c6a5
2021-02-01 09:29:54 -05:00
Egon Elbre
19e3dc4ec0 satellite/overlay: rename NodeSelectionCache to UploadSelectionCache
It wasn't obvious that NodeSelectionCache was only for uploads.

Change-Id: Ifeeaa6fdb50a4b7916245b48d8634d70ac54459c
2021-01-28 14:56:53 +02:00
Yingrong Zhao
02845e7b8f pkg/server,private/testplanet: start to listen on quic
This PR introduces a new listener that can listen for quic traffic on
both storagenodes and satellites.

Change-Id: I5eb5bc82c37dde20d3be2ec8fa5f69c18fae0af0
2021-01-27 11:03:42 -05:00
Michał Niewrzał
3fc0d2a83e satellite/metainfo: add testing method from multipart-upload branch
We wanto have single uplink branch for standard and multipart-upload satellite but some tests are using helper methods from multipart. This change adds methods used by uplink test.

Change-Id: I82352ed56674ff7e8743b58061ba594018e78e3b
2021-01-26 09:13:12 +00:00
Cameron Ayer
d14607a5f7 satellite/{contact,nodestats,overlay,satellitedb}: remove references to total_uptime_count and uptime_success_count columns
Change-Id: I1f92022909bc564e9b1e31bf937fdfe7c16554de
2021-01-19 15:43:02 -05:00
Cameron Ayer
75d828200c private,satellite: add chore to dq stray nodes
Full scope:
private/testplanet,satellite/{overlay,satellitedb}

Description:
In most cases, downtime tracking with audits will eventually lead
to DQ for nodes who are unresponsive. However, if a stray node has no
pieces, it will not be audited and will thus never be disqualified.
This chore will check for nodes who have not successfully been contacted
in some set time and DQ them.

There are some new flags for toggling DQ of stray nodes and the timeframes
for running the chore and how long nodes can go without contact.

Change-Id: Ic9d41fdbf214736798925e728245180fb3c55615
2021-01-19 14:21:56 -05:00
Jessica Grebenschikov
1709117b0d satellite/console/wasm: add more unit tests
Change-Id: Ie134f8a08d690ce013039ed1a4e484f8b6a1a6d5
2021-01-08 18:50:29 +00:00
Jessica Grebenschikov
d961437889 satellite/orders: remove the config IncludeEncryptedMetadata
Since the Satellite now requires the order encryption functionality (since serial_number table is deprecated) to properly function, we can remove the config flag to turn on/off the feature.

Change-Id: Ie973f72a9a05a81cef9e53dc9c99d22c940c2488
2020-12-18 10:39:29 -08:00
Jessica Grebenschikov
da0327c9b7 satellite/dbcleanup: remove expired serial chore
Change-Id: Ib71d41eb6679d6435e5bc10b6244dac66380a74e
2020-12-18 09:36:28 -08:00
Michal Niewrzal
cdeea1c999 private/testplanet: add helper OpenProject method to testplanet uplink
This will simplify opening pure uplink.Project in tests.

Change-Id: I076875e15e21608f49dc875bb445412f34609bdb
2020-12-07 13:45:47 +00:00
Moby von Briesen
3fc76f4ffe satellite/downtime: Remove deprecated downtime tracking service.
We are no longer planning on implementing downtime penalization using
the method described in
docs/blueprints/archive/storage-node-downtime-tracking-deprecated.md.
Now, we are implementing the design described in
docs/blueprints/storage-node-downtime-tracking-with-audits.md.

This change removes the downtime estimation chores from the satellite
core as well as the package satellite/downtime. A future change will
remove the database table.

Change-Id: I1a1d3cf9dceeba36255d25243294865b89925518
2020-12-02 15:16:13 -05:00
Jessica Grebenschikov
b261110352 satellite/orders: get bucketID from encrypted metadata in order instead of serial_numbers table
We want to stop using the serial_numbers table in satelliteDB. One of the last places using the serial_numbers table is when storagenodes settle orders, we look up the bucket name and project ID from the serial number from the serial_numbers table.

Now that we have support to add encrypted metadata into the OrderLimit, this PR makes use of that and now attempts to read the project ID and bucket name from the encrypted orderLimit metadata instead of from the serial_numbers table. For backwards compatibility and to ensure no errors, we will still fallback to the old way of getting that info from the serial_numbers table, but this will be removed in the next release as long as there are no errors.

All processes that create orderLimits must have an orders.encryption-keys set. The services that create orderLimits (and thus need to encrypt the order metadata) are the satellite apiProcess, the repair process, audit service (core process), and graceful exit (core process). Only the satellite api process decrypts the order metadata when storagenodes settle orders. This means that the same encryption key needs to be provided in the config for the satellite api process, repair process, and the core process like so:
orders.include-encrypted-metadata=true
orders.encryption-keys="<"encryptionKeyID>=<encryptionKey>"

Change-Id: Ie2c037971713d6fbf69d697bfad7f8b672eedd66
2020-12-01 15:29:32 +00:00
Ethan
2b92bba563 satellite/satellitedb/orders: Handle serial_numbers deletes in smaller increments on CRDB
CRDB doesn't like large deletes. While testing in the POC environment we found that deletes on the serial_numbers table could take hours.  This change limits deletes to 1000 at a time (configurable) to avoid blocking other queries.

Change-Id: I08455e25db1574579dd4d7b7125a08e9c913dff1
2020-11-20 13:44:52 +00:00
Cameron Ayer
5a337c48ec {cmd,private,storagenode}: create storage dir verification during setup
Previously, we created a new file to use for directory verification
every time the storage node starts. This is not helpful if the storage node
points to the wrong directory when restarting. Now we will only create the file
on setup. Now the file should be created only once and will be verified at
runtime.

Change-Id: Id529f681469138d368e5ea3c63159befe62b1a5b
2020-11-11 11:01:36 -05:00
Moby von Briesen
db6bc6503d satellite/metainfo: Update metainfo RS config to more easily support multiple RS schemes.
Make metainfo.RSConfig a valid pflag config value. This allows us to
configure the RSConfig as a string like k/m/o/n-shareSize, which makes
having multiple supported RS schemes easier in the future.

RS-related config values that are no longer needed have been removed
(MinTotalThreshold, MaxTotalThreshold, MaxBufferMem, Verify).

Change-Id: I0178ae467dcf4375c504e7202f31443d627c15e1
2020-11-09 22:16:13 +00:00
Egon Elbre
c55c23f81f private/testplanet: add STORJ_TESTPLANET_ABSTIME
Allow setting STORJ_TESTPLANET_ABSTIME=1 to use absolute time in
testplanet logs.

Change-Id: I4df5dfc1fc055d9726aed65242ab71338550e671
2020-11-03 15:44:18 +02:00
Egon Elbre
0c23b12038 private/testplanet: use relative time logging
Instead of printing RFC3339 timestamp, we'll print relative time
since the creation of the testplanet.

Before:

    logger.go:130: 2020-11-02T14:54:53.864+0200 DEBUG   versioncontrol   addr= 127.0.0.1:30904

After:

    log.go:54: 00:00.002        DEBUG   versioncontrol   addr= 127.0.0.1:30945

Change-Id: Ifa423f9d54d4e7c583d9290fe36a791d28166f8f
2020-11-02 17:53:18 +00:00
Egon Elbre
7183dca6cb all: fix defers in loop
defer should not be called in a loop.

Change-Id: Ifa5a25a56402814b974bcdfb0c2fce56df8e7e59
2020-11-02 15:06:38 +02:00
Egon Elbre
e0dca4042d all: add pprof labels for debugger
By using pprof.Labels debugger is able to show service/peer names in
goroutine names.

Change-Id: I5f55253470f7cc7e556f8e8b87f746394e41675f
2020-10-29 15:10:07 +00:00
Egon Elbre
89ce1fe626 storagenode/storagenodedb: add ctx to OpenNew and OpenExisting
Database opening usually dial and hence we should pass ctx to them.

Change-Id: I9160ae95829f22f347bd525904898a47279a7427
2020-10-29 09:52:37 +02:00
Egon Elbre
d0beaa4a87 pkg/revocation: pass ctx into opening the database
Opening a databases requires ctx, this is first step to passing ctx
to the appropriate level.

Change-Id: I12700f39a320206d8a2a4e054452319f8585b44b
2020-10-29 07:15:36 +00:00
Jessica Grebenschikov
f5880f6833 satellite/orders: rollout phase3 of SettlementWithWindow endpoint
Change-Id: Id19fae4f444c83157ce58c933a18be1898430ad0
2020-10-26 14:56:28 +00:00
Stefan Benten
14a2050b8d pkg/auth: move package to consoleauth
To avoid further name collisions, the very broad named package gets moved into
the consoleauth package where its also mainly being used.

Change-Id: Ie563c9700adbf0553baca2b7b8ba4a1d9c29d144
2020-10-06 14:15:07 +02:00
Jessica Grebenschikov
4a2c66fa06 satellite/accounting: add cache for getting project storage and bw limits
This PR adds the following items:
1) an in-memory read-only cache thats stores project limit info for projectIDs

This cache is stored in-memory since this is expected to be a small amount of data. In this implementation we are only storing in the cache projects that have been accessed. Currently for the largest Satellite (eu-west) there is about 4500 total projects. So storing the storage limit (int64) and the bandwidth limit (int64), this would end up being about 200kb (including the 32 byte project ID) if all 4500 projectIDs were in the cache. So this all fits in memory for the time being. At some point it may not as usage grows, but that seems years out.

The cache is a read only cache. When requests come in to upload/download a file, we will read from the cache what the current limits are for that project. If the cache does not contain the projectID, it will get the info from the database (satellitedb project table), then add it to the cache.

The only time the values in the cache are modified is when either a) the project ID is not in the cache, or b) the item in the cache has expired (default 10mins), then the data gets refreshed out of the database. This occurs by default every 10 mins. This means that if we update the usage limits in the database, that change might not show up in the cache for 10 mins which mean it will not be reflected to limit end users uploading/downloading files for that time period..

Change-Id: I3fd7056cf963676009834fcbcf9c4a0922ca4a8f
2020-09-25 16:28:49 +00:00
Jennifer Johnson
4e2413a99d satellite/satellitedb: uses vetted_at field to select for reputable nodes
Additionally, this PR changes NewNodeFraction devDefault and testplanet config from 0.05 to 1.
This is because many tests relied on selecting nodes that were reputable based on audit and uptime
counts of 0, in effect, selecting new nodes as reputable ones.
However, since reputation is now indicated by a vetted_at db field that is explicitly set
rather than implied by audit and uptime counts, it would be more complicated to try to
update all of the nodes' reputations before selecting nodes for tests.
Now we just allow all test nodes to be new if needed.

Change-Id: Ib9531be77408662315b948fd029cee925ed2ca1d
2020-09-04 16:45:32 +00:00
Michal Niewrzal
aa47e70f03 satellite/metainfo: use metabase.SegmentKey with metainfo.Service
Instead of using string or []byte we will be using dedicated type
SegmentKey.

Change-Id: I6ca8039f0741f6f9837c69a6d070228ed10f2220
2020-09-03 15:11:32 +00:00
Cameron Ayer
ca0c1a5f0c storagenode/{monitor,pieces}, storage/filestore: add loop to check storage directory writability
periodically create and delete a temp file in the storage directory
to verify writability. If this check fails, shut the node down.

Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574
2020-08-31 21:20:49 +00:00
Moby von Briesen
5d21e85529 satellite/audit/queue: Separate audit queue into two separate structs.
* The audit worker wants to get items from the queue and process them.
* The audit chore wants to create new queues and swap them in when the
old queue has been processed.

This change adds a "Queues" struct which handles the concurrency
issues around the worker fetching a queue and the chore swapping a new
queue in. It simplifies the logic of the "Queue" struct to its bare
bones, so that it behaves like a normal queue with no need to understand
the details of swapping and worker/chore interactions.

Change-Id: Ic3689ede97a528e7590e98338cedddfa51794e1b
2020-08-31 20:51:25 +00:00
Bill Thorp
dbb53151f0 private/testplanet: Decrease metainfo MaxBuckets test value to speed testing.
TestMaxOutBuckets is one of our slower tests (50-90s).
This change seems to make it 2-12s.

It reduces the number of buckets that need to be created.
It also removes unnecessary storage nodes.

Change-Id: I1012fc6e9258b2f7674b16da4e8b418741c93eea
2020-08-26 17:31:31 +00:00
Jeff Wendling
91698207cf storagenode: live tracking of order window usage
This change accomplishes multiple things:

1. Instead of having a max in flight time, which means
   we effectively have a minimum bandwidth for uploads
   and downloads, we keep track of what windows have
   active requests happening in them.

2. We don't double check when we save the order to see if it
   is too old: by then, it's too late. A malicious uplink
   could just submit orders outside of the grace window and
   receive all the data, but the node would just not commit
   it, so the uplink gets free traffic. Because the endpoints
   also check for the order being too old, this would be a
   very tight race that depends on knowledge of the node system
   clock, but best to not have the race exist. Instead, we piggy
   back off of the in flight tracking and do the check when
   we start to handle the order, and commit at the end.

3. Change the functions that send orders and list unsent
   orders to accept a time at which that operation is
   happening. This way, in tests, we can pretend we're
   listing or sending far into the future after the windows
   are available to send, rather than exposing test functions
   to modify internal state about the grace period to get
   the desired effect. This brings tests closer to actual
   usage in production.

4. Change the calculation for if an order is allowed to be
   enqueued due to the grace period to just look at the
   order creation time, rather than some computation involving
   the window it will be in. In this way, you can easily
   answer the question of "will this order be accepted?" by
   asking "is it older than X?" where X is the grace period.

5. Increases the frequency we check to send up orders to once
   every 5 minutes instead of once every hour because we already
   have hour-long buffering due to the windows. This decreases
   the maximum latency that an order will be reported back to
   the satellite by 55 minutes.

Change-Id: Ie08b90d139d45ee89b82347e191a2f8db1b88036
2020-08-19 19:42:33 +00:00
Cameron Ayer
0155c21b44 private/testplanet, storagenode/{monitor,pieces}: write storage dir verification file on run and verify on loop
On run, write the storage directory verification file.

Every time the node runs it will write the file even if it already exists.
The reason we do this is because if the verification file is missing, the SN
doesn't know whether it is an incorrect directory, or it simply hasn't written
the file yet, and we want to keep nodes running without needing operator intervention.

Once this change has been a part of the minimum version for several releases,
we will move the file creation from the run command to the setup
command. Run will only verify its existence.

Change-Id: Ib7d20e78e711c63817db0ab3036a50af0e8f49cb
2020-08-19 19:12:21 +00:00
Yingrong Zhao
14ad7a4f1c satellite/metainfo: add limiter for objectdeletion and piecedeletion
services

This PR adds a limiter on the amount of concurrent objects deletion can be handled so
we don't run out of memory.

Change-Id: Id2ce368af6f86845fcdfd34cb2f5e460efe9b272
2020-08-19 16:08:29 +00:00
Moby von Briesen
708cb48aa6 storagenode/orders: implement orders filestore on storagenode
* Add all new orders to the orders filestore instead of the database.
* Submit orders from the filestore to the new satellite SettleWindow
endpoint.

The orders filestore will eventually replace the orders DB completely.
For now, we will still be checking the orders DB and submitting those
orders if they exist. In a later release, we will completely remove the
orders DB, but we need both the DB and filestore for the transitionary
period.

Change-Id: Iac8780fd5ab770296181bbd313e1d335f072d4dc
2020-08-19 15:00:35 +00:00
Ivan Fraixedes
7f8df74070
private/testplanet: Use config with name set when empty
In testplanet Run function we create a new configuration variable on
each t.Run for setting the value to the config name field when it's
empty, however the new copy of the configuration was not used.

Change-Id: I9da34e743f9648850c96556eab0349e742db3aac
2020-08-19 13:12:10 +02:00
Michal Niewrzal
88dcc93f3c satellite/metainfo: use user PartnerID for bucket attribution
Change-Id: I20f1bd432333f9b37ca8fb457c349eff94ffb392
2020-08-06 13:14:07 +00:00
Moby von Briesen
e02adfe5e9 satellite/overlay/config.go: Add AuditHistoryConfig to overlay
Adds AuditHistory{WindowSize, TrackingPeriod, GracePeriod,
OfflineThreshold}. These values will be used to track offline audits over
time, and to suspend/disqualify nodes for being offline for too long.

Change-Id: I05f7dbc3c034bdc53c4fbd7719c71a44f37ec6a5
2020-08-04 18:18:56 +00:00
Jeff Wendling
85a74b47e7 satellite/orders: 3-phase rollout
This adds a config flag orders.window-endpoint-rollout-phase
that can take on the values phase1, phase2 or phase3.

In phase1, the current orders endpoint continues to work as
usual, and the windowed orders endpoint uses the same backend
as the current one (but also does a bit extra).

In phase2, the current orders endpoint is disabled and the
windowed orders endpoint continues to use the same backend.

In phase3, the current orders endpoint is still disabled and
the windowed orders endpoint uses the new backend that requires
much less database traffic and state.

The intention is to deploy in phase1, roll out code to nodes
to have them use the windowed endpoint, switch to phase2, wait
a couple days for all existing orders to expire, then switch
to phase3.

Additionally, it fixes a bug where a node could submit a bunch
of orders and rack up charges for a bucket.

Change-Id: Ifdc10e09ae1645159cbec7ace687dcb2d594c76d
2020-08-03 17:01:42 +00:00
Rafael Gomes
935f44ddb7 satellite/metainfo: Add Delete Service config
Change-Id: I0a6e3ce1adfe1488eb23da9dda92877af1834599
2020-08-03 14:28:02 +00:00
Michal Niewrzal
20184d3604 satellite/metainfo: move TestAttributionReport to attribution tests
Additionally test was simplified by adding ability to set user agent for
testplanet uplink.

Change-Id: I82942c2280562b5118a42aa8e1e0f53092f8dbe1
2020-07-30 19:18:15 +00:00
Bill Thorp
b265b7f555 satellite/console: make paywall optional
Add a config so that some percent of users require credit cards /
account balances
in order to create a project or have a promotional coupon applied

UI was updated to match needed paywall status

At this point we decided not to use a field to store if a user is in an
A/B
test, and instead just use math to see if they're in a test.  We decided
to use MD5 (because its in Postgres too) and User UUID for that math.

Change-Id: I0fcd80707dc29afc668632d078e1b5a7a24f3bb3
2020-07-28 10:57:49 +00:00
Qweder93
92efffb48a storagenode/version: notification flow now based on cursor, chore_test added, versioncontrol added to reconfigure.
Change-Id: I70713def8d585228270ec5a8c586ecc5b4d510c4
2020-07-23 14:13:24 +00:00
Ethan
cfca021839 satellite/accounting: Add chore to cleanup old project bandwidth rollups data
Removes old project_bandwidth_rollups records that are no longer used.

Uses a retain months configuration to determine how many months to save.  Current month cannot be removed.
Tests retainMonths=-1, 0, 2

Change-Id: Ia4be2546cdb28802427acf41ecd85ad66df3e62c
2020-07-22 18:56:49 +00:00
Egon Elbre
b84923558b satellite: fix scoping, formatting
Change-Id: I21ef9edc2d449d75ad74891df7f966fb150d80fd
2020-07-16 19:13:14 +03:00
Egon Elbre
e70da5cd4e all: fix comments
Change-Id: I2d2307e3fab87de47a72b3595d051e2c95ff4f8a
2020-07-16 19:13:14 +03:00