Commit Graph

437 Commits

Author SHA1 Message Date
stefanbenten
0209a2095f satellite/{console,satellitedb}: add project_limit column to users table
Change-Id: I603f085f17ca5b413dd1c6837c2081f9e7e791a1
2020-07-15 17:27:31 +00:00
Jennifer Johnson
784a156eea satellite: prevents uplink from creating a bucket once it exceeds the max bucket allocation.
Change-Id: I4b3822ed723c03dbbc0df136b2201027e19ba0cd
2020-07-15 17:27:05 +00:00
Stefan Benten
9dbd511396
private/dbutil: reduce db connection defaults (#3920) 2020-07-08 19:59:42 +02:00
Ethan Adams
66f5368807
satellite/testing: Change testing to use PG 12.3 (#3913)
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
2020-06-25 20:17:39 +03:00
Stefan Benten
db57e5cf7b
scripts/tests/rollingupgrade: continue for v1.6.4 (#3912) 2020-06-25 13:57:39 +02:00
VitaliiShpital
5b3c8b2f1a web/satellite: google tag manager for signup pages
WHAT:
GTM added for partnered satellites sign up pages
csp values were extended to make GTM work at all:
1. googletagmanager.com for GTM script
2. google-analytics.com for GA script
3. hash was added to avoid using 'unsafe-inline' value in 'script-src' directive

Also config flag for GTM id was added

WHY:
Marketing team needs GTM and GA for their campaigns

Change-Id: Ibb2ace737feb971dda6c191599d479fe4a7af332
2020-06-23 10:45:04 +00:00
Isaac Hess
2d727bb14e satellite: Check macaroon revocation
When a request comes in on the satellite api and we validate the
macaroon, we now also check if any of the macaroon's tails have been
revoked.

Change-Id: I80ce4312602baf431cfa1b1285f79bed88bb4497
2020-06-22 13:50:07 -06:00
Michal Niewrzal
7a0778fac4 cmd/satellite: choose correct Stripe client also for commands
Commands were using always real Strip client but for e.g. storj-sim
should use mock by default.

Change-Id: Ifd3c02028312d2e8d6d73d67b2dadadcd69077c8
2020-06-15 11:03:14 +00:00
Rafael Gomes
958ea1b9df satellite/accounting: add download limit cache
Change-Id: I722930cab8bd5d240f4878dc6997e9bc7637311f
2020-06-12 16:33:46 -03:00
stefanbenten
c6c8b923af satellite/dbcleanup: run cleanup more frequently
As the tables that get cleaned up by this job get a lot of inserts and deletes over the course of a day, the autovacuum process on PostgreSQL struggles fairly easily/quickly.
Due to its limitation, it can only delete 180,000,000 tuples in one go, before it has to rescan the entire table/index.

With the current load, the most busy satellites accumulate about 1,000,000,000 tuples per day (consumed_serials). With our current 24h interval that results in ~6-7 scans, slowing the entire database down for a quite long time.
This PR reduces the interval to 4 hours, which under a constant load, results in less than 180,000,000 entries per run.
That way, we do not scan twice for only a small gain over said amount. Reducing the interval further would also increase the DB load unnecessary, as each run scans the entire tables at least once.

For future reference, we might need to adjust the interval, if the load is significantly changing.

Change-Id: I18fdd45d93d468cff126e719c8380c29a49f43dd
2020-06-10 18:32:15 +00:00
Yingrong Zhao
c78f88beb6 scripts/tests: use the same version for storj-sim as satellites
Change-Id: If85b14a0a5af334b9547d73bc4dbe65cdd856984
2020-06-05 13:17:17 -04:00
Yingrong Zhao
175f048aa7 scripts;cmd/storj-sim: include satellite node id in satellite address
Expose satellite ID from storj-sim so we can have access to it when
changing satellite address.

Change-Id: Ife816f4d35eae2d0bc9f9ad592fe75d73d93d9ff
2020-06-05 09:44:19 -04:00
Yingrong Zhao
9a04ca0527 scripts/tests/rollingupgrade: skip v1.6.3
We removed the ability to change satellite address in for uplink cli v1.6.3. Therefore we will skip that version for now

Change-Id: I0f78cfe27b51fd6cad571636ba38266f5f672d58
2020-06-05 09:00:36 +00:00
Michal Niewrzal
056ae7ffa8 scripts/test/rollingupgrade: set correct satellite address for imported
access

Addition: use always latest gateway release

Change-Id: I5e5231e6da4b6f7900cb71bb6e227901474270ea
2020-06-04 15:04:10 +00:00
Yingrong Zhao
9d7713cdd0 script/testdata: update tracing agent default address
Change-Id: I730994f16f135c4b8643a52f4cf499487e4af326
2020-06-03 23:46:18 +00:00
Moby von Briesen
b82d04e618 satellite/metainfo: limit size of uplink-provided metadata to 2KiB
Change-Id: Id44a46046ddb4a12102525531f4502fcff2b6252
2020-06-01 16:51:29 -04:00
Jeff Wendling
44433f38be satellite/satellitedb: remove ORDER BY when reading from queue
also remove the continuation support from the queue, otherwise
we may end up sequential scanning the entire table to get
a few rows at the end.

then, in the core, instead of looping both to get a big enough
batch inside of the queue, as well as outside of it to ensure
we consume the whole queue, just get a single batch at a time.

also, make the queue size configurable because we'll need to
do some tuning in production.

Change-Id: If1a997c6012898056ace89366a847c4cb141a025
2020-06-01 18:31:14 +00:00
littleskunk
801a3ab90d
satellite/coinpayments: Reduce update interval to 2 minutes (#3897)
* satellite/coinpayments: Reduce update interval to 2 minutes

* satellite/coinpayments: Reduce balance update

Co-authored-by: paul cannon <thepaul@users.noreply.github.com>
2020-05-29 22:21:27 +02:00
VitaliiShpital
c9b9c686fc web/satellite: logic for new signup/login flow
WHAT:
1. updated verification page URL in config
2. added list of partnered satellites to config
3. added logic for satellites dropdown on new signup/login pages

WHY:
1. signup/login flow was reworked in tardigrade.io repo (iframe removed, new pages etc.)
2. new config flag was added to check if satellite name matches at least one member of partnered satellites list to redirect user to verification page
3. new pages will have dropdown with partnered satellites list. Appropriate logic was added.

Change-Id: I33399ab66ca31f07b297a433f6b1f41da4cb6e66
2020-05-29 17:11:44 +00:00
Egon Elbre
59ca5a1c52 scripts: remove lib/uplink from update-access
Change-Id: I70df3fa7ad14287e873e2f955def4231cc002aa6
2020-05-29 13:35:10 +03:00
Michal Niewrzal
2be93df566 scripts: reduce file size for backward compatibility tests
Change-Id: Ibcf460f908b6e551fe0ca68248affb7ddf49592f
2020-05-27 13:21:55 +00:00
littleskunk
2fbb34c3ea
nodeselection: Increase minimum free space to 500MB (#3898) 2020-05-25 12:13:28 +02:00
littleskunk
8ec64f3daf
satellite/overlay: enable node selection cache on all satellites (#3895) 2020-05-19 19:25:53 +02:00
Bill Thorp
5a7a4d2e98 satellite: add Go test version of satellite-config-lock tests
The current satellite config lock code relies on bash scripts and
gnu diff, it must be run as root and hence it typically requires
docker.  The old version will be removed at a later date..

I tried for several hours to run directly against cmdSetup() in
cmd/satellite/main.go, to avoid the ctx.Compile() call. I had no
luck.

Change-Id: I0a4888421e743b436d32b6af69d04759d7816751
2020-05-13 08:14:24 +00:00
igor gaidaienko
1eab5e2980 satellite/console: Increase default webUI rate limit to 5
Previous limit is annoying for normal users

Change-Id: I7cb783e0b2515f415b2a055d5e811efab3810654
2020-05-12 16:12:17 +00:00
Stefan Benten
e23bd806b4
satellite/accounting: separate usage and bandwidth limit (#3878) 2020-05-12 15:01:15 +02:00
Stefan Benten
65f3e26f80
satellite: Change Default Project Limits and minimum STORJ Payment (#3877) 2020-05-12 14:18:58 +03:00
Egon Elbre
ec589a8289 all: fix comments about grpc
Change-Id: Id830fbe2d44f083c88765561b6c07c5689afe5bd
2020-05-11 13:05:34 +03:00
Egon Elbre
4e94da3fda satellite/overlay: add feature flag for node selection cache
Also distinguish the purpose for selecting nodes to avoid potential
confusion, what should allow caching and what shouldn't.

Change-Id: Iee2451c1f10d0f1c81feb1641507400d89918d61
2020-05-06 16:13:47 +03:00
Jennifer Johnson
18078bf7ee satellite/audit: increases audit worker concurrency to 2
Change-Id: Ibe3e3801b79accffbcfe9e2e02c96fc963894a7f
2020-05-05 11:31:55 +00:00
Egon Elbre
d98b8f6e23 satellite/metainfo,storage: use different limit for metainfo loop
Change-Id: I5ef7233930679b977b33f7b3e1dda45c907dcfad
2020-05-05 10:37:20 +00:00
Moby von Briesen
8f60cfc4fb satellite/overlay: Add flag for enabling/disabling disqualification from suspension mode
Add a flag that allows us to easily switch disqualification from
suspension mode on or off. A node will only be disqualified from
suspension mode if it has been suspended for longer than the grace
period AND the SuspensionDQEnabled flag is true.

Change-Id: I9e67caa727183cd52ab2042b0a370a1bcaebe792
2020-05-04 17:25:09 +00:00
Jeff Wendling
bb28851964 test-sim-backwards: refactor, reduce node count to 9 in mixed mode
This catches the bug that hit prod where we missed that a migration
was necessary for a column, but only on uploads, and only for nodes
that had not checked in yet.

So, this does more uploads, and stops a node from starting so that
we trigger that specific scenario. Hopefully it will catch other
similar ones in the future.

I confirmed that this locally caught the bug when the release under
test was v0.34.10 and HEAD did not include the fix for it.

Change-Id: If7d41e8241d6a042fa524b4aff956b0264ecb128
2020-05-01 04:56:06 +00:00
Yingrong Zhao
9b4a3f8fcc cmd/uplink: use tracing.enabled flag
Previously we are using tracing.sampled to be the switch for turning on/off tracing.
However we would like to separate sampling rate from being the switch,
so we can set sampling rate to be 0 but still intialize tracing for
satellite and storagenodes

Change-Id: I27e6ba25ea6f6b612b4e1a57cf1301889ded41ec
2020-04-27 17:54:57 +00:00
Bill Thorp
341aecfe0f satellite/console: add rate limiter to login, register, password recovery
Added a per IP rate limiter to the console web.
Cleaned up password check to leak less bcyrpt info.

Change-Id: I3c882978bd8de3ee9428cb6434a41ab2fc405fb2
2020-04-24 17:15:49 +00:00
Jess G
825226c98e
satellite/overlay: use node selection cache for uploads (#3859)
* satellite/overlay: use node selection cache for uploads

Change-Id: Ibd16cccee979d0544f2f4a01749af9f36f02a6ad

* fix config lock

Change-Id: Idd307e4dee8ab92749f1ec3f996419ea0af829fd

* start fixing tests

Change-Id: I207d373a3b2a2d9312c9e72fe9bd0b01e06ad6cf

* fix test, add some more

Change-Id: I82b99c2004fca2510965f9b389f87dd4474bc722

* change config name

Change-Id: I0c0f7fc726b2565dc3828cb723f5459a940f2a0b

* add benchmarks

Change-Id: I05fa25bff8d5b65f94d918556855b95163d002e9

* revert bench to put in different PR

Change-Id: I0f6942296895594768f19614bd7b2e3b9b106ade

* add staleness to benchmark

Change-Id: Ia80a310623d5a342afa6d835402170b531b0f870

* add cache config to testplanet

Change-Id: I39abdab8cc442694da543115a9e470b2a8a25dff

* have repair select old way

Change-Id: I25a938457d7d1bcf89fd15130cb6b0ac19585252

* lower testplante config time

Change-Id: Ib56a2ed086c06bc6061388d15a10a2526a663af7

* fix test

Change-Id: I3868e9cacde2dfbf9c407afab04dc5fc2f286f69
2020-04-24 09:11:04 -07:00
Moby von Briesen
72b93f3120 satellite/satellitedb: disqualify suspended nodes when the grace period passes
If a node is suspended and receives an unknown or failing audit,
disqualify them if the grace period (default 1w in production) has
passed.

Migrate the nodes table so any node that is currently suspended gets
unsuspended when the satellite starts up.

Change-Id: I7b81c68026f823417faa0bf5e5cb5e67c7156b82
2020-04-22 15:45:00 -04:00
Yingrong Zhao
0bdcf123cf bump monkit, monkit-jaeger, and private to latest
Also bump storj.io/common and sync repo

Change-Id: If8e60db6bdf0af8077b7befcb1da304c3c4dcae4
2020-04-22 12:30:37 -04:00
Moby von Briesen
178aa8b5e0 satellite/{metainfo,repair}: Delete expired segments from metainfo
* Delete expired segments in expired segments service using metainfo
loop
* Add test to verify expired segments service deletes expired segments
* Ignore expired segments in checker observer
* Modify checker tests to verify that expired segments are ignored
* Ignore expired segments in segment repairer and drop from repair queue
* Add repair test to verify that a segment that expires after being
added to the repair queue is ignored and dropped from the repair queue

Change-Id: Ib2b0934db525fef58325583d2a7ca859b88ea60d
2020-04-22 13:02:31 +00:00
Yingrong Zhao
8375a09c89 cmd: remove InitTracing from satellite and storagenode main.go file
Change-Id: I4addbe7d0645f66abfb3e98d74d17035e9624e69
2020-04-20 14:06:26 -04:00
Michal Niewrzal
884638a294 scripts: reduce segment size for integration tests
This will speedup integration tests.

Change-Id: Ifb1fadd01692390457971bfe7f9ba73a0796abb2
2020-04-20 15:13:05 +02:00
littleskunk
52cd33a633
scripts/deploy-storagenode: tag docker image as latest (#3858)
Co-authored-by: Ivan Fraixedes <ivan@fraixed.es>
2020-04-18 11:59:56 +02:00
VitaliiShpital
2dce4c232c web/satellite: redirect to verification page on sign up if inside iframe
Change-Id: I606b63fd27bef46597697b491970523e8a3a0cae
2020-04-16 13:35:49 +00:00
VitaliiShpital
158013a866 satellite/console: redirect on account activation
Change-Id: I2506ce0fd3832bf46fbcdcc5a42bb83dc926e99a
2020-04-15 11:49:50 +00:00
Moby von Briesen
d7794a4851 satellite/overlay: hardcode default values for audit alpha/beta
Alpha=1 and beta=0 are the expected first values for any alpha/beta
reputation system we are using in the codebase. So we are removing the
configurability of these values.

Change-Id: Ic61861b8ea5047fa1438ea6609b1d0048bf0abc3
2020-04-14 19:12:40 +00:00
JT Olio
e2d5b403e6 cmd/uplink: support --force (like awscli) for rb
Change-Id: If835c6dd08ee95e7c66ba7e4c7451cb3f0f95442
2020-04-14 18:10:54 +00:00
Cameron Ayer
3ee6c14f54 satellite/downtime: add concurrency to downtime estimation
We want to increase our throughput for downtime estimation. This commit
adds the ability to reach out to multiple nodes concurrently for downtime
estimation. The number of concurrent routines is determined by a new config
flag, EstimationConcurrencyLimit. It also increases the default
EstimationBatchSize to 1000.

Change-Id: I800ce7ec1035885afa194c3c3f64eedd4f6f61eb
2020-04-14 14:39:13 +00:00
Qweder93
3a9422cc9a satellite/nodestats: add pricing model to endpoint
Change-Id: Iddace8e437216a343458f440b543cee61164f233
2020-04-08 14:29:51 +00:00
Yingrong Zhao
96e58d21b4 cmd;pkg/server: init tracing collector in all processes
Add tracing handler in drpc server.
Initializing tracing collector in admin, satellite api, garbage
collection, satellite core, repaier, storagenode.
Change-Id: Ie98420e35dfc6913836ebd82b517d9d12877aefc

Change-Id: I91057b6265a4ac8bde033dfde692b8a28acca99f
2020-04-07 17:20:59 -04:00
Yingrong Zhao
4ab553cf3f scripts: only latest releases have version file
Change-Id: Ice212ac39b011c18b1d3d48ea9e6f580b4fd7f0c
2020-04-04 18:47:33 +00:00
Cameron Ayer
42be4bdc0f satellite/contact: add timeout to PingBack method
Change-Id: I2ec2f82e2e10d8be16f82e9de13ce42358e47c98
2020-04-04 18:26:30 +00:00
Yingrong Zhao
b3939bc1ba scripts/tests: fix test-version and rolling-upgrade test installation
Change-Id: I2c1262aa2e4fd1f5aa34bffe9c76f272fe076d49
2020-04-02 22:07:06 +00:00
Michal Niewrzal
c178a08cb8 satellite/metainfo: add max segment size and max inline size to
BeginObject response

We want to control inline segment size and segment size on satellite
side. We need to return such information to uplink like with redundancy
scheme.

Change-Id: If04b0a45a2757a01c0cc046432c115f475e9323c
2020-04-02 12:41:28 +00:00
Egon Elbre
90319dbec1 scripts: fix test-sim-backwards
Change-Id: I9d6644d4b493f5f6f60a960cacfaac2b5b828a5f
2020-04-02 00:13:42 +02:00
Egon Elbre
644df8dcdc private/version: minimal fix for tag-release.sh
Previous split to a storj.io/private repository broke tag-release.sh
script. This is the minimal temporary fix to make things work.

This links the build information to specified variables and sets them
inline. This approach, of course, is very fragile.

Change-Id: I73db2305e6c304146e5a14b13f1d917881a7455c
2020-04-01 13:46:45 +00:00
Michal Niewrzal
d444fbadea scripts: cleanup rolling upgrade test
* add script for easy rolling upgrade test local execution
* remove unneeded binaries building for rolling upgrade and versions
tests
* unify build process for Jenkins and local execution for rolling
upgrade and versions tests

Change-Id: Ic11211b83f3f447494bbd5827d2af77ea4b20dfe
2020-04-01 12:30:08 +00:00
Michal Niewrzal
0374e30678 scripts: fix storj-sim installation function for rolling upgrade tests
Change-Id: Ifb749e0a41c57b90cc03293c2588b2ceb6a0a15a
2020-03-31 09:11:00 +00:00
Jeff Wendling
e2ff2ce672 satellite: compensation package and commands
Change-Id: I7fd6399837e45ff48e5f3d47a95192a01d58e125
2020-03-30 14:08:14 -06:00
Yingrong Zhao
831668478a scripts/tests: fix gateway installation
Change-Id: I1629644f543505c27d4edb8f7bbe97d037cdc8a8
2020-03-29 19:24:05 +00:00
JT Olio
f28100b73f bump storj.io/private
Change-Id: I4ddd5c34521602967b89bd18e2a71a6f1e29f436
2020-03-27 21:57:35 +00:00
Moby von Briesen
a933bcc99a satellite/repair/repairer/ec.go: add option for downloading pieces onto disk instead of in memory during repair
Add flag to satellite repairer, "InMemoryRepair" that allows the
satellite to decide whether to download the entire segment being
repaired into memory (this is what the satellite already does), or to
download it into temporary files on disk that will be read from in the
upload phase of repair.

This should help with handling high repair traffic on satellites that
cannot afford to spend 64mb of memory per repair worker.

Updates tests to test repair for both in memory and to disk.

Change-Id: Iddf591e165621497c98533d45bfea3c28b08a194
2020-03-27 16:41:00 +00:00
Natalie Villasana
8e0ca0e6f5
satellite/gc: update release default for gc to run separately (#3830) 2020-03-26 14:44:18 -04:00
Michal Niewrzal
3a3648c80b scripts: add script for running versions tests locally
Change-Id: Iaa2a2d085b7edd46f63a0d79b4e731ea72412953
2020-03-24 15:44:00 +00:00
Michal Niewrzal
fdf40a7526 storj: remove storj/private/version package which was moved to
`storj/private` repo

Change-Id: I81c3f5b9d5e4fe7bca760999eb045ee9734e5e2e
2020-03-24 14:31:33 +00:00
Egon Elbre
6a7571f73e cmd/s3-benchmark: move to storj.io/benchmark
Change-Id: Idca2b836bdf876ca28eb5cabc9bfae1d576e4a3e
2020-03-23 19:09:42 +02:00
JT Olio
3b66ba6f02 scripts/tests: uplink no longer respects --client.segment-size
this is going to make all the tests slower but it is what it is

test-sim-aws.sh is removed because it was moved to storj/gateway repo.

Change-Id: I10727e747a4c3740b1c9054ce7d17313b4fa310b
2020-03-22 17:48:57 +00:00
Jennifer Johnson
699b635e5d satellite/overlay: rename newNodePercentage to newNodeFraction
Change-Id: Ie66de91f88183b44de0773589e83e4ade9aa997a
2020-03-19 20:09:32 +00:00
Jessica Grebenschikov
5142874144 satellite/gc: move garbage collection to its own process
Change-Id: I7235aa83f7c641e31c62ba9d42192b2232dca4a5
2020-03-18 16:44:01 +00:00
Egon Elbre
09e0f3de63 satellite/metainfo/piecedeletion: add Service
Change-Id: Id7e32ed569701fa0be66f9527c43a67052994570
2020-03-18 14:50:08 +00:00
Matt Robinson
bd4982e249
test/backwards-compatibility: Exclude rc tags from testing (#3787)
Change-Id: Id486709306c5c75a262ea17410ae641be53df8ed

Co-authored-by: Ivan Fraixedes <ivan@fraixed.es>
Co-authored-by: littleskunk <jens.heimbuerge@googlemail.com>
2020-03-18 00:12:05 +01:00
littleskunk
b10b69d9ce
test/rollingupgrade: fix stage 1 release version (#3810) 2020-03-17 22:30:07 +01:00
littleskunk
80acf33abc
script/release: fix error in regex (#3809)
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
2020-03-17 17:36:17 +01:00
Stefan Benten
49a30ce4a7
satellite/payments: Set proper defaults for the release (#3806)
* Slight adjustments to the migration

Change-Id: I68ae81c010c3414fde2845df16ab124f8d17834b

* Change Coupon Value

Change-Id: I0f241d09e5f716f1d1b3f0688643ba7f614d83c4

* Change AlphaUsage to 5GB

Change-Id: I5d25c6b5750684510cda8b14a27f38d5b2b07408

* change config lock

Change-Id: Ib7c7a54555ba2387c9aa8dd60a0501b0ee6491dd

* Use Scan properly

Change-Id: Ie39cf4644e3ddd703a254e2f5e616763dd805235

* Fix Config Lock

Change-Id: I558ecc1c1becfaaefc7aea5ad2fe83fd6bf6b561
2020-03-16 22:53:12 +01:00
Stefan Benten
52590197c2
satellite/payments: More Cleanup and Satellite command to ensure we have stripe customers (#3805) 2020-03-16 20:34:15 +01:00
Kaloyan Raev
4f0bf3fe1d
build: cleanup more gateway targets from Makefile (#3802)
Change-Id: Ia95caa2187b3e9e056a83cbea4230788ed4e8abd

Co-authored-by: Michal Niewrzal <michal@storj.io>
2020-03-16 15:07:52 +01:00
Stefan Benten
bd603c0751
satellite/payments: Improve Invoice Generation (#3800) 2020-03-13 17:07:39 +01:00
Yingrong Zhao
3cf05b24d8 scripts: add benchmark test for delete operation
Change-Id: I448cc50375c1c712d704d8cf93f5b8481372b0c8
2020-03-12 17:03:19 +00:00
JT Olio
051569c69f
satellite: enable open registration (and add flag that disables it) SM-441
Change-Id: I47bfedb312089f6d2bfbab013bd74ad4b8aa5f5e
2020-03-11 03:53:34 +01:00
Yingrong Zhao
1a875baa1d scripts/tests: fix arguments orders passed to test-versions.sh
Change-Id: I2e3b501477823413137f6a43fcdc4347af27e43c
2020-03-11 13:50:18 +00:00
Yingrong Zhao
46b04a38cc scripts/tests/: fix uplink access to contain satellite id
Change-Id: I7dfb6bc2da3bf84a81d75a286ee9488e32ea4f01
2020-03-10 21:15:39 +00:00
Michal Niewrzal
c20cf25f35 cmd: migrate uplink CLI to new API
Change-Id: I8f8fcc8dd9a68aac18fd79c4071696fb54853a60
2020-03-09 13:26:29 +00:00
littleskunk
842c8d8ed9
scripts/tests/rollingupgrade: fix installation for current commit 2020-03-06 17:19:55 -05:00
littleskunk
8fa8178f04
release/rollingupgrade: on release tags run rolling upgrade against previous release (#3792)
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
2020-03-03 23:56:32 +01:00
Michal Niewrzal
fb2711d05e scripts: update postgres helper script to set password
Latest postgres docker image requires non empty password.

Change-Id: I03017e1b7ff4803fefc24c39087d9ccd4042373b
2020-02-27 10:33:37 +00:00
Yingrong Zhao
ac34485f5d scripts/tests: install correct version of gateway
1. only run release tags that don't contain 'rc'
2. install gateway version that's the same as satellite
3. update gateway access to contain satellite id

Change-Id: I8ca1418302c3aafdf0c4eaaf8361422a1eec2bd4
2020-02-26 13:12:31 +00:00
Jessica Grebenschikov
e19e3c1101 pkg/process:
Now that we are trying to identify the root cause of the satellite load limitations (i.e. currently the satellite has a max ability of 400 rps for uploads and we need this to be higher), we are using the golang diagnostic tools to collect insight into what the bottlenecks are.  We currently have a debug endpoint to gather some cpu and mem data, but it could be useful to have continuous profiling. GCP stackdriver has support for continuous profiling so lets set that up and see if it is helpful to gather more data.

This PR adds support for [GCP continuous profiler](https://cloud.google.com/profiler) which allows enabling continuous cpu/mem profiling and the stats are sent to stackdriver in google cloud console.

To enable the continuous profiling for a storj component, do the following:
- prereq: the workload must be running in GKE and have Stackdriver Profiling IAM role permissions
- provide the config flag `debug.profilename` in the config.yaml file for the workload (i.e. satellite api process, etc). The profilename should be the workload name, for example "satellite-api".
- once the above config flag is provided, the profiler will be initialized and profiling stats will automatically be sent to GCP project where the workload is running and viewable in the Stackdriver Profile page in the console

The current implementation assumes the workload is running in GKE, however if we find if useful we can add support to enable this from anywhere. But for simplicity, its configured this way assuming the main goal is to enable in production systems.

Change-Id: Ibf8ebe2df7bf06fdd4951ee6a1e48854dd36ad47
2020-02-25 09:04:23 -08:00
paul cannon
92d86fa044 satellite/repair: fix repair concurrency
This new repair timeout (configured as TotalTimeout) will include both
the time to download pieces and the time to upload pieces, as well as
the time to pop the segment from the repair queue.

This is a move from Github PR #3645.

Change-Id: I47d618f57285845d8473fcd285f7d9be9b4318c8
2020-02-24 19:57:09 +00:00
Jeff Wendling
f671eb2beb satellite/satellitedb: use queue for orders to get back fast billing
This change adds two new tables to process orders as fast as we used
to but in an asynchronous manner and with hopefully less storage
usage. This should help scale on cockroach, but limits us to one
worker. It lays the groundwork for the order processing pipeline to
be queue rather than database driven.

For more details, see the added fast billing changes blueprint.

It also fixes the orders db so that all the timestamps that are
passed to columns that do not contain a time zone are converted to
UTC at the last possible opportunity, making it less likely to use
the APIs incorrectly. We really should migrate to include timezones
on all of our timestamp columns.

Change-Id: Ibfda8e7a3d5972b7798fb61b31ff56419c64ea35
2020-02-24 17:07:07 +00:00
Egon Elbre
e30f7b35b6 cmd/gateway: use a separate repository
Change-Id: Idbb0b2b6cf0e60c6d5d91218c24524d72285cf26
2020-02-24 10:03:03 +02:00
Michal Niewrzal
54e38b8986 pkg/miniogw: gateway implementation with new libuplink
Change-Id: I170c3a68cfeea33b528eeb27e6aecb126ecb0365
2020-02-21 16:20:38 +01:00
Yingrong Zhao
77f67a8086 satellite/metainfo: add timeout for delete request
Change-Id: I9cad6d7ea185fc2c0ed4e58b42e4e3a78178a79f
2020-02-20 09:10:16 +00:00
JT Olio
2ae9978304 satellite/gc: skip first gc run
rationale: if GC kills the satellite, it would be nice to make
it through a repair checker sweep first

Change-Id: Id56171dc8e13940cfb6481e36a910bad077a01ed
2020-02-13 13:41:15 +02:00
littleskunk
76849558cb satellite/gracefulexit: increase performance and tolerate higher error
rate

Graceful exit is very slow at the moment. Over the last couple days we
increase the batch size on Stefans satellite to 1000 but as a side
effect the error rate was increased. With a batch size of 500 the error
rate looks stable.
This PR will increase the default to batch size to 300. Graceful exit
will still be painful slow but at least it will be a bit faster. At the
same time this PR also increases the number of errors we tolerate. We
don't want to DQ slow storage nodes just because they didn't finish all
300 transfers in time. We want to give them more retries.

Change-Id: I92e3f99e116d4988457d8b902a88e85ed1bcc1a7
2020-02-12 11:40:15 +00:00
Egon Elbre
dbf46c4aa7 satellite/admin: administrative endpoint
Admin server allows creating basic REST and html API-s
for different administrative tasks.

Change-Id: I3dc1786abe1c87350eed60ec90e48130f44e63cf
2020-02-12 12:12:50 +02:00
Cameron Ayer
b22bf16b35 satellite/overlay: add config flag for node selection free disk requirement
Currently SNs report their free disk space once per hour. If a node
becomes full, it has to wait until the next contact cycle begins to
report; all the while receiving and failing upload requests. By increasing
the minimum required disk space, we can give the storage nodes more time
to report their space before the completely fill up. This change goes
hand-in-hand with another change we want to implement: trigger capacity
report on SN immediately upon falling below threshold.

Change-Id: I12f778286c6c3f582438b0e2949765ac43325e27
2020-02-11 18:08:25 +00:00
Qweder93
dc075eaa96 satellite/payments : deposit bonuses (credits) added
Change-Id: Ib151bbb9b02d655fa619c53bfbc04ed6f3bb39e0
2020-02-11 11:11:42 +00:00
Moby von Briesen
8c19855871 scripts/tests/rollingupgrade: explicitly set debug port for old
satellite api during rolling upgrade test

The old api is using the same config file as the new satellite in the
rolling upgrade test, so we need to set it to something different so
that there is no conflict when we spin up a new storj-sim instance while
the old api is running concurrently.

Change-Id: Ia4ec2db4953f36f43275495710992831ad3916a2
2020-01-29 18:32:03 -05:00
Egon Elbre
a2b2bc676b pkg/debug: implement control panel
Control Panel allows to control different chores and services.
Currently this adds controlling of cycles.

Change-Id: I734f1676b2a0d883b8f5ba937e93c45ac1a9ce21
2020-01-29 16:30:31 -05:00
littleskunk
e0cb8037c1 satellite/projectusage: reduce usage limit from 5GB to 0GB
Change-Id: Ie3d2509613e7a4336e2a8d2b136b32f5f308aafc
2020-01-29 20:38:39 +00:00
Ethan
149273c63f satellite/metainfo: add cache expiration for project level rate limiting
Allow rate limit project cache to expire so we can make project level rate limit changes without restarting the satellite process.

Change-Id: I159ea22edff5de7cbfcd13bfe70898dcef770e42
2020-01-29 16:14:10 +00:00