Commit Graph

159 Commits

Author SHA1 Message Date
Qweder93
f5ba8b8009 storagenode/suspensions: added offline-suspension notificatio chore + tests
Change-Id: I2521cd2e7d08a1dd379e717a554a026c7508c18f
2020-10-29 19:44:22 +02:00
Qweder93
624255e8ba storagenode/secret: db tests added, small renaming fixes added
Change-Id: I7eae1a9a64c20a39c97e81fa741cfc9b9e1e615a
2020-10-29 14:23:04 +02:00
paul cannon
76d4977b6a storagenode/gracefulexit: logic moved from worker to service
Change-Id: I8b12606a96b712050bf40d587664fb1b2c578fbc
2020-10-22 23:19:30 +00:00
Qweder93
9df74338a8 storagenode: secret db and service added
Change-Id: I91257e5adc4fc6711653f30c118e476ed1c95b6b
2020-10-16 13:24:33 +00:00
Moby von Briesen
aa86c0889c storagenode/console: Add current storage used per satellite to storagenode api
Right now, the best way for a storage node operator to get the current
space used for each satellite is to run the `storagenode exit-satellite`
command for graceful exit, and cancel at the second confirmation prompt.
This is convoluted and the data is readily available from the Blobs
Usage Cache.

This change adds the current space used by each satellite to the
endpoints `/api/sno` and `/api/sno/satellite/<Satellite ID>`

Change-Id: I2173005bb016fc76db96fd598d26b485e5b2aa0b
2020-10-14 21:30:28 +00:00
Egon Elbre
cf2dd76db7 cmd/satellite: proper log usage
log.Fatal immediately terminates the program without running any defers.
We should properly close all the services and databases.

Change-Id: I5e959cef3eafedeacb3a2062e3da47e8d04e8e75
2020-10-13 16:56:35 +03:00
Egon Elbre
2268cc1df3 all: fix linter complaints
Change-Id: Ia01404dbb6bdd19a146fa10ff7302e08f87a8c95
2020-10-13 15:59:01 +03:00
Qweder93
8182fdad0b storagenode: heldamount renamed to payouts, renamed some methods and structs to more meaningful names. grouped estimated payout with pathouts
satellite: heldamount renamed to SNOpayouts.

Change-Id: I244b4d2454e0621f4b8e22d3c0d3e602c0bbcb02
2020-09-16 14:57:35 +00:00
Moby von Briesen
789b07e226 storagenode/orders/store.go: Do not return error from ListUnsentBySatellite when order files are corrupted.
If we see an UnexpectedEOF error when attempting to read orders, return
the orders we have been able to read successfully and do not return an
error. This behavior ensures that the storagenode orders service
attempts to archive corrupted files and does not retry them repeatedly
and get stuck.

Change-Id: I0d00d1e174f968af6e99ca861eddad190f1339e2
2020-09-10 23:36:05 +00:00
Qweder93
ac29d80495 storagenode: heldamount GetPaystub refactored, estimationPayouts logic separated form console to separate service, storagenodeapi tests fixed.
Change-Id: I902823ef40a62861ce32799e9fb7a67a1e14710d
2020-09-09 15:31:16 +00:00
Jeff Wendling
91698207cf storagenode: live tracking of order window usage
This change accomplishes multiple things:

1. Instead of having a max in flight time, which means
   we effectively have a minimum bandwidth for uploads
   and downloads, we keep track of what windows have
   active requests happening in them.

2. We don't double check when we save the order to see if it
   is too old: by then, it's too late. A malicious uplink
   could just submit orders outside of the grace window and
   receive all the data, but the node would just not commit
   it, so the uplink gets free traffic. Because the endpoints
   also check for the order being too old, this would be a
   very tight race that depends on knowledge of the node system
   clock, but best to not have the race exist. Instead, we piggy
   back off of the in flight tracking and do the check when
   we start to handle the order, and commit at the end.

3. Change the functions that send orders and list unsent
   orders to accept a time at which that operation is
   happening. This way, in tests, we can pretend we're
   listing or sending far into the future after the windows
   are available to send, rather than exposing test functions
   to modify internal state about the grace period to get
   the desired effect. This brings tests closer to actual
   usage in production.

4. Change the calculation for if an order is allowed to be
   enqueued due to the grace period to just look at the
   order creation time, rather than some computation involving
   the window it will be in. In this way, you can easily
   answer the question of "will this order be accepted?" by
   asking "is it older than X?" where X is the grace period.

5. Increases the frequency we check to send up orders to once
   every 5 minutes instead of once every hour because we already
   have hour-long buffering due to the windows. This decreases
   the maximum latency that an order will be reported back to
   the satellite by 55 minutes.

Change-Id: Ie08b90d139d45ee89b82347e191a2f8db1b88036
2020-08-19 19:42:33 +00:00
Moby von Briesen
708cb48aa6 storagenode/orders: implement orders filestore on storagenode
* Add all new orders to the orders filestore instead of the database.
* Submit orders from the filestore to the new satellite SettleWindow
endpoint.

The orders filestore will eventually replace the orders DB completely.
For now, we will still be checking the orders DB and submitting those
orders if they exist. In a later release, we will completely remove the
orders DB, but we need both the DB and filestore for the transitionary
period.

Change-Id: Iac8780fd5ab770296181bbd313e1d335f072d4dc
2020-08-19 15:00:35 +00:00
Qweder93
7b4a8c4d6d storagenode/heldamount: payoutHistory added
Change-Id: I93dd3d024085d19ecff76075e52bf66796207fd6
2020-07-14 17:35:03 +03:00
Qweder93
f73e92c268 storagenode/gracefulexit: added blobs clean
on node's start checks if any of trusted satellites has GE status "Exited successfully"
if so - trying to delete blobs/satellite folder, so no trash left on SNO.

Change-Id: I566266c84f2a872df54cd01bc2f15a9934f138ed
2020-07-13 11:49:18 +00:00
Qweder93
ac716e1514 storagenode/heldamount: payment receipt added to monthly paystub, heldamount.service separated for service and endpoint
Change-Id: Id759586c6362edbef34c230d4f0d2585c11c9b47
2020-07-06 09:51:52 +00:00
Qweder93
577f72cb92 storagenode/version: notifications added
Change-Id: Ib9720d8124d8e078354a292b644e2db1f5fffe67
2020-07-01 19:35:46 +03:00
Qweder93
e52809d53e cmd/storagenode: add check if satellites available to gracefulexit
Change-Id: I8747507593d810bbdec0d140de0600ee147011c3
2020-06-10 13:38:36 +00:00
Moby von Briesen
dc57640d9c storagenode/piecestore: switch usedserials db for in-memory usedserials store
Part 2 of moving usedserials in memory
* Drop usedserials table in storagenodedb
* Use in-memory usedserials store in place of db for order limit
verification
* Update order limit grace period to be only one hour - this means
uplinks must send their order limits to storagenodes within an hour of
receiving them

Change-Id: I37a0e1d2ca6cb80854a3ef495af2d1d1f92e9f03
2020-05-28 12:52:52 -04:00
Qweder93
73214c6d1c storagenode/heldamount: heldhistory reworked to all satellites
Change-Id: I8d7707fddfbdc52d29951a8a002978c7fbb07049
2020-05-28 11:44:26 +00:00
Egon Elbre
bef84a5f9d storagenode: remove dependency to overlay.NodeDossier
This is the last dependency from storage node to satellite.

Change-Id: I12f7abb91e84f823ba5af126c6e2979519838612
2020-05-21 08:37:13 +03:00
Egon Elbre
941d10cbc3 private/testplanet: remove Peer.Local()
Currently storagenode depends on overlay.NodeDossier, this is the first
step in removing it.

Change-Id: I034a3f1601835f8349bd41752455022e19bcc707
2020-05-20 11:05:34 +00:00
Ethan
159df8b2e4 Add logging listener for retrieving and setting log levels
See https://storjlabs.atlassian.net/browse/SM-752

These changes allow us to change the log level at runtime through a handler off of the debug endpoint.

Examples of changing the log level on storj-sim

To get the current level for the satellite api process:
curl -XGET 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain'

To change the log level:
curl -XPUT 'http://127.0.0.1:10009/logging' --header 'Content-Type: text/plain' --data-raw '{"level":"error"}'

Change-Id: I05d164b290929fa06b6d78c01075ee41f8238044
2020-05-12 16:38:06 -04:00
Egon Elbre
7d29f2e0d3 all: remove drpc wrappers
Change-Id: I45016f7d2a771dc00776196c1f531f3343e93b40
2020-05-11 08:20:34 +03:00
Egon Elbre
e6d5ce6b77 all: remove grpc
It seems everyone has migrated to drpc.

Change-Id: Ica6b2d0bdef68c6603083f2963458843eca71e9e
2020-05-10 06:36:09 +00:00
Jeff Wendling
57eb8a17e2 storagenode: allow configuring database path independently
Fixes #3852

Change-Id: I021c29c4dd7c393399f6abef41d8457514032833
2020-05-04 06:04:31 +00:00
Egon Elbre
c630cf2490 storagenode/pieces: implement buffering for writing
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.

Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
2020-05-04 06:01:32 +00:00
Egon Elbre
8928399d02 all: rename CreateTables to MigrateToLatest
CreateTables hasn't been quite true for a while now, rename to
MigrateToLatest to be clearer in it's behavior.

Change-Id: Ida48e95122a5d9b7a814e922d3698e00024a2ba7
2020-04-30 07:21:17 +00:00
Isaac Hess
a785d37157 storagenode/pieces: Process deletes asynchronously
To improve delete performance, we want to process deletes asynchronously
once the message has been received from the satellite. This change makes
it so that storagenodes will send the delete request to a piece Deleter,
which will process a "best-effort" delete asynchronously and return a
success message to the satellite.

There is a configurable number of max delete workers and a max delete
queue size.

Change-Id: I016b68031f9065a9b09224f161b6783e18cf21e5
2020-04-23 11:51:19 -06:00
Egon Elbre
d3ce845f82 satellite: revert log lines used to figure out node id
Currently storj-sim relies on the log lines to be exactly the same,
when they change it cannot find the necessary information from log.

Change-Id: Ia039915ef3375a7cf60f107b2c05c958de15b6d5
2020-04-15 17:07:56 +03:00
Kaloyan Raev
a2ce836761 remove sugar logging
Change-Id: I6b6ca9704837cb3f5f5449ba7f55661487814d9f
2020-04-15 12:37:47 +00:00
Qweder93
743b3fb226 storagenode/nodestats: add pricing model, storagenode/cache: add paystub history storing
Change-Id: I9bc104a1407c8f286a964c796656d89b122bf752
2020-04-14 19:04:00 +03:00
Yingrong Zhao
a731472496 bump storj.io/common to latest and storj.io/drpc to v0.0.11
Change-Id: I7a6e823b441eeff4621dfdf2d6577be76c9761c8
2020-03-24 15:17:10 -04:00
Michal Niewrzal
fdf40a7526 storj: remove storj/private/version package which was moved to
`storj/private` repo

Change-Id: I81c3f5b9d5e4fe7bca760999eb045ee9734e5e2e
2020-03-24 14:31:33 +00:00
Michal Niewrzal
f0aeda3091 storj: remove from storj/pkg packages moved to storj/private repo
* debug
* traces
* cfgstruct
* process

Package `storj/private/version` will be removed as a separate change.

Change-Id: Iadc40faa782e6225513b28218952f02d9c240a9f
2020-03-24 09:56:29 +01:00
crawter
89374e260d storagenode/console/consoleapi: using cached data in heldamount api
Change-Id: I0efca320eaf722ade1146100bbb0e70d75a5dca3
2020-03-16 01:39:11 +02:00
Qweder93
9f84261c36 storagenode/cache heldamount added
Change-Id: I7fc807789de63e8a9b8ca2018fd73bdb9e01ad0d
2020-03-16 00:28:35 +02:00
Qweder93
7b0371e9e2 storagenode/heldamount/service added, console/heldamountapi added, console/server updated
Change-Id: I6290a6ea1b07b222908440defbbd7aec5f2a4cdf
2020-03-13 19:18:03 +02:00
Qweder93
5ccce04338 storagenode/storagenodedb: heldamount added
Change-Id: I213e3abffd7356bbfccb3f33bcbafa558674b8d9
2020-03-13 16:23:59 +00:00
Moby von Briesen
178dbb4683 storagenode/storagenodedb: allow storagenodes to start test_table exists
In many cases when a storagenode fails the preflight check, it is due to
test_table existing, which is used to determine read/write capabilities
after the initial schema verification. If preflight ends early due to a
failure or stopped storagenode, it may not get the chance to drop this
table.

This change excludes test_table from the schema comparison to ensure
that it never prevents a storagenode from starting up.

It also adds Preflight DB test for storagenode.

Change-Id: Ib8e71df2e42fda3b2a364fbf7a801891c5831d39
2020-03-09 14:29:46 -04:00
Jennifer Johnson
1c1750e6be removes bandwidth limiting
On satellite, remove all references to free_bandwidth column in nodes table.
On storage node, remove references to AllocatedBandwidth and MinimumBandwidth and mark as deprecated.

Protobuf message, NodeCapacity, is left intact for backwards compatibility.
Once this is released to all satellites, we can drop the column from the DB.

Change-Id: I2ff6c6537fc9008a0c5588e951afea58ede85838
2020-03-04 14:04:00 +00:00
Cameron Ayer
7244a6a84e storagenode/{contact, piecestore}: implement low disk notification with cooldown
When a storagenode begins to run low on capacity, we want to notify
the satellite before completely running out of space. To achieve this,
at the end of an upload request, the SN checks if its available space has
fallen below a certain threshold. If so, trigger a notification to the
satellites.

The new NotifyLowDisk method on the monitor chore is implemented using the
common/syn2.Cooldown type, which allows us to execute contact only once
within a given timeframe; avoiding hammering the satellites with requests.
This PR contains changes to the storagenode/contact package, namely moving
methods involving the actual satellite communication out of Chore and into
Service. This allows us to ping satellites from the monitor chore

Change-Id: I668455748cdc6741291b61130d8ef9feece86458
2020-03-03 10:45:37 -05:00
Qweder93
484ec7463a storagenode: notifications on outdated software version
Change-Id: If19b075c78a7b2c441e11b783c3c09fed55060c7
2020-03-02 16:48:02 +00:00
Egon Elbre
64330c55b3 all: use pbgrpc
common/pb moved grpc to a separate package common/pb/pbgrpc.
This updates this repository to use it.

Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e
2020-02-26 21:27:47 +02:00
Cameron Ayer
d578102672 storagenode/piecestore: add workgroup to endpoint to prevent stray goroutine after shutdown
Change-Id: Ie8444c3c8f870745b73342de2e9a93027fcad371
2020-02-24 21:38:52 +00:00
Cameron Ayer
3e70a893dd storagenode/{piecestore, contact}: report capacity to satellites if below specific threshold
Curently, storage nodes only report their capacity to satellites
once per hour. If a node fills up, it will fail all uploads until
the next contact cycle begins. With these changes, at the end of an
upload we check whether the MinimumDiskSpace threshold has been
passed. If so, trigger the monitor chore to update the node's
capacity, then trigger the contact chore to report the new
capacity to the satellites

Change-Id: Ie6aadaade1e2c12c87e03f8ff9059a50121380a0
2020-02-18 15:42:48 -05:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Isaac Hess
4dafd03f11 storagenode: Prevent negative values in piece_space_used, migrate negatives to 0
Change-Id: Ibd663db087058c928190aa52c520f22e9338dd04
2020-01-30 13:03:18 -05:00
Egon Elbre
4e2bf81719 pkg/debug: add better title
Change-Id: Icc6114f4e7523cfe6c7984ef1f6eec664ae4ee65
2020-01-30 07:49:40 -05:00
Egon Elbre
d10d6fd153 storagenode,satellite: ignore error on listening debug port
Change-Id: Id3a6d153535776ce41f8edf2bd6f6dad5e2a60bf
2020-01-29 18:06:02 -05:00
Egon Elbre
10be538602 storagenode: add pkg/debug support
Change-Id: If941095b886c28a0d53fff4c9bf9fa0ce7471dea
2020-01-29 16:30:31 -05:00