Commit Graph

28 Commits

Author SHA1 Message Date
Jessica Grebenschikov
89bdb20a62 storagenodedb/orders: select unsent satellite with expiration
In production we are seeing ~115 storage nodes (out of ~6,500) are not using the new SettlementWithWindow endpoint (but they are upgraded to > v1.12).

We analyzed data being reported by monkit for the nodes who were above version 1.11 but were not successfully submitting orders to the new endpoint.
The nodes fell into a few categories:
1. Always fail to list orders from the db; never get to try sending orders from the filestore
2. Successfully list/send orders from the db; never get to calling satellite endpoint for submitting filestore orders
3. Successfully list/send orders from the db; successfully list filestore orders, but satellite endpoint fails (with "unauthenticated" drpc error)

The code change here add the following to address these issues:
- modify the query for ordersDB.listUnsentBySatellite so that we no longer select expired orders from the unsent_orders table
- always process any orders that are in the ordersDB and also any orders stored in the filestore
- add monkit monitoring to filestore.ListUnsentBySatellite so that we can see the failures/successes

Change-Id: I0b473e5d75252e7ab5fa6b5c204ed260ab5094ec
2020-10-21 15:02:23 +00:00
Moby von Briesen
fbf2c0b242 storagenode/orders: Refactor orders store
Abstract details of writing and reading data to/from orders files so
that adding V1 and future maintenance are easier.

Change-Id: I85f4a91761293de1a782e197bc9e09db228933c9
2020-10-06 15:28:07 -04:00
Jeff Wendling
91698207cf storagenode: live tracking of order window usage
This change accomplishes multiple things:

1. Instead of having a max in flight time, which means
   we effectively have a minimum bandwidth for uploads
   and downloads, we keep track of what windows have
   active requests happening in them.

2. We don't double check when we save the order to see if it
   is too old: by then, it's too late. A malicious uplink
   could just submit orders outside of the grace window and
   receive all the data, but the node would just not commit
   it, so the uplink gets free traffic. Because the endpoints
   also check for the order being too old, this would be a
   very tight race that depends on knowledge of the node system
   clock, but best to not have the race exist. Instead, we piggy
   back off of the in flight tracking and do the check when
   we start to handle the order, and commit at the end.

3. Change the functions that send orders and list unsent
   orders to accept a time at which that operation is
   happening. This way, in tests, we can pretend we're
   listing or sending far into the future after the windows
   are available to send, rather than exposing test functions
   to modify internal state about the grace period to get
   the desired effect. This brings tests closer to actual
   usage in production.

4. Change the calculation for if an order is allowed to be
   enqueued due to the grace period to just look at the
   order creation time, rather than some computation involving
   the window it will be in. In this way, you can easily
   answer the question of "will this order be accepted?" by
   asking "is it older than X?" where X is the grace period.

5. Increases the frequency we check to send up orders to once
   every 5 minutes instead of once every hour because we already
   have hour-long buffering due to the windows. This decreases
   the maximum latency that an order will be reported back to
   the satellite by 55 minutes.

Change-Id: Ie08b90d139d45ee89b82347e191a2f8db1b88036
2020-08-19 19:42:33 +00:00
Egon Elbre
00b2dfa313 storagenode/orders: fix TestDB and TestCleanArchive
Tests were relying on too exact time values.

Change-Id: I83aa0e815b06372a65ec2d62b23df86051050d19
2020-04-30 15:00:06 +03:00
Egon Elbre
21f53e38da storagenode/storagenodedb/storagenodedbtest: pass ctx as an argument
Change-Id: I10b0a8ef3a7d5001e7d361f1873ad5987af1f9c2
2020-01-20 16:56:12 +02:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Cameron
3d9441999a
storagenode/orders: add archive cleanup to orders service (#2821)
This PR introduces functionality for routine deletion of archived orders.

The user may specify an interval at which to run archive cleanup and a TTL for archived items. During each cleanup, all items that have reached the TTL are deleted

This archive cleanup job is combined with the order sender into a new combined orders service
2019-08-22 10:33:14 -04:00
Ivan Fraixedes
e47b8ed131
storagenode: No FATAL error when unsent orders aren't found (#2801)
* pkg/process: Fatal show complete error information
  Change the general process execution function to not using the sugared
  logger for outputting the full error information.
  Delete some unreachable code because Zap logger Fatal method calls exit
  1 internally.
* storagenode/storagenodedb: Add info to error
  Add more information to an error returned due to some data
  inconsistency.
* storagenode/orders: Don't use sugared logger
  Don't use sugar logger and provide better contextualized error messages
  in settle method.
* storagenode/orders: Add some log fields to error msgs
  Add some relevant log fields to some logged errors of the sender settle
  method.
* satellite/orders: Remove always nil error from debug
  Remove an error which as logged in debug level which was always nil and
  makes the logic that used this variable clear.
* storagenode/orders: Don't return error Archiving unsent
  Don't stop the process which archive unsent orders if some of them
  aren't found the DB because it cause the Storage Node to stop with a
  fatal error.
2019-08-16 16:53:22 +02:00
Cameron
497f10d7b1
add method CleanArchive to delete archived orders (#2796) 2019-08-15 12:56:33 -04:00
Ivan Fraixedes
26fb992474 storagenode: Add more test assertions (#2772) 2019-08-13 15:08:05 -04:00
Jeff Wendling
26a2fbb719 storagenode: batch archiving unsent_orders (#2507) 2019-07-31 19:40:08 +03:00
Egon Elbre
5d0816430f
rename all the things (#2531)
* rename pkg/linksharing to linksharing
* rename pkg/httpserver to linksharing/httpserver
* rename pkg/eestream to uplink/eestream
* rename pkg/stream to uplink/stream
* rename pkg/metainfo/kvmetainfo to uplink/metainfo/kvmetainfo
* rename pkg/auth/signing to pkg/signing
* rename pkg/storage to uplink/storage
* rename pkg/accounting to satellite/accounting
* rename pkg/audit to satellite/audit
* rename pkg/certdb to satellite/certdb
* rename pkg/discovery to satellite/discovery
* rename pkg/overlay to satellite/overlay
* rename pkg/datarepair to satellite/repair
2019-07-28 08:55:36 +03:00
Egon Elbre
13dd501042
storagenode/storagenodedb: move tests near the interface rather than the implementation (#2596) 2019-07-19 20:40:27 +03:00
Egon Elbre
d52f764e54
protocol: implement new piece signing and verification (#2525) 2019-07-11 16:51:40 -04:00
Alexander Leitner
1c5db71faf
Change protobuf expirations to use time.Time (#2509)
* Change protobuf expirations to use time.Time instead of timestamp.Timestamp
2019-07-09 17:54:00 -04:00
Michal Niewrzal
bbc25a2bf7 Drop SN certifiates table from DB (#2498) 2019-07-09 17:33:45 -04:00
Alexander Leitner
6d55bbdb57
OrderLimit creation date time limit (#2412)
* Limit by order creation
2019-07-02 12:06:12 -04:00
Egon Elbre
385c046723
pkg/pb: rename Order2 to Order, OrderLimit2 to OrderLimit (#2406) 2019-07-01 18:54:11 +03:00
Egon Elbre
e83ebd7cde
jenkins: avoid using goimports and distribute load better (#2359) 2019-06-27 21:52:50 +03:00
Egon Elbre
b6ad3e9c9f
internal/testrand: new package for random data (#2282) 2019-06-26 13:38:51 +03:00
Egon Elbre
6502143e79
fix import ordering (#2322) 2019-06-25 12:46:29 +03:00
JT Olio
ccb158c99b
pkg/auth: add monkit task to missing places (#2123)
What: add monkit.Task to a bunch of functions that are missing it

Why: this will significantly help our instrumentation, data collection, and tracing about what's going on in the network
2019-06-05 07:47:01 -06:00
Bryan White
faf5fae3f9
Identity versioning (#1389) 2019-04-08 20:15:19 +02:00
Egon Elbre
2c5c2c29da
storage node order sending (#1535) 2019-03-21 15:24:26 +02:00
Natalie Villasana
61ee04d363
adds pingbackTimeout to kademlia endpoint (#1518) 2019-03-19 14:30:27 -04:00
Egon Elbre
117edec54c
Add serial number type (#1508) 2019-03-18 15:08:24 +02:00
Egon Elbre
05d148aeb5
Storage node and upload/download protocol refactor (#1422)
refactor storage node server
refactor upload and download protocol
2019-03-18 12:55:06 +02:00