The test needs to wait for the upload information to be saved to the
database.
Fixes https://github.com/storj/storj/issues/6008
Change-Id: I1f258c923a4b33cbc571f97bad046cec70642a0b
Storagenodes are currently getting larger signed orders due to
a performance optimization in uplink, which now messes with the
ingress graph because the storagenode plots the graph using
the order amount instead of actually uploaded bytes, which this
change fixes.
The egress graph might have a similar issue if the order amount
is larger than the actually downloaded bytes but since we pay
for orders, whether fulfilled or unfulfilled, we continue using
the order amount for the egress graph.
Resolves https://github.com/storj/storj/issues/5853
Change-Id: I2af7ee3ff249801ce07714bba055370ebd597c6e
The blobstore implementation is entirely related to storagenode, so the
rightful place is together with the storagenode implementation.
Fixes https://github.com/storj/storj/issues/5754
Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d
storj/storj uses storj/uplink and storj/uplink uses storj/storj (for integration test).
Without using the real defaults (instead of hard coded ones) in storj/storj, we couldn't modify them. (modification in uplink will fail when storj/storj is used for integration test, with the unchanged, hard-coded defaults).
Change-Id: Ifa68567dc2d5c8d08af8041ac338870c4fc26d45
Download is server from two goroutines:
* one is waiting for the orders (and updates the actual limit)
* other one sends the valuable bytes back to the client (in case the actual order is big enough)
These two tasks are syncrhonized with the help of a `sync2.NewThrottle()`
But all of these happens in the same method, therefore we have no idea how much time is spent on waiting for next orders
(throttle can wait until we receive new orderlimit), and how much time is spent with actual work.
This patch moves the actual work (after sending routine is waked up) to a separated method to have better visibility and measure the actual work (read data + send it).
Change-Id: Ia5068c544560a53bc2fcea6cb6fce85cfacbd95b
if a drpc.ClosedError was returned, it would always take the
first (failure) branch, despite the second branch's existence.
Change-Id: Ife3b27869c4e9d37ca2914e2d1d1a2c60d326309
Storagenode download metrics are not accurate:
* the current metric bump cancel metrics only for specific error messages, but there are cases where the error is already handled (err == nill)
* instead of the full size of the piece we need to use the size of the downloaded bytes
Change-Id: I6ca75770e2d40bf514f5e273785c78e02968c919
we may in the future want to accept writes and commits as part of the
initial request message, just like
https://review.dev.storj.io/c/storj/storj/+/9245
this change is forward compatible but continues to work with
existing clients.
Change-Id: Ifd3ac8606d498a43bb35d0a3751859656e1e8995
uplinks currently get the node's certificate chain over TLS. once Noise
is in use, uplinks will no longer be able to do this. we should start
having the upload request return the certificate chain in the same
release that starts supporting noise.
Change-Id: I619b23cb8e25691bcc62d760f884403a4ccd64a0
- Adds "Remote Address" field to all INFO logs related to GET,
PUT, and DELETE requests
- Adds Offset and Size fields to all info logs related to GET
requests
Resolves https://github.com/storj/storj/issues/5404
Change-Id: I5dab1867619385362e5f1e0455dfab17d295a37a
we may in the future want to accept orders as part of the initial
request message
(e.g. https://review.dev.storj.io/c/storj/uplink/+/9246).
this change is forward compatible but continues to work with
existing clients.
Change-Id: I475ad50d6cbfee8a1f843383230698e4ef9b9e54
This ensures that empty trash and restore trash cannot run at the same
time.
Fixes https://github.com/storj/storj/issues/5416
Change-Id: I9d2e3aa3d66e61e5c8a7427a95208bb96089792d
* normalize ExistsCheckWorkers flag to avoid setting incorrect values
* wait for limiter to finish if context was canceled
Change-Id: I688a395bd958cd09233605fc264d43f91ec45ed1
Adds new method Exists which can be used to verify which
requested piece ids exists on storage node. Will verify only pieces
which belongs to the satellite that used that endpoint.
Minum WASM size was increased a bit.
https://github.com/storj/storj/issues/5415
Change-Id: Ia5f9cadeb526541b2776a8973eb7d50133ad8636
Starting restore trash in the background allows the satellite to
continue to the next storagenode without needing to wait until
completion.
Of course, this means the satellite doesn't get feedback whether it
succeeds successfully or not. This means that the restore-trash needs to
be executed several times.
Change-Id: I62d43f6f2e4a07854f6d083a65badf897338083b
* Allow configure initial piece scan
* Update tests to include new flag
* Update default config specification
* Rename configuration flag
* Rename variable and fix formatting
* Fix format
* Fix typo
Co-authored-by: Stefan Benten <mail@stefan-benten.de>
Co-authored-by: Clement Sam <clementsam75@gmail.com>
Co-authored-by: littleskunk <jens.heimbuerge@googlemail.com>
Remote closing during upload or download is entirely expected and
it shouldn't lead to an error in the log.
Bump drpc to get the version that contains correct error code
for it. Also bump errs, which contains a fix for .Has.
Fixes https://github.com/storj/storj/issues/4609
Change-Id: I9297cabcfdc4b3a2c19d478dc729f779a2aef0c3
* storagenode/piecestore: Add upload and download metrics for Grafana alerts
* storagenode/piecestore: group download metrics by piece action
Change-Id: Ib2a42b60c56c3f581915d512f4907c8db71e4624
Co-authored-by: Clement Sam <clement@storj.io>
When a satellite operator runs restore trash command, it's hard for node
operator to know whether their node has received the request. Adding
logs so this process is visible from node operators perspective.
Change-Id: I08c670b1b0c5971e6b73e038a0935cb0caaca63d
Estimate speed of the uploads by calculating the time a client has been in a non-congested state. Storage node is uncongested when it has used up over 80% of its maximum concurrent requests capacity.
Currently it's disabled by default.
errs.Class should not contain "error" in the name, since that causes a
lot of stutter in the error logs. As an example a log line could end up
looking like:
ERROR node stats service error: satellitedbs error: node stats database error: no rows
Whereas something like:
ERROR nodestats service: satellitedbs: nodestatsdb: no rows
Would contain all the necessary information without the stutter.
Change-Id: I7b7cb7e592ebab4bcfadc1eef11122584d2b20e0
Add the upload size to the log lines of the storagenode Upload endpoint
to provides the information to Storage node operators.
Change-Id: Ife661d28be72c2bf02579093e21fa811566ac8dd
We shouldn't have any EOF issues with recent drpc fix, let's reenable
and see whether it's still flaky.
Change-Id: I0de312bcb087c7f70ec9d3281d73d86f971845d5
Abstract details of writing and reading data to/from orders files so
that adding V1 and future maintenance are easier.
Change-Id: I85f4a91761293de1a782e197bc9e09db228933c9
periodically create and delete a temp file in the storage directory
to verify writability. If this check fails, shut the node down.
Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574
When we call ordersStore.BeginEnqueue, the unsent orders file for that
satellite and hour is prevented from being sent. It is freed when the
commit callback returned by BeginEnqueue is used. This change ensures
that we always call the commit callback, even when we have an empty
order or an order with Amount <= 0.
Change-Id: Ic4678f7eaa1e6957dd77d4bb5a23bb35d25b1e93
This change accomplishes multiple things:
1. Instead of having a max in flight time, which means
we effectively have a minimum bandwidth for uploads
and downloads, we keep track of what windows have
active requests happening in them.
2. We don't double check when we save the order to see if it
is too old: by then, it's too late. A malicious uplink
could just submit orders outside of the grace window and
receive all the data, but the node would just not commit
it, so the uplink gets free traffic. Because the endpoints
also check for the order being too old, this would be a
very tight race that depends on knowledge of the node system
clock, but best to not have the race exist. Instead, we piggy
back off of the in flight tracking and do the check when
we start to handle the order, and commit at the end.
3. Change the functions that send orders and list unsent
orders to accept a time at which that operation is
happening. This way, in tests, we can pretend we're
listing or sending far into the future after the windows
are available to send, rather than exposing test functions
to modify internal state about the grace period to get
the desired effect. This brings tests closer to actual
usage in production.
4. Change the calculation for if an order is allowed to be
enqueued due to the grace period to just look at the
order creation time, rather than some computation involving
the window it will be in. In this way, you can easily
answer the question of "will this order be accepted?" by
asking "is it older than X?" where X is the grace period.
5. Increases the frequency we check to send up orders to once
every 5 minutes instead of once every hour because we already
have hour-long buffering due to the windows. This decreases
the maximum latency that an order will be reported back to
the satellite by 55 minutes.
Change-Id: Ie08b90d139d45ee89b82347e191a2f8db1b88036
On run, write the storage directory verification file.
Every time the node runs it will write the file even if it already exists.
The reason we do this is because if the verification file is missing, the SN
doesn't know whether it is an incorrect directory, or it simply hasn't written
the file yet, and we want to keep nodes running without needing operator intervention.
Once this change has been a part of the minimum version for several releases,
we will move the file creation from the run command to the setup
command. Run will only verify its existence.
Change-Id: Ib7d20e78e711c63817db0ab3036a50af0e8f49cb
* Add all new orders to the orders filestore instead of the database.
* Submit orders from the filestore to the new satellite SettleWindow
endpoint.
The orders filestore will eventually replace the orders DB completely.
For now, we will still be checking the orders DB and submitting those
orders if they exist. In a later release, we will completely remove the
orders DB, but we need both the DB and filestore for the transitionary
period.
Change-Id: Iac8780fd5ab770296181bbd313e1d335f072d4dc
We only sync free disk and available space, if necessary, on startup. If the
SNs disk fills up with non-storj data, we will not know about it when reporting
available space to the satellite.
Solution: whenever we check the node's capacity, double check free disk.
If free disk < than expected available space, return free disk.
Change-Id: I66265c16e03be45b6e1f5817c70df7eac0a76455