WHAT:
list of suspension and audit scores by each satellite
WHY:
a way to show audit and suspension scores of all satellite when no satellite selected
Change-Id: I4b3ac55c2da826e50131a2381c004493d82c5d5b
This PR updates `uplink rb --force` command to use the new libuplink API
`DeleteBucketWithObjects`.
It also updates `DeleteBucket` endpoint to return a specific error
message when a given bucket has concurrent writes while being deleted.
Change-Id: Ic9593d55b0c27b26cd8966dd1bc8cd1e02a6666e
WHAT:
dashboard and billing pages' scrolling fixed with no paywall banner
WHY:
content was slightly clipped
Change-Id: I4d663775456980887af7bc41c72bf3a8da87d301
This PR fixes a deadlock that can happen when the number of piece
deletion requests is different from the distinct node count from those
requests. The success threshold should be based on the number of nodes
instead of the amount of requests
Change-Id: I83073a22eb1e111be1e27641cebcefecdc16afcb
This change forces the test of GetObjectIPs to use multiple remote
segments (earlier versions of the test were accidentally using inline
segments). This change also revealed a small bug in the for loop code,
which is fixed.
Change-Id: Ic486b079d221952ba13553acf0ca41a8873f3f21
periodically create and delete a temp file in the storage directory
to verify writability. If this check fails, shut the node down.
Change-Id: I433e3a8d1d775fc779ae78e7cf3144a05ffd0574
* The audit worker wants to get items from the queue and process them.
* The audit chore wants to create new queues and swap them in when the
old queue has been processed.
This change adds a "Queues" struct which handles the concurrency
issues around the worker fetching a queue and the chore swapping a new
queue in. It simplifies the logic of the "Queue" struct to its bare
bones, so that it behaves like a normal queue with no need to understand
the details of swapping and worker/chore interactions.
Change-Id: Ic3689ede97a528e7590e98338cedddfa51794e1b
Add online score used for the new audit history offline tracking system
to the nodes table. This allows us easy access to the node's online
score for the storagenode dashboard as well as for data analysis.
Change-Id: Ie99be1192e5236862a5b3dbed2e5ef03b9169410
We were seeing error on the last day of the month with TestProjectAllocatedBandwidthRetainTwo.
This is due to AddDate normalizes its result in the same way that Date does, so, for example,
adding one month to October 31 yields December 1, the normalized form for November 31."
I also fixed a minor UTC issue with this test as well.
Change-Id: I0157873e7befa57810e5f264a922b188890fa46a
satellite.DB.Console().Projects().GetAll database query
can be replaced with planet.Uplinks[0].Projects[0].ID
Change-Id: I73b82b91afb2dde7b690917345b798f9d81f6831
To prevent longlived unused connections, set the maximum time to 30 minutes to
prevent proxies and loadbalancers forcefully cutting the connection.
This helps in scenarios with low load/requests to a DB.
Change-Id: I7dba15ef97f6f6541e872a6fb1d3a9bbbfe5bb50
When a node's audit history "online score" passes below a configured
threshold, the node goes into "offline suspension" mode and begins a
review period, where the operator is given an opportunity to bring their
node back online.
After the review period passes, offline suspension is turned off for the
node.
In the future, if a node still has a bad online score at the end of the
review period, it will be disqualified. This is disabled right now.
In the future, if a node is in offline suspension, it will be treated as
"unhealthy". Right now, there are no consequences for being in offline
suspension.
Minor changes:
* Moves AuditHistoryConfig out of UpdateStats/BatchUpdateStats args and
into UpdateRequest.
* Adds "now" argument to UpdateStats/BatchUpdateStats args for easy
testing.
* Changes formatting strings inside buildUpdateStatement to use specific
types.
Change-Id: I032b60298840fc16e6ef831da750f2d57619a397
Currently there is confusion between responsibilities of
metainfo.Endpoint, metainfo.Service, PointerDB.
By separating database "service" into a separate package and
its types allows to disentagle them.
This gives us responsibilities:
1. metainfo.Endpoint - translates requests and permissions
2. metainfo.Service - handles requests and coordinates with
objectdeletion, piecedeletion, metabase
3. metabase.Service - communication with the database interface and invariants
Currently metabase will contain the types necessary to coordinate
information.
Change-Id: If8c992b4b9d9e70a56bbd8a378a5af6b1a2ec34e
Jenkins has been failing a lot lately due to test timeouts with CockroachDB.
TestMigrateCockroach previously took around 5 minutes, now it takes 2.
Why 103? I couldn't get 100 to work due to an error w/ NOT NULL and PKs.
Change-Id: Iec95d4e25f9d6cd36920e7f43272c486a17fa879
TestMaxOutBuckets is one of our slower tests (50-90s).
This change seems to make it 2-12s.
It reduces the number of buckets that need to be created.
It also removes unnecessary storage nodes.
Change-Id: I1012fc6e9258b2f7674b16da4e8b418741c93eea
If a segment is deleted, is modified, or expires during an audit, this
is not problematic, so we should not return errors. Functionally,
nothing changes, but our metrics around audit success rate will be
improved after this change.
Change-Id: Ic11df056b2c73894b67a55894bd4d58c00470606
This PR changes DeleteBucket to be able to delete all objects within a
bucket if `DeleteAll` is set in `BucketDeleteRequest`.
It also changes `DeleteBucket` API to treat `ErrBucketNotFound` as a
successful delete operation instead of returning an error back to the
client.
Change-Id: I3a22c16224c7894f2d0c2a40ba1ae8717fa1005f
When we call ordersStore.BeginEnqueue, the unsent orders file for that
satellite and hour is prevented from being sent. It is freed when the
commit callback returned by BeginEnqueue is used. This change ensures
that we always call the commit callback, even when we have an empty
order or an order with Amount <= 0.
Change-Id: Ic4678f7eaa1e6957dd77d4bb5a23bb35d25b1e93