For being able to have resilient multi-region satellites we cannot stop
processing uploads/download client request when Redis isn't responding
properly.
These changes avoid to stop the processing of the client requests when
we cannot check if the client exceeds its storage or bandwidth limits
and we cannot update its used storage/bandwidth limits because Redis is
not responding successfully or the satellite database returns an error.
Change-Id: Ia7f12c07fc9ffdfad0e7ff052ff3fd81eca0f0e3
Respond to the HTTP clients which request the project usage limits with
different status codes depending of the error class returned by the
satellite/accounting Service.
Change-Id: I6f486ea55517f616c7cec81dbbe77e997484180f
This is the first step in the removal of uptime columns on the
nodes table. These columns are no longer used:
uptime_success_count
total_uptime_count
uptime_reputation_alpha
uptime_reputation_beta
In order to avoid breaking backwards compatibility, we need to
remove all references to these columns before removing the columns
themselves from the database. However, since uptime_success_count
and total_uptime_count are NOT NULLABLE, we can't remove them from
the insert statements in the overlay. So we can't remove the columns
because of the references, and we can't remove the references because
the columns can't be null. What a pickle. To remedy this, we will set a
default on the columns. Then we should be able to remove them from the
insert statements
Change-Id: I75f6c56fb7897835bbf29869f86f39de1d9dd345
We have to adapt the live accounting to allow the packages that use it
to differentiate about errors for being able to ignore them and make our
satellite resilient to Redis downtime.
For differentiating errors we should make changes in the live accounting
but also in the storage/redis.Client, however, we may need to do some
dirty workarounds or break other parts of the implementation that
depends on it.
On the other hand we want to get rid of the storage/redis.Client because
it has more functionality that the one that we are using and some
process has been started to remove it.
Hence, we have refactored the live accounting to directly use the Redis
client library for later on (in a future commit) adapt the satellite for
being resilient to Redis downtime.
Last but not least, a test for expired bandwidth keys have been added
and with it a bug was spotted and fix it.
Change-Id: Ibd191522cd20f6a9a15e5ccb7beb83a678e530ff
GetSuccessfulNodeNotCheckedInSince and GetOfflineNodesLimited are overlay methods
which were only used by the previous downtime tracking system which has been removed.
These methods should also be removed.
Change-Id: Idb829d742e1f987e095604423fff656fe581183e
SatelliteAddress in OrderLimit is not being used anymore and some
satellite addresses may consume too much bytes.
Change-Id: Ic7a0efe5b6211c2f3b91af67b293cde98b29d074
Avoid using project uuid string representation, because
it uses more bandwidth.
This reduces the encrypted metadata size from 118 -> 97 bytes.
Change-Id: Ic53a81b83acc065f24f28cd404f9c0b1fe592594
When running make install-sim we currently create a .build directory and
init a go module there. This change allows a user to pass in a custom
location for this to occur. If nothing is passed in it defaults to
current behavior.
Change-Id: I432dfc7bae412d8a8454a6b8b3dffece84a41147
Jens noticed that 'uplink access register' wasn't working with named
accesses.
This was because GetNamedAccess was harcoded to use inspectCfg, which
in the case of 'uplink access register' wasn't being bound to the
config file.
Change-Id: I49403b45af28ad33408cfc5ec6545a395f0f080d
Fail all the processes immediately when one of the processes fails. This
is to make it more obvious that one of them has failed.
To disable failfast, use `-failfast=false`.
Change-Id: I2bbedf12fb653e42739d00273aa9ae515d34eda6
WHAT:
if all buckets are selected then we pass an empty array to web worker instead of all existing bucket names
WHY:
bug fix
Change-Id: I59b9e9ec1deb0abb3edeba7e2303e2b1d578935d
Do not insert the number of healthy pieces for segment health anymore.
Rather, insert the segment health calculated by our new priority
function.
Change-Id: Ieee7fb2deee89f4d79ae85bac7f577befa2a0c7f
Full prefix: satellite/{overlay,nodestats},storagenode/{reputation,nodestats}
Allow the storagenode to receive its audit history data from the
satellite via the satellite's GetStats endpoint.
The storagenode does not save this data for use in the API yet.
Change-Id: I9488f4d7a4ccb4ccf8336b8e4aeb3e5beee54979
Previously, we were trying to overwrite accesses, which is a nested map
in the uplink config, by calling viper.MergeWithConfig with a nested
map. While this works for keys that don't exist already, it does not
overwrite already existing keys. In order to do that, we need to call
MergeWithConfig with "accesses.<accessname> -> value" rather than using
"accesses -> <accessname> -> value".
Change-Id: I74d7a9decf2078cdf2ff440eaf24821e30474b53
The current uplink access register method has the ability to write to AWS credential files.
This is caused issues with repeat usage in recent AWS CLI code, and there was concern that
it was an unstable solution. This version instead specifies various output formats "env"
and "aws". "env" formats the text so that it can be used with 'export'. "Aws" generates
"aws configure" commands to persist the credentials to the AWS credential files as the
previous version could.
Example usages:
Setting ephemeral evironment variables in bash:
export $(uplink access register $(storj-sim network env GATEWAY_0_ACCESS)
--auth-service http://localhost:8000 --format env)
Setting persistant configs via AWS CLI in bash:
source <(uplink access register $(storj-sim network env GATEWAY_0_ACCESS)
--auth-service http://localhost:8000 --format aws --aws-profile storjsim)
Change-Id: I5d78d6462a3537780af3717a298bb2bebf9c2799
* Separate audit history interface into its own file in the overlay
package
* Add overlay.AuditHistory struct so that internalpb.AuditHistory is
only used from within the database layer
* Add overlay.GetAuditHistory function for features that will require
access to detailed audit history information
* Do not return full audit history from UpdateAuditHistory - callers to
that function only need to know the online score and whether a full
tracking period has been completed
* Move audit history tests out of satellite/satellitedb, since they are
independent of database implementation
Change-Id: I35b0c4ac23bbaabd80624f8a9631c3cb1a1f33bd
Now that the deprecated downtime tracking service is removed
(3fc76f4ffe), we can safely remove
the nodes_offline_times table.
Change-Id: Ia7c6efe32ba104dff5a830af5f2beee3337eefe5
Nodes which are offline_suspended will no longer be considered for new
uploads. The current threshold that enters a node into offline
suspension is 0.6. Disqualification for offline suspension is still
disabled.
Change-Id: I0da9abf47167dd5bf6bb21e0bc2186e003e38d1a
Add the upload size to the log lines of the storagenode Upload endpoint
to provides the information to Storage node operators.
Change-Id: Ife661d28be72c2bf02579093e21fa811566ac8dd