Switch back to the original DeleteBucket and DeleteObject methods.
Next step: remove the DeleteBucketReturnDeleted and
DeleteObjectReturnDeleted from storj.io/uplink.
Change-Id: I273a305326d411e51ce354ce72fcc6ecadf4dd5f
step 1 in https://review.dev.storj.io/c/storj/uplink/+/1236
Now the old libuplink uses the temporary DeleteBucketReturnDeleted and
DeleteObjectReturnDeleted methods. This way, in the next step, we will
be able to change the DeleteBucket and DeleteObject methods to return
the deleted bucket/object.
Change-Id: I2e638be1960bca6ce1456c92849fcdd6d93e5252
The admission/v3 protocol now supports arbitrary key/value headers to be
included in each packet of metrics. This commit creates support for
this, so the lua config file can declare a filter taking into account
the key/value headers.
Change-Id: I41de8c018d33304ccf46ec221ae689d55c5fb1ee
common/pb moved grpc to a separate package common/pb/pbgrpc.
This updates this repository to use it.
Change-Id: I2de2a190688871cf9cb61f7ea511f8a01e264e4e
Now that we are trying to identify the root cause of the satellite load limitations (i.e. currently the satellite has a max ability of 400 rps for uploads and we need this to be higher), we are using the golang diagnostic tools to collect insight into what the bottlenecks are. We currently have a debug endpoint to gather some cpu and mem data, but it could be useful to have continuous profiling. GCP stackdriver has support for continuous profiling so lets set that up and see if it is helpful to gather more data.
This PR adds support for [GCP continuous profiler](https://cloud.google.com/profiler) which allows enabling continuous cpu/mem profiling and the stats are sent to stackdriver in google cloud console.
To enable the continuous profiling for a storj component, do the following:
- prereq: the workload must be running in GKE and have Stackdriver Profiling IAM role permissions
- provide the config flag `debug.profilename` in the config.yaml file for the workload (i.e. satellite api process, etc). The profilename should be the workload name, for example "satellite-api".
- once the above config flag is provided, the profiler will be initialized and profiling stats will automatically be sent to GCP project where the workload is running and viewable in the Stackdriver Profile page in the console
The current implementation assumes the workload is running in GKE, however if we find if useful we can add support to enable this from anywhere. But for simplicity, its configured this way assuming the main goal is to enable in production systems.
Change-Id: Ibf8ebe2df7bf06fdd4951ee6a1e48854dd36ad47
With commit: 3331b443e7, satellite will
start calling `DeletePieces`. Therefore, we can remove the old endpoint
once the above commit is deployed with all satellites
Change-Id: I0124bc00a7cb808d119eb59f8fcd7fadf68158bb
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.
graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.
it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.
Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
it was noticed that if you had a long lived transaction A that
was blocking some other transaction B and A was being aborted
due to retriable errors, then transaction B was never given
priority. this was due to using savepoints to do lightweight
retries.
this behavior was problematic becaue we had some queries blocked
for over 16 hours, so this commit addresses the issue with two
prongs:
1. bound the amount of time we will retry a transaction
2. create new transactions when a retry is needed
the first ensures that we never wait for 16 hours, and the value
chosen is 10 minutes. that should be long enough for an ample
amount of retries for small queries, and huge queries probably
shouldn't be retried, even if possible: it's more preferrable to
find a way to make them smaller.
the second ensures that even in the case of retries, queries that
are blocked on the aborted transaction gain priority to run.
between those two changes, the maximum stall time due to retries
should be bounded to around 10 minutes.
Change-Id: Icf898501ef505a89738820a3fae2580988f9f5f4
We move PathCipher to encryption.Store and we need to adjust
storj/uplink for those changes. Uplink repo is also using libuplink to
run tests so we need first adjust storj/storj libuplink and later
storj/uplink.
Change-Id: I84f23e6bad18ac139f72c19939dc526f9f46d88b
This ought to make it so that all single statements (Exec- or Query-) on
a CockroachDB backend will get retried as necessary. As there is no need
for savepoints to be allocated or released in this case, there is no
round-trip overhead except when statements actually do need to be
retried.
Change-Id: Ibd7f1725ff727477c456cb309120d080f3cd7099
As per discussed we decided to rate limit how fast we iterate through
the metainfo database in the metainfo loop. This puts in place a
mechanism for rate limiting and burst limiting if need be in the future.
The default for this rate limiting is still no limits so it stays the
same as our previous functionality.
Change-Id: I950f7192962b0e49f082d2c4284e2d52b0a925c7
these may not be optimal but they're probably better based on
our previous testing. we can tune better in the future now that
the groundwork is there.
Change-Id: Iafaee86d3181287c33eadf6b7eceb307dda566a6
* separate sadb migration, add version check
* update checkversion to do same validation as migration
* changes per CR
* add sa migration to storj-sim
* add different debug port in storj-sim for migration
* add wait for exit for storj-sim migration
* update sa docker entrypoint to support migration
* storj-sim satellite parts all wait for migration
* upgrade golang-migrate/migrate to v4 because bug
* fix go mod tidy
keep a pool of connections open when dialing for drpc. this
makes it so that long lived clients (like lib/uplink's Project)
don't continue to use a bad connection forever. it also allows
for concurrent rpcs.
Change-Id: If649b286050e4f09c413fadc3e1ce88f5fc6e600
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.
most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.
a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.
Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
The fundamental problem is that both drpc and grpc servers
want to close the listener and they both want to ignore the
error from Accept after the listener is closed. There's no
way to do this in a race free way. Fortunately, the mux
hands out listeners that can be independently closed. That
means they can both do their own shutdown logic where they
ignore the error, and then after they're closed, the code
orchestrating the servers can close the listeners.
The final weird bit is that the server's Close method is
required to wait until the Run method has exited (or at
least enough for the listeners to definitely be closed)
because tests depend on that behavior, so we have to add
some channels/mutexes/onces to ensure that Run has exited
and that a new call can't start after Close is called.
Change-Id: I7c4ef293f7963f83138815f51824fd5b8d09ce15
What: Change cmd/uplink to use scopes
It moves the fields that will be subsumed by scopes into an explicit legacy section and hides their configuration flags.
Why: So that it can read scopes in from files and stuff
* update UI to reflect final mockups
* implement create handler and render offers table data to UI
* fix line-height unit and remove important from selectors
* update file names and ids for clarity
* shorten 'label' in ids
* localize global vars, fix endpoint names, remove unnecessary receiver, fix comments
* fix unnecessary initialization of pointer
* correct file-naming conventions
* register timeConverter in an init func for safety and remove unnecessary important from css
* consolidate create endpoints and add comments
* register timeConverter in init func
* add copyright to files
* introduce require pkg
* add proper http server unit test
* update linting and create offers concurrently in unit test
* fix getOffers comment
* add copy-right to unit-test
* fix data-races
* fix linting
* remove converter in NewServer
* requested changes in progress
* add require for checking status code
* renamed template file
* fix 400 handler
* fix missing copyright and remove extra line
* fix build
* run goroutine for testing parallel
* evaluate reqType with switch stmt and promp for credit amount in cents
* fix lint issue
* add default case
* remove unnecessary var
* fix range scope error
* remove empty lines and use long form for struct field
* fix merge conflicts
* fix template reference
* fix modal id
* not delete package
* add currency formatting and requested changes
* markup formatting
* lean out currency logic and move wait outside loop
* pass ToDollars func to home template
* fix lint
* update UI to reflect final mockups
* fix line-height unit and remove important from selectors
* update file names and ids for clarity
* shorten 'label' in ids
* correct file-naming conventions
* add copyright to files
* check if break in imports is causing lint error
* resolve lint
* tidy go mod
* fix shorthands
* set to only listen on 127.0.0.1, move static files to same location, better template handling
* handle error
* fix path in storj-sim
* revert template handling changes
* code shouldn't panic on invalid tempalte
* do not rewrite once writing has started
* write correct error code
* use filepath for path handling
* revert change
* fix
* fix mod tidy
* use correct error code for not found, avoid infinite loop on failure
* Set up new port 8090 for in offers
Clean up commented code
Rename offers to offersweb
Remove unused code
Add todos for adding front-end templates
Add middleware for only allow local access
Add comment
Fix linting error
Remove commented code
Update storj-sim
Check request IP against Host IP
Use net pakcage to retrieve IP address
Rename service to marketing
* Add wrapper for all errors
* fix conflicts
* update the config file
* fix linting error
* remove unused packages
* remove global runtime var and add flag to storj-sim for mar static dir
* remove debugging lines
* add new config for test data and check if static dir flag is set before passing to mux
* change 'console' to 'marketing' for test data config
* fix linting errors
* update config flag
* Trigger Jenkins
* Trigger CLA
* change BindSetup to be an option to Bind
* add process.Bind to allow composite structures
* hack fix for noprefix flags
* used tagged version of structs
Before this PR, some flags were created by calling `cfgstruct.Bind` and having their fields create a flag. Once the flags were parsed, `viper` was used to acquire all the values from them and config files, and the fields in the struct were set through the flag interface.
This doesn't work for slices of things on config structs very well, since it can only set strings, and for a string slice, it turns out that the implementation in `pflag` appends an entry rather than setting it.
This changes three things:
1. Only have a `Bind` call instead of `Bind` and `BindSetup`, and make `BindSetup` an option instead.
2. Add a `process.Bind` call that takes in a `*cobra.Cmd`, binds the struct to the command's flags, and keeps track of that struct in a global map keyed by the command.
3. Use `viper` to get the values and load them into the bound configuration structs instead of using the flags to propagate the changes.
In this way, we can support whatever rich configuration we want in the config yaml files, while still getting command like flags when important.
* define irreparable inspector protobuf
* add IrreparableDB method GetLimited
* fill out irreparable inspector API
* add IrreparableInspector server to satellite, fix small error
* refactor IrreparableDB to use pb.IrreparableSegment instead of irreparable.RemoteSegmentInfo
this change removes the cryptopasta dependency.
a couple possible sources of problem with this change:
* the encoding used for ECDSA signatures on SignedMessage has changed.
the encoding employed by cryptopasta was workable, but not the same
as the encoding used for such signatures in the rest of the world
(most particularly, on ECDSA signatures in X.509 certificates). I
think we'll be best served by using one ECDSA signature encoding from
here on, but if we need to use the old encoding for backwards
compatibility with existing nodes, that can be arranged.
* since there's already a breaking change in SignedMessage, I changed
it to send and receive public keys in raw PKIX format, instead of
PEM. PEM just adds unhelpful overhead for this case.
* cmd/statreceiver: lua-scriptable stat receiver
Change-Id: I3ce0fe3f1ef4b1f4f27eed90bac0e91cfecf22d7
* some updates
Change-Id: I7c3485adcda1278fce01ae077b4761b3ddb9fb7a
* more comments
Change-Id: I0bb22993cd934c3d40fc1da80d07e49e686b80dd
* linter fixes
Change-Id: Ied014304ecb9aadcf00a6b66ad28f856a428d150
* catch errors
Change-Id: I6e1920f1fd941e66199b30bc427285c19769fc70
* review feedback
Change-Id: I9d4051851eab18970c5f5ddcf4ff265508e541d3
* errorgroup improvements
Change-Id: I4699dda3022f0485fbb50c9dafe692d3921734ff
* too tricky
the previous thing was better for memory with lots of errors at a time
but https://play.golang.org/p/RweTMRjoSCt is too much of a foot gun
Change-Id: I23f0b3d77dd4288fcc20b3756a7110359576bf44
* preparing for use of `customtype` gogo extension with `NodeID` type
* review changes
* preparing for use of `customtype` gogo extension with `NodeID` type
* review changes
* wip
* tests passing
* wip fixing tests
* more wip test fixing
* remove NodeIDList from proto files
* linter fixes
* linter fixes
* linter/review fixes
* more freaking linter fixes
* omg just kill me - linterrrrrrrr
* travis linter, i will muder you and your family in your sleep
* goimports everything - burn in hell travis
* goimports update
* go mod tidy
..although it ought to work for other storage.KeyValueStore needs as
well. it's just optimized to work pretty well for a largish hierarchy of
paths.
This includes the addition of "long benchmarks" for KeyValueStore
testing. These will only be run when -test-bench-long is added to the
test flags. In these benchmarks, a large corpus of paths matching a
natural ("real-life") hierarchy is read from paths.data.gz (which you
can get from https://github.com/storj/path-test-corpus) and imported
into a particular KeyValueStore. Recursive and non-recursive queries are
run on it to detect performance problems that arise only at scale.
This also includes alternate implementation of the postgreskv client,
which works in a less-bizarre way for non-recursive queries, but suffers
from poor performance in tests such as the long benchmarks. Once this
alternate impl is committed to the tree, we can remove it again; I just
want it to be available for future reference.
* begin adding path encryption
* do not encrypt/decrypt first element of path (bucket)
* add path encryption for delete and list
* use encrypted paths in streamstore.Meta
* fix listing with encrypted paths
* move encrypt/decryptAfterBucket to streamstore
* fix listing with no prefix
* remove duplicate logic for listing with no prefix
* Loads cache from context for PointerDB access
* WIP adds overlay lookups to pointerdb requests
* Pointer lookup code is added for Get
* adds feature flag for pointerdb return
* refactors pointerdb code
* removes some unnecessary debug logs
* Fixes indent in config
* adds early return for non-remote pointers
* formats code, removes some comments
* Fixes tests broken by pointer proto changes
* adds error check and merges variable declaration
* removes commented out proto import
* adds error check to pdbclient
* pkg structure and repair queue implementation
* adds zeebo
* gets redis working with queue
* modifies interface
* changes re feedback
* pr changes w encoding and enqueue dequeue modifications
* test force error
* concurrent enqueue/dequeue
* refactor sequential to use only 1 slice
* added token for time conflicts
* begin adding encryption for remote pieces
* begin adding decryption
* add encryption key as arg to Put and Get
* move encryption/decryption to object store
* Add encryption key to object store constructor
* Add the erasure scheme to object store constructor
* Ensure decrypter is initialized with the stripe size used by encrypter
* Revert "Ensure decrypter is initialized with the stripe size used by encrypter"
This reverts commit 07272333f461606edfb43ad106cc152f37a3bd46.
* Revert "Add the erasure scheme to object store constructor"
This reverts commit ea5e793b536159d993b96e3db69a37c1656a193c.
* move encryption to stream store
* move decryption stuff to stream store
* revert changes in object store
* add encryptedBlockSize and close rangers on error during Get
* calculate padding sizes correctly
* encryptedBlockSize -> encryptionBlockSize
* pass encryption key and block size into stream store
* remove encryption key and block size from object store constructor
* move encrypter/decrypter initialization
* remove unnecessary cast
* Fix padding issue
* Fix linter
* add todos
* use random encryption key for data encryption. Store an encrypted copy of this key in segment metadata
* use different encryption key for each segment
* encrypt data in one step if it is small enough
* refactor and move encryption stuff
* fix errors related to nil slices passed to copy
* fix encrypter vs. decrypter bug
* put encryption stuff in eestream
* get captplanet test to pass
* fix linting errors
* add types for encryption keys/nonces and clean up
* fix tests
* more review changes
* add Cipher type for encryption stuff
* fix rs_test
* Simplify type casting of key and nonce
* Init starting nonce to the segment index
* don't copy derived key
* remove default encryption key; force user to explicitly set it
* move getSegmentPath to streams package
* dont require user to specify encryption key for captplanet
* rename GenericKey and GenericNonce to Key and Nonce
* review changes
* fix linting error
* Download uses the encryption type from metadata
* Store enc block size in metadata and use it for download
* storage node quick check and startup validation
* rearranged the startup validation and quick check logic
* travis lint warning fixes
* travis lint warning fixes
* travis lint warning fixes
* code changes per review comments
* code clean dev debug info
* travis lint wranings
* code changes per code review comments
* code changes per code review comments
* code update per review
* sqlite SUM is having issue when getting the SUM of an empty column; filepath was checking a directory that doesn't exist when starting server; Example updated to print allocated and used space
* storage node quick check and startup validation
* rearranged the startup validation and quick check logic
* travis lint warning fixes
* travis lint warning fixes
* travis lint warning fixes
* code changes per review comments
* code clean dev debug info
* travis lint wranings
* code changes per code review comments
* code changes per code review comments
* code update per review
* no file or directory error
* Updated mock PSClient