This is based on Jeff's most excellent work to identify why
non-recursive listing under postgreskv was phenomenally slow. It turns
out PostgreSQL's query planner was actually using two sequential scans
of the pathdata table to do its job. It's unclear for how long that has
been happening, but obviously it won't scale any further.
The main change is propagating bucket association with pathnames through
the CTE so that the query planner lets itself use the pathdata index on
(bucket, fullpath) for the skipping-forward part.
Jeff also had some changes to the range ends to keep NULL from being
used- I believe with the intent of making sure the query planner was
able to use the pathdata index. My tests on postgres 9.6 and 11
indicate that those changes don't make any appreciable difference in
performance or query plan, so I'm going to leave them off for now to
avoid a careful audit of the semantic differences.
There is a test included here, which only serves to check that the new
version of the function is indeed active. To actually ensure that no
sequential scans are being used in the query plan anymore, our tests
would need to be run against a test db with lots of data already loaded
in it, and that isn't feasible for now.
Change-Id: Iffe9a1f411c54a2f742a4abb8f2df0c64fd662cb
* put TestCreateV0 back in StoreForTest
* avoid direct handles to V0 pieceinfo db
* type mismatch fix
* use storage.Blobs interface in store_test.go
..instead of filestore.Store. this will allow filestore.Store to become
unexported.
* unexport filestore.Store
rename it to blobStore. things should use the storage.Blobs interface
instead. changes in this commit are purely mechanical (made through the
"refactor" tool in Gocode followed by search/replace on the word "Store"
within the storage/filestore/ directory).
* kill filestore.StoreForTest
now that filestore.blobStore is unexported, there isn't a need for a
specialized wrapper type. this (not coincidentally) also makes it
possible for the WriterForFormatVersion() method on
storagenode/pieces.StoreForTest to work, without requiring everything to
wrap the store.blobs attribute in a filestore.StoreForTest, which was
impractical.
We don't use reverse listing in any of our code, outside of tests, and
it is only exposed through libuplink in the
lib/uplink.(*Project).ListBuckets() API. We also don't know of any users
who might have a need for reverse listing through ListBuckets().
Since one of our prospective pointerdb backends can not support
backwards iteration, and because of the above considerations, we are
going to remove the reverse listing feature.
Change-Id: I8d2a1f33d01ee70b79918d584b8c671f57eef2a0
* separate sadb migration, add version check
* update checkversion to do same validation as migration
* changes per CR
* add sa migration to storj-sim
* add different debug port in storj-sim for migration
* add wait for exit for storj-sim migration
* update sa docker entrypoint to support migration
* storj-sim satellite parts all wait for migration
* upgrade golang-migrate/migrate to v4 because bug
* fix go mod tidy
* set up redis support in live accounting
* move live.Service interface into accounting package and rename to Cache, pass into satellite
* refactor Cache to store one int64 total, add IncrBy method to redis client implementation
* add monkit tracing to live accounting
* add cache, update cache w/piece create/delete
* add service w/loop to cache to recalculate space used cache
* add piecestore cache to other sn svcs to use
* add table to persist the total space used
* rm cache where not needed
* rm stuff from sn svcs
* start fixing tests, changes per comments
* update commits
* add unit tests
* fix commiting before we write header bytes
* fix cache create test
* copy cache map, add started back to recalc
* fix test
* add test, update comments
Deprecate the pieceinfo database, and start storing piece info as a header to
piece files. Institute a "storage format version" concept allowing us to handle
pieces stored under multiple different types of storage. Add a piece_expirations
table which will still be used to track expiration times, so we can query it, but
which should be much smaller than the pieceinfo database would be for the
same number of pieces. (Only pieces with expiration times need to be stored in piece_expirations, and we don't need to store large byte blobs like the serialized
order limit, etc.) Use specialized names for accessing any functionality related
only to dealing with V0 pieces (e.g., `store.V0PieceInfo()`). Move SpaceUsed-
type functionality under the purview of the piece store. Add some generic
interfaces for traversing all blobs or all pieces. Add lots of tests.
* pkg/datarepair/repairer: Track always time for repair
Make a minor change in the worker function of the repairer, that when
successful, always track the metric time for repair independently if the
time since checker queue metric can be tracked.
* storage/postgreskv: Wrap error in Get func
Wrap the returned error of the Get function as it is done when the
query doesn't return any row.
* satellite/metainfo: Move debug msg to the right place
NewStore function was writing a debug log message when the DB was
connected, however it was always writing it out despite if an error
happened when getting the connection.
* pkg/datarepair/repairer: Wrap error before logging it
Wrap the error returned by process which is executed by the Run method
of the repairer service to add context to the error log message.
* pkg/datarepair/repairer: Make errors more specific in worker
Make the error messages of the "worker" method of the Service more
specific and the logged message for such errors.
* pkg/storage/repair: Improve error reporting Repair
In order of improving the error reporting by the
pkg/storage/repair.Repair method, several errors of this method and
functions/methods which this one relies one have been updated to be
wrapper into their corresponding classes.
* pkg/storage/segments: Track path param of Repair method
Track in monkit the path parameter passed to the Repair method.
* satellite/satellitedb: Wrap Error returned by Delete
Wrap the error returned by repairQueue.Delete method to enhance the
error with a class and stack and the
pkg/storage/segments.Repairer.Repair method get a more contextualized
error from it.
* added scopelint and correcte issues found
* corrected scopelint issue
* made updates based on Ivan's suggestions
Most were around naming conventions
Some were false positives, but I kept them since the test.Run could eventually be changed to run in parallel, which could cause a bug
Others were false positives. Added // nolint: scopelint
* first round cleanup based on go-critic
* more issues resolved for ifelsechain and unlambda checks
* updated from master and gocritic found a new ifElseChain issue
* disable appendAssign. i reports false positives
* re-enabled go-critic appendAssign and disabled lint check at code line level
* fixed go-critic lint error
* fixed // nolint add gocritic specifically
* add flags to sotrj-sim for SA dbs
* add schema to postgres
* add createschema with parse to sa
* add metainfo db postgres support
* add kv default as bolt
* add debug log to see db source
* add env var for postgres to test-sim.sh
* fix lint errs
* dynamically add postgres to args
* add postgres to integration tests
* add sqlite and postgres integration jenkins
* fix db name
* merge integration tests into one step
* test integration tests w/psql
* try using different schema
* debug failure
* use correct host for running storj-sim
* rm sqlite integration
* add back integration
* add boltDB batching for Put operation, add benchmark test
* add batchPut method to kademlia routingTable
* add BatchPut method for other KeyValueStore to satisfy interface
* return err not implemented
* add noSync to boltdb client
* rm boltDB noSync
* make batch block and fix tests
* changes per CR
* rm test setting so it matches prod code behavior
* fix lint errs
* initial test
* add parenthesis
* remove pipeline
* add few todos
* use docker image for environment
* use pipeline
* fix
* add missing steps
* invoke with bash
* disable protoc
* try using golang image
* try as root
* Disable install-awscli.sh temporarily
* Debugging
* debugging part 2
* Set absolute path for debugging
* Remove absolute path
* Dont run as root
* Install unzip
* Dont forget to apt-get update
* Put into folder that is in PATH
* disable IPv6 Test
* add verbose info and check protobuf
* make integration non-parallel
* remove -v and make checkout part of build
* make a single block for linting
* fix echo
* update
* try using things directly
* try add xunit output
* fix name
* don't print empty lines
* skip testsuites without any tests
* remove coverage, because it's not showing the right thing
* try using dockerfile
* fix deb source
* fix typos
* setup postgres
* use the right flag
* try using postgresdb
* expose different port
* remove port mapping
* start postgres
* export
* use env block
* try using different host for integration tests
* eat standard ports
* try building images and binaries
* remove if statement
* add steps
* do before verification
* add go get goversioninfo
* make separate jenkinsfile
* add check
* don't add empty packages
* disable logging to reduce output size
* add timeout
* add comment about mfridman
* Revert Absolute Path
* Add aws to PATH
* PATH Changes
* Docker Env Fixes
* PATH Simplification
* Debugging the PATH
* Debug Logs
* Debugging
* Update PATH Handling
* Rename
* revert changes to Jenkinsfile
* preparing for use of `customtype` gogo extension with `NodeID` type
* review changes
* preparing for use of `customtype` gogo extension with `NodeID` type
* review changes
* wip
* tests passing
* wip fixing tests
* more wip test fixing
* remove NodeIDList from proto files
* linter fixes
* linter fixes
* linter/review fixes
* more freaking linter fixes
* omg just kill me - linterrrrrrrr
* travis linter, i will muder you and your family in your sleep
* goimports everything - burn in hell travis
* goimports update
* go mod tidy
..although it ought to work for other storage.KeyValueStore needs as
well. it's just optimized to work pretty well for a largish hierarchy of
paths.
This includes the addition of "long benchmarks" for KeyValueStore
testing. These will only be run when -test-bench-long is added to the
test flags. In these benchmarks, a large corpus of paths matching a
natural ("real-life") hierarchy is read from paths.data.gz (which you
can get from https://github.com/storj/path-test-corpus) and imported
into a particular KeyValueStore. Recursive and non-recursive queries are
run on it to detect performance problems that arise only at scale.
This also includes alternate implementation of the postgreskv client,
which works in a less-bizarre way for non-recursive queries, but suffers
from poor performance in tests such as the long benchmarks. Once this
alternate impl is committed to the tree, we can remove it again; I just
want it to be available for future reference.
* Let's do it right this time
* Oh travis...
* Handle redis URL
* Travis... why u gotta be like this?
* Handle when address does not use redis scheme
* Start repairer
* Match provider.Responsibility interface
* Simplify if statement
* Config doesn't need to be a pointer
* Initialize doesn't need to be exported
* Don't run checker or repairer on startup
* Fix travis complaints
1. Added KeyValueStore.Iterate for implementing the different List, ListV2 etc. implementations. This allows for more efficient use of memory depending on the situation.
2. Implemented an inmemory teststore for running tests. This should allow to replace MockKeyValueStore in most places.
3. Rewrote tests
4. Pulled out logger from bolt implementation so it can be used for all other storage implementations.
5. Fixed multiple things in bolt and redis implementations.
* Don't use url.Parse for bolt paths: filepaths may not be valid URL-s.
* go.mod: update dependencies
* README.md: add Windows instructions
* pkg/overlay: check for the correct path and text in error
* pkg/overlay: fix tests for windows
* pkg/piecestore: make windows tests pass
* pkg/telemetry: skip test, as it doesn't shutdown nicely
* storage/redis: ensure that redis is clean before running tests
Fixes go1.11 vet warnings.
Cancel on WithTimeout must always be called to avoid memory leak:
pkg/provider/provider.go:73: the cancel function returned by context.WithTimeout should be called, not discarded, to avoid a context leak
Range over non-copyable things:
pkg/pool/connection_pool_test.go:32: range var v copies lock: struct{pool pool.ConnectionPool; key string; expected pool.TestFoo; expectedError error} contains pool.ConnectionPool contains sync.RWMutex
pkg/pool/connection_pool_test.go:56: range var v copies lock: struct{pool pool.ConnectionPool; key string; value pool.TestFoo; expected pool.TestFoo; expectedError error} contains pool.ConnectionPool contains sync.RWMutex
pkg/pool/connection_pool_test.go:83: range var v copies lock: struct{pool pool.ConnectionPool; key string; value pool.TestFoo; expected interface{}; expectedError error} contains pool.ConnectionPool contains sync.RWMutex
zeebo/errs package always requires formatting directives:
pkg/peertls/peertls.go:50: Class.New call has arguments but no formatting directives
pkg/peertls/utils.go:47: Class.New call has arguments but no formatting directives
pkg/peertls/utils.go:87: Class.New call has arguments but no formatting directives
pkg/overlay/cache.go:94: Class.New call has arguments but no formatting directives
pkg/provider/certificate_authority.go:98: New call has arguments but no formatting directives
pkg/provider/identity.go:96: New call has arguments but no formatting directives
pkg/provider/utils.go:124: New call needs 1 arg but has 2 args
pkg/provider/utils.go:136: New call needs 1 arg but has 2 args
storage/redis/client.go:44: Class.New call has arguments but no formatting directives
storage/redis/client.go:64: Class.New call has arguments but no formatting directives
storage/redis/client.go:75: Class.New call has arguments but no formatting directives
storage/redis/client.go:80: Class.New call has arguments but no formatting directives
storage/redis/client.go:92: Class.New call has arguments but no formatting directives
storage/redis/client.go:96: Class.New call has arguments but no formatting directives
storage/redis/client.go:102: Class.New call has arguments but no formatting directives
storage/redis/client.go:126: Class.New call has arguments but no formatting directives
* adds foundation for bucketStore
* adds prefixedObjStore to buckets package, adjusts gateway-storj accordingly
* fixes multi value assignment problems in gateway-storj
* fixes more multi value assignment errors in gateway-storj
* starts changing miniogw tests to accommodate buckets
* creates bucket store mock
* wip - fixing test cases in object tests
* adds get, put, and list object tests, comments out two test cases
* adds happy scenario tests for bucket methods
* fixes bug in list, removes redundant parts from gateway tests
* fixes nit
* Clean up tests from #188
* Fix bug with timestamp conversion in segment store
* fixes segments.Meta test
* Fix regression in listing objects in a bucket
* adds check to see if bucket is empty before deleting
* updates DeleteBucket test to account for empty/full bucket
* adds TODOs for DeleteBucket and MakeBucket for some cases, adjusts tests, filters out minio errors in logging.go
* adds checks for if buckets already exist or not in DeleteBucket and MakeBucket functions; adjusts tests
* adds BucketNotFound error check in bucket store, removes todo
* adds make_bucket to Travis test, updates boltdb client constructor to always create a bucket (table)
* Unit test covarege increased for kademlia pkg
go style formatting added
Removed DHT param from newTestKademlia method, added comments for Bucket methods that informs that these tests will need to be updated
unnecessary comment deleted from newTestKademlia
Adjust Segment Store to the updated interface (#160)
* Adjust Segment Store to the updated interface
* Move /pkg/storage/segment to /pkg/storage/segments
* Fix overlay client tests
* Revert changes in NewOverlayClient return value
* Rename `rem` to `seg`
* Implement Meta()
captplanet (#159)
* captplanet
I kind of went overboard this weekend.
The major goal of this changeset is to provide an environment
for local development where all of the various services can
be easily run together. Developing on Storj v3 should be as
easy as running a setup command and a run command!
To do this, this changeset introduces a new tool called
captplanet, which combines the powers of the Overlay Cache,
the PointerDB, the PieceStore, Kademlia, the Minio Gateway,
etc.
Running 40 farmers and a heavy client inside the same process
forced a rethinking of the "services" that we had. To
avoid confusion by reusing prior terms, this changeset
introduces two new types: Providers and Responsibilities.
I wanted to avoid as many merge conflicts as possible, so
I left the existing Services and code for now, but if people
like this route we can clean up the duplication.
A Responsibility is a collection of gRPC methods and
corresponding state. The following systems are examples of
Responsibilities:
* Kademlia
* OverlayCache
* PointerDB
* StatDB
* PieceStore
* etc.
A Provider is a collection of Responsibilities that
share an Identity, such as:
* The heavy client
* The farmer
* The gateway
An Identity is a public/private key pair, a node id, etc.
Farmers all need different Identities, so captplanet
needs to support running multiple concurrent Providers
with different Identities.
Each Responsibility and Provider should allow for configuration
of multiple copies on its own so creating Responsibilities and
Providers use a new workflow.
To make a Responsibility, one should create a "config"
struct, such as:
```
type Config struct {
RepairThreshold int `help:"If redundancy falls below this number of
pieces, repair is triggered" default:"30"`
SuccessThreshold int `help:"If redundancy is above this number then
no additional uploads are needed" default:"40"`
}
```
To use "config" structs, this changeset introduces another
new library called 'cfgstruct', which allows for the configuration
of arbitrary structs through flagsets, and thus through cobra and
viper.
cfgstruct relies on Go's "struct tags" feature to document
help information and default values. Config structs can be
configured via cfgstruct.Bind for binding the struct to a flagset.
Because this configuration system makes setup and configuration
easier *in general*, additional commands are provided that allow
for easy standup of separate Providers. Please make sure to
check out:
* cmd/captplanet/farmer/main.go (a new farmer binary)
* cmd/captplanet/hc/main.go (a new heavy client binary)
* cmd/captplanet/gw/main.go (a new minio gateway binary)
Usage:
```
$ go install -v storj.io/storj/cmd/captplanet
$ captplanet setup
$ captplanet run
```
Configuration is placed by default in `~/.storj/capt/`
Other changes:
* introduces new config structs for currently existing
Responsibilities that conform to the new Responsibility
interface. Please see the `pkg/*/config.go` files for
examples.
* integrates the PointerDB API key with other global
configuration via flags, instead of through environment
variables through viper like it's been doing. (ultimately
this should also change to use the PointerDB config
struct but this is an okay shortterm solution).
* changes the Overlay cache to use a URL for database
configuration instead of separate redis and bolt config
settings.
* stubs out some peer identity skeleton code (but not the
meat).
* Fixes the SegmentStore to use the overlay client and
pointerdb clients instead of gRPC client code directly
* Leaves a very clear spot where we need to tie the object to
stream to segment store together. There's sort of a "golden
spike" opportunity to connect all the train tracks together
at the bottom of pkg/miniogw/config.go, labeled with a
bunch of TODOs.
Future stuff:
* I now prefer this design over the original
pkg/process.Service thing I had been pushing before (sorry!)
* The experience of trying to have multiple farmers
configurable concurrently led me to prefer config structs
over global flags (I finally came around) or using viper
directly. I think global flags are okay sometimes but in
general going forward we should try and get all relevant
config into config structs.
* If you all like this direction, I think we can go delete my
old Service interfaces and a bunch of flags and clean up a
bunch of stuff.
* If you don't like this direction, it's no sweat at all, and
despite how much code there is here I'm not very tied to any
of this! Considering a lot of this was written between midnight
and 6 am, it might not be any good!
* bind tests
Add files for testing builds in docker (#161)
* Add files for testing builds in docker
* Make tests check for redis running before trying to start redis-server, which may not exist.
* Clean redis server before any tests use it.
* Add more debugging for travis
* Explicitly requiring redis for travis
pkg/provider: with pkg/provider merged, make a single heavy client binary, gateway binary, and deprecate old services (#165)
* pkg/provider: with pkg/provider merged, make a single heavy client binary and deprecate old services
* add setup to gw binary too
* captplanet: output what addresses everything is listening on
* revert peertls/io_util changes
* define config flag across all commands
* use trimsuffix
fix docker makefile (#170)
* fix makefile
protos: update protobufs with go generate (#169)
the import for timestamp and duration should use
the path provided by a standard protocol buffer library
installation
Refactor List in PointerDB (#163)
* Refactor List in Pointer DB
* Fix pointerdb-client example
* Fix issue in Path type related to empty paths
* Test for the PointerDB service with some fixes
* Fixed debug message in example: trancated --> more
* GoDoc comments for unexported methods
* TODO comment to check if Put is overwriting
* Log warning if protobuf timestamp cannot be converted
* TODO comment to make ListPageLimit configurable
* Rename 'segment' package to 'segments' to reflect folder name
Minio integration with Object store (#156)
* initial WIP integration with Object store
* List WIP
* minio listobject function changes complete
* Code review changes and work in progress for the mock objectstore unit testing cases
* Warning fix redeclaration of err
* Warning fix redeclaration of err
* code review comments & unit testing inprogress
* fix compilation bug
* Fixed code review comments & added GetObject Mock test case
* rearraged the mock test file and gateway storj test file in to the proper directory
* added the missing file
* code clean up
* fix lint error on the mock generated code
* modified per code review comments
* added the PutObject mock test case
* added the GetObjectInfo mock test case
* added listobject mock test case
* fixed package from storj to miniogw
* resolved the gateway-storj.go initialization merge conflict
update readme (#174)
added assertion for unused errors (#152)
merging this PR to avoid future issues
updating github user to personal account (#171)
Test coverage ranger (#168)
* Fixed go panic for corner case
* Initial test coverage for ranger pkg
streamstore: add passthrough implementation (#176)
this doesn't implement streamstore, this just allows us to try and
get the june demo working again in the meantime
StatDB (#144)
* add statdb proto and example client
* server logic
* update readme
* remove boltdb from service.go
* sqlite3
* add statdb server executable file
* create statdb node table if it does not exist already
* get UpdateBatch working
* update based on jt review
* remove some commented lines
* fix linting issues
* reformat
* apiKey -> APIKey
* update statdb client apiKey->APIKey
Update README.md
Update README.md
overlay: correct dockerfile db (#179)
cmd/hc, cmd/gw, cmd/captplanet: simplify setup/run commands (#178)
also allows much more customization of services within captain planet,
such as reconfiguring the overlay service to use redis
pkg/process: don't require json formatting (#177)
Cleanup metadata across layers (#180)
* Cleanup metadata across layers
* Fix pointer db tests
Kademlia Routing Table (#164)
* adds comment
* runs deps
* creates boltdb kademlia routing table
* protobuf updates
* adds reverselist to mockkeyvaluestore interface
* xor wip
* xor wip
* fixes xor sort
* runs go fmt
* fixes
* goimports again
* trying to fix travis tests
* fixes mock tests
Ranger refactoring (#158)
* Fixed go panic for corner case
* Cosmetic changes, and small error fixes
miniogw: log all errors (#182)
* miniogw: log all errors
* tests added
* doc comment to satisfy linter
* fix test failure
Jennifer added to CLA list
* Temporary fix for storage/redis list method test
* adds netstate rpc server pagination, mocks pagination in test/util.go
* updates ns client example, combines ns client and server test to netstate_test, adds pagination to bolt client
* better organizes netstate test calls
* wip breaking netstate test into smaller tests
* wip modularizing netstate tests
* adds some test panics
* wip netstate test attempts
* testing bug in netstate TestDeleteAuth
* wip fixes global variable problem, still issues with list
* wip fixes get request params and args
* fixes bug in path when using MakePointers helper fn
* updates mockdb list func, adds test, changes Limit to int
* fixes merge conflicts
* fixes broken tests from merge
* remove unnecessary PointerEntry struct
* removes error when Get returns nil value from boltdb
* breaks boltdb client tests into smaller tests
* renames AssertNoErr test helper to HandleErr
* adds StartingKey and Limit parameters to redis list func, adds beginning of redis tests
* adds helper func for mockdb List function
* if no starting key provided for netstate List, the first value in storage will be used
* adds basic pagination for redis List function, adds tests
* adds list limit to call in overlay/server.go
* streamlines/fixes some nits from review
* removes use of obsolete EncryptedUnencryptedSize
* uses MockKeyValueStore instead of redis instance in redis client test
* changes test to expect nil returned for getting missing key
* remove error from `KeyValueStore#Get`
* fix bolt test
* Merge pull request #1 from bryanchriswhite/nat-pagination
remove error from `KeyValueStore#Get`
* adds Get returning error back to KeyValueStore interface and affected clients
* trying to appease travis: returns errors in Get calls in overlay/cache and cache_test
* handles redis get error when no key found