Commit Graph

331 Commits

Author SHA1 Message Date
littleskunk
c52c7275ad
satellite/repair: reduce upload timeout (#3597) 2019-11-18 18:52:56 +01:00
Nikolai Siedov
3fe518d547
satellite: added ability to inject stripe public key post build (#3560) 2019-11-18 13:38:43 +02:00
Yaroslav Vorobiov
53c6741ba6
satellite/payments: add API for retrieving conversion ratio, convert tokens to USD before applying to balance (#3530) 2019-11-15 16:59:39 +02:00
Yehor Butko
a8e4e9cb03
satellite/payments: project usage charges (#3512) 2019-11-15 16:27:44 +02:00
Natalie Villasana
1a9757a7f2 satellite/gracefulexit: add count for order limits sent from satellite to exiting node (#3544) 2019-11-13 09:54:50 -05:00
Yaroslav Vorobiov
0b32690d0a satellite/peer: add payments config (#3488)
* satellite/peer: add payments config

* remove stripe-key from console config

* update config lock

* fix imports

* fix config-lock
2019-11-05 21:26:19 +01:00
littleskunk
def3dcbaa9
satellite/audit: increase timeout to 5 minutes (#3480)
* satellite/audit: increase timeout to 5 minutes

* fix lint error
2019-11-05 11:21:25 +01:00
Maximillian von Briesen
590312970d satellite/gracefulexit: add flag for enabling/disabling graceful exit on the satellite (#3437) 2019-11-01 16:21:24 +02:00
Maximillian von Briesen
d9bb25b4b9 satellite/metainfo: support a wider range of values for RS.Total in satellite metainfo validation (#3431)
change uplink RS default configuration from 130 to 95
2019-10-31 15:04:33 -04:00
Yingrong Zhao
bfa6699e2c
satellite/repair: add timeout for repair download from a single node(#3418) 2019-10-30 16:31:08 -04:00
Natalie Villasana
4878135068
satellite/gracefulexit, storagenode/gracefulexit: add timeouts (#3407) 2019-10-30 13:40:57 -04:00
Yingrong Zhao
fa1ac24e19
satellite/gracefulexit: add failure threshold check (#3329)
* add overall failure percentage check and inactive time frame check before sending a response to sno

* update comment

* delete node from transfer queue if it has been inactive for too long

* fix linting error

* add test config value

* fix nil pointer

* add config value into testplanet

* add unit test for overall failure threshold

* move timeframe threshold to chore

* update protolock

* add chore test

* add per peiece failure count logic

* change config name from EndpointMaxFailures to MaxFailuresPerPiece

* address comments

* fix linting error

* add error handling for no row returned from progress table

* fix test for graceful exit chore on storagenode

* fix typo InActive -> Inactive

* improve readability for failure threshold calculation

* update config lock

* change error handling for GetProgress in graceful exit endpoint on the satellite side

* return proper rpc error in endpoint

* add check in chore test for checking finish timestamp and queue
2019-10-24 12:24:42 -04:00
littleskunk
2a5526fcc4
satellite/repair: reduce timeout (#3302) 2019-10-18 13:43:24 +02:00
Natalie Villasana
855fca003d satellite/metrics: create a metrics chore (#3263)
* add metrics counter and chore

* updates metrics observer interval release default and dev default to 15min

* add more specific check for remote pointers

* add Counter field to metrics chore, add counter tests

* rm redundant ObjectCount suffix

* make pointer check easier to read

* change metrics.Config.Interval to ChoreInterval

* rm unneeded var

* fix comment

* update satellite config lock
2019-10-16 14:08:33 -04:00
Cameron
76ad83f12c
satellite/accounting: add redis support to live accounting (#3213)
* set up redis support in live accounting

* move live.Service interface into accounting package and rename to Cache, pass into satellite

* refactor Cache to store one int64 total, add IncrBy method to redis client implementation

* add monkit tracing to live accounting
2019-10-16 12:50:29 -04:00
Jennifer Li Johnson
b185dbbee2
satellite/discovery: remove discovery related code (#3175) 2019-10-14 10:57:01 -04:00
littleskunk
96aeedcdee
OrderLimit/GracePeriod: Increase time window from 1h to 24h (#3255)
* OrderLimit/GracePeriod: Increase time window from 1h to 24h

* update satellite config lock
2019-10-13 17:40:24 +02:00
Ethan Adams
a1275746b4
satellite/gracefulexit: Implement the 'process' endpoint on the satellite (#3223) 2019-10-11 17:18:05 -04:00
Ethan Adams
4c4519f0be
satellite/gracefulexit: add transfer queue for pieces (#3174)
initial impl of transfer queue
updated docs represent the new design how we handle durability during exit
2019-10-07 16:38:05 -04:00
Stefan Benten
1db4251234 Satellite/repair: Add Repair Threshold Override to allow earlier repair (#3151) 2019-10-02 14:58:37 +02:00
Maximillian von Briesen
08ed50bcaa
satellite/metainfo: add commit interval to prevent long delays between order limit creation and segment commit (#3149) 2019-10-01 12:55:02 -04:00
Bogdan Artemenko
423d35fb3f
satellite/console: Added support URLs and other fields to config file (#3090) 2019-09-27 10:48:53 -06:00
Stefan Benten
c71f3a3f4a internal/version: Change default endpoint to query (#3126)
* change default domain name

change default domain name to point to the new version control

* Update satellite-config.yaml.lock
2019-09-25 22:55:38 +02:00
Jennifer Li Johnson
724bb44723
Remove Kademlia dependencies from Satellite and Storagenode (#2966)
What:

cmd/inspector/main.go: removes kad commands
internal/testplanet/planet.go: Waits for contact chore to finish
satellite/contact/nodesservice.go: creates an empty nodes service implementation
satellite/contact/service.go: implements Local and FetchInfo methods & adds external address config value
satellite/discovery/service.go: replaces kad.FetchInfo with contact.FetchInfo in Refresh() & removes Discover()
satellite/peer.go: sets up contact service and endpoints
storagenode/console/service.go: replaces nodeID with contact.Local()
storagenode/contact/chore.go: replaces routing table with contact service
storagenode/contact/nodesservice.go: creates empty implementation for ping and request info nodes service & implements RequestInfo method
storagenode/contact/service.go: creates a service to return the local node and update its own capacity
storagenode/monitor/monitor.go: uses contact service in place of routing table
storagenode/operator.go: moves operatorconfig from kad into its own setup
storagenode/peer.go: sets up contact service, chore, pingstats and endpoints
satellite/overlay/config.go: changes NodeSelectionConfig.OnlineWindow default to 4hr to allow for accurate repair selection
Removes kademlia setups in:

cmd/storagenode/main.go
cmd/storj-sim/network.go
internal/testplane/planet.go
internal/testplanet/satellite.go
internal/testplanet/storagenode.go
satellite/peer.go
scripts/test-sim-backwards.sh
scripts/testdata/satellite-config.yaml.lock
storagenode/inspector/inspector.go
storagenode/peer.go
storagenode/storagenodedb/database.go
Why: Replacing Kademlia

Please describe the tests:
• internal/testplanet/planet_test.go:

TestBasic: assert that the storagenode can check in with the satellite without any errors
TestContact: test that all nodes get inserted into both satellites' overlay cache during testplanet setup
• satellite/contact/contact_test.go:

TestFetchInfo: Tests that the FetchInfo method returns the correct info
• storagenode/contact/contact_test.go:

TestNodeInfoUpdated: tests that the contact chore updates the node information
TestRequestInfoEndpoint: tests that the Request info endpoint returns the correct info
Please describe the performance impact: Node discovery should be at least slightly more performant since each node connects directly to each satellite and no longer needs to wait for bootstrapping. It probably won't be faster in real time on start up since each node waits a random amount of time (less than 1 hr) to initialize its first connection (jitter).
2019-09-19 15:56:34 -04:00
Jennifer Li Johnson
ce3203e910
update NodeSelectionConfig.OnlineWindow to 4hr default (#3082) 2019-09-18 14:57:57 -04:00
Andrew Harding
f550ab5d1c
Uplink "import" command (#2981)
* uplink import cmd

* pkg/process: fix import order

* fix golangci-lint failures

* remove "help" from the satellite config lock file
2019-09-13 12:33:30 -06:00
Natalie Villasana
aa3567187e
satellite/audit: worker now verifies and reverifies (#2965) 2019-09-11 18:37:01 -04:00
Natalie Villasana
6d363fb756
satellite/audit: create the audit queue, chore, and worker (#2888) 2019-09-05 11:40:52 -04:00
Cameron
af5fb8e9c5
satellite/vouchers: deprecate voucher endpoint, return 'please upgrade' error (#2940)
* voucher endpoint returns 'please upgrade' error, test
2019-09-04 13:21:02 -04:00
Yingrong Zhao
10a896bf73
web/marketing: static asset path (#2872)
* use relative path instead of absolute path

* add template func baseURL

* add a method

* update storj-sim

* add comment
2019-08-30 18:43:53 -04:00
Cameron
599324c364
satellite/dbcleanup: delete expired serials from satellite (#2867)
Creates a new chore, dbcleanup, which can be used for routine deletion of items from the satellite database and adds functionality for deletion of expired serial numbers
2019-08-27 13:12:38 -04:00
Natalie Villasana
243cedb628
satellite/audit: implement reservoir struct and RemoteSegment observer method (#2744) 2019-08-21 11:49:27 -04:00
ethanadams
1a69ec8318
satellite/orders: document protocol and fix typos (#2813)
* Addressing comments from PR 2762
* Rebuild of orders.pb.go after comments added to proto file
* run update-satellite-config-lock for spelling fix.
2019-08-19 09:36:11 -04:00
ethanadams
8df683a265
Update satellite settlement endpoint to batch order processing into transactions. (#2762)
Update satellite settlement endpoint to batch order processing into transactions
2019-08-15 15:05:43 -04:00
littleskunk
3e41767f22
satellie/gc: enable garbage collection on the satellite (#2765) 2019-08-12 20:30:09 +02:00
Egon Elbre
c8edeb0257
satellite/overlay: rename overlay.Cache to overlay.Service (#2717) 2019-08-06 19:35:59 +03:00
Jeff Wendling
21a3bf89ee cmd/uplink: use scopes to open (#2501)
What: Change cmd/uplink to use scopes

It moves the fields that will be subsumed by scopes into an explicit legacy section and hides their configuration flags.

Why: So that it can read scopes in from files and stuff
2019-08-05 11:01:20 -06:00
ethanadams
c9b46f2fe2
V3-1987: Optimize audits stats persistence (#2632)
* Added batch update stats for recordAuditSuccessStatus
* Added batch update stats to recordAuditFailStatus
* added configurable batch size
* build individual update/delete statements so the statements can be batched into 1 call to the DB
* notified #config-changes channel and ran make update-satellite-config-lock
* updated tests to use batch update stats
2019-07-31 13:21:06 -04:00
Natalie Villasana
f11413bc8e Implement garbage collection on satellite (#2577)
* Added a gc package at satellite/gc, which contains the gc.Service, which runs garbage collection integrated with the metainfoloop, and the gc PieceTracker, which implements the metainfo loop Observer interface and stores all of the filters (about which pieces are good) for each node.
* Added a gc config located at satellite/gc/service.go (loop disabled by default in release)
* Creates bloom filters with pieces to be retained inside the metainfo loop
* Sends RetainRequests (or filters with good piece ids) to all storage nodes.
2019-07-24 13:26:43 -04:00
Maximillian von Briesen
6c1c3fb4a7
Add metainfo loop service (#2563)
Add a metainfo loop service on the satellite that can be subscribed to by various services that need to make use of metainfo information
2019-07-22 09:34:12 -04:00
Alexander Leitner
64b2769de3
discovery: parallelize refresh (#2535)
* parallelize discovery refresh

* add paginateQualifiedtest, address pr comments

* Remove duplicate uptime update

* Lower concurrency in Testplanet for discovery
2019-07-12 10:35:48 -04:00
Ivan Fraixedes
f420b29d35
[V3-1927] Repairer uploads to max threshold instead of success… (#2423)
* pkg/datarepair: Add test to check num upload pieces
  Add a new test for ensuring the number of pieces that the repair process
  upload when a segment is injured.
* satellite/orders: Don't create "put order limits" over total
  Repair must not create "put order limits" more than the total count.
* pkg/datarepair: Update upload repair pieces test
  Update the test which checks the number of pieces which are uploaded
  during a repair for using the same excess over the success threshold
  value than the implementation.
* satellites/orders: Limit repair put order for not being total
  Limit the number of put orders to be used by repair for only uploading
  pieces to a % excess over the successful threshold.
* pkg/datarepair: Change DataRepair test to pass again
  Make some changes in the DataRepair test to make pass again after the
  repair upload repaired pieces only until a % excess over success
  threshold.
  Also update the steps description of the DataRepair test after it has been
  changed, to match on what's now, besides to leave it more generic for
  avoiding having to update it on minimal future refactorings.
* satellite: Make repair excess optimal threshold configurable
  Add a new configuration parameter to the satellite for being able to
  configure the percentage excess over the optimal threshold, used for
  determining how many pieces should be repaired/uploaded, rather than
  having the value hard coded.
* repairer: Add configurable param to segments/repairer
  Add a new parameters to the segment/repairer to calculate the maximum
  number of excess nodes, based on the optimal threshold, that repaired
  pieces can be uploaded.
  This new parameter has been added for not returning more nodes than the
  number of upload orders for data repair satellite service calculate for
  repairing pieces.
* pkg/storage/ec: Update log message in clien.Repair
* satellite: Update configuration lock file
2019-07-12 00:44:47 +02:00
Bill Thorp
0e463dccfd
7 day validity window for order limits (#2520)
* 7 day limit
2019-07-10 17:17:00 -04:00
Stefan Benten
16156e3b3d
Ensure we force a segment size and account storage before committing them (#2473) 2019-07-08 18:24:38 -04:00
Egon Elbre
674742d1a7
satellite/datarepair: use reliability cache (#1976) 2019-07-09 01:04:35 +03:00
littleskunk
a2362f92dc
Rollback uptime disqualification (#2417) 2019-07-02 10:39:36 +02:00
littleskunk
9e62423f47
reduce vetting requirement (#2416) 2019-07-02 01:02:23 +02:00
JT Olio
8c57434ded
pkg/process/metrics: add an instance prefix (#2190)
* pkg/process/metrics: add an instance prefix

the distinction between which satellite is sending which
data should go in the instance field, not the suffix or application
fields. (un)fortunately, the instance id is deliberately not
configurable because we don't want it to be easy to accidentally
have multiple applications collide with the same instance id.

so we're currently stuffing the human readable instance in the
suffix. :(

perhaps a reasonable tradeoff would be an optional instance
prefix that allows operators to put their domain name in
the instance

Change-Id: I6fcc8498be908c5740439cc00f77474ad151febd

* linting

Change-Id: I9f9a44fa9a2634ef5e4f89548d42d57ce9e4450e
2019-06-24 16:45:37 -06:00
Maximillian von Briesen
fd6a4d96f2
change uptime dq threshold to 0.4 (#2313)
* change uptime dq threshold to 0.4

* update config lock
2019-06-24 12:18:32 -04:00
Cameron
1283036e37
add storage node voucher request service (#2158)
* add voucher service on storage node

* config field tag syntax, go routines for requests

* hook up voucher service in storagenode/peer.go

* add voucher config to testplanet

* add voucher config to testplanet

* add voucher response status INVALID, ACCEPTED, REJECTED

* add a test for vouchers service

* handle no row from GetValid, test it

* add trust pool to voucher service

* use trusted list to get satellites

* verify vouchers upon receipt

* test VerifyVoucher
2019-06-21 18:48:52 -04:00
aligeti
043d603cbe
satellite rs config check with validation check set to false default (#2229)
* satellite rs config check with validation check
2019-06-21 14:15:58 -04:00
Bill Thorp
8f47fca5d3
Remove audit / uptime ratio fields (#2247)
* removed ratios
2019-06-21 13:14:53 -04:00
JT Olio
568b000e9b satellite: make order expiration configurable (#2251) 2019-06-21 13:38:40 +03:00
Natalie Villasana
edb3d1cbf8
pkg/overlay: update node selection config values for reputation (#2264) 2019-06-20 15:01:50 -04:00
Natalie Villasana
9386187fe6
add disqualification and new reputation system into overlay cache (#2227) 2019-06-20 09:56:04 -04:00
Natalie Villasana
b30c35d306
change ReputationAuditOmega (et al.) to AuditReputationWeight (#2232) 2019-06-18 14:17:25 -04:00
JT Olio
e58a06bd0c config: update release values to match prod (#2192) 2019-06-15 18:19:19 +02:00
littleskunk
319cc77a34
increase audit timeout (#2208) 2019-06-14 13:53:49 +02:00
JT Olio
df2fad15d8
pkg/process/logging: different defaults for release/dev (#2191)
* pkg/process/logging: different defaults for release/dev

Change-Id: I55be80430a31668fededf479b052e106ab18d9ce

* linting

Change-Id: I4e50d4c9569b7324c4704c14df7dd3228dbb7dd5

* Trigger Jenkins

* fix lock file

* use dev=debug and prod=info
2019-06-13 10:43:39 -06:00
Egon Elbre
fdddaa2a47 Fix marketingweb code (#2177)
* set to only listen on 127.0.0.1, move static files to same location, better template handling

* handle error

* fix path in storj-sim

* revert template handling changes

* code shouldn't panic on invalid tempalte

* do not rewrite once writing has started

* write correct error code

* use filepath for path handling

* revert change

* fix

* fix mod tidy

* use correct error code for not found, avoid infinite loop on failure
2019-06-12 09:42:39 -04:00
Yingrong Zhao
af66d9c6e4
Open a new port on satellite for admin GUI (#1901)
* Set up new port 8090 for in offers

Clean up commented code

Rename offers to offersweb

Remove unused code

Add todos for adding front-end templates

Add middleware for only allow local access

Add comment

Fix linting error

Remove commented code

Update storj-sim

Check request IP against Host IP

Use net pakcage to retrieve IP address

Rename service to marketing

* Add wrapper for all errors

* fix conflicts

* update the config file

* fix linting error

* remove unused packages

* remove global runtime var and add flag to storj-sim for mar static dir

* remove debugging lines

* add new config for test data and check if static dir flag is set before passing to mux

* change 'console' to 'marketing' for test data config

* fix linting errors

* update config flag

* Trigger Jenkins

* Trigger CLA
2019-06-11 11:00:59 -04:00
Ivan Fraixedes
f624b213a3
reputation: Add configuration parameters (#2150)
* reputation: Add configuration parameters
 Add the configuration parameters which will be used by the algorithm
 which will calculate the storage node reputation.
 Because the reputation calculation is based on audit and uptime check
 results some configuration parameters are in pkg/audit, others in
 pkg/discovery and other in the satellite which will combine the both
 reputation results to obtain the storage node reputation for repair and
 uplink.
* satellite-config: Refresh lock file with new params
  Refresh the Satellite configuration yaml lock file with the new
  parameters added in this branch.
2019-06-11 12:14:01 +02:00
Egon Elbre
749846b42b
Update golangci-lint (#2159) 2019-06-10 11:52:09 +03:00
JT Olio
469d485f62 dbutil: reduce defaults so sum is < 100 (#2157) 2019-06-09 21:04:22 +02:00
JT Olio
43d4f3daf5 discovery: remove graveyard (#2145) 2019-06-07 08:40:51 +03:00
Stefan Benten
23213a7a29 Fix Config Comments in DButil Package (#2119)
* Fix Config Comments

* Updating lock file

* Update lock file

* Set Actual time.Duration

* Fix Issue
2019-06-04 17:53:38 -06:00
JT Olio
d02427e41a db: set max open conns, conn max lifetime, add db stat monitoring (#2117) 2019-06-04 23:30:21 +02:00
paul cannon
d15eaed588 add capability of logging all GRPC calls/payloads (#2067) 2019-06-04 14:55:24 +02:00
JT Olio
3fe8343b6c repairer: fix config comments (#2105) 2019-06-04 14:13:31 +02:00
Yaroslav Vorobiov
6809129e6f
Console add stripe service (#2080) 2019-06-03 16:46:57 +03:00
Kaloyan Raev
2ab95b533e
Check errors for possible outcomes from audit's DownloadShares (#2072) 2019-06-03 12:17:09 +03:00
JT Olio
e60ff9dcbb
process/metrics: have metrics suffix default to dev/release status (#2073)
What: this will make it so release binaries default to whatever-release instead of whatever-dev in metrics collection

Why: So we can monitor release binaries with default configuration without getting drowned out by dev binaries
2019-05-31 16:47:48 -06:00
Fadila
5b730e3073
Make maxReverifyCount configurable (#2071)
* make max reverify count configurable
2019-05-31 17:23:00 +02:00
Cameron
590b1a5a1d
Satellite voucher service (#2043)
* set up voucher service skeleton, basic test

* add VetNode db method

* basic test for VetNode

* encode and sign voucher functions

* fill out and sign vouchers

* test pass/fail voucher request

* match EncodeVoucher to other Encode functions
2019-05-30 15:52:33 -04:00
aligeti
934ebf9cbf
Added the irreparable repair functionality (#1955)
* Added the irreparable repair functionality
2019-05-30 11:18:20 -04:00
Stefan Benten
8912d7149c
Fixes Auth Issue (#2064) 2019-05-28 16:32:51 +02:00
Cameron
4058c29ca4
filter duplicate node IPs (#1890)
* add last_ip field to dbx model node, generate dbx

* add last_ip to node proto, generate pb

* migrate

* resolve address in transport.DialNode, update lastIp in cache.UpdateAddress

* use net.SplitHostPort to isolate host address from port

* define DistinctIPs flag

* add test for GetIP

* select last_ip when querying for nodes

* if distinctIPs flag == true, query for nodes with distinct IPs

* some basic tests

* change last_ip to field 14 in proto

* remove comments

* check err

* change distinctIPs to distinctIP

* exclude IPs from newNodes in query for reputable nodes

* add index on last_ip

* only add to excludedIPs if flag is true

* test half new nodes returns distinct IPs

* fix alignment

* add test

* rework ip filter query, add retry logic, add switch for database driver

* add retry to SelectNewNodes

* change discovery intervals so IPs don't get overwritten

* remove TestGetIP

* edit updating node stats in test

* split exclude into nodeIDs and IPs

* separate non-distinct IP query into other function

* trigger checks

* remove else block
2019-05-22 16:06:27 -04:00
JT Olio
32b3f8fef0 cmd/storagenode: pull more things into releaseDefaults (#1980) 2019-05-21 13:48:47 +02:00
paul cannon
02be91b029
real-time tracking of space used per project (#1910)
Ran into difficulties trying to find the ideal solution for sharing
these counts between multiple satellite servers, so for now this is a
dumb solution storing recent space-usage changes in a big dumb in-memory
map with a big dumb lock around it. The interface used, though, should
allow us to swap out the implementation without much difficulty
elsewhere once we know what we want it to be.
2019-05-09 20:39:21 -05:00
Bill Thorp
89c5e70003
defaults now commented out (#1878)
* defaults now commented out, unless custom / user / override
2019-05-08 08:14:00 -04:00
Ivan Fraixedes
f8df461249 v3-1660: Create a test that breaks when satellite config file has changed (#1898)
* scripts: check for satellite configuration changes

Create a simple script that checks if the satellite configuration has
differed with the last changes.

* Makefile: add target to check satellite cfg changes

Add a Makefile target which executes the script that checks if the last
changes has made a change in the satellite configuration.

* ci: add satellite cfg check on integration stage

* FIXUP: use releae defaults rather than development ones

* FIXUP: show the message when config differs & some cleanups

* scripts: add script to update the satellite cfg lock

Add a script for allowing to update the satellite configuration lock
file and add Makefile target to run it.

* scripts/testsdata: update satellite cfg lock

Update the satellite configuration lock file with the last changes that
satellite has suffered upstream.
2019-05-07 17:20:04 -04:00