Commit Graph

52 Commits

Author SHA1 Message Date
Egon Elbre
94b2b315f7 storagenode/trust: refactor GetAddress to GetNodeURL
Most places now need the NodeURL rather than the ID and Address
separately. This simplifies code in multiple places.

Change-Id: I52621d8ca52296a8b5bf7afbc1001cf8bfb44239
2020-05-20 11:05:15 +00:00
Egon Elbre
ed627144ed all: use DialNodeURL throughout the codebase
Change-Id: Iaf9ae3aeef7305c937f2660c929744db2d88776c
2020-05-20 10:36:30 +00:00
Egon Elbre
c630cf2490 storagenode/pieces: implement buffering for writing
Currently uploads can cause a lot of IOPS, reduce this by introducing a
in-memory buffer on-top of the file.

Change-Id: I5f4e3e01c0a36258271d180b922107de447bcb59
2020-05-04 06:01:32 +00:00
Egon Elbre
11a44cdd88 all: don't depend on gogo/proto directly
Change-Id: I8822dea0d1b7b99e0b828e0373a0308a42dde2be
2020-04-08 17:32:15 +00:00
Egon Elbre
e1a443b04a private/testplanet: allow modifying created database
Instead of providing the database from outside to testplanet create it
inside and then allow wrapping and modifying it. This is more convenient
to use.

Change-Id: I9b8f69e6e0a19ff984b4e2bfe927c9100c77bc6c
2020-03-27 19:14:48 +00:00
Egon Elbre
e8f18a2cfe private/testplanet: expose storagenode and satellite Config
Change-Id: I80fe7ed8ef7356948879afcc6ecb984c5d1a6b9d
2020-03-27 17:01:25 +02:00
Yingrong Zhao
b7b19289d1 bump storj.io/common to latest
Change-Id: I16e337660ce8e1ef332cc842dbf4cfa067b9b98b
2020-03-25 09:08:40 -04:00
Bill Thorp
94c11c5212 satellite: remove some unnecessary UTC() calls
Fixes some easy cases of extraneous UTC() calls

Change-Id: I3f4c287ae622a455b9a492a8892a699e0710ca9a
2020-03-13 13:49:44 +00:00
Egon Elbre
5342dd9fe6 go.mod: update uplink
Change-Id: I867a6a1eef8aa5d60bb676e5112b98c4192ce811
2020-02-21 16:08:12 +02:00
Jeff Wendling
7999d24f81 all: use monkit v3
this commit updates our monkit dependency to the v3 version where
it outputs in an influx style. this makes discovery much easier
as many tools are built to look at it this way.

graphite and rothko will suffer some due to no longer being a tree
based on dots. hopefully time will exist to update rothko to
index based on the new metric format.

it adds an influx output for the statreceiver so that we can
write to influxdb v1 or v2 directly.

Change-Id: Iae9f9494a6d29cfbd1f932a5e71a891b490415ff
2020-02-05 23:53:17 +00:00
Egon Elbre
10be538602 storagenode: add pkg/debug support
Change-Id: If941095b886c28a0d53fff4c9bf9fa0ce7471dea
2020-01-29 16:30:31 -05:00
littleskunk
5c68f4fc7c storagenode/gracefulexit: higher concurrency and shorter timeouts
1 transfer with a minimum speed of 128 Bytes was a nice try but it is
way too low. Even a pi3 was able to handle 7 grpc transfers. We have 4
satellites and with 5 concurrent transfers that should be a total of 20
concurrent transfers. Each transfer will have a minimum speed of 5KB/s.
That should give us a better througput and still be Ok on a pi3.

Change-Id: I650a7baf890080901ef70ea3b5636d93009b4e60
2020-01-24 23:51:39 +00:00
Isaac Hess
44de90ecc8 storagenode/pieces: Rename vars and update comments
A few variables were not renamed to the new standard piecesTotal and
piecesContentSize, so it was unclear which value was being used. These
have been updated, and some comments made more thorough.

Change-Id: I363bad4dec2a8e5c54d22c3c4cd85fc3d2b3096c
2020-01-23 11:00:24 -07:00
Isaac Hess
14fd6a9ef0 storagenode/pieces: Track total piece size
This change updates the storagenode piecestore apis to expose access to
the full piece size stored on disk. Previously we only had access to
(and only kept a cache of) the content size used for all pieces. This
was inaccurate when reporting the amount of disk space used by nodes.

We now have access to the total content size, as well as the total disk
usage, of all pieces. The pieces cache also keeps a cache of the total
piece size along with the content size.

Change-Id: I4fffe7e1257e04c46021a2e37c5adc6fe69bee55
2020-01-23 11:00:24 -07:00
Michal Niewrzal
6502454947 satellite/metainfo: move RS configuration to satellite
With this change RS configuration will be set on satellite. Uplink with
get RS values with BeginObject request and will use it. For backward
compatibility and to avoid super large change redundancy scheme stored
with bucket is not touched. This can be done in future.

Change-Id: Ia5f76fc10c37e2c44e4f7b8754f28eafe1f97eff
2020-01-22 09:33:53 +00:00
Egon Elbre
21f53e38da storagenode/storagenodedb/storagenodedbtest: pass ctx as an argument
Change-Id: I10b0a8ef3a7d5001e7d361f1873ad5987af1f9c2
2020-01-20 16:56:12 +02:00
Egon Elbre
d5438036b5 {satellite,storagnode}/gracefulexit: reduce logging
Change-Id: I9f274ede77a582fc43ef14a47bf9341d4e3083df
2020-01-19 22:36:13 +02:00
Egon Elbre
3cd584c007 storagenode/gracefulexit: move database test
Database tests belong to the interface, not the implementation.

Change-Id: I5d76fdc7df0b0f32391ebad1b595ef26b062a9cb
2020-01-19 18:12:01 +00:00
Bill Thorp
6f2f97b313 storagenode\gracefulexit: broke worker deleteOnePieceOrAll into deleteOnePiece and deleteAllPieces and deletePiece
Change-Id: Ic3bd21e89fa71e962c2bb1c4943f4696bc4f83e5
2020-01-17 15:07:34 +00:00
Yingrong Zhao
07c2824d94 storagenode/gracefulexit: fix exit-status command output
When exit succeeded, cli should display `Y` in Successful column and `100%` in PercentComplete.

Change-Id: I6093eca207ecd618bb332af12e5e455bc8224dde
2020-01-15 14:58:15 +00:00
Yingrong Zhao
ebeee58001 storagenode/gracefulexit: remove satellite entry when node fail precondition
Change-Id: I3c215170f10f0053e4f8718ee31d64d93f52ec80
2020-01-08 18:11:58 +00:00
Egon Elbre
082ec81714
uplink: move to storj.io/uplink (#3746) 2020-01-08 15:40:19 +02:00
Egon Elbre
2680bae88c private/testplanet: remove dependency to uplink
Remove direct dependency on uplink.RSConfig, this simplifies
moving the config file without introducing weird dependencies.

Change-Id: I7fd2a145401e0205d7047631df9d2810241efeec
2020-01-02 09:40:46 +00:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Isaac Hess
7d1e28ea30 storagenode: Include trash space when calculating space used
This commit adds functionality to include the space used in the trash
directory when calculating available space on the node.

It also includes this trash value in the space used cache, with methods
to keep the cache up-to-date as files are trashed, restored, and
emptied.

As part of the commit, the RestoreTrash and EmptyTrash methods have
slightly changed signatures. RestoreTrash now also returns the keys that
were restored, while EmptyTrash also returns the total disk space
recovered. Each of these changes makes it possible to keep the cache
up-to-date and know how much space is being used/recovered.

Also changed is the signature of PieceStoreAccess.ContentSize method.
Previously this method returns only the content size of the blob,
removing the size of any header data. This method has been renamed
`Size` and returns both the full disk size and content size of the blob.
This allows us to only stat the file once, and in some instances (i.e.
cache) knowing the full file size is useful.

Note: This commit simply adds the trash size data to the piece size data
we were already collecting. The piece size data is not accurate for all
use-cases (e.g. because it does not contain piece header data); however,
this commit does not fix that problem. Now that the ContentSize (Size)
method returns the full size of the file, it should be easier to fix
this problem in a future commit.

Change-Id: I4a6cae09e262c8452a618116d1dc66b687f59f85
2019-12-23 19:07:03 -07:00
Egon Elbre
d55288cf68 pkg/rpc: replace methods with direct calls to pb
Change-Id: I8bd015d8d316a2c12c1daceca1d9fd257f6f57bc
2019-12-22 17:12:43 +02:00
Egon Elbre
afe05edff2 {storagenode,satellite}/gracefulexit: ensure workers finish their work
Fixes a data race caused by not waiting for workers to finish
before shutting down. Currently this ended up failing logging
because it was closed when test tried to write to it.

Change-Id: I074045cd83bbf49e658f51353aa7901e9a5d074b
2019-12-17 17:21:52 +02:00
Cameron Ayer
6fae361c31 replace planet.Start in tests with planet.Run
planet.Start starts a testplanet system, whereas planet.Run starts a testplanet
and runs a test against it with each DB backend (cockroach compat).

Change-Id: I39c9da26d9619ee69a2b718d24ab00271f9e9bc2
2019-12-10 16:55:54 +00:00
Ethan Adams
9420fa9fc5 satellite/gracefulexit: Add graceful exit completed/failed receipt verification to satellite CLI (#3679) 2019-12-03 17:09:39 -05:00
Yingrong Zhao
66f1a1680f
add completion receipt to exit-status cli command on storage node (#3650) 2019-11-26 12:32:26 -05:00
littleskunk
8842b0c252 storagenode/gracefulexit: improve logging (#3633) 2019-11-21 21:10:02 -05:00
Rafael Antonio Ribeiro Gomes
da39c71d35
storagenode: add new metric satellite.request (#3610)
* storagenode: add new metric satellite.request

* storagenode: metrics fixed

* switch from Counter to Meter
2019-11-19 18:11:31 -03:00
Egon Elbre
ee6c1cac8a
private: rename internal to private (#3573) 2019-11-14 21:46:15 +02:00
Egon Elbre
1e64006e32 lint: add staticcheck as a separate step (#3569) 2019-11-14 10:31:30 +02:00
paul cannon
bd89f51c66
Keep v0pieceinfo database isolated (#3364)
* put TestCreateV0 back in StoreForTest
* avoid direct handles to V0 pieceinfo db
* type mismatch fix
* use storage.Blobs interface in store_test.go

..instead of filestore.Store. this will allow filestore.Store to become
unexported.

* unexport filestore.Store

rename it to blobStore. things should use the storage.Blobs interface
instead. changes in this commit are purely mechanical (made through the
"refactor" tool in Gocode followed by search/replace on the word "Store"
within the storage/filestore/ directory).

* kill filestore.StoreForTest

now that filestore.blobStore is unexported, there isn't a need for a
specialized wrapper type. this (not coincidentally) also makes it
possible for the WriterForFormatVersion() method on
storagenode/pieces.StoreForTest to work, without requiring everything to
wrap the store.blobs attribute in a filestore.StoreForTest, which was
impractical.
2019-11-13 13:15:31 -06:00
Yingrong Zhao
db8294cfba
storagenode/gracefulexit: get hash and limit using original piece ID (#3557) 2019-11-13 12:45:55 -05:00
littleskunk
7eb6724c92
logging: unify logging around satellite ID, node ID and piece ID (#3491)
* logging: unify logging around satellite ID, node ID and piece ID

* unify segment index
2019-11-05 22:04:07 +01:00
Maximillian von Briesen
257d3946d5
storagenode/gracefulexit: allow storagenodes to concurrently transfer pieces for graceful exit (#3478) 2019-11-05 10:33:44 -05:00
Ethan Adams
43103ae13f
lower storage node counts in tests (#3427) 2019-10-31 10:57:54 -04:00
Natalie Villasana
4878135068
satellite/gracefulexit, storagenode/gracefulexit: add timeouts (#3407) 2019-10-30 13:40:57 -04:00
Natalie Villasana
5453886231 satellite/repair, uplink/ecclient: remove unused expiration arg from ec.Repair and ec.putPiece (#3416) 2019-10-30 11:35:00 -04:00
Yingrong Zhao
3ee0b89f8f
storagenode/gracefulexit: delete pieces when receive Delete or Completed message from satellite (#3406) 2019-10-30 10:46:56 -04:00
Egon Elbre
65a8e0bcbc
{satellite,storagenode}/gracefulexit: clearer log messages (#3413) 2019-10-30 10:21:27 +02:00
Ethan Adams
5b0398a718 storagenode/gracefulexit: Exclude finished exits from chore/worker processing. Fix update status bug (#3399) 2019-10-28 13:59:45 -04:00
Jeff Wendling
ed48e74e20 gracefulexit: fix build for drpc (#3387)
Change-Id: I335e9f8991a10c9e8a0737bc7c9ea3f04cbe2546
2019-10-26 15:53:35 +02:00
Maximillian von Briesen
6df4d7bc73
storagenode/gracefulexit + satellite/gracefulexit: add storagenode-side transfer validation (#3371)
* Make the exiting node check piece hashes, piece IDs, and piece hash signatures before relaying successful transfer data to the satellite.
* Enable immediate graceful exit failure for "successful" transfers that fail satellite-side validation.
* Move transfer piece logic in storagenode worker to separate function (to make the worker easier to understand)
2019-10-25 13:16:20 -04:00
Yingrong Zhao
fa1ac24e19
satellite/gracefulexit: add failure threshold check (#3329)
* add overall failure percentage check and inactive time frame check before sending a response to sno

* update comment

* delete node from transfer queue if it has been inactive for too long

* fix linting error

* add test config value

* fix nil pointer

* add config value into testplanet

* add unit test for overall failure threshold

* move timeframe threshold to chore

* update protolock

* add chore test

* add per peiece failure count logic

* change config name from EndpointMaxFailures to MaxFailuresPerPiece

* address comments

* fix linting error

* add error handling for no row returned from progress table

* fix test for graceful exit chore on storagenode

* fix typo InActive -> Inactive

* improve readability for failure threshold calculation

* update config lock

* change error handling for GetProgress in graceful exit endpoint on the satellite side

* return proper rpc error in endpoint

* add check in chore test for checking finish timestamp and queue
2019-10-24 12:24:42 -04:00
Ethan Adams
3e0d12354a
storagenode/gracefulexit: Implement storage node graceful exit worker - part 1 (#3322) 2019-10-22 16:42:21 -04:00
Yingrong Zhao
e5099f31f3
add context.Clean and correct rpc error code (#3295) 2019-10-16 13:50:01 -04:00
Yingrong Zhao
87e3764390
storagenode/cmd: add exit-status command for graceful exit (#3264)
* add exit-status command

* remove todo and fix format

* fix status display

* change startExit to exit progress

* fix linting error

* add successful column in exit progress

* fix test

* remove extra new line

* fix TYPOS

* format the percentage better
2019-10-15 18:07:32 -04:00