Commit Graph

85 Commits

Author SHA1 Message Date
Michal Niewrzal
4bdbb25d83 satellite/metabase/rangedloop: move Segment definition
We will remove segments loop soon so we need first to move
Segment definition to rangedloop package.

https://github.com/storj/storj/issues/5237

Change-Id: Ibe6aad316ffb7073cc4de166f1f17b87aac07363
2023-05-16 12:37:17 +00:00
Michal Niewrzal
2592aaef9c satellite/gc/bloomfilter: remove segments loop parts
We are switching completely to ranged loop.

https://github.com/storj/storj/issues/5368

Change-Id: I1a22ac4b242998e287b2b7d8167b64e850b61a0f
2023-05-15 11:46:26 +00:00
Michal Niewrzal
98562d06c8 satellite/gc/bloomfilter: add sync observer
Current observer used with ranged loop is using massive amount of
memory because each range is generating separate set of bloom filters.
Each bloom filter can be up to 2MB of memory. That's a lot.

This change is initial change to reduce used memory by sharing bloom
filters between ranges and just synchronize access to them. This
implementation is rather simple and even naive but maybe it will be
enough without doing something more complex.

https://github.com/storj/storj/issues/5803

Change-Id: Ie62d19276aa9023076b1c97f712b788bce963cbe
2023-04-28 07:40:56 +00:00
Michal Niewrzal
3cd79d987d satellite/gc/bloomfilter: extract BF upload logic
This is refactor/cleanup change before I will start working on adding
separate GC observer with optimized memory consumption.

https://github.com/storj/storj/issues/5803

Change-Id: I854cb3797802a32942c25f2765dbb72be88bacbd
2023-04-27 11:15:27 +02:00
Egon Elbre
f5020de57c storagenode/blobstore: move blob store logic
The blobstore implementation is entirely related to storagenode, so the
rightful place is together with the storagenode implementation.

Fixes https://github.com/storj/storj/issues/5754

Change-Id: Ie6637b0262cf37af6c3e558556c7604d9dc3613d
2023-04-05 18:06:20 +00:00
Erik van Velzen
464ceb1c0e satellite/gc: improve comments
Change-Id: I9e71c9bee3447f78365ba1593e4a4ef55b28356f
2023-03-08 13:15:13 +00:00
Qweder93
d6a948f59d satellite/repair : implemented ranged loop observer
implemented observer and partial, created new structures to keep mon
metrics remain in same way as in segment loop

Change-Id: I209c126096c84b94d4717332e56238266f6cd004
2023-01-23 14:23:03 +00:00
Erik van Velzen
2d863759b0 satellite/metabase/rangedloop: add AsOfSystemTime
Add option AsOfSystemTime to segment provider to make it equivalent to
the old segment loop.

There's no comment on what it does because it's pretty complex and
makes no sense, but we can improve it later.

closes https://github.com/storj/storj/issues/5434

Change-Id: I8f09b03803e681e2fd41008c5dba67804b0f37a1
2023-01-11 16:22:18 +00:00
Andrew Harding
5362dff94b satellite/gc/bloomfilter: implement rangedloop observer
https://github.com/storj/storj/issues/5235

Change-Id: Iffe8f682adfa46e48e47976bf838326e7125ff80
2023-01-03 09:46:02 -07:00
Michal Niewrzal
75b77d53ff satellite/gc/sender: avoid sending BF to disqualified and exited nodes
We don't want to waste our time on disqualified and exited nodes.

Change-Id: I11709350ad291c24f3b46670dd6a418c0ddbb44f
2022-11-29 09:56:32 +00:00
Ethan
9a09d8920e satellite/gc: Upload bloomfilters with prefix and update LATEST when complete
Change he bloomfilter generation process to prefix the objects with a date and update the LATEST object with the prefix.  The sender will read the LATEST file to get the prefix to process.

Change-Id: Iae0d3c49015d57f391d87789fb799a7d774066bf
2022-11-01 21:24:46 +00:00
Ethan
4efde65c9e satellite/gc: Optionally run the GC bloomfilter process once, instead of in a loop
The current deployment strategy requires that the GC bloomfilter generation process executes only once and exits.

Change-Id: I952991f126596aa165d1f2e9fce6f8548c21bdba
2022-11-01 18:19:40 +00:00
Egon Elbre
ff22fc7ddd all: fix deprecated ioutil commands
Change-Id: I59db35116ec7215a1b8e2ae7dbd319fa099adfac
2022-10-11 15:27:29 +00:00
Erik van Velzen
d2a67fb8f7 satellite/gc/sender: concurrency
Restore functionality where retain filters can be sent out to multiple
storage nodes simultaneously.

Fixes https://github.com/storj/team-metainfo/issues/121

Change-Id: I2bf86a166b09c6a277c1cb455cdca0165ce6b8af
2022-09-27 08:10:10 +00:00
Erik van Velzen
6ee3993f6c satellite/gc: e2e gc test
Restore previously existing end-to-end garbage collection test using
the new separate services for bloom filter generation and storage node
communication.

Original tests can be found under:
https://github.com/storj/storj/blob/v1.63.1/satellite/gc/gc_test.go

Change-Id: I42d1ab0f9981dfe183140da4d08087f4a6cd9296
2022-09-26 07:56:35 +00:00
Michal Niewrzal
a22e6bdf67 satellite/gc/bloomfilter: use int64 to count pieces
Pieces count in DB are stored as int64 and we would like to align bloom
filter processing with this type.

Change-Id: Iaec767e609a40d802077ae057520541805a7c44f
2022-09-22 09:39:53 +00:00
Michal Niewrzal
c3ca98f552 satellite/gc/bloomfilter: minor cleanups
* service was running wrong RunOnce method
* after doing integration with GC sender we concluded that we don't
need special flag "gc-sender" to be uploaded as its safe to consume
partial results by GC senger. This part was removed.
* prefix format for moving data after error was unified with GC
sender

https: //github.com/storj/team-metainfo/issues/120
Change-Id: I204b696b9c2def4874ad1d17d0e84231cc98d583
2022-09-20 18:29:00 +02:00
Erik van Velzen
e6b5501f9b satellite/gc/sender: new service to send retain filters
Implement a new service to read retain filter from a bucket and
send them out to storagenodes.

This allows the retain filters to be generated by a separate command on
a backup of the database.

Paralellism (setting ConcurrentSends) and end-to-end garbage collection
tests will be restored in a subsequent commit.

Solves https://github.com/storj/team-metainfo/issues/121

Change-Id: Iaf8a33fbf6987676cc3cf74a18a8078916fe673d
2022-09-20 11:49:40 +00:00
Michal Niewrzal
90eded4d99 satellite/gc/bloomfilter: take CreationDate from latest segment
Bloom filter CreationDate is used to avoid deleting pieces that
where not processed by GC. Every piece created after that timestamp
won't be deleted. Current GC process is taking CreationDate as a
beginning of bloom filter creation. This is appraoch allows to avoid
issues with inconsistent view on DB as currently we are using live DB
to create bloom filters.

With appraoch were we will be using DB snaphot with segment loop
we can get CreationDate from latest created segment in DB. Every piece
created after latest created segment won't be touched by GC on storage
node.

Updates https://github.com/storj/team-metainfo/issues/120

Change-Id: I6aaf64948ab7f60cfea62195689ad77c25ea772e
2022-09-15 11:59:53 +00:00
Michal Niewrzal
158eb2381e satellite/gc/bloomfilter: uploading bloom filters
We would like to have separate process/command to collect bloom
filters from source different than production DBs. Such process will
use segment loop to build bloom filters for all storage nodes and
will send it to Storj bucket.

This change add main logic to new service. After collecting all bloom
filters with segment loop and piece tracker all filters are marshaled
and packed into zip files. Each zip contains up to "ZipBatchSize" bloom
filters and it's uploaded to specified in configuration bucket.

All uploaded objects have specified expiration time to not delete them
manually.

Updates https://github.com/storj/team-metainfo/issues/120

Change-Id: I2b6bc02a7dd7c3a639e75810fd013ae4afdc80a2
2022-09-12 08:33:53 +00:00
Michal Niewrzal
d905931ed9 private/testplanet: integrate GC bloom filter service
We would like to have separate process/command to collect bloom
filters from source different than production DBs. Such process will
use segment loop to build bloom filters for all storage nodes and
will send it to Storj bucket.
This change adds integration with testplanet which makes writing
unit tests possible.

Updates https://github.com/storj/team-metainfo/issues/120

Change-Id: I7b335c5dafa8cffe265c56b75d8c8f8567580893
2022-09-02 11:52:45 +00:00
Michal Niewrzal
68f6d93f29 satellite/gc/bloomfilter: add service to collect bloom filters
We would like to have separate process/command to collect bloom
filters from source different than production DBs. Such process will
use segment loop to build bloom filters for all storage nodes and
will send it to Storj bucket. This this initial change to add such
service. Added service is joining segment loop and collects all
bloom filters.

Sending bloom filters to the bucket will be added as a subsequent
change.

Updates https://github.com/storj/team-metainfo/issues/120

Change-Id: I2551723605afa41bec84826b0c647cd1f61f3b14
2022-09-02 08:10:46 +00:00
Michal Niewrzal
6cc2052f47 satellite: fix segment loop observers metrics
We made optimization for segment loop observers to avoid
heavy monkit initialization on each call. It was applied to very
often executed methods. Unfortunately we used wrong monkit
method to track function times. Instead mon.Task we used
mon.Func().

https://github.com/spacemonkeygo/monkit#how-it-works

Change-Id: I9ca454dbd828c6b43ba09ca75c341991d2fd73a8
2022-08-10 14:13:16 +00:00
Michał Niewrzał
7a2d2a36ca satellite: use more optimal monkit call for loop observers methods
Recently we applied this optimization to metrics observer and time
used by its method dropped from 12m to 3m for us1 (220m segments).
It looks that it make sense to apply the same code to all observers.

Change-Id: I05898aaacbd9bcdf21babc7be9955da1db57bdf2
2022-05-20 11:03:41 +00:00
Michał Niewrzał
456aea727e satellite: use PieceIDDeriver for derivation
We can use PieceIDDeriver in all places where we are deriving id from
the same id multiple times. We have serveral such places: gc, segment
deletion, segment validation, order limit creation. Using it should
save some resources.

Change-Id: I24668d516c0f7cea4aec6470614067734149501d
2022-05-19 06:31:42 +00:00
Michał Niewrzał
99ec4c8869 satellite/gc: improve test for copies
Initial space used for pieces is calcualted, not retrieved
from storage nodes and at the end of test we are deleting
also copies that become ancestors to verify that all data
was removed from storage nodes.

Change-Id: I9804adb9fa488dc0094a67a6e258c144977e7f5d
2022-04-11 11:06:01 +00:00
Michał Niewrzał
c105562479 satellite/gc: test GC with object copies
We implemented server-side copy feature and we would like to
confirm that it is not affecting GC.

Fixes https://github.com/storj/storj/issues/4696

Change-Id: Id391f0badf5fce51f9910f0df732d477b07fa7ac
2022-04-07 11:32:35 +00:00
Michał Niewrzał
0bde845a17 satellite/metabase: don't delete pieces when deleting ancestor object
Fixes https://github.com/storj/storj/issues/4613

Change-Id: I3d6217a618a2a685256471f0394a143a323ac044
2022-03-21 09:32:26 +00:00
Egon Elbre
64c8de6ea5 mod: use vendored base58
Change-Id: I5aa29515928848c862500330218cc094618638d7
2022-01-31 15:54:33 +02:00
Michał Niewrzał
c258f4bbac private/testplanet: move Metabase outside Metainfo for satellite
At some point we moved metabase package outside Metainfo
but we didn't do that for satellite structure. This change
refactors only tests.
When uplink will be adjusted we can remove old entries in
Metainfo struct.

Change-Id: I2b66ed29f539b0ec0f490cad42c72840e0351bcb
2021-09-09 07:15:51 +00:00
Egon Elbre
ca64e55281 satellite/gc: remove skip first
We used this to reduce initial load on the core to avoid OOM. However,
this is not a problem anymore with garbage collection running
separately.

Change-Id: Ifd62c822a74974bc21a5913199334469a4bc0130
2021-06-21 18:30:38 +00:00
Egon Elbre
9b2607d6ba satellite: remove garbage collection option from core
We don't run it anywhere in this configuration, so it's not worthwhile
to keep it that way.

Change-Id: I88afb8bb3eb3843801b15454408f10d1353596cb
2021-06-15 21:07:02 +03:00
JT Olio
da9ca0c650 testplanet/satellite: reduce the number of places default values need to be configured
Satellites set their configuration values to default values using
cfgstruct, however, it turns out our tests don't test these values
at all! Instead, they have a completely separate definition system
that is easy to forget about.

As is to be expected, these values have drifted, and it appears
in a few cases test planet is testing unreasonable values that we
won't see in production, or perhaps worse, features enabled in
production were missed and weren't enabled in testplanet.

This change makes it so all values are configured the same,
systematic way, so it's easy to see when test values are different
than dev values or release values, and it's less hard to forget
to enable features in testplanet.

In terms of reviewing, this change should be actually fairly
easy to review, considering private/testplanet/satellite.go keeps
the current config system and the new one and confirms that they
result in identical configurations, so you can be certain that
nothing was missed and the config is all correct.
You can also check the config lock to see what actual config
values changed.

Change-Id: I6715d0794887f577e21742afcf56fd2b9d12170e
2021-06-01 22:14:17 +00:00
Michał Niewrzał
e76cbc9bd5 satellite/gc: move GC to segments loop
This change is refactor to move GC from metainfo loop
(objects/segments)  to segments loop.

Change-Id: I21f1ff7cb0b6f98c41aa8930447b8d9bea227975
2021-06-01 20:36:02 +00:00
Egon Elbre
69b149a66f mod: bump uplink
uplink stopped using zap, hence some of the private methods needed to be
changed.

Change-Id: Iac1fae45a40cd3f1649b9f672bf8c250344986d5
2021-05-06 14:48:36 +00:00
Egon Elbre
961e841bd7 all: fix error naming
errs.Class should not contain "error" in the name, since that causes a
lot of stutter in the error logs. As an example a log line could end up
looking like:

    ERROR node stats service error: satellitedbs error: node stats database error: no rows

Whereas something like:

    ERROR nodestats service: satellitedbs: nodestatsdb: no rows

Would contain all the necessary information without the stutter.

Change-Id: I7b7cb7e592ebab4bcfadc1eef11122584d2b20e0
2021-04-29 15:38:21 +03:00
Michał Niewrzał
7944df20d6 storj: use multipart API
Change-Id: I10b401434e3e77468d12ecd225b41689568fd197
2021-04-26 13:15:09 +00:00
Egon Elbre
4c9ed64f75 satellite/metabase/metaloop: move loop under metabase
Currently the loop handling is heavily related to the metabase rather
than metainfo.

metainfo over time has become related to the "public API" for accessing
the metabase data.

Currently updates monkit.lock, because monkit monitoring does not handle
ScopeNamed correctly. Needs a followup change to monitoring check.

Change-Id: Ie50519991d718dfb872ec9a0176a82e732c97584
2021-04-22 12:58:09 +03:00
Egon Elbre
267506bb20 satellite/metabase: move package one level higher
metabase has become a central concept and it's more suitable for it to
be directly nested under satellite rather than being part of metainfo.

metainfo is going to be the "endpoint" logic for handling requests.

Change-Id: I53770d6761ac1e9a1283b5aa68f471b21e784198
2021-04-21 15:54:22 +03:00
Fadila Khadar
bde367ae73 satellite/gc: check on bloom filter creation date
Check that the bloom filter creation date is earlier than the
metainfo loop system time used for db scanning.

Change-Id: Ib0f47c124f5651deae0fd7e7996abcdcaac98fb4
2021-04-14 16:40:37 +00:00
Kaloyan Raev
035c393da0 satellite: update tests to pass etag.Reader to multipart.PutObjectPart
Change-Id: Ibe99357945ae7a91f5b5d4f87b83d425c9fa84a5
2021-03-29 13:18:11 +00:00
Egon Elbre
f19ef4afe5 satellite/metainfo/metaloop: move loop to a separate package
Change-Id: I94c931a27c1af6062185ec62688624ec02050f11
2021-03-23 15:37:34 +00:00
Michał Niewrzał
9a60011774 Merge remote-tracking branch 'origin/main' into multipart-upload
Change-Id: Ia90f29be432e207c4125f7f955c912978eabe59a
2021-02-04 09:38:08 +01:00
Egon Elbre
c4578eb3ec satellite/gc: add test for pending object
Change-Id: Ifb076ab38442f88f94a3e0c2ae1b19528a55f724
2020-12-22 09:42:32 +00:00
Fadila Khadar
724b0f91eb satellite/gc: update tests to use metabase
Change-Id: I13c6c02a46254ea1d7176c0c6045fd24dd117a58
2020-12-16 10:38:24 +00:00
Kaloyan Raev
fc85179a19 satellite/metainfo: refactor SegmentLocation.Index to SegmentPosition
Change-Id: Ic9403c8126712693326dd83d6ba4f3b84be3e0c7
2020-12-14 13:35:53 +02:00
Stefan Benten
494bd5db81
all: golangci-lint v1.33.0 fixes (#3985) 2020-12-05 17:01:42 +01:00
Ivan Fraixedes
7eb3b2d6d0
satellite/gc: Init map with an aprox size
Because the PieceTracker receives a piece count per nodes which is an
approximation of the number of nodes that they are going to be reported
by the metainfo loop so we can use as a good guess of the map's size and
initialized with it.

Change-Id: I644db40926c03e4c457457fb41d2ec1da059cea6
2020-11-27 10:44:19 +01:00
Kaloyan Raev
92a2be2abd satellite/metainfo: get away from using pb.Pointer in Metainfo Loop
As part of the Metainfo Refactoring, we need to make the Metainfo Loop
working with both the current PointerDB and the new Metabase. Thus, the
Metainfo Loop should pass to the Observer interface more specific Object
and Segment types instead of pb.Pointer.

After this change, there are still a couple of use cases that require
access to the pb.Pointer (hence we have it as a field in the
metainfo.Segment type):
1. Expired Deletion Service
2. Repair Service

It would require additional refactoring in these two services before we
are able to clean this.

Change-Id: Ib3eb6b7507ed89d5ba745ffbb6b37524ef10ed9f
2020-10-27 13:06:47 +00:00
Michal Niewrzal
9202295348 satellite/metainfo: replace ScopedPath with metabase.SegmentLocation
Change-Id: I7e89c9e8eaeae58be828a32ad47ed3028501f4c7
2020-09-04 10:06:52 +00:00