Commit Graph

234 Commits

Author SHA1 Message Date
Michal Niewrzal
aba2f14595 satellite/metabase/rangedloop: few additions for monitoring
Additional elements added:
* monkit metric for observers methods like Start/Fork/Join/Finish to
be able to check how much time those methods are taking
* few more logs e.g. entries with processed range
* segmentsProcessed metric to be able to check loop progress

Change-Id: I65dd51f7f5c4bdbb4014fbf04e5b6b10bdb035ec
2023-02-17 08:46:00 +00:00
Michal Niewrzal
41bcc6bb62 satellite/metainfo: fix duplicates while listing committed objects
We have an issue where object can appear in two different listing pages.
It's because protobuf listing cursor doesn't have version included and
now we can have internally versions higher than 1. On satellite side
version 1 was always used as a default cursor version.

As a workaround for existing implementation of libuplink library we will
use always maximum version for listing cursor on satellite side.

Fixing protobuf and libuplink implementation will happen later.

https://github.com/storj/storj/issues/5570

Change-Id: Ibd27b174556c9d8b8bd60fab8cff7862fd11e994
2023-02-14 14:47:27 +01:00
Michal Niewrzal
94d341bcf3 satellite: use ranged loop with GC-GF peer
Peer for generating bloom filters will be able to use ranged loop.

As an addition some cleanup were made:
* remove unused parts of GC BF peer (identity, version control)
* added missing Close method for ranged loop service
* some additional tests added

https://github.com/storj/storj/issues/5545

Change-Id: I9a3d85f5fffd2ebc7f2bf7ed024220117ab2be29
2023-02-13 18:32:21 +00:00
Egon Elbre
b14019b8c5 satellite/{metabase/rangedloop,metainfo/piecedeletion}: fix flaky tests
TestLoopContinuesAfterObserverError was failing due to system
granularity measuring the duration as 0.

TestDialer_DialTimeout was failing due to connection failure came with a
delay and wasn't being handled.

Change-Id: I4638c86f5d021a86c3d3529fab13cf3608f35c40
2023-02-09 16:07:00 +02:00
Egon Elbre
dcebcf770d satellite/metabase: add database comments
Change-Id: I78e19fb9301044d076e3eabf7e92fa6b715ddf03
2023-02-07 11:42:21 +02:00
Michal Niewrzal
4e0c7b2d90 satellite/metabase/metabasetest: fix race while running tests
Change-Id: I4fdb328443617bc9710ee6b9168b31870fe336d9
2023-02-02 10:35:28 +00:00
Michal Niewrzal
bd8867cd09 satellite: adjust code to handle context cancelation for SQL queries
Our DB support in storj/private was updated to enable basic context
support for executing SQL queries. This change requires some small
adjustments as not all parts were working correctly.

storj/private commit with change:
4bc77107b7acfcc2f7ad65796d5dd3d7c64801e4

Change-Id: I64d7ed92788ea0920d12cecd1aa0e414720e9b9c
2023-01-27 10:07:43 +01:00
Michal Niewrzal
b5c5c62d7b satellite/metabase: add missing error check
Change-Id: I6891f0647cb8e4a8dd6534eaa3588bbe76e2721d
2023-01-26 11:18:14 +00:00
Michal Niewrzal
bb2ac4279a satellite/metainfo: enable multiple versions fix by default
Change-Id: I6cc7ba928e59cc8b8fa50f2ab19ec5418dc76507
2023-01-26 09:35:20 +00:00
Michal Niewrzal
8850fde9f5 satellite/metabase/metabasetest: detect full scan table queries
This is automated test around metabase tests. It's detecting queries
which were performing full table scan during test execution.

After merging this and checking that this is not problematic in any way
we will enable this also for testplanet tests.

One query was adjusted to avoid full table scan and pass new check.

https://github.com/storj/storj/issues/5471

Change-Id: I2d176f0c25deda801e8a731b75954f83d18cc1ce
2023-01-23 19:40:20 +00:00
Qweder93
d6a948f59d satellite/repair : implemented ranged loop observer
implemented observer and partial, created new structures to keep mon
metrics remain in same way as in segment loop

Change-Id: I209c126096c84b94d4717332e56238266f6cd004
2023-01-23 14:23:03 +00:00
Yaroslav Vorobiov
5644fb1a7e satellite/accounting/nodetally: add ranged loop
Add node tally ranged loop observer and partial.
Add node tally randed observer to range loop peer.
Add config flag to select which loop to use for node tally.
Update satellite core to use segement/ranged loop based on a flag.
Duplicate existing node tally test but using ranged loop.

Change-Id: I6786f1a16933463fab5f79601bf438203a7a5f9e
2023-01-17 13:50:18 +01:00
Andrew Harding
c5b5695bca satellite/metabase/rangedloop: clean up observerstats init
Small cleanups of the observer stats init code:
1. Use sync.Once for race free addition to the monitoring chain
   (purely defensive)
2. Set the observer durations before adding to the monitoring chain on
   first use.
3. observerDurations slice does not need to be initialized to non-nil

Change-Id: I9ae8ec96debc7d52c4ee5d22762a89f21bb2e38c
2023-01-13 10:40:30 +00:00
Erik van Velzen
ed910b6087
satellite/metabase/rangedloop: continue after error (#5430)
When an observer errors we still want to finish the other observers.

This changes store the error and continues the loop, skipping
the observer which errored out and setting the duration metric to -1.

When the error occurs in the process stage, it does continue the other
ranges of the same observer. It removes the observer entirely after the process
stage. To improve this would make it more complex due to race
conditions.

Closes https://github.com/storj/storj/issues/5389

Change-Id: I528432c491d4340817d6950f1200ee2b9e703309
2023-01-11 22:23:17 +01:00
Erik van Velzen
2d863759b0 satellite/metabase/rangedloop: add AsOfSystemTime
Add option AsOfSystemTime to segment provider to make it equivalent to
the old segment loop.

There's no comment on what it does because it's pretty complex and
makes no sense, but we can improve it later.

closes https://github.com/storj/storj/issues/5434

Change-Id: I8f09b03803e681e2fd41008c5dba67804b0f37a1
2023-01-11 16:22:18 +00:00
Michal Niewrzal
282aaf8945 satellite/metabase: fix GetStreamPieceCountByNodeID full table scan
Previous version of SQL query was causing full table scan.

Output of EXPLAIN:
---
distribution: local
vectorized: true

• lookup join
│ table: segments@segments_pkey
│ equality: (?column?) = (stream_id)
│ pred: remote_alias_pieces IS NOT NULL
│
└── • union
    │
    ├── • values
    │     size: 1 column, 1 row
    │
    └── • scan
          missing stats
          table: segment_copies@segment_copies_pkey
          spans: [/'\xff135285155378d980b8c49148cef3ca' - /'\xff135285155378d980b8c49148cef3ca']
---

Change-Id: I708d1df204ac2d33cefe80b23594442b193424d2
2023-01-10 23:35:22 +00:00
Erik van Velzen
23b92da490
satellite/metabase/rangedloop: live reporting (#5366)
Add an observer to monitor ranged segment loop progress.

Tested by running the segment loop in storj-up and navigating to
http://<container>:11111/mon/stats and there is the entry:

rangedloop-live,scope=storj.io/storj/satellite/metabase/rangedloop numSegments=364523630000.000000

part of https://github.com/storj/storj/issues/5223

Change-Id: If3d2774d2f17f51eac86f47c6dda1fb8ad696dfe
2023-01-06 09:49:14 +01:00
Erik van Velzen
1d4411f166
satellite/metabase/rangedloop: cancellation (#5364)
Support interruption of the ranged segment loop through context.

Part of https://github.com/storj/storj/issues/5223

Change-Id: Iae0260e250f8ea33affed95c6592a1f42df384eb
2023-01-05 16:32:30 +01:00
Qweder93
8c69ee62fc {cmd/storj-sim, satellite/rangedloop}: added rangedloop to storj-sim, removed identity
added in storj-sim rangedloop for each satellite, to verify it works for metrics oveserver,
removed identity from rangedloop peer as we never use it, added logs on service run, added loop
to service instead of endless for loop, interval value to config

Closes: https://github.com/storj/storj/issues/5414

Change-Id: Ibc3b06071b68feda4a35b45da2bbe36e22a02fc8
2023-01-05 11:29:00 +00:00
Erik van Velzen
1da9fd1eee
satellite/metabase/rangedloop: monkit durations (#5365)
Wire up duration measurement of observers with monkit.

Tested by attaching a SleepObserver, starting the rangedloop in storj-up
and navigating to http://<container>:11111/mon/stats. It reports the
following statistic:

completed-observer-duration,observer=*rangedlooptest.SleepObserver,scope=storj.io/storj/satellite/metabase/rangedloop duration=10.000117

Change-Id: Ief131d34001dd5d3ba1d7be6f161986e1f66440d
2023-01-04 12:16:47 +00:00
Michal Niewrzal
77afdae741 satellite/metabase: handle target pending/committed objects while move
Before we introduced objects versions internally move operation was
always failing when under target location object exists. But then we
had only single version 1 all the time. With versions different than 1
we need to check all existing objects under target location.

To be backward compatible with our API new logic looks like this:
* if there is no object under target location use source object version
as target version
* if there are only pending objects find first free (highest) version
which could be used to move object there
* if there is committed object under target location reject move
operation

Fixes https://github.com/storj/storj/issues/5403

Change-Id: I717f3e7c42470b406287d6ec335f6f057d3fc3b5
2023-01-04 08:50:51 +00:00
Erik van Velzen
37b4981cc0
satellite/metabase/rangedloop: measure observer duration (#5350)
Track duration of all segment loop observers. Factor out functions to
reduce size.

Still need to send the measurements out via monkit.

Part of https://github.com/storj/storj/issues/5223

Change-Id: Iae0260e250f8ea33affed95c6592a1f42df384eb
2022-12-21 21:58:08 +01:00
Egon Elbre
04f16f8768 cmd/tools/segment-verify: tool for checking duplicate net
Change-Id: Ie47c1282e580ffc418bf3b1f3c8820a48973aefc
2022-12-15 22:58:36 +00:00
Michal Niewrzal
0bbbb9c4c1 satellite/metabase: fix log for multiple committed version
Change-Id: I2556c5b523091c11937a01efff07be9e0dd964aa
2022-12-13 13:08:02 +00:00
Michal Niewrzal
0759cbdc7f satellite/metabase: handle copies with GetStreamPieceCountByNodeID
We missed proper handling of object copies for method
GetStreamPieceCountByNodeID which is used by metabase.GetObjectIPs.
That caused some lack of IPs returned when queriyng IPs of copy and
broke things like pices map on linksharing.

Fixes https://github.com/storj/storj/issues/5406

Change-Id: I9574776f34880788c2dc9ff78a6ae20d44fe628f
2022-12-13 12:32:56 +01:00
Andrew Harding
b562cbf98f satellite/metrics: provide a rangedloop observer
https://github.com/storj/storj/issues/5236

Change-Id: Ic1ed7a5533dccacd58285b64579dbdd6210de4f9
2022-12-09 12:04:39 -07:00
Andrew Harding
633ab8dcf6 satellite/metadabase/rangedloop: stream affinity for test provider
Some observers assume that they will observe all the segments for a
given stream, and that they will observe those segments in a sequential
stream over one or more iterations.

This change updates the range provider from rangedlooptest to provide
these guarantees.

The change also removes the Mock suffix from the provider/splitter types
since the package name (rangedlooptest) implies that the type is a test
double.

Change-Id: I927c409807e305787abcde57427baac22f663eaa
2022-12-09 16:49:02 +00:00
Michal Niewrzal
5c2131ed0d satellite/metabase: always try to remove old version on commit
We have a bug in our behavior while doing API pods deployment. At this
time its possible to have pods with multiple versions flag set true only
partially for some of pods. Because of that it's possible to start new
object without removing existing/older version on BeginObject
(new behavior) and also don't remove that existing/older object on
CommitObject. That can cause to have two committed objects with
different versions and that's a state we want to avoid.

To fix it we are removing multiple versions flag from CommitObject to
always try delete existing objects. This way even if we don't remove
existing object on BeginObject it will be always removed while
committing.

Fixes https://github.com/storj/storj/issues/5373

Change-Id: Idc334bf5cc785d2f559af96e92c3de6d82ca58ba
2022-12-09 13:45:03 +00:00
Erik van Velzen
3cf7ebfad0
satellite/metabase/rangedloop: database abstraction (#5337)
Add an abstraction rangedloop.SegmentProvider to fetch chunks of
segments from the metainfo database in parallel.

Part of https://github.com/storj/storj/issues/5223

Change-Id: Ife26467ea0c3be550bde0b05464ef1db62dd4d2a
2022-12-09 03:01:07 +01:00
Erik van Velzen
ff6d640fca
satellite/metabase/rangedloop: minimal loop (#5334)
Minimal implementation of the ranged (=threaded) segment loop
service, to improve performance over the existing loop.

Has tests with a an inmemory segment database
and example observer.

Does not have yet: database link, observer duration tracking,
suspicious processed ratio guard, rate limiting, minimum execution
interval per observer, etc.

Part of https://github.com/storj/storj/issues/5223

Change-Id: I08ffb392c3539e380f4e7b4f1afd56c4c394668d
2022-12-08 15:27:21 +01:00
Fadila Khadar
7fd23d6864 satellite/metabase: add logic for verifying segments in given buckets
To be able to verify segments in a list of buckets, this change:
- adds method ListBucketsStreamIDs to list all stream ids belonging to a list of buckets provided using a ListVerifyBucketList on which Add(projectID, bucketName) is defined.
- allows to specify a list of streamIDs to check in ListVerifySegments

Fixes https://github.com/storj/storj-private/issues/101

Change-Id: I72a48a0873a3056ac54ad56c0e9242364b2ae918
2022-12-08 09:45:15 +00:00
Michal Niewrzal
4544eee72b Revert "satellite/metainfo: enable metainfo.multiple-versions flag by default"
This reverts commit f0ce8996c3.

We need to revert it until https://github.com/storj/storj/issues/5373 is
fixed.

Change-Id: Ibb22af100014724d1910d4871d8f4e159fdea391
2022-12-07 19:43:20 +00:00
Andrew Harding
c6e48fb71d satellite/metabase/rangedloop: clarify observer docs
Change-Id: I171d39fd069186c2c275aab3a5e672427b34e38f
2022-12-07 11:27:35 +00:00
Michal Niewrzal
f0ce8996c3 satellite/metainfo: enable metainfo.multiple-versions flag by default
We tested new upload flow (with multiple versions) to fix inconsistency
while uploading object on QA/EUN1/SLC. Now we would like to enable it
for all satellites by default. Tests required small adjustments.

Fixes https://github.com/storj/storj/issues/5283

Change-Id: I0d53c041abebc0d182ba5a88bb1dac906c29caf0
2022-11-23 17:05:22 +00:00
Erik van Velzen
b574ee5e6d satellite/metabase/rangedloop: service skeleton
Create skeleton for multi-threaded segment loop, peer, cmd command for rangedloop.

Change-Id: I52c78a313f15070d43207c52ea94e53169821654
2022-11-22 15:21:41 +02:00
Michal Niewrzal
d5eea2db61 satellite/accounting: use custom query for bucket tally by default
We added alternative way to calculate bucket tallies for accounting and
now it's tested and we will enable it by default.

CollectBucketTallies was extended to support overriding current time
to be able to test handling expired objects.

Change-Id: I738b99a33fd2e086245f92d874c1cbb806e834c0
2022-11-22 10:23:40 +00:00
Erik van Velzen
9fb18a43d8 satellite/metabase/rangedloop: observer interface
New interface for parallel segment loop.

Closes https://github.com/storj/storj/issues/5239

Change-Id: I2bcce6f836f6625da8ceb4fc0fc030c0ea4380e7
2022-11-17 20:12:23 +00:00
Michal Niewrzal
6273ed035d satellite/metabase: make UploadID stable for different options
Multipart upload requires to have the same UploadID returned from
different requests (BeginUpload, ListUploads). Otherwise client won't
be able to find existing uploads. Main issue was that data needed to
construct UploadID is in System metadata which can be filtered out
by listing option.

This change is fixing how we are setting Status for listed objects and
it's forcing reading System metadata if we are reading pending objects.

Fixes https://github.com/storj/storj/issues/5298

Change-Id: I8dd5fbab4421a64dc3ed95556408ead4c829f276
2022-11-10 17:35:36 +00:00
Erik van Velzen
337b72f310 satellite/metabase/rangedloop: uuid range pairs
Pair uuid's to create ranges. Will be used to parallelize the segment
loop.

Part of https://github.com/storj/storj/issues/5223

Change-Id: I73e2fb8a2cd379b840864449b6251b48feeb7b66
2022-11-03 11:15:56 +00:00
Erik van Velzen
c25391e976 satellite/metabase/rangedloop: uuid generation
Create helper function to generate ranges of UUIDs, for parallelizing
the segment loop.

Change-Id: I17dbc1d5effe27fc1a3491aa9ca56c692bd95df0
2022-10-31 16:05:41 +01:00
paul cannon
c54c45c9c7 satellite/audit: new ReverifyPiece implementation
ReverifyPiece() is not currently hooked up to anything, but is planned
to take the place of audit.(*Verifier).Reverify().

ReverifyPiece() works by downloading one piece in its entirety, rather
than pulling an entire stripe across many nodes.

Change-Id: Ie2c680f4d3c3b65273a72466a3f9f55c115b0311
2022-10-27 16:06:21 +00:00
Qweder93
fa287b8206 satellite/metabase: added CollectBucketTallies
One of two parts to stop using objects loop for bucket accounting,
this method collects bucket tallies from list of bucket locations

part1 of: https://github.com/storj/team-metainfo/issues/125

Change-Id: Id2d492582453e28463cddf1245622fb7f191050c
2022-10-15 18:31:06 +00:00
Fadila Khadar
35f74b78e0 satellite/metabase: IterateLoopSegments accepts ranges
Fixes: https://github.com/storj/storj/issues/5207

Change-Id: I7872696068320987825de2d381f57ea503736e89
2022-10-13 14:12:40 +00:00
Michal Niewrzal
e5ac8430c3 satellite/metainfo: delete pieces from nodes on object commit
We have new flow where existing object is deleted not on begin
object but on commit object. Deletion on commit object is still
missing deletion from storage nodes. This change adds this part
to the code.

Fixes https://github.com/storj/storj/issues/5222

Change-Id: Ibfd34665b2a055ec6c0d6e260c1a57e8a4c62b0e
2022-10-12 15:02:24 +00:00
Egon Elbre
8b70f969b6 all: fix nolint directives
Change-Id: I261c8b12e4961e6401cc4024fa5abc35b1a5efa6
2022-10-11 18:31:20 +00:00
Michal Niewrzal
11d1e623b5 satellite/metabase/segmentloop: don't do rate limiting if disabled
We have a code to  limit segments loop in case it will hit DB to hard
but so far we didn't use this loop feature  in production. This is a
simple change to avoid logic responsible for rate limiting and its
monitoring if limiting is disabled (RateLimit = 0)

Change-Id: I43e07b407c6e65cf252303159d052eef250d1bea
2022-10-11 10:55:30 +00:00
Michal Niewrzal
db1409eea6 satellite/metabase: use SUBSTRING with objects iterator
Until this change we were stripping prefix from object key on satellite side. Because of that we were transferring over network unnecessary data
from DB. This change adjusts iterator SQL queries to use SUBSTRING to
remove prefix on DB side and avoid sending it to satellite.

Benchmark against 'main':
unfortunately "time/op" is very unstable while doing local bench in this
case and sometimes  there is no difference in time and sometimes its up to 18%. I never saw results when old solution is faster then new one. Results for "alloc/op" and "allocs/op" are rather consistent.

name                                                 old time/op    new time/op    delta
NonRecursiveListing/Cockroach/listing_no_prefix-8      1.98ms ± 6%    2.05ms ±23%     ~     (p=1.000 n=9+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8    3.97ms ± 8%    3.42ms ±20%  -13.86%  (p=0.005 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8    8.42ms ±16%    7.58ms ± 5%   -9.91%  (p=0.002 n=10+10)

name                                                 old alloc/op   new alloc/op   delta
NonRecursiveListing/Cockroach/listing_no_prefix-8      16.7kB ± 0%    16.9kB ± 0%   +1.16%  (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8    27.3kB ± 0%    28.2kB ± 0%   +3.31%  (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8    60.0kB ± 0%    62.4kB ± 0%   +3.93%  (p=0.000 n=10+8)

name                                                 old allocs/op  new allocs/op  delta
NonRecursiveListing/Cockroach/listing_no_prefix-8         312 ± 0%       315 ± 0%   +0.96%  (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_with_prefix-8       526 ± 0%       541 ± 0%   +2.85%  (p=0.000 n=10+10)
NonRecursiveListing/Cockroach/listing_only_prefix-8     1.16k ± 0%     1.23k ± 0%   +5.24%  (p=0.000 n=10+10)

Change-Id: I23e501494ededafb2dd5ea903e8e4e313b42e956
2022-10-10 14:27:26 +00:00
Michal Niewrzal
4d9c9138ce satellite/metainfo: use multiple object versions internally
With this change we are switching methods to begin object, from
BeginObjectExactVersion to BeginObjectNextVersion. Main implication
is that from now it will be possible to have object with version
different than 1. New object will always get first available version.

Main reason to do this it to avoid deleting existing object during
reuploading object. Now we can create multiple pending objects but
only last committed will be available to the user. Any previous
committed object will be deleted.Because of that we moved logic to
delete existing object from BeginObject to CommitoObject request.

New logic is behind feature flat to be able to test it well first
before enablng on production.

Fixes https://github.com/storj/storj/issues/4871

Change-Id: I2dd9c7364fd93796a05ef607bda9c39a741e6a89
2022-10-06 15:19:02 +00:00
Egon Elbre
c8506cdda3 satellite/metabase,cmd/tools/segment-verify: simplify interface
Change-Id: Icdd445b1713bc26cee3b3a125b68b0cde0739837
2022-10-06 13:42:00 +00:00
Egon Elbre
c1817ab743 cmd/tools/segment-verify: a few fixes
The flags weren't properly loading from config.

The code assumed that every node that's online for downloading also have
data uploaded to them -- which is not true.

Change-Id: Ifd65a47b9eca5b4841231928244fab17acbde6fb
2022-10-05 15:51:38 +00:00