Commit Graph

30 Commits

Author SHA1 Message Date
Yingrong Zhao
1f59a08e85 pkg/server: eliminate Retry packet from quic-go hanshake phase
In order to prevent traffic amplification attack, QUIC allows
application to perform an address validation during connection
establishment.
(https://tools.ietf.org/id/draft-ietf-quic-transport-34.html#name-address-validation)
However, this adds another round trip during connection establishment.
In storj network, it does client authentication before servers starts
sending significant amount of data to any client. We believe the traffic
amplification attack isn't going to be significant when turning off
address validation in QUIC. AND it will provide us a significant
performance boost during connection establishment.

Change-Id: I7f9a0ca5ca770b715d08b1e8ce3022fbb2b85d42
2021-03-09 22:05:35 +00:00
Yingrong Zhao
7e80badaf9 pkg/server,pkg/quic: accept an existing conn to create quic listener and
allow disabling tcp/quic

In order to have more control of a server so that we can
simulate connection failures in `testplanet`, this PR changes
quic.Listener to accept an existing UDPConn instead of relying on the
quic-go library to create the UDPConn.
This PR also adds two flags on the `server.Config` struct to allow
enabling/disabling tcp/tls listener and quic listener. By default, they
are both set to true.
    - `DisableTCPTLS`: internal flag, disables tcp/tls listener.
    - `DisableQUIC`: hidden flag, disables quic listener
By making the `DisableQUIC` a hidden flag, it allows storagenode operators to
have the ability to disable quic traffic in case their set up can't work
with udp traffic.

Change-Id: I853b12435d988b9c41ad9b873fd57480d792e378
2021-02-03 12:04:29 -05:00
Yingrong Zhao
52d6852e58 pkg/server: add retry logic for random port assignment
Change-Id: I70464e344a79dce8eadb9513d2a990faf3b2cca8
2021-01-28 14:44:22 -06:00
Yingrong Zhao
02845e7b8f pkg/server,private/testplanet: start to listen on quic
This PR introduces a new listener that can listen for quic traffic on
both storagenodes and satellites.

Change-Id: I5eb5bc82c37dde20d3be2ec8fa5f69c18fae0af0
2021-01-27 11:03:42 -05:00
Egon Elbre
080ba47a06 all: fix dots
Change-Id: I6a419c62700c568254ff67ae5b73efed2fc98aa2
2020-07-16 14:58:28 +00:00
Egon Elbre
ec589a8289 all: fix comments about grpc
Change-Id: Id830fbe2d44f083c88765561b6c07c5689afe5bd
2020-05-11 13:05:34 +03:00
Egon Elbre
e6d5ce6b77 all: remove grpc
It seems everyone has migrated to drpc.

Change-Id: Ica6b2d0bdef68c6603083f2963458843eca71e9e
2020-05-10 06:36:09 +00:00
Yingrong Zhao
96e58d21b4 cmd;pkg/server: init tracing collector in all processes
Add tracing handler in drpc server.
Initializing tracing collector in admin, satellite api, garbage
collection, satellite core, repaier, storagenode.
Change-Id: Ie98420e35dfc6913836ebd82b517d9d12877aefc

Change-Id: I91057b6265a4ac8bde033dfde692b8a28acca99f
2020-04-07 17:20:59 -04:00
Egon Elbre
c715c75fea pkg/server: add counters for grpc calls
This will help to determine how many grpc calls are made to the
satellite.

Also remove the grpc funcs that have been added to upstream.

Change-Id: I91878f4fd10f9bfe601c94222c102eaaf4d35963
2020-03-25 21:38:13 +02:00
Yingrong Zhao
a731472496 bump storj.io/common to latest and storj.io/drpc to v0.0.11
Change-Id: I7a6e823b441eeff4621dfdf2d6577be76c9761c8
2020-03-24 15:17:10 -04:00
Egon Elbre
bd9a998abd pkg/server: remove dead code
Change-Id: I4501ac7de727e2ec6908a7624808b4e596a68a23
2020-03-24 18:30:22 +02:00
Egon Elbre
f85606b5a7 private/grpctlsopts: grpc related tlsopts
This moves grpc related tlsopts methods to private/grpctlsopts.
This allows to remove grpc dependency from tlsopts.

Change-Id: I25090b82b1e7a0633417ad600f8587b0c30ace73
2020-02-26 22:46:06 +02:00
Jeff Wendling
828d0b9984 pkg/server: set TCP_USER_TIMEOUT and monitor leaked conns
Go will, by default, set tcp keep alives on sockets. But
the kernel does not send keep alives to sockets that have
a non-empty send queue. That can cause connections that
hang forever.

So we set TCP_USER_TIMEOUT on all of the sockets as well.
That option will close any connection that has not received
an ack for any sent data (keep alive or otherwise) in the
configured time period. This places an upper bound on the
amount of time a socket can be stuck due to a client not
acknowleding data.

See https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
for more information on what these options do and how they
interact.

Additionally, make sure that we close every connection coming
from the listeners by wrapping them in a type with a finalizer
that closes the connection, much like the os package does for
file handles. It monitors if a connection was closed due to a
finalizer so that we can go and look for the bug if we ever
see a non-zero value.

Change-Id: Idc6c0564224b8dc2e4c9d769e80374ed1fe8cce0
2020-01-03 21:31:09 +00:00
Egon Elbre
6615ecc9b6 common: separate repository
Change-Id: Ibb89c42060450e3839481a7e495bbe3ad940610a
2019-12-27 14:11:15 +02:00
Jeff Wendling
f3b20215b0 pkg/{rpc,server,tlsopts}: pick larger defaults for buffer sizes
these may not be optimal but they're probably better based on
our previous testing. we can tune better in the future now that
the groundwork is there.

Change-Id: Iafaee86d3181287c33eadf6b7eceb307dda566a6
2019-11-18 21:22:49 +00:00
Jennifer Li Johnson
7ceaabb18e
Delete Bootstrap and Kademlia (#2974) 2019-10-04 16:48:41 -04:00
Jeff Wendling
098cbc9c67 all: use pkg/rpc instead of pkg/transport
all of the packages and tests work with both grpc and
drpc. we'll probably need to do some jenkins pipelines
to run the tests with drpc as well.

most of the changes are really due to a bit of cleanup
of the pkg/transport.Client api into an rpc.Dialer in
the spirit of a net.Dialer. now that we don't need
observers, we can pass around stateless configuration
to everything rather than stateful things that issue
observations. it also adds a DialAddressID for the
case where we don't have a pb.Node, but we do have an
address and want to assert some ID. this happened
pretty frequently, and now there's no more weird
contortions creating custom tls options, etc.

a lot of the other changes are being consistent/using
the abstractions in the rpc package to do rpc style
things like finding peer information, or checking
status codes.

Change-Id: Ief62875e21d80a21b3c56a5a37f45887679f9412
2019-09-25 15:37:06 -06:00
Jeff Wendling
007662d49e pkg/server: serve drpc listeners along with grpc
The fundamental problem is that both drpc and grpc servers
want to close the listener and they both want to ignore the
error from Accept after the listener is closed. There's no
way to do this in a race free way. Fortunately, the mux
hands out listeners that can be independently closed. That
means they can both do their own shutdown logic where they
ignore the error, and then after they're closed, the code
orchestrating the servers can close the listeners.

The final weird bit is that the server's Close method is
required to wait until the Run method has exited (or at
least enough for the listeners to definitely be closed)
because tests depend on that behavior, so we have to add
some channels/mutexes/onces to ensure that Run has exited
and that a new call can't start after Close is called.

Change-Id: I7c4ef293f7963f83138815f51824fd5b8d09ce15
2019-09-12 19:18:30 +00:00
Jeff Wendling
477b47f554 pkg/server: use a listenmux with nothing registered
Change-Id: I25577b4afb907f4f8b57fc0428de6e6ea4ce9ba9
2019-09-12 19:12:53 +00:00
Egon Elbre
ec3d5c0bdd
don't use global loggers (#2671)
* pkg/server: don't use global logger
* satellite/overlay: use correct logger
* pkg/kademlia: use correct logger
* linksharing: use conventional way to pass in logger
* use zaptest in tests
2019-07-31 15:09:45 +03:00
paul cannon
d15eaed588 add capability of logging all GRPC calls/payloads (#2067) 2019-06-04 14:55:24 +02:00
Michal Niewrzal
fe3dfc1587
Move pointerdb.Service to satellite (#1826) 2019-04-25 10:46:32 +02:00
Kaloyan Raev
d1639c4157 Merge statdb pkg into overlay pkg (#1570) 2019-03-25 18:25:09 -04:00
Jess G
193a70f0a6
add private listener to grpc server (#1398)
* add private listener to grpc server

* add changes per init CR

* fix server.close

* add insecure grpc connection, update logs msg

* fix tests, move insecure client

* add private ports to storj-sim, add insecure client to other inspectors

* add ports to test so there arent conflicts

* fix lint err

* fix node started log msg, close public listener

* remove commented out line
2019-03-07 13:19:37 -05:00
JT Olio
2a59679766 pkg/transport: require tls configuration for dialing (#1286)
* separate TLS options from server options (because we need them for dialing too)
* stop creating transports in multiple places
* ensure that we actually check revocation, whitelists, certificate signing, etc, for all connections.
2019-02-11 13:17:32 +02:00
Egon Elbre
e37e0c1b5f
Fix server config usage (#1282) 2019-02-08 20:57:17 +02:00
Egon Elbre
bb11d83ed0
Proper planet shutdown (#1249) 2019-02-06 15:19:14 +02:00
Michal Niewrzal
53c11dfc5d
Stop storagenode on ctrl+c (#1220)
* Stop storagenode on ctrl+c

* Cancel grpc server

* handle error

* handle error

* use errgroup

* fix check-travis-tidy

* remove pipefail
2019-02-04 15:50:55 +01:00
Egon Elbre
cecd4b0816
Remove server aliases (#1154) 2019-01-28 17:04:53 +02:00
JT Olio
2c916a04c3 pkg/provider: split into pkg/server, pkg/identity (#953) 2019-01-02 12:23:25 +02:00