From df9bbcecad8cd05e0d5b3381109f925ea701d4ef Mon Sep 17 00:00:00 2001
From: JT Olio <hello@jtolio.com>
Date: Fri, 6 Jan 2023 17:31:11 -0500
Subject: [PATCH] blueprint: noise over tcp

Change-Id: I3871e540313100a7198d62a455c745f128ac51fb
---
 docs/blueprints/noise.md | 603 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 603 insertions(+)
 create mode 100644 docs/blueprints/noise.md

diff --git a/docs/blueprints/noise.md b/docs/blueprints/noise.md
new file mode 100644
index 000000000..189e19397
--- /dev/null
+++ b/docs/blueprints/noise.md
@@ -0,0 +1,603 @@
+# Noise over TCP (uplink to storage node)
+
+## Abstract
+
+This design doc discusses how we can achieve a large connection set up
+performance improvement by exchanging our use of TLS for Noise_IK.
+
+This design doc is scoped to communication between the Uplink and storage nodes
+only.
+
+## Background/context
+
+### The problem
+
+When compared to traditional datacenter storage platforms, Storj is
+significantly more exposed to issues arising from distance. In particular,
+whenever two peers need to exchange data in a round trip, the speed of light is
+a serious concern. In a datacenter, a round trip between two nodes may take
+500 us, or .5 ms. In the Storj context, a round trip to a node across the ocean
+may take 150 ms, *300 times* slower. There is no way to fix this, this is a
+fundamental physical law.
+
+Because datacenter round trips are so fast, many protocols have not been
+designed with an extreme allergy to round trips, but we must be. Every time we
+wait for a packet round trip, we are adding 150ms, which for certain operations,
+is our entire performance budget.
+
+To do a small download, say 5KB for example, right now the Uplink will:
+ * first establish a TCP connection with the Satellite (one packet round trip,
+   TCP SYN, then TCP ACK),
+ * then establish a TLS session over that TCP stream (another packet round trip,
+   TLS Client Hello, then TLS Server Hello),
+ * then send the request and receive a satellite response (a third round trip).
+ * Now, the Uplink is finally able to start making requests to nodes! It has to
+   do this all again.
+ * first, establish a TCP connection with each node (packet round trip)
+ * then, establish a TLS session over TCP (packet round trip)
+ * then send the RPC request over DRPC (packet round trip)
+ * THEN send the order (it's not part of the same request currently!!)
+   (strictly speaking, this isn't inherently another round trip, though this
+    behavior does significantly complicate using Noise, which requires a
+    response to handshake packet 1 before sending packet 2).
+ * finally, the node will start returning data.
+
+If the Uplink is near the Satellite, the Satellite operations won't be the worst
+part (though, considering most Uplinks don't operate in the same datacenter as
+the Satellite, and the Satellite itself has inter-dc penalities due to Cockroach
+coordination, it's not cheap). But most node operations do have a large amount
+of traffic that goes over an ocean, and many node operations are not able to
+be effectively connection pooled (though we've tried). Because the overall
+request goes as fast as the slowest of 29 downloads, the likelihood that at
+least one of the required nodes has a high round trip cost is high, and so this
+is what is killing us.
+
+### QUIC
+
+Our first attempt to fix this was to roll out QUIC. QUIC is a TCP-like protocol
+built on top of UDP packets (TCP is a protocol built on top of IP packets).
+QUIC's solution is to combine the TLS and the TCP handshake into one.
+
+Marton has proven that our QUIC project did indeed pay off, most of the time.
+QUIC makes our overall performance between peers significantly faster due to
+the elimination of just one handshake.
+
+Unfortunately, having QUIC based on UDP means that slightly more requests fail
+than with TCP (perhaps bad node operator setup, middleware dropping packets,
+unclear). This means that our long tail cancelation has to wait for more nodes,
+which makes us more susceptible to slowness. So overall, QUIC has worse long
+tail variability, and is slower at higher percentiles.
+
+### Do we need the TLS handshake?
+
+QUIC eliminates one handshake by combining the TCP and TLS handshakes, but the
+TLS handshake at all is a bummer. The reason the TLS handshake exists is for
+both peers to exchange cryptographic information and confirm that they are
+talking to the right peer over the right protocol, doing a protocol negotiation.
+
+But we don't need that, certainly for nodes!
+
+Storage nodes are already checking in with Satellites, and this provides an
+opportunity for storage nodes to provide all of their cryptographic
+configuration settings (public key, etc) in advance. When the Satellite tells
+the Uplink which nodes to talk to, it can give the Uplink everything it needs
+to skip the cryptographic handshake.
+
+So, we should try and eliminate the cryptographic handshake altogether. We don't
+need it, and we can avoid it, at least between Uplinks and nodes, in all cases.
+
+### Do we need any handshakes?
+
+If we don't need the TLS handshake, do we really even need the TCP handshake?
+Also no. There's no reason the very first SYN packet for connection setup can't
+include the request data, so that the very second packet back (what would have
+been the ACK packet) can include the first bytes of the response. This would
+take our current situation of 3 round trips between the Uplink and each node to
+1, which saves 100-150ms off for each round trip eliminated.
+
+Including data in the SYN packet is precisely what TCP_FASTOPEN does, which we
+could enable, if we don't end up using something UDP-based.
+
+For the purposes of this design doc though, suffice it to say that if this is
+our ideal, we need to eliminate the TLS handshake entirely in almost all cases.
+
+### What about the Satellite?
+
+This design doc is focusing on the Uplink to storage node communication as a
+way to get a foot in the door. Solving Uplink to Satellite communication is
+more challenging, as we don't have a clean way of getting the Uplink the
+Satellite's current Noise public key in advance. That is of course interesting
+but saved for a future design doc. Satellite to storage node communications are
+not in the hotpath and are thus not as high priority.
+
+Luckily, the Uplink to Satellite communication flow benefits the most from
+connection pooling.
+
+## Design
+
+### The Noise Framework
+
+The Noise protocol framework is a relatively new entrant in the cryptographic
+communication space. Using cryptographic design primitives like the ratchet
+system Signal uses, the Noise protocol framework provides a tight,
+straightforward set of cryptographic building blocks for building your own TLS.
+In fact, the Noise protocol is what Wireguard uses.
+
+A great intro to the Noise framework is this presentation by its creator:
+https://www.youtube.com/watch?v=3gipxdJ22iM. The design doc is also fantastic:
+https://noiseprotocol.org/noise.html
+
+The Noise framework prioritizes simplicity and security above all else.
+
+One of the ways the Noise framework radically reduces complexity is by
+eliminating all of the negotiation features that TLS has. In TLS, the client and
+the server negotiate on which cryptographic primitives can and should be used
+for the session, and the client and server can choose based on what's available
+between peers.
+
+Noise does away with this. When you implement Noise, you pin down your
+algorithms. Since Noise is a framework, here are some example Noise protocols
+within the framework:
+
+ * Noise_XX_25519_AESGCM_SHA256
+ * Noise_N_25519_ChaChaPoly_BLAKE2s
+ * Noise_IK_448_ChaChaPoly_BLAKE2b
+
+ You choose one of these at connection dial time, and then there are no places
+ in the code that allow a session to negotiate or switch. Both peers must agree
+ on the protocol in advance.
+
+ The Noise framework also lets you choose between a set list of potential
+ "handshake  patterns" which describe what is exchanged in packets, when, and
+ what security  properties you get. There are many possible handshake patterns,
+ but for full duplex protocols, the Noise authors recommend XX or IK only. XX is
+ much like TLS in that there is a handshake to exchange keys first. IK is a
+ 0-RTT handshake and requires the keys exchanged in advance, which we can do.
+ Wireguard uses Noise_IK. Noise_IK is the handshake pattern that solves our
+ problems here.
+
+ The one (and only?) downside to Noise_IK is that it opens us up to replay
+ attacks. See the Open issues section.
+
+ We want to use Noise_IK. We probably want to benchmark and pick between:
+
+  * Noise_IK_25519_ChaChaPoly_BLAKE2b
+  * Noise_IK_25519_AESGCM_BLAKE2b
+
+I think we're convinced BLAKE2b is faster and better for our cases than BLAKE2s,
+SHA256, and SHA512, but I'm not convinced (given the existence of accelerated
+encryption hardware) that ChaChaPoly is better than AESGCM, which we use
+everywhere else.
+
+### TCP vs UDP
+
+We already have a project (QUIC) that seeks to eliminate the TCP handshake, and
+TCP in general, and prepare the network for using other UDP-based protocols. It
+has been a mixed bag. We should keep working on it! We have struggled to enable
+QUIC by default due to these UDP-related issues.
+
+This project is trying to take a parallel approach as a next step - what
+happens if we keep TCP? Can we sidestep our UDP issues but still eliminate a
+handshake in a robust way?
+
+So, this project is going to be based on TCP. A later project may be to swap
+TCP for another UDP based protocol (unencrypted QUIC, UDT, something else) or
+to try and improve TCP using a technique like TCP_FASTOPEN, but
+we can do that after getting all the cryptography correct for this one.
+
+### Data flow
+
+The broad picture is that when storage nodes check in, they will submit their
+Noise public key and Noise configuration to the Satellite. When the Uplink asks
+the Satellite who to talk to, the Satellite can return the Noise information
+to the Uplink, and the Uplink can establish 0-RTT Noise_IK connections.
+
+In tests for small files, this led to a consistent 1.5x overall speedup, which
+is huge. With TCP_FASTOPEN in addition, the savings went to over 2.8x.
+
+Noise_IK requests are at risk of replay attacks, so we don't want to enable
+them by default everywhere. We need to audit each request for idempotency before
+enabling it, but at least initially, Upload and Download requests from Uplinks
+to Nodes would be exceptionally high value.
+
+Uploads require that the node is able to validate cryptographically that the
+peer it is talking to is the node id in question. Since we get that with TLS
+but we won't get that with Noise DH25519 keys, the Node will need to send a
+signed attestation by its Node key that the Noise key is indeed its public key
+at the end of the upload. This can be precomputed and thus fast.
+
+## Rationale
+
+Seems good!
+
+## Implementation
+
+### Changes to the RPC server code
+
+As a function of our migration from gRPC to DRPC, we already have a server-side
+demultiplexer system built in - drpcmigrate.ListenMux. Even more luckily,
+(and, actually, a situation planned with foresight) this multiplexing happens
+outside of the TLS stream, so we can use this same demultiplexer for
+differentiating between DRPC over TLS and DRPC over Noise.
+
+The current prefix for DRPC over TLS is 8 bytes - `DRPC!!!1`. We can do a new
+prefix, `DRPC!N!1`, to indicate Noise.
+
+```
+publicNoiseDRPCListener = noiseconn.NewListener(
+    publicMux.Route("DRPC!N!1"), p.noiseConf)
+go p.public.drpc.Serve(ctx, publicNoiseDRPCListener)
+```
+
+The server will need to generate a static DH key and persist it somewhere,
+though it is okay if the DH key gets regenerated from time to time (process
+start is probably fine TBH).
+
+The server will also need to choose the encryption algorithm, hashing algorithm,
+and DH curve to use.
+
+It's worth mentioning that Noise has some guidance about cryptographic channel
+binding. If you want a node to attest that it is indeed the peer on the other
+end of a Noise channel, the best way for the node to attest that it is that
+specific Noise session is for the peer to sign the handshake hash more than
+anything else, exposed in this commit:
+https://github.com/jtolio/noiseconn/commit/d7ec1a08b0b81c40754d83980ae63d1dfcc7c58b
+See https://noiseprotocol.org/noise.html#channel-binding for more.
+
+Potentially useful commits:
+ * https://review.dev.storj.io/c/storj/drpc/+/9225
+ * https://review.dev.storj.io/c/storj/storj/+/9224
+ * https://review.dev.storj.io/c/storj/storj/+/9187
+
+### Changes to Node Address / NodeURL structures
+
+Oh man, pb.Node, pb.NodeAddress, and pb.NodeTransport have rotted significantly
+from their original intentions. pb.NodeTransport still having GRPC flags in
+there gives some sense that we got this all wrong.
+
+In fact, the original intention of pb.Node has been almost entirely superceded
+by NodeURL, which basically does the exact same thing as pb.Node.
+
+The original intention of pb.Node was to specify a Node, along with the
+information you'd need to securely dial it. That is now a NodeURL, which is
+of the form
+
+```
+base58nodeid@host:port
+```
+
+This NodeURL, by virtue of having the node id, allows the client to know whether
+they are securely talking to the right peer (the node id is the hash of the
+validated certificate authority that signed the TLS leaf cert).
+
+Unfortunately, the NodeURL as it stands is not enough to talk over Noise_IK,
+since the RSA keys referenced by the NodeID cannot be used in Noise.
+
+To be able to talk over Noise, the following things are needed:
+ * The peer public key (32 bytes) (this is only needed for Noise_IK, so this
+   could be optional for Noise_XX).
+ * The peer's cipher suite selection (hashing, symmetric encryption, and
+   Diffie-Helmann)
+ * The peer's handshake pattern (we should always use IK for Nodes, but perhaps
+   we'll want to support XX for Satellites).
+
+To prevent a malicious Satellite from replacing Noise public keys with something
+else, we'll additionally need
+
+ * A signed signature, from the node's certificate chain and public key, signing
+   that the Noise key is correct. Unfortunately, for validation, this requires
+   also carrying around the node's certificate chain, so this is likely too
+   large to include in a Node Address and will need to be something we provide
+   on demand. It's unclear how much of a threat a malicious Satellite could even
+   be, considering how much trust the Uplink puts in the Satellite.
+
+Independent of anything else, pb.Node/pb.NodeAddress/pb.NodeTransport should
+be refactored so that it is a 1-1 match with NodeURL. A NodeURL should be able
+to be represented efficiently (not base58) as a pb.Node protobuf, and
+human-readable as a NodeURL. There should be a lossless conversion routine that
+converts a pb.Node to a NodeURL and back again.
+
+I'd also suggest that we should rename NodeURL to something else since we're
+abusing URI syntax at best.
+
+This all implies some cleaned up types. Let's rename everything to be a NodeLoc.
+
+```
+message NodeLoc {
+  // the node id, not encoded
+  bytes id = 1;
+
+  // the TCP address for TCP-based communication with the node. The TCP address
+  // is required. See the note at the bottom of this design doc about IPv6.
+  // Address here implies a host/port pair joined via net.JoinHostPort style
+  // logic.
+  string tcp_address = 2;
+
+  // the UDP address for UDP-based communication with the node. this isn't
+  // necessarily the same and we're currently forcing it to be! we could decide
+  // that a missing udp_address either the TCP address should be used or means
+  // UDP is unsupported.
+  // Address here implies a host/port pair joined via net.JoinHostPort style
+  // logic.
+  // Note: this may be worth saving for later entirely. Alternatively, this
+  // could just be a port and we could force the host to be the same as
+  // tcp_address.
+  string udp_address = 3;
+
+  // noise settings. If provided, the node may support noise handshakes instead
+  // of TLS over TCP or UDP.
+  enum NoiseProtocol {
+    NOISE_UNSET = 0;
+    NOISE_IK_25519_CHACHAPOLY_BLAKE2B = 1;
+    NOISE_IK_25519_AESGCM_BLAKE2B = 2;
+  }
+  NoiseProtocol noise_proto = 4; // this is explicitly not a set.
+  bytes noise_pk = 5;
+}
+
+// This type is a note signed by a node that this public key is what they are
+// using. NoiseSessionAttestation should be used instead where possible.
+message NoiseKeyAttestation {
+    bytes node_id = 1;
+    bytes node_certchain = 2;
+    bytes noise_public_key = 3;
+    uint64 timestamp = 4;
+    bytes signature_of_public_key_and_timestamp = 5;
+}
+
+// This type is a note signed by a node that this active Noise session really
+// has them on the other end.
+message NoiseSessionAttestation {
+    bytes node_id = 1;
+    bytes node_certchain = 2;
+    bytes noise_handshake_hash = 3;
+    bytes signature_of_handshake_hash = 4;
+}
+```
+
+The intention of NodeTransport was to have a list of transports that the node
+understood, so that clients could use newer transports on nodes that supported
+them, but in practice this field has just fallen into complete disuse and we've
+managed those issues with requiring recent versions on all nodes instead. That
+said, it may still make sense to have a list of supported protocols in the
+Node structure.
+
+A `NodeLoc` can be serialized like an old NodeURL as follows:
+
+```
+base58nodeid@tcp_address?udp=udp_address&noise_proto=1&noise_pk=base58_noise_public_key
+```
+
+This should be backwards compatible with existing serialized NodeURLs. We should
+evaluate if this format is parsable by existing NodeURL parsing code (assuming
+it throws away query parameters).
+
+AddressedOrderLimits should return filled in *pb.NodeLoc.
+
+### Changes to the RPC client code
+
+Dialing to a node should take this new NodeLoc structure, along with whether the request is replay-attack safe.
+
+```
+DialNode(ctx context.Context, node *pb.NodeLoc, replay_safe bool) (Conn, error)
+Validate(node *pb.NodeLoc, attestation *pb.NoiseKeyAttestation) (error)
+ValidateSession(node *pb.NodeLoc, attestation *pb.NoiseSessionAttestation) (error)
+```
+
+(If `replay_safe` is false, Noise_IK should not be used).
+
+As an aside:
+
+The rpc.Connector situation is a mess. For example:
+ * TCPConnector's DialContextUnencrypted adds the DRPC specific header, which
+   will make changing the header based on different Encryption strategies (TLS
+   vs Noise) challenging.
+ * HybridConnector does too much (premature generalization).
+ * If someone provides their own DialContext, the Connector interface doesn't
+   allow for the net.Dialer.Control style of adding something like TCP_FASTOPEN.
+   Config.DialContext is unfortunately just one step removed from
+   Dialer.Control, which means that if someone provides a Config.DialContext,
+   then our library can no longer call Setsockopt on the socket before dialing.
+   If we want to flexibly add TCP_FASTOPEN to most requests, then we need to be
+   able to call Setsockopt before the connect() syscall happens, and that's only
+   possible if you set Dialer.Control *before* DialContext is called.
+
+We should get rid of all of the connector flexibility and have a single,
+unconfigurable dialer type that dials with common/socket's BackgroundDialer.
+
+Going forward, you should be able to ask an RPC dialer pool to:
+
+```
+DialNode(ctx context.Context, node *pb.NodeLoc, replay_safe bool) (Conn, error)
+```
+
+and get back a valid Conn. The logic inside DialNode should:
+
+1) Consider the QUIC rollout state from common/rpc/quic_rollout.go. If QUIC is
+   enabled, we should use that. QUIC should be disabled.
+2) If QUIC is disabled, but replay_safe is true and the pb.NodeLoc has Noise
+   information, the dial should happen over TCP over Noise.
+3) Otherwise the dial should happen over TCP over TLS
+
+The RPC pool needs to keep track of QUIC, TLS, and Noise connections separately.
+In particular, Noise connections should be identified by the Noise public key
+and Noise protocol from the pb.NodeLoc Noise protocol enum.
+
+Possibly useful commits:
+ * https://review.dev.storj.io/c/storj/common/+/9219
+
+### Changes to DRPC
+
+DRPC should gain a feature that allows corking outgoing sends (forcing all
+writes into a local buffer), and then uncorking, which tells the DRPC stream
+to send the buffer with the next send. This is important because our existing
+piecestore Download protocol sends two separate requests before the node can
+start returning data, and we want both requests to go into the initial Noise
+packet.
+
+drpcstream.Options.ManualFlush is close, but we only want to change it
+for a single packet, and we are possibly getting a conn from the connection
+pool, so a per-conn ManualFlush is hard to use.
+
+Possibly useful commits:
+ * https://review.dev.storj.io/c/storj/drpc/+/9236
+ * https://review.dev.storj.io/c/storj/uplink/+/9237
+
+### Changes to the storage node
+
+The storage node, using the new RPC server side code, should have Noise
+configuration generated. It should submit this Noise information as part of its
+contact checkin. It should provide a NoiseSessionAttestation.
+
+Nodes should send NoiseSessionAttestations at the end of uploads so the Uplinks
+can do node id piece validation.
+
+Nodes should be extended to check if the initial Download request has an order
+embedded and use it if so.
+
+Possibly useful commits:
+ * https://review.dev.storj.io/c/storj/uplink/+/9246
+ * https://review.dev.storj.io/c/storj/storj/+/9245
+ * https://review.dev.storj.io/c/storj/common/+/9197
+ * https://review.dev.storj.io/c/storj/storj/+/9188
+
+### Changes to the Satellite
+
+On contact checkin, the Satellite should check for Noise information and
+validate a NoiseSessionAttestation. If the NoiseSessionAttestation is valid, the
+Satellite should persist the *pb.NodeLoc information in the nodes table.
+
+The upload and download selection caches should retrieve the *pb.NodeLoc
+information and add them to the AddressedOrderLimit structs that are sent to
+Uplinks.
+
+Possibly useful commits:
+ * https://review.dev.storj.io/c/storj/common/+/9197
+ * https://review.dev.storj.io/c/storj/common/+/9214
+ * https://review.dev.storj.io/c/storj/storj/+/9200
+ * https://review.dev.storj.io/c/storj/storj/+/9215
+
+### Changes to the Uplink
+
+Uplinks should be extended to use the *pb.NodeLoc from the AddressedOrderLimits
+and use the rpc Dialing that uses those.
+
+Our existing piecestore Download protocol sends two separate requests before the
+node can start returning data, and we want both requests to go into the initial
+Noise packet. So, Uplinks should use a new DRPC feature to cork the first
+Download request RPC send, so that the Download request goes out when the Uplink
+sends the actual order request and both get written to the first Noise packet.
+
+An alternative strategy would be to update the Uplink to send the first order
+as part of the first request, but this is not backwards compatible with old
+storage nodes.
+
+Possibly useful commits:
+ * https://review.dev.storj.io/c/storj/uplink/+/9246
+ * https://review.dev.storj.io/c/storj/storj/+/9245
+ * https://review.dev.storj.io/c/storj/uplink/+/9218
+ * https://review.dev.storj.io/c/storj/uplink/+/9220
+ * https://review.dev.storj.io/c/storj/common/+/9214
+ * https://review.dev.storj.io/c/storj/storj/+/9221
+
+## Other options
+
+### TLS session resumption
+
+Instead of Noise, we could still use TLS. TLS 1.3 has a feature called
+session resumption. Session resumption negotiates a key after a first connection
+that can be reused for zero roundtrip session setup if both peers remember each
+other. The downside of this is that we wouldn't get zero roundtrip for the first
+connection.
+
+Erik asked if perhaps the Satellite could establish these keys in advance and
+simply hand off the SSL connection session resumption information to the node.
+This seems possible in theory. Open questions for me: would this work in general
+for more than one connection? Is it safe to reestablish many connections to
+a node from multiple Uplinks? Would we have to negotiation session resumption
+information in advance per Uplink? There are a lot of cryptographic unknowns
+here for me, but in principle this is essentially forcing TLS into the Noise_IK
+shape.
+
+Overall, I'm worried at how unusual this is, vs Noise IK, where what we're doing
+is what it is designed for.
+
+### QUIC session resumption
+
+QUIC session resumption is exactly TLS 1.3 session resumption.
+
+### Multiaddresses
+
+We might want to consider migrating to https://github.com/multiformats/multiaddr
+instead of NodeURLs. The challenge I see with multi-addresses is they seem
+to let the multi-address specify much more about the connection than we want
+to allow (whether or not TLS is used, what protocols are used, etc). The only
+parts of the multi-address we want are the stuff that get us to having a
+valid IP packet host, and maybe whether the peer has open TCP or UDP ports.
+Multi-addresses do much more than that, and so we would be in a position where
+we are restricting what we use in multi-addresses, though I suppose that's not
+much different than URLs in general. I don't know what to do here other than
+that I have NIH.
+
+## Wrapup
+
+## Related work
+
+### github.com/jtolio/noiseconn
+
+To get my proof of concept working, I wrote a library that is a useful
+net.Conn wrapper that uses Noise. github.com/jtolio/noiseconn has good 
+performance and has been tested as part of the proof of concept for this
+project.
+
+### Tracing
+
+ * https://review.dev.storj.io/c/storj/storj/+/9229
+
+### TCP Fast Open
+
+ * https://review.dev.storj.io/c/storj/common/+/9252 (common/socket: enable tcp_fastopen_connect)
+ * https://review.dev.storj.io/c/storj/storj/+/9251 (private/server: support tcp_fastopen)
+
+### IPv6 support
+
+Cleaning up NodeURL is an opportunity to add more clarity around IPv6 support.
+Here are some thoughts:
+
+* Because customers might move from IPv4 to IPv6 networks and back, every
+  storage node must be reachable by every network. At very least, this means
+  every storage node must be reachable over IPv4 (IPv6 only networks should be
+  able to hit IPv4 nodes over gateways). It's fine if nodes also support IPv6
+  such that IPv6-supporting clients can reach IPv6 nodes over IPv6 natively,
+  but if data is uploaded to IPv6, we don't want it to be only available if the
+  client is on an IPv6 supporting network. So, unless a client explicitly opts
+  into their data potentially only being available over IPv6, every storage node
+  must support IPv4.
+* For IPv6 support, we could either have the NodeURL list the IPv6 address in
+  addition to the IPv4 address, much like the potential additional UDP
+  information, or we could require that IPv6 node operators get a DNS entry
+  that has both A and AAAA records.
+
+## Open issues
+
+ * We need to double check that uploads and downloads are replay attack safe
+   and make them so if not. Order serial numbers should protect against this.
+ * We should evaluate what other commands are replay attack safe.
+   Exists, RestoreTrash, Retain, and DeletePieces do not have serial checking,
+   but are only made by Satellites. We may want to ensure these methods are
+   not available over Noise_IK.
+ * We should have RPC clients keep a cache of Noise public key attestations.
+   We won't have the Satellite public key initially, but perhaps if an RPC
+   client has spoken with a Satellite before, the Satellite could have provided
+   a NoiseKeyAttestation, and thus future connections could be over Noise. This
+   would be especially useful for the Gateway-MT. We would need to audit which
+   Satellite requests are replay attack safe.
+ * This may be more of an issue for TCP_FASTOPEN, but we should double check
+   that 0-RTT connection establishment doesn't open us up to amplification
+   attacks (could an attacker spoof N bytes to us and get us to send >N bytes
+   to a third party?)
+ * How do we improve Uplink to Satellite communication? Access grants have
+   Satellite node IDs, but not Noise public keys. Is it a good idea to put
+   Noise public keys in Access grants? Should we use Noise_XX and deal with
+   the handshake? That doesn't save us much, but perhaps our Noise ciphers are
+   faster than TLS?