Outgoing federation to servers using SRV record redirect does not seem to work over SOCKS5 proxy #246

Open
opened 2022-02-22 17:51:38 +00:00 by DeeUnderscore · 12 comments
DeeUnderscore commented 2022-02-22 17:51:38 +00:00 (Migrated from gitlab.com)

Description

Conduit, when configured to proxy traffic over SOCKS5, seems to fail when talking to a remote server that redirects to a different subdomain via a SRV record.

I was trying to communicate to a remote homeserver with a setup like the following:

  • https://example.com/.well-known/matrix/server returns {"m.server": "example.com"}
  • SRV for _matrix._tcp.example.com is 10 0 8448 matrix.example.com
  • example.com does not respond on 8448 (Connection Refused), but matrix.example.com does respond on 8448

As far as I understand, this is an acceptable setup per spec.

I have a proxy configured as in the following example:

[global.proxy]
global = { url = "socks5h://192.168.1.100" }

The proxy is Dante running on a different machine. The relevant DNS records are resolvable via dig on this machine. Conduit has been successfully talking to a number of remote servers over this proxy for a couple of days.

When trying to accept an invite from someone on the example.com server, Element would fail and report Connection Refused to a https://example.com:8448/… URL. Conduit also logged "error trying to connect: socks connect error: Connection refused" when trying to fetch a https://example.com:8448/_matrix/media/r0/… URL, which I assume is the user's icon, since the icon did not show up in Element. This happened using both a socks5h and a socks5 proxy URL.

Removing the proxy configuration line from config made it possible to both accept the invite, and also made Conduit succeed at fetching the remote user's icon (it showed up in Element).

System Configuration

Conduit Version: 0.3.0
Database backend (default is sqlite): rocksdb

### Description Conduit, when configured to proxy traffic over SOCKS5, seems to fail when talking to a remote server that redirects to a different subdomain via a SRV record. I was trying to communicate to a remote homeserver with a setup like the following: * `https://example.com/.well-known/matrix/server` returns `{"m.server": "example.com"}` * `SRV` for `_matrix._tcp.example.com` is `10 0 8448 matrix.example.com` * `example.com` does not respond on 8448 (Connection Refused), but `matrix.example.com` does respond on 8448 As far as I understand, this is an acceptable setup per spec. I have a proxy configured as in the following example: ```toml [global.proxy] global = { url = "socks5h://192.168.1.100" } ``` The proxy is Dante running on a different machine. The relevant DNS records are resolvable via `dig` on this machine. Conduit has been successfully talking to a number of remote servers over this proxy for a couple of days. When trying to accept an invite from someone on the `example.com` server, Element would fail and report Connection Refused to a `https://example.com:8448/…` URL. Conduit also logged "error trying to connect: socks connect error: Connection refused" when trying to fetch a `https://example.com:8448/_matrix/media/r0/…` URL, which I assume is the user's icon, since the icon did not show up in Element. This happened using both a `socks5h` and a `socks5` proxy URL. Removing the proxy configuration line from config made it possible to both accept the invite, and also made Conduit succeed at fetching the remote user's icon (it showed up in Element). ### System Configuration Conduit Version: 0.3.0 Database backend (default is sqlite): rocksdb
ticho34782694 commented 2022-02-22 19:05:49 +00:00 (Migrated from gitlab.com)

.well-known/matrix/server takes precedence over the DNS SRV record (https://spec.matrix.org/v1.2/server-server-api/#resolving-server-names), so the described setup of the remote homeserver (well-known pointing to example.com, but example.com not listening on 8448) seems incorrect to me.

`.well-known/matrix/server` takes precedence over the DNS SRV record (https://spec.matrix.org/v1.2/server-server-api/#resolving-server-names), so the described setup of the remote homeserver (well-known pointing to `example.com`, but `example.com` not listening on 8448) seems incorrect to me.
DeeUnderscore commented 2022-02-22 19:17:26 +00:00 (Migrated from gitlab.com)

The third bullet point of section 3 in that section says:

If <delegated_hostname> is not an IP literal and no
<delegated_port> is present, an SRV record is looked up for
_matrix._tcp.<delegated_hostname>.

By that reading, if you receive {"m.server": "example.com"}, then delegated_hostname becomes example.com, you have no <delegated_port>, and you should do a SRV lookup. This is actually what Conduit does as of now, but I suspect the overrides it adds are not applied when using a proxy.

The third bullet point of section 3 in that section says: > If <code>&lt;delegated_hostname&gt;</code> is not an IP literal and no <code>&lt;delegated_port&gt;</code> is present, an SRV record is looked up for <code>_matrix._tcp.&lt;delegated_hostname&gt;</code>. By that reading, if you receive `{"m.server": "example.com"}`, then `delegated_hostname` becomes `example.com`, you have no `<delegated_port>`, and you should do a SRV lookup. This is actually what [Conduit does as of now](https://gitlab.com/famedly/conduit/-/blob/next/src/server_server.rs#L360-379), but I suspect the overrides it adds are not applied when using a proxy.
DeeUnderscore commented 2022-02-22 19:27:14 +00:00 (Migrated from gitlab.com)

changed the description

changed the description
ticho34782694 commented 2022-02-23 19:37:29 +00:00 (Migrated from gitlab.com)

I just tried setting up a homeserver instance with the delegation setup as closely similar as what you described, but I cannot reproduce the issue. For reference, my instance is at tst.cyberdi.sk. The differences are that I use port 443 instead of 8448, and my delegated_hostname has the port open, but returns 401 for everything. (I am not in a position to get another server with a different IP address just for this.)

Note, though, that this seems to be an invalid configuration at least according to https://federationtester.matrix.org/api/report?server_name=tst.cyberdi.sk, which wants to talk to the delegated_hostname (tst.cyberdi.sk). My other synapse server also refuses to federate with this server unless I make the homeserver available also on the tst.cyberdi.sk hostname.

Maybe I'm doing something wrong, feel free to check my setup and suggest improvements.

I just tried setting up a homeserver instance with the delegation setup as closely similar as what you described, but I cannot reproduce the issue. For reference, my instance is at `tst.cyberdi.sk`. The differences are that I use port 443 instead of 8448, and my delegated_hostname has the port open, but returns 401 for everything. (I am not in a position to get another server with a different IP address just for this.) Note, though, that this seems to be an invalid configuration at least according to https://federationtester.matrix.org/api/report?server_name=tst.cyberdi.sk, which wants to talk to the `delegated_hostname` (`tst.cyberdi.sk`). My other synapse server also refuses to federate with this server unless I make the homeserver available also on the `tst.cyberdi.sk` hostname. Maybe I'm doing something wrong, feel free to check my setup and suggest improvements.
tulir commented 2022-02-23 19:54:31 +00:00 (Migrated from gitlab.com)

maunium.net is available for testing the full .well-known + SRV resolution spec and will return fancy error responses if you do anything wrong.

maunium.net is available for testing the full .well-known + SRV resolution spec and will return fancy error responses if you do anything wrong.
DeeUnderscore commented 2022-02-23 20:21:54 +00:00 (Migrated from gitlab.com)

Note that per the spec, "Requests should be made to the resolved IP address and port with
a Host header containing the <delegated_hostname>", so that's why the tester is going to send a Host header of tst.cyberdi.sk instead of matrix.tst.cyberdi.sk (and also expect a cert that's valid for tst.cyberdi.sk), and for maunium.net it accepts a cert for federation.mau.chat.

Note that per the spec, "Requests should be made to the resolved IP address and port with a <code>Host</code> header containing the <code>&lt;delegated_hostname&gt;</code>", so that's why the tester is going to send a `Host` header of `tst.cyberdi.sk` instead of `matrix.tst.cyberdi.sk` (and also expect a cert that's valid for `tst.cyberdi.sk`), and for maunium.net it accepts a cert for `federation.mau.chat`.
ticho34782694 commented 2022-02-23 20:34:38 +00:00 (Migrated from gitlab.com)

Thanks, apparently I need to learn better how nginx routes traffic based on the TLS hostname and the Host header.

Anyway, it seems that this bug is reproducible directly on maunium.net - trying to fetch media via conduit with proxy disabled works (M_NOT_FOUND of course), but with proxy enabled, the response is:

{
    "errcode": "M_NOT_FOUND",
    "error": "Incorrect IP address, you need to read the spec more carefully :P",
    "solution": "You need to connect to meow.host.mau.fi (135.181.208.158 / 2a01:4f9:3a:26a1::) on port 443 as specified by the SRV record at _matrix._tcp.federation.mau.chat. The A and AAAA records on federation.mau.chat point to a different IP address (95.216.50.134 / 2a01:4f9:3a:ff34::).",
    "spec": "https://spec.matrix.org/v1.2/server-server-api/#resolving-server-names",
    "🐈": true
}

Thanks @tulir 😄, now to find out what is going wrong...

Thanks, apparently I need to learn better how nginx routes traffic based on the TLS hostname and the Host header. Anyway, it seems that this bug is reproducible directly on maunium.net - trying to fetch media via conduit with proxy disabled works (`M_NOT_FOUND` of course), but with proxy enabled, the response is: ``` { "errcode": "M_NOT_FOUND", "error": "Incorrect IP address, you need to read the spec more carefully :P", "solution": "You need to connect to meow.host.mau.fi (135.181.208.158 / 2a01:4f9:3a:26a1::) on port 443 as specified by the SRV record at _matrix._tcp.federation.mau.chat. The A and AAAA records on federation.mau.chat point to a different IP address (95.216.50.134 / 2a01:4f9:3a:ff34::).", "spec": "https://spec.matrix.org/v1.2/server-server-api/#resolving-server-names", "🐈": true } ``` Thanks @tulir :smile:, now to find out what is going wrong...
ticho34782694 commented 2022-02-23 22:08:26 +00:00 (Migrated from gitlab.com)

Best I can tell, with proxy enabled, reqwest does not go through the resolve function of federation_client (https://gitlab.com/famedly/conduit/-/blob/next/src/database/globals.rs#L140), so it ends up trying to connect to the wrong hostname.

With proxy enabled, it is equivalent to simple

curl https://delegated_hostname/...

while with proxy disabled, it does something like

curl --connect-to delegated_hostname:443:actual_destination:443 https://delegated_hostname/...

(yes, that is a horrible way to send custom SNI with curl, but I don't think a better one exists)

Best I can tell, with proxy enabled, reqwest does not go through the resolve function of `federation_client` (https://gitlab.com/famedly/conduit/-/blob/next/src/database/globals.rs#L140), so it ends up trying to connect to the wrong hostname. With proxy enabled, it is equivalent to simple ```curl https://delegated_hostname/...``` while with proxy disabled, it does something like ```curl --connect-to delegated_hostname:443:actual_destination:443 https://delegated_hostname/...``` (yes, that is a horrible way to send custom SNI with curl, but I don't think a better one exists)
timokoesters commented 2022-04-01 07:56:37 +00:00 (Migrated from gitlab.com)

Conduit uses reqwest's ClientBuilder::proxy to configure the proxy and a custom DNS resolver to make SRV redirects work. Does the proxy have to use a custom resolver too?

Conduit uses reqwest's ClientBuilder::proxy to configure the proxy and a custom DNS resolver to make SRV redirects work. Does the proxy have to use a custom resolver too?
timokoesters commented 2022-04-01 09:59:09 +00:00 (Migrated from gitlab.com)
Maybe this is relevant: https://github.com/matrix-org/matrix-spec/issues/561
ticho34782694 commented 2022-04-01 14:38:12 +00:00 (Migrated from gitlab.com)

Yes, the outgoing proxy would have to use the same custom resolver, because right now, the proxied requests can end up in wrong destination for anything besides the simplest delegation setups.

Yes, the outgoing proxy would have to use the same custom resolver, because right now, the proxied requests can end up in wrong destination for anything besides the simplest delegation setups.
tamara-schmitz commented 2023-10-24 15:33:47 +00:00 (Migrated from gitlab.com)

mentioned in issue #395

mentioned in issue #395
Sign in to join this conversation.
No labels
Android
CS::needs customer feedback
CS::needs follow up
CS::needs on prem installation
CS::waiting
Chrome
Design:: Ready
Design:: in progress
Design::UX
E2EE
Edge
Firefox
GDPR
Iteration 13 IM
Linux
MacOS
Need::Discussion
Need::Steps to reproduce
Need::Upstream fix
Needs:: Planning
Needs::Dev-Team
Needs::More information
Needs::Priority
Needs::Product
Needs::Refinement
Needs::Severity
Priority::1-Critical
Priority::2-Max
Priority::3-Impending
Priority::4-High
Priority::5-Medium
Priority::6-Low
Priority::7-None
Progress::Backlog
Progress::Review
Progress::Started
Progress::Testing
Progress::Triage
Progress::Waiting
Reporter::Sentry
Safari
Target::Community
Target::Customer
Target::Internal
Target::PoC
Target::Security
Team:Customer-Success
Team:Design
Team:Infrastructure
Team:Instant-Messaging
Team:Product
Team:Workflows
Type::Bug
Type::Design
Type::Documentation
Type::Feature
Type::Improvement
Type::Support
Type::Tests
Windows
blocked
blocked-by-spec
cla-signed
conduit
contribution::advanced
contribution::easy
contribution::help needed
from::review
iOS
p::ti-tenant
performance
product::triage
proposal
refactor
release-blocker
s: dart_openapi_codegen
s::Famedly-Patient
s::Org-Directory
s::Passport-Generator
s::Requeuest
s:CRM
s:Famedly-App
s:Famedly-Web
s:Fhiroxide
s:Fhiroxide-cli
s:Fhiroxide-client
s:Fhirs
s:Hedwig
s:LISA
s:Matrix-Dart-SDK
s:Role-Manager
s:Synapse
s:User-Directory
s:WFS-Matrix
s:Workflow Engine
s:dtls
s:famedly-error
s:fcm-shared-isolate
s:matrix-api-lite
s:multiple-tab-detector
s:native-imaging
severity::1
severity::2
severity::3
severity::4
technical-debt
voip
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Matthias/conduit#246
No description provided.