Fetch server keys concurrently #985

Open
_ZN3val wants to merge 2 commits from concurrent-fetch-keys into next
_ZN3val commented 2023-11-24 20:12:54 +00:00 (Migrated from gitlab.com)

This speeds up handling of /_matrix/federation/v1/send/:transaction_id
when more than one event contains unknown keys.

In particular, when receiving multiple PDUs with dead servers in their
auth chain, timeouts of each server accumulate and can make handling of
incoming requests take several minutes, to the point the client closes
the connection (eg. matrix.org has a 2 minute timeout), causing new
events to be dropped eventually.


  • I agree to release my code and all other changes of this MR under the Apache-2.0 license
This speeds up handling of /_matrix/federation/v1/send/:transaction_id when more than one event contains unknown keys. In particular, when receiving multiple PDUs with dead servers in their auth chain, timeouts of each server accumulate and can make handling of incoming requests take several minutes, to the point the client closes the connection (eg. matrix.org has a 2 minute timeout), causing new events to be dropped eventually. ----------------------------------------------------------------------------- - [x] I agree to release my code and all other changes of this MR under the Apache-2.0 license
_ZN3val commented 2023-11-24 20:57:31 +00:00 (Migrated from gitlab.com)

added 1 commit

  • a258ad83 - Fetch server keys concurrently

Compare with previous version

added 1 commit <ul><li>a258ad83 - Fetch server keys concurrently</li></ul> [Compare with previous version](/famedly/conduit/-/merge_requests/560/diffs?diff_id=854324680&start_sha=55eacfdbdd1c8fd8dcafbf28e43a949f44012c90)
_ZN3val commented 2023-11-24 21:46:39 +00:00 (Migrated from gitlab.com)

added 1 commit

  • e986bf2d - Fetch server keys concurrently

Compare with previous version

added 1 commit <ul><li>e986bf2d - Fetch server keys concurrently</li></ul> [Compare with previous version](/famedly/conduit/-/merge_requests/560/diffs?diff_id=854336688&start_sha=a258ad83ea1d1860d761e0fe242523b5c0c6882e)
_ZN3val commented 2023-11-25 17:51:48 +00:00 (Migrated from gitlab.com)

added 1 commit

  • 28db5ada - Fetch server keys concurrently

Compare with previous version

added 1 commit <ul><li>28db5ada - Fetch server keys concurrently</li></ul> [Compare with previous version](/famedly/conduit/-/merge_requests/560/diffs?diff_id=854543932&start_sha=e986bf2d63089f32a9003cdd2d74b7c67aa792ac)
Aranjedeath commented 2023-11-26 18:27:58 +00:00 (Migrated from gitlab.com)

mentioned in commit Aranjedeath/conduit@e6938db25b

mentioned in commit Aranjedeath/conduit@e6938db25b246aa789a1f83866e183a3cf86c92f
_ZN3val commented 2023-12-12 18:41:14 +00:00 (Migrated from gitlab.com)

added 1 commit

  • e7f4dd47 - Add missing pub key in send_join handling

Compare with previous version

added 1 commit <ul><li>e7f4dd47 - Add missing pub key in send_join handling</li></ul> [Compare with previous version](/famedly/conduit/-/merge_requests/560/diffs?diff_id=870477460&start_sha=28db5ada33cb3cf6946fb9bb86cc443da93f3a3d)
Aranjedeath commented 2024-01-24 03:52:12 +00:00 (Migrated from gitlab.com)

I have been using this patch since written. it works well.

I have been using this patch since written. it works well.
timokoesters commented 2024-01-24 18:22:06 +00:00 (Migrated from gitlab.com)

I only skimmed this code so far, but it looks like you reimplemented some of the functionality found in event_handler::fetch_join_signing_keys, which is a bit more powerful because it tries to get all keys in one go from the trusted server. I think instead of introducing more functions, we can make fetch_join_signing_keys more generic and use it in more places, what do you think?

But I also feel like this MR is not as useful as it sounds, because it's only a single transaction that takes long. After that, either the keys are loaded, or the servers are on the backoff blacklist.

I only skimmed this code so far, but it looks like you reimplemented some of the functionality found in event_handler::fetch_join_signing_keys, which is a bit more powerful because it tries to get all keys in one go from the trusted server. I think instead of introducing more functions, we can make fetch_join_signing_keys more generic and use it in more places, what do you think? But I also feel like this MR is not as useful as it sounds, because it's only a single transaction that takes long. After that, either the keys are loaded, or the servers are on the backoff blacklist.
_ZN3val commented 2024-01-24 19:35:57 +00:00 (Migrated from gitlab.com)

instead of introducing more functions

this isn't introducing a new function, I only generalized fetch_required_signing_keys.
But indeed, they do look similar. Maybe

But I also feel like this MR is not as useful as it sounds

It is a requirement to run a server with trusted_servers = [] if you want to join a handful of non-trivial rooms. Without it, most PDUs from matrix.org can't be processed

After that, either the keys are loaded, or the servers are on the backoff blacklist.

or the client closed the connection before we finished processing, causing the request to be killed before anything is persisted.

> instead of introducing more functions this isn't introducing a new function, I only generalized `fetch_required_signing_keys`. But indeed, they do look similar. Maybe > But I also feel like this MR is not as useful as it sounds It is a requirement to run a server with `trusted_servers = []` if you want to join a handful of non-trivial rooms. Without it, most PDUs from `matrix.org` can't be processed > After that, either the keys are loaded, or the servers are on the backoff blacklist. or the client closed the connection before we finished processing, causing the request to be killed before anything is persisted.
timokoesters commented 2024-01-24 20:54:47 +00:00 (Migrated from gitlab.com)

It is a requirement to run a server with trusted_servers = [] if you want to join a handful of non-trivial rooms. Without it, most PDUs from matrix.org can't be processed

I don't think this MR changes anything about room joins, also see my chat messages here: https://matrix.to/#/!SMloEYlhCiqKwRLAgY:fachschaften.org/$GwAOU5o9gh_ULd2rjZO4PoNF4wDphZCkvz_IqgV2ENM?via=flipdot.org&via=matrix.org&via=conduit.rs

or the client closed the connection before we finished processing, causing the request to be killed before anything is persisted.

All requests are handled in a new tokio task that will not be cancelled when the client disconnects, see spawn_task in main.rs

> It is a requirement to run a server with `trusted_servers = []` if you want to join a handful of non-trivial rooms. Without it, most PDUs from `matrix.org` can't be processed I don't think this MR changes anything about room joins, also see my chat messages here: https://matrix.to/#/!SMloEYlhCiqKwRLAgY:fachschaften.org/$GwAOU5o9gh_ULd2rjZO4PoNF4wDphZCkvz_IqgV2ENM?via=flipdot.org&via=matrix.org&via=conduit.rs > or the client closed the connection before we finished processing, causing the request to be killed before anything is persisted. All requests are handled in a new tokio task that will not be cancelled when the client disconnects, see `spawn_task` in main.rs
_ZN3val commented 2024-01-24 21:10:21 +00:00 (Migrated from gitlab.com)

I don't think this MR changes anything about room joins

I meant that after joining the rooms, PDUs from matrix.org time out

All requests are handled in a new tokio task that will not be cancelled when the client disconnects, see spawn_task in main.rs

well I don't know then, but I can tell you that for two weeks my server was unable to process PDUs from matrix.org in times of high activity and logging that it was fetching the same keys over and over.

> I don't think this MR changes anything about room joins I meant that after joining the rooms, PDUs from matrix.org time out > All requests are handled in a new tokio task that will not be cancelled when the client disconnects, see `spawn_task` in main.rs well I don't know then, but I can tell you that for two weeks my server was unable to process PDUs from matrix.org in times of high activity and logging that it was fetching the same keys over and over.
timokoesters commented 2024-01-24 21:12:03 +00:00 (Migrated from gitlab.com)

Interesting, I will try to look through the code again with this bug in mind

Interesting, I will try to look through the code again with this bug in mind
Aranjedeath commented 2024-02-14 04:04:07 +00:00 (Migrated from gitlab.com)

mentioned in commit Aranjedeath/conduit@3122a7417b

mentioned in commit Aranjedeath/conduit@3122a7417b1bc661f0f9a85f0a7c1d8b5fb9fe30
This pull request is broken due to missing fork information.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin concurrent-fetch-keys:concurrent-fetch-keys
git checkout concurrent-fetch-keys

Merge

Merge the changes and update on Forgejo.
git checkout next
git merge --no-ff concurrent-fetch-keys
git checkout next
git merge --ff-only concurrent-fetch-keys
git checkout concurrent-fetch-keys
git rebase next
git checkout next
git merge --no-ff concurrent-fetch-keys
git checkout next
git merge --squash concurrent-fetch-keys
git checkout next
git merge --ff-only concurrent-fetch-keys
git checkout next
git merge concurrent-fetch-keys
git push origin next
Sign in to join this conversation.
No reviewers
No labels
Android
CS::needs customer feedback
CS::needs follow up
CS::needs on prem installation
CS::waiting
Chrome
Design:: Ready
Design:: in progress
Design::UX
E2EE
Edge
Firefox
GDPR
Iteration 13 IM
Linux
MacOS
Need::Discussion
Need::Steps to reproduce
Need::Upstream fix
Needs:: Planning
Needs::Dev-Team
Needs::More information
Needs::Priority
Needs::Product
Needs::Refinement
Needs::Severity
Priority::1-Critical
Priority::2-Max
Priority::3-Impending
Priority::4-High
Priority::5-Medium
Priority::6-Low
Priority::7-None
Progress::Backlog
Progress::Review
Progress::Started
Progress::Testing
Progress::Triage
Progress::Waiting
Reporter::Sentry
Safari
Target::Community
Target::Customer
Target::Internal
Target::PoC
Target::Security
Team:Customer-Success
Team:Design
Team:Infrastructure
Team:Instant-Messaging
Team:Product
Team:Workflows
Type::Bug
Type::Design
Type::Documentation
Type::Feature
Type::Improvement
Type::Support
Type::Tests
Windows
blocked
blocked-by-spec
cla-signed
conduit
contribution::advanced
contribution::easy
contribution::help needed
from::review
iOS
p::ti-tenant
performance
product::triage
proposal
refactor
release-blocker
s: dart_openapi_codegen
s::Famedly-Patient
s::Org-Directory
s::Passport-Generator
s::Requeuest
s:CRM
s:Famedly-App
s:Famedly-Web
s:Fhiroxide
s:Fhiroxide-cli
s:Fhiroxide-client
s:Fhirs
s:Hedwig
s:LISA
s:Matrix-Dart-SDK
s:Role-Manager
s:Synapse
s:User-Directory
s:WFS-Matrix
s:Workflow Engine
s:dtls
s:famedly-error
s:fcm-shared-isolate
s:matrix-api-lite
s:multiple-tab-detector
s:native-imaging
severity::1
severity::2
severity::3
severity::4
technical-debt
voip
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Matthias/conduit#985
No description provided.