Skip to content

bug: "Lost connection to relay server" warnings when not gracefully dropping endpoint, binding a new one and re-using secret key #3798

@adzialocha

Description

@adzialocha

Describe the bug

When running an iroh Endpoint with a fixed secret key and default preset a connection to the relay gets successfully established. If this endpoint instance is dropped and a new Endpoint instance launched with the same secret key I observe the following warnings which occur pretty fast after each other:

2025-12-21T10:57:33.508260Z  INFO iroh_relay_bug: ==========================
2025-12-21T10:57:33.508289Z  INFO iroh_relay_bug: spawning endpoint for the first time
2025-12-21T10:57:33.508297Z  INFO iroh_relay_bug: ==========================
2025-12-21T10:57:33.672445Z  INFO iroh_relay_bug: all good here
2025-12-21T10:57:33.672596Z  INFO iroh_relay_bug: ==========================
2025-12-21T10:57:33.672644Z  INFO iroh_relay_bug: spawning endpoint for the second time
2025-12-21T10:57:33.672678Z  INFO iroh_relay_bug: ==========================
2025-12-21T10:57:33.949755Z  WARN iroh::magicsock::transports::relay::actor: Lost connection to relay server
2025-12-21T10:57:34.049130Z  WARN iroh::magicsock::transports::relay::actor: Failed to handshake with relay server
2025-12-21T10:57:34.168594Z  WARN iroh::magicsock::transports::relay::actor: Lost connection to relay server
2025-12-21T10:57:34.271563Z  WARN iroh::magicsock::transports::relay::actor: Failed to handshake with relay server
2025-12-21T10:57:34.399962Z  WARN iroh::magicsock::transports::relay::actor: Lost connection to relay server
2025-12-21T10:57:34.514966Z  WARN iroh::magicsock::transports::relay::actor: Lost connection to relay server
2025-12-21T10:57:34.611387Z  WARN iroh::magicsock::transports::relay::actor: Failed to handshake with relay server
2025-12-21T10:57:34.718899Z  WARN iroh::magicsock::transports::relay::actor: Lost connection to relay server
2025-12-21T10:57:34.806516Z  WARN iroh::magicsock::transports::relay::actor: Failed to handshake with relay server
2025-12-21T10:57:35.113331Z  INFO iroh_relay_bug: ==========================
2025-12-21T10:57:35.113408Z  INFO iroh_relay_bug: endpoint closed

Eventually the endpoint will establish a connection to the relay as well during the second time.

In p2panda we've also observed scenarios where this never happens, aka the relay will never be connected and we see the same warnings, just a multitude more (some part hangs in a loop and tries to establish the relay connection).

The warnings do not occur if the process is restarted or the endpoint gracefully closed.

Relevant Logs

https://github.com/user-attachments/files/24278171/logs.txt

Expected behavior

Not gracefully shutting down an endpoint can happen due to faulty implementations, bugs, supervised systems, etc., the expected behaviour is to be robust in these scenarios and handle reconnection of the same endpoint id on the relay "as usual". If this is not possible, the user should maybe be informed about this issue in some way?

Iroh

Version: 0.95.1
Endpoint configuration: Default

Platform(s)

Desktop:

  • Operating System: Arch Linux, Kernel: Linux 6.18.1-arch1-2, Architecture: x86-64

Additional Context / Screenshots / GIFs

Simple program to reproduce this issue (this is also where the logs come from). You can comment out the code from the drop to close and restart the process manually in which case no warnings are shown:

use std::str::FromStr;
use std::time::Duration;

use anyhow::Result;
use iroh::{Endpoint, SecretKey};
use tracing::info;

fn setup_logging() {
    if std::env::var("RUST_LOG").is_ok() {
        let _ = tracing_subscriber::fmt()
            .with_env_filter(tracing_subscriber::EnvFilter::from_default_env())
            .try_init();
    }
}

#[tokio::main]
async fn main() -> Result<()> {
    setup_logging();

    let secret_key =
        SecretKey::from_str("ae58ff8833241ac82d6ff7611046ed67b5072d142c588d0063e942d9a75502b6")?;

    // First round.
    info!("==========================");
    info!("spawning endpoint for the first time");
    info!("==========================");

    let endpoint = Endpoint::builder()
        .secret_key(secret_key.clone())
        .bind()
        .await?;

    tokio::time::timeout(Duration::from_secs(10), endpoint.online()).await?;

    info!("all good here");

    drop(endpoint);

    // Second round (where we observe the warnings).
    info!("==========================");
    info!("spawning endpoint for the second time");
    info!("==========================");

    let endpoint = Endpoint::builder().secret_key(secret_key).bind().await?;

    tokio::time::timeout(Duration::from_secs(10), endpoint.online()).await?;

    endpoint.close().await;

    info!("==========================");
    info!("endpoint closed");

    Ok(())
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    👍 Ready

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions