lb-rs: enum approach for co-operative core processes #3590

Kousay-Jebir · 2025-06-16T19:26:04Z

The first commits do the following big changes :

Renamed Lb to InnerLb. This has the consequence of aliasing imports of Lb in all the service/*.rs files
Moved InnerLb definition and its init definition to its own module.
lib.rs now only imports the enum as well as the two concrete structs InnerLb and ProxyLb
to use crate::InnerLb as Lb.
the enum itslef is now taking the name Lb. Added pub use enum_lb::Lb; so that lb-rs clients don't have to adjust their imports.
rpc module contains two structs to properly define what a request and a response looks like. for now these reside in the module file itself but i think it needs to be moved to the model folder in future commits
error.rs's LbErr now implements Ser and Deser, I did ask the AI to do these implementations just demo purposes for now since it's that relevant right now for what I've been working on will probably work on improving them in later commits
For now i went with the approach where each call opens its own tcp connection. thus the definition of ProxyLb not containing a connection field used across the whole api.
For drafting purposes, I hardcoded a TCP port that init will try to connect to, Thinking about moving this in Config

The goal of these first commits is to prepare the overall structure.
If this gets approved, future commits will focus on implementing the other apis, further organize the files, get rid of all the warnings and at some point implement robust measures such as retrial on error on connections etc...

libs/lb/lb-rs/src/enum_lb.rs

libs/lb/lb-rs/Cargo.toml

Kousay-Jebir · 2025-07-05T11:32:39Z

pub struct LbClient {
    pub addr: SocketAddrV4,
    pub events: EventSubs
}

do you think this is valid ? , upon inspecting i see that there is no reason for subscribe to be called via rpc and let the leader handle it, i think followers can just call their own without interacting with the leader here ?

Parth

good stuff, continuing to head in the right direction, you should fire off a cargo fmt soon

libs/lb/lb-rs/src/blocking.rs

Parth · 2025-07-07T22:28:02Z

libs/lb/lb-rs/src/dispatch.rs

@@ -0,0 +1,419 @@
+pub async fn dispatch(lb: Arc<LbServer>, req: RpcRequest) -> LbResult<Vec<u8>> {
+
+    let raw = req.args.unwrap_or_default();


I will have some comments in the client side of this that should render the unwrap_or_default() un-needed.

Parth · 2025-07-07T23:46:34Z

libs/lb/lb-rs/src/dispatch.rs

+        "import_account_private_key_v2" => {
+            let (pk_bytes, api_url): ([u8; 32], String) = deserialize_args(&raw)?;
+            let sk = SecretKey::parse(&pk_bytes).map_err(core_err_unexpected)?;
+            call_async(|| lb.import_account_private_key_v2(sk, &api_url)).await?
+        }
+
+        "import_account_phrase" => {
+            let (phrase_vec, api_url): (Vec<String>, String) = deserialize_args(&raw)?;
+            let slice: Vec<&str> = phrase_vec.iter().map(|s| s.as_str()).collect();
+            let phrase_arr: [&str; 24] = slice
+                .try_into()
+                .map_err(core_err_unexpected)?;
+            call_async(|| lb.import_account_phrase(phrase_arr, &api_url)).await?
+        }


not sure why these have been lifted out to here

Parth · 2025-07-07T23:48:45Z

libs/lb/lb-rs/src/dispatch.rs

+fn call_sync<R>(f: impl FnOnce() -> LbResult<R>) -> LbResult<Vec<u8>>
+where
+    R: Serialize,
+{
+    let res: R = f()?;
+    bincode::serialize(&res).map_err(|e| core_err_unexpected(e).into())
+}


is this needed? I thought all the lb-rs fns became async because of the network activity

all of lb-enum (in turn lb-client) are async due to the network activity, but i haven't turned sync methods of lb-server to async cuz no need to

Parth · 2025-07-07T23:51:01Z

libs/lb/lb-rs/src/dispatch.rs

+use std::future::Future;
+use std::path::PathBuf;
+use std::sync::Arc;
+
+use libsecp256k1::SecretKey;
+use serde::de::DeserializeOwned;
+use serde::Serialize;
+use std::sync::Mutex;
+use uuid::Uuid;
+use crate::model::api::{AccountFilter, AccountIdentifier, AdminSetUserTierInfo, ServerIndex, StripeAccountTier};
+use crate::model::errors::{LbErrKind};
+use crate::model::errors::{core_err_unexpected};
+use crate::model::file::ShareMode;
+use crate::model::file_metadata::{DocumentHmac, FileType};
+use crate::model::path_ops::Filter;
+use crate::rpc::{CallbackMessage, RpcRequest};
+use crate::service::activity::RankingWeights;
+use crate::service::import_export::{ExportFileInfo, ImportStatus};
+use crate::subscribers::search::SearchConfig;
+use crate::{LbServer,LbResult};


Personally I'm sympathetic to all use statements being at the bottom of all files, I like to order things in some sort of significance, either alpha or importance and use statements usually are just noise to me.

But it's too un-idiomatic and foreign of a thing to do. If rustfmt supported it as a config and it happened automatically I could be down to be weird. But in the absence of that I think it's better off not happening on a project we want many people to contribute to.

Parth · 2025-07-08T00:09:52Z

libs/lb/lb-rs/src/rpc.rs

+    stream.read_exact(&mut buf).await?;
+
+    let req: RpcRequest = bincode::deserialize(&buf).map_err(core_err_unexpected)?;
+    let payload = dispatch(lb, req).await?;


Suggested change

let payload = dispatch(lb, req).await?;

let resp = dispatch(lb, req).await?;

Parth · 2025-07-08T00:11:11Z

libs/lb/lb-rs/src/service/keychain.rs

Not sure why serialization stuff was added here, this file is prob only for LbServer internals.

had to do this cuz now theres get_keychain

I am pretty sure no one outside lb is using keychain tho. So there prob shouldn't be a get_keychain

Parth · 2025-07-08T00:11:39Z

libs/lb/lb-rs/src/subscribers/search.rs

 }

-#[derive(Copy, Clone, Debug)]
+impl Serialize for SearchIndex {


Not sure why serialization stuff was added here, this file is prob only for LbServer internals.

same here, for get_search

still I may be mis-understanding but I'm fairly sure that no one will be serializing / deserializing the whole search index -- which is basically every document stored in mem

Parth · 2025-07-08T00:12:15Z

libs/lb/lb-rs/src/lb_enum.rs

I think you should just call this file lb.rs

Parth · 2025-07-08T00:37:08Z

libs/lb/lb-rs/src/lb_client.rs

+#[derive(Clone)]
+pub struct LbClient {
+    pub addr: SocketAddrV4,
+    pub events: EventSubs


I don't think this makes sense:

i think followers can just call their own without interacting with the leader here ?

let me explain the point of the subscription api:
lots of people will ask questions from lb-rs (like: "what are all the files?"), and then they'll show that to the user.

The subscription api tells them when that data is stale and it's time to get fresh data.

Followers simply won't have access to the information unless it's pushed from the server to them.

Followers are interested in subscribing to a tokio broadcast channel. I would read generally about channels and then read about broadcast if you're unfamiliar with these synchronization primitives. So you're close regarding what you have, but I don't think this field belongs in LbClient, I think it's going to be something like this:

With your existing stateless protocol you'd have to do something like this:

when someone calls subscribe the client implementation do the following:

create a broadcast channel of it's own

create a new thread, this thread will call_rpc, and the dispatch version of subscribe will do the actual subscription and continuously respond with updates (the channel can't be sent over the network so this is the only sensible reply here). This helper thread on the client will recv those messages form them back into rust structs and broadcast them to whoever is listening.

the client subscribe impl will return the recv end of the broadcast channel as soon as the thread is created

the thread will populate the channel beyond that point.

When I was advising you to not really worry about callbacks it's to reserve your resources for mastery over this one method -- subscribe. I am interested in pretty soon deprecating all the callbacks we have already in favor of this new api.

Pushing data from within lb to clients is annoying, you're experiencing it's annoyance over the network, smail experiences in ffi, etc. So we just want to do it once and then just rely on our rich typesystem to carry us beyond that point once the pipe is setup.

There are three reasons I want your connection to be stateful, and you can work around all of them:

compatibility, if one client is updated before another one, then the connection may be incompatible, we'll need to explore conventions here, maybe one broadcasts it's min supported version (like our actual server does). Ideally this would just happen at connection time, but it's prob fast enough for it to happen over every connection. You can reserve room in the protocol like you did for the size of the transmission. Or it's part of the connection sequence (that's what I would do).

Performance overhead -- I assume it's small but worth addressing, but you can measure and if it's insignificant then you don't need to do it.

Security, right now there's no benefit to guarding access to localhost (to be clear it's not 0.0.0.0) but one day (optional passphrase #159) there likely will be and then the connection will probably take a passphrase or some other sort of API key to start. It's possible that you'd still need to log in to the client regardless, and that client would have a bb core with just an account or something that's used to authenticate with LbServer.

Kousay-Jebir · 2025-08-14T14:11:35Z

Should i work on fixing resulting errors in lb-c lb-java and test_utils

Parth · 2025-08-14T16:55:50Z

You can probably fix those errors last -- we can get very far with just CLI & egui as a proof of concept. Happy to see this moving again!

Parth · 2025-09-15T16:54:57Z

libs/lb/lb-rs/src/rpc.rs

+) -> LbResult<()> {
+    let method_id = method as u16;
+    let body = bincode::serialize(args).map_err(core_err_unexpected)?;
+    let len = body.len();


I think sending a len prefix is fine and pretty common

you should just know about another approach, one that fits this 1 connection per request a bit better:

after finishing your writes you can execute stream.Shutdown(write) and then on the reading side you can do read_all. This means you don't have to negotiate the types.

This lets you do a single, probably more optimal syscall, signalling to the OS, I want to put all available bytes into memory, I assume this is the fastest way to express the idea if the end goal is in-fact to get all the bytes into memory.

however if we find that we're trying to head in the direction of re-using the connection (which is likely) then you'll need to negotiate the length. So I think keeping it in this half way point is fine until we know what direction we want to head and then produce the optimal implementation for that strategy.

Parth · 2025-09-15T16:59:43Z

libs/lb/lb-rs/src/rpc.rs

+    stream
+        .read_exact(&mut len_buf)
+        .await
+        .map_err(core_err_unexpected)?;
+    let resp_len = usize::from_be_bytes(len_buf);


there is a stream.read_u64 and a stream.write_u64 which could simplify these and related calls

why do you use _be_ in some places and _le_ in others? this makes me upgrade the above to a "requested change" from a suggestion.

Parth

Okay this is actually quite close to being ready to merge:

rebase needs to happen, so people can test with the most recent version
minor comments around the protocol need to be addressed
search index for sure does not need to be serialized, the search queries will be sent to LbServer which is the only one who will have the index.
we can make LbErr::backtrace an Option<String> instead of an Option<Backtrace> which will let you derive Serialize and Deserialize via the macro on the LbErr type.

Kousay-Jebir · 2025-09-16T13:24:58Z

Regarding the SearchIndex serialization, in clients/cli/src/main.rs line 241 used to have
lb.search.tantivy_reader.reload().unwrap(); i had to add a get_search method so that this can retreive the index from the leader when lb is a follower. i think this was my only reason that made me impl ser and deser on it.

Parth · 2025-09-16T18:46:51Z

Regarding the SearchIndex serialization, in clients/cli/src/main.rs line 241 used to have lb.search.tantivy_reader.reload().unwrap(); i had to add a get_search method so that this can retreive the index from the leader when lb is a follower. i think this was my only reason that made me impl ser and deser on it.

OKay that makes sense, I think the preferred way to solve this would be to expose a new endpoint for the CLI to refresh the index.

Kousay-Jebir added 5 commits June 15, 2025 19:55

rename Lb as InnerLb and move it to a seperate module file

c72b9c7

create proxy_lb module and the Lb enum in lib.rs as the main entry point

b34b311

very minimal structure to decide if follower / leader or error on both

f4346f1

start of rpc module and firs use

542ea97

create_account impl for the enum

89b066f

Kousay-Jebir mentioned this pull request Jun 16, 2025

multiple independent co-operative core processes #2207

Closed