-
Notifications
You must be signed in to change notification settings - Fork 24
lb-rs: enum approach for co-operative core processes #3590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
lb-rs: enum approach for co-operative core processes #3590
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
do you think this is valid ? , upon inspecting i see that there is no reason for |
Parth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good stuff, continuing to head in the right direction, you should fire off a cargo fmt soon
libs/lb/lb-rs/src/dispatch.rs
Outdated
| @@ -0,0 +1,419 @@ | |||
| pub async fn dispatch(lb: Arc<LbServer>, req: RpcRequest) -> LbResult<Vec<u8>> { | |||
|
|
|||
| let raw = req.args.unwrap_or_default(); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have some comments in the client side of this that should render the unwrap_or_default() un-needed.
libs/lb/lb-rs/src/dispatch.rs
Outdated
| "import_account_private_key_v2" => { | ||
| let (pk_bytes, api_url): ([u8; 32], String) = deserialize_args(&raw)?; | ||
| let sk = SecretKey::parse(&pk_bytes).map_err(core_err_unexpected)?; | ||
| call_async(|| lb.import_account_private_key_v2(sk, &api_url)).await? | ||
| } | ||
|
|
||
| "import_account_phrase" => { | ||
| let (phrase_vec, api_url): (Vec<String>, String) = deserialize_args(&raw)?; | ||
| let slice: Vec<&str> = phrase_vec.iter().map(|s| s.as_str()).collect(); | ||
| let phrase_arr: [&str; 24] = slice | ||
| .try_into() | ||
| .map_err(core_err_unexpected)?; | ||
| call_async(|| lb.import_account_phrase(phrase_arr, &api_url)).await? | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why these have been lifted out to here
| fn call_sync<R>(f: impl FnOnce() -> LbResult<R>) -> LbResult<Vec<u8>> | ||
| where | ||
| R: Serialize, | ||
| { | ||
| let res: R = f()?; | ||
| bincode::serialize(&res).map_err(|e| core_err_unexpected(e).into()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed? I thought all the lb-rs fns became async because of the network activity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of lb-enum (in turn lb-client) are async due to the network activity, but i haven't turned sync methods of lb-server to async cuz no need to
libs/lb/lb-rs/src/dispatch.rs
Outdated
| use std::future::Future; | ||
| use std::path::PathBuf; | ||
| use std::sync::Arc; | ||
|
|
||
| use libsecp256k1::SecretKey; | ||
| use serde::de::DeserializeOwned; | ||
| use serde::Serialize; | ||
| use std::sync::Mutex; | ||
| use uuid::Uuid; | ||
| use crate::model::api::{AccountFilter, AccountIdentifier, AdminSetUserTierInfo, ServerIndex, StripeAccountTier}; | ||
| use crate::model::errors::{LbErrKind}; | ||
| use crate::model::errors::{core_err_unexpected}; | ||
| use crate::model::file::ShareMode; | ||
| use crate::model::file_metadata::{DocumentHmac, FileType}; | ||
| use crate::model::path_ops::Filter; | ||
| use crate::rpc::{CallbackMessage, RpcRequest}; | ||
| use crate::service::activity::RankingWeights; | ||
| use crate::service::import_export::{ExportFileInfo, ImportStatus}; | ||
| use crate::subscribers::search::SearchConfig; | ||
| use crate::{LbServer,LbResult}; No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'm sympathetic to all use statements being at the bottom of all files, I like to order things in some sort of significance, either alpha or importance and use statements usually are just noise to me.
But it's too un-idiomatic and foreign of a thing to do. If rustfmt supported it as a config and it happened automatically I could be down to be weird. But in the absence of that I think it's better off not happening on a project we want many people to contribute to.
libs/lb/lb-rs/src/rpc.rs
Outdated
| stream.read_exact(&mut buf).await?; | ||
|
|
||
| let req: RpcRequest = bincode::deserialize(&buf).map_err(core_err_unexpected)?; | ||
| let payload = dispatch(lb, req).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let payload = dispatch(lb, req).await?; | |
| let resp = dispatch(lb, req).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why serialization stuff was added here, this file is prob only for LbServer internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had to do this cuz now theres get_keychain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure no one outside lb is using keychain tho. So there prob shouldn't be a get_keychain
| } | ||
|
|
||
| #[derive(Copy, Clone, Debug)] | ||
| impl Serialize for SearchIndex { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why serialization stuff was added here, this file is prob only for LbServer internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, for get_search
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still I may be mis-understanding but I'm fairly sure that no one will be serializing / deserializing the whole search index -- which is basically every document stored in mem
libs/lb/lb-rs/src/lb_enum.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should just call this file lb.rs
libs/lb/lb-rs/src/lb_client.rs
Outdated
| #[derive(Clone)] | ||
| pub struct LbClient { | ||
| pub addr: SocketAddrV4, | ||
| pub events: EventSubs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this makes sense:
i think followers can just call their own without interacting with the leader here ?
let me explain the point of the subscription api:
lots of people will ask questions from lb-rs (like: "what are all the files?"), and then they'll show that to the user.
The subscription api tells them when that data is stale and it's time to get fresh data.
Followers simply won't have access to the information unless it's pushed from the server to them.
Followers are interested in subscribing to a tokio broadcast channel. I would read generally about channels and then read about broadcast if you're unfamiliar with these synchronization primitives. So you're close regarding what you have, but I don't think this field belongs in LbClient, I think it's going to be something like this:
With your existing stateless protocol you'd have to do something like this:
when someone calls subscribe the client implementation do the following:
- create a broadcast channel of it's own
- create a new thread, this thread will
call_rpc, and thedispatchversion ofsubscribewill do the actual subscription and continuously respond with updates (the channel can't be sent over the network so this is the only sensible reply here). This helper thread on the client will recv those messages form them back into rust structs and broadcast them to whoever is listening. - the client subscribe impl will return the recv end of the broadcast channel as soon as the thread is created
- the thread will populate the channel beyond that point.
When I was advising you to not really worry about callbacks it's to reserve your resources for mastery over this one method -- subscribe. I am interested in pretty soon deprecating all the callbacks we have already in favor of this new api.
Pushing data from within lb to clients is annoying, you're experiencing it's annoyance over the network, smail experiences in ffi, etc. So we just want to do it once and then just rely on our rich typesystem to carry us beyond that point once the pipe is setup.
There are three reasons I want your connection to be stateful, and you can work around all of them:
- compatibility, if one client is updated before another one, then the connection may be incompatible, we'll need to explore conventions here, maybe one broadcasts it's min supported version (like our actual server does). Ideally this would just happen at connection time, but it's prob fast enough for it to happen over every connection. You can reserve room in the protocol like you did for the size of the transmission. Or it's part of the connection sequence (that's what I would do).
- Performance overhead -- I assume it's small but worth addressing, but you can measure and if it's insignificant then you don't need to do it.
- Security, right now there's no benefit to guarding access to localhost (to be clear it's not 0.0.0.0) but one day (optional passphrase #159) there likely will be and then the connection will probably take a passphrase or some other sort of API key to start. It's possible that you'd still need to log in to the client regardless, and that client would have a bb core with just an account or something that's used to authenticate with
LbServer.
|
Should i work on fixing resulting errors in |
|
You can probably fix those errors last -- we can get very far with just CLI & egui as a proof of concept. Happy to see this moving again! |
| ) -> LbResult<()> { | ||
| let method_id = method as u16; | ||
| let body = bincode::serialize(args).map_err(core_err_unexpected)?; | ||
| let len = body.len(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think sending a len prefix is fine and pretty common
you should just know about another approach, one that fits this 1 connection per request a bit better:
after finishing your writes you can execute stream.Shutdown(write) and then on the reading side you can do read_all. This means you don't have to negotiate the types.
This lets you do a single, probably more optimal syscall, signalling to the OS, I want to put all available bytes into memory, I assume this is the fastest way to express the idea if the end goal is in-fact to get all the bytes into memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however if we find that we're trying to head in the direction of re-using the connection (which is likely) then you'll need to negotiate the length. So I think keeping it in this half way point is fine until we know what direction we want to head and then produce the optimal implementation for that strategy.
| stream | ||
| .read_exact(&mut len_buf) | ||
| .await | ||
| .map_err(core_err_unexpected)?; | ||
| let resp_len = usize::from_be_bytes(len_buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a stream.read_u64 and a stream.write_u64 which could simplify these and related calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you use _be_ in some places and _le_ in others? this makes me upgrade the above to a "requested change" from a suggestion.
Parth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay this is actually quite close to being ready to merge:
- rebase needs to happen, so people can test with the most recent version
- minor comments around the protocol need to be addressed
- search index for sure does not need to be serialized, the search queries will be sent to
LbServerwhich is the only one who will have the index. - we can make
LbErr::backtraceanOption<String>instead of anOption<Backtrace>which will let you deriveSerializeandDeserializevia the macro on theLbErrtype.
|
Regarding the |
OKay that makes sense, I think the preferred way to solve this would be to expose a new endpoint for the CLI to refresh the index. |
The first commits do the following big changes :
LbtoInnerLb. This has the consequence of aliasing imports of Lb in all theservice/*.rsfilesInnerLbdefinition and itsinitdefinition to its own module.lib.rsnow only imports the enum as well as the two concrete structsInnerLbandProxyLbto
use crate::InnerLb as Lb.Lb. Addedpub use enum_lb::Lb;so thatlb-rsclients don't have to adjust their imports.rpcmodule contains two structs to properly define what a request and a response looks like. for now these reside in the module file itself but i think it needs to be moved to themodelfolder in future commitserror.rs's LbErr now implements Ser and Deser, I did ask the AI to do these implementations just demo purposes for now since it's that relevant right now for what I've been working on will probably work on improving them in later commitsProxyLbnot containing a connection field used across the whole api.initwill try to connect to, Thinking about moving this inConfigThe goal of these first commits is to prepare the overall structure.
If this gets approved, future commits will focus on implementing the other apis, further organize the files, get rid of all the warnings and at some point implement robust measures such as retrial on error on connections etc...