Skip to content
This repository was archived by the owner on Jun 1, 2024. It is now read-only.

Conversation

@AdamNiederer
Copy link

Hi there,

I've been trying to improve the speed of offline subbreddit caching, and I figured that serially awaiting each HTTP request was eating the bulk of the time.

While fixing that problem, I found that RedditClient.execute() has some kind of lock which doesn't let me query Reddit concurrently. Although this patch doesn't hurt the wall time of the caching process, I'd recommend not merging this until I can figure out how to get around that limitation. Does reddit have a particularly oppressive rate limiter or something?

While it isn't the immense performance increase I anticipated, it does help
speed up offline caching somewhat.
@ccrama
Copy link
Collaborator

ccrama commented Aug 5, 2017

Thank you for the PR!

Actually, this is done on purpose because Reddit has API rate limitations, and the caching mechanism was designed to not break these limits. Unfortunately don't know if your changes break these limits, but I know that JRAW has a master rate limit built into it (I figured serially downloading would ensure we don't break any limits on our side either). One way we could do this is to execute each as a separate pooled thread, but keeping track of all of those threads is pretty messy and that's why I opted to just have one master async thread. If you have any other suggestions on how to go about this, I'm open to any suggestions!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants