Skip to content

Conversation

@barsh404error
Copy link

@barsh404error barsh404error commented Dec 30, 2025

Thanks alot to @christolis for helping me out on making this pull request.
Added two utulity methods isLinkBroken and replaceDeadLinks

-isLinkBroken(String url) checks the link availability using a HEAD request
I used HEAD request instead of GET request to check link availability without downloading the response body, reducing bandwidth and improving the performance.

-replaceDeadLinks (String text, String replacement) replaces unreachable/broken links asynchronously.

This change does not have any behavior changes to the existing code.

@barsh404error barsh404error requested a review from a team as a code owner December 30, 2025 10:52
@CLAassistant
Copy link

CLAassistant commented Dec 30, 2025

CLA assistant check
All committers have signed the CLA.

return !(new UrlDetector(content, UrlDetectorOptions.BRACKET_MATCH).detect().isEmpty());
}
public static CompletableFuture<Boolean> isLinkBroken(String url) {
HttpClient client = HttpClient.newHttpClient();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define the HttpClient as a static reference in this class please?

i.e.

private static final HttpClient CLIENT = HttpClient.newHttpClient();

public static CompletableFuture<Boolean> isLinkBroken(String url) {
    HttpRequest request = HttpRequest.newBuilder(URI.create(url))
            .method("HEAD", HttpRequest.BodyPublishers.noBody())
            .build();

    return CLIENT.sendAsync(request, HttpResponse.BodyHandlers.discarding())
            .thenApply(response -> response.statusCode() >= 400)
            .exceptionally(ignored -> true);
}

Creating a new HttpClient per link is unnecessary and inefficient. HttpClient is thread-safe and designed to be reused across concurrent requests.

String text,
String replacement
) {
Set<LinkFilter> filters = Set.of(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to always re-create this Set as it never changes, we can make this static-final.

return CompletableFuture.completedFuture(text);
}

StringBuilder result = new StringBuilder(text);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is not thread-safe. StringBuilder is mutated from multiple async callbacks, causing data races and invalid index calculations. Async HTTP completion order is nondeterministic, so index-based replacements are unsafe.

The correct approach is to separate concurrent link checks from sequential string mutation: perform all HTTP checks first, then apply replacements in a single thread once all futures complete.

HttpClient client = HttpClient.newHttpClient();

HttpRequest request = HttpRequest.newBuilder(URI.create(url))
.method("HEAD", HttpRequest.BodyPublishers.noBody())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While, it's a good idea to use HEAD requests, a lot of HTTP servers do not implement this properly, you might commonly get 405, 404 etc., and CDNs like Cloudflare actually block it as suspicious. It's best to use a hybrid approach here, try HEAD first and if that fails try GET before entirely failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants