GitHub - JFCiscoHuerta/java-web-scraper: A Java web scraper using Jsoup to extract headers, links, and other elements from web pages. Includes retry logic for failed connections and file-saving functionality.

JavaWebScraper

WebScrapper is a Java-based tool designed to scrape and collect data from web pages using Jsoup. This scraper allows you to extract various elements, such as titles (h1, h2, h3, h4), links, and any other specific elements defined by CSS selectors.

The tool supports connection retries in case of failures, and it provides functionality to save the scraped data to a file. Customize the scraper by providing a user agent and target URL to start gathering data from any website.

License

This project is licensed under the Apache 2.0 license - see the LICENSE file for detail

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JavaWebScraper

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

JFCiscoHuerta/java-web-scraper

Folders and files

Latest commit

History

Repository files navigation

JavaWebScraper

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages