web crawler

A web crawler for a code challenge.

To start run node server.

Lauch a web browser to http://localhost:8081/crawl.

Will crawl the urls in the crawl.json and fill out and submit the forms that are set.

The crawl.json

The configs are done in the crawl.json.

"sites": [
        {
            "url": "http://testing-ground.scraping.pro/login",
            "origin": "http://testing-ground.scraping.pro",
            "forms": [
                {
                    "inputs": {
                        "usr": "admin",
                        "pwd": "12345"
                    },
                    "result": {
                        "element": "h3",
                        "attr": "class"
                    }
                }
            ]
        }
    ]

First node is the sites level. It is a list of urls to crawl.

It has a url child with a value of the url.
Another child is the origin. It would be the main url that will be added to all of the forms actions.
The forms child holds a list of forms to be filled out. Well it would if it would be improved.
The forms has a child inputs which is a list of key: value of the html elements to submit with the form.
The forms has a child result which is a object. This object has a element property with the value being the element to parse from the post response. It also has a attr property which is what attribute to log.

Currently the log is "{attr value} response {element value} of the post response.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
crawl.json		crawl.json
package.json		package.json
server.js		server.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

web crawler

The crawl.json

About

Uh oh!

Releases

Packages

Languages

License

blueflash2o/webcrawler

Folders and files

Latest commit

History

Repository files navigation

web crawler

The crawl.json

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages