Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
test/
27 changes: 16 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,35 @@
# jscrap: A very easy-to-use and lighweight web scrapper
# jscrap : A very easy-to-use and lighweight web scrapper


`jscrap` is a very fast and easy-to-use web scrapper for node.js

# Installing
### Installing
```npm

npm install jscrap
```

# Having fun

### Example:
```javascript
var
jscrap = require('jscrap');

jscrap.scrap("https://www.kernel.org/",function(err,$){
console.log("Latest Linux Kernel: ",$("article #latest_link > a").text().trim());
console.log("Released: ",$("article #releases tr:first-child td:nth-child(3)").text());
});

# Supported selectors:
```
### Supported selectors:

`jscrap` supports all the [zcsel](https://www.npmjs.org/package/zcsel) selectors and functions.
Watch out [zcsel](https://www.npmjs.org/package/zcsel) documentation.

# Options
### Options

The `scrap()` function supports these options:
The __`scrap()`__ function supports these options:

`debug` : Activates the debug mode. Defaults to `false`.
`followRedirects` : Number of redirects to follow. Defaults to `3`.
`charsetEncoding` : Document charset. Default to `utf-8`.
* __`debug`__ : Activates the debug mode. Defaults to `false`.
* __`followRedirects`__ : Number of redirects to follow. Defaults to `3`.
* __`charsetEncoding`__ : Document charset. Default to `utf-8`.
* __`headers`__ : Headers to pass with request. `Not set` by Default.
* __`timeout`__ : Timeout for request. `null` by Default.
38 changes: 38 additions & 0 deletions test/Simple_Http.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
var
jscrap = require('jscrap'),
start = new Date();
var port = 3000;
var ip ="127.0.0.1";
// HTTP server setup
var http = require('http');
http.createServer(function (req, res) {

res.writeHead(200, { // Tell Browser to wait
'Content-Type': 'text/plain'
});

function echoData(text1, text2) {

res.end(text1 + " | " + text2);
}

function scrapData(callback) {
jscrap.scrap("https://www.kernel.org/", {
debug: true
}, function (err, $) {
text1 = "Latest Linux Kernel: " + $("article #latest_link > a").text();
text2 = "Released: " + $("article #releases tr:first-child td:nth-child(3)").text();

if (err) {
console.log(err);
}

callback(text1, text2);
});
};

scrapData(echoData);


}).listen(port, ip);
console.log('Server running at http://127.0.0.1:1337/');