4chan text scraping still possible? #530
-
|
Hi everyone! I've just successfully installed 4CAT, only to find that 4chan is no longer an option under "Available data sources". Is there any way around this? Or are 4chan, reddit, 8chan no longer supported as a source? If this is the case, can anyone recommend a similar data scraping programme for collecting text from 4chan? I'm looking for a way to aggregate and analyse text data in /pol, if possible over time and using keywords. Many thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
4chan and 8kun are still supported. 4chan/8kun API posts/threads disappear over time, thus you have to build up your data by first collecting it. There are instructions on how to set that up here: We did stop supporting Reddit due to their API changes ($$$). You can find reddit data dumps online (which you could then import into 4cat if you were interested in using its analysis methods). |
Beta Was this translation helpful? Give feedback.
4chan and 8kun are still supported. 4chan/8kun API posts/threads disappear over time, thus you have to build up your data by first collecting it. There are instructions on how to set that up here:
https://github.com/digitalmethodsinitiative/4cat/wiki/Enabling-local-data-sources
Once you collect data, you can then search your records via the 4cat interface. We are exploring other collection options for locally stored datasources like those in the future that will hopefully be easier to set up.
We did stop supporting Reddit due to their API changes ($$$). You can find reddit data dumps online (which you could then import into 4cat if you were interested in using its analysis methods).