>>11260
Download all images from a thread:
wget -nd -nc -r -l 1 -H -e robots=off -D 8kun.top,media.8kun.top -A png,gif,jpg,jpeg,webm http://8kun.top/url/to/thread.html
-nd tells wget not to create a whole new folder, if you want that you can, iirc use -p foldername
-nc tells it not to overwrite files that are already there
-r means recursive
-l NUMBER are how many levels of links it follows on the page in question
-H is span hosts
-e robots=off tells it not to follow the robots.txt, don't know if that is even needed on 8kun
-D is the domains you're working with, obviously. 8kun has images hosted on a different subdomain so just following links to 8kun.top won't get you the images.
-A is the formats that will be downloaded
You can alter it for different imageboards too. If some site doesn't want to comply you can add arbitrary shit like some commonly used user agent and waits:
--user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" --random-wait --limit-rate=1M
to make it seem more natural. Of course this didn't work while vanwanet was going, but after they changed it to be less obnoxious is works well again.
You could probably make it ignore thumbnails somehow, but honestly, it's easier to just remove them manually afterwards, they are pretty obvious due to their small file size, some imageboards even prefix them with somethin like
t_ which makes it even easier.
Also there is a really good tool called httrack to download whole websites and create local mirrors, you may also try that.
When it comes to archiving Post too long. Click here to view the full text.