Looking for some general guidelines on web scraping. I made a shitty web scraper in python, but it does the job. I just put in a few links into a file, and then it goes through each link and gets all the links in the web page and then keeps going through them searching for words or whatever.. but is there any guidelines? Like.. should I test the connection or site before I request a page? I got a bunch of shit the first time, so had to also filter out certain words within the links.