Allowed file types:jpg, jpeg, gif, png, webm, mp4, pdfMax filesize is 16 MB.Max image dimensions are 15000 x 15000. You may upload 3 per post.
File (hide): cb3b81920ca783d⋯.jpg (157.46 KB, 500x750, 2:3, saintpeter.jpg) (h) (u)
▶Anonymous 11/20/18 (Tue) 21:36:08 No.999782>>999926 >>1000120 [Watch Thread][Show All Posts]
Anyone ever scrape time-series data from twitter? I'm trying to work out a solution with Selenium, but it's going to require a lot of runtime.
▶Anonymous 11/21/18 (Wed) 02:25:20 No.999848>>999926 >>1000124
direct twitter api https://developer.twitter.com/en/docs.html
or pick a language and use a real scraping library
https://scrapy.org/
https://jsoup.org/
http://www.nokogiri.org/
writing a program to go off and open a browser to do it is going to be the slowest way possible to do this.
▶Anonymous 11/21/18 (Wed) 07:31:40 No.999926>>1000124
>>999782 (OP)
Depends on which kind of data you are interested in.
>>999848
API is ideal. Scraping tools are quicker than proxy-browser solutions like Selenium, but they can't execute JavaScript so might not always work.
▶Anonymous 11/21/18 (Wed) 08:03:11 No.999930
>Anyone ever scrape time-series data from twitter? I'm trying to work out a solution with Selenium, but it's going to require a lot of runtime.
Kill me, Pete
▶Anonymous 11/21/18 (Wed) 14:54:25 No.1000022>>1000124
Just look at their unofficial API through your browser's network activity monitor. It's pretty easy to understand, and you don't have to put up with the bullshit rate limiting that comes with the official api.
▶Anonymous 11/21/18 (Wed) 22:23:31 No.1000120>>1000124
Just use the devtools protocol if you don't want to emulate a browser. Running chromium in headless mode is the most undetectable way to scrape data, and most likely to be successful as these websites expect an actual browser to execute javascript.
▶Anonymous 11/21/18 (Wed) 22:32:05 No.1000124
>>999926
Need to execute the js, unfortunately. Trying to avoid going the API route, because I need a lot more historical data than is available at my price point.
>>1000022
Gonna look into this, thanks.
>>1000120
Makes sense, thanks.
▶Anonymous 11/22/18 (Thu) 01:14:14 No.1000172
kill me, pete
KILL ME NOW DO IT I BEG YOU