QUEST FOR SEARCHABILITY

Name
Email
Subject
Comment *
File	Select/drop/paste files here
Password	(Randomized for file and post deletion; you may also set your own.)
* = required field	[▶ Show post options & limits] Confused? See the FAQ.

Embed	(replaces files and can be used instead)
Options	Do not bump (you can also write sage in the email field) Spoiler images (this replaces the thumbnails of your images with question marks)
Allowed file types:jpg, jpeg, gif, png, webm, mp4, pdf Max filesize is 16 MB. Max image dimensions are 15000 x 15000. You may upload 5 per post.

▶Anonymous 02/25/18 (Sun) 22:11:54 85a843 (4) No.495005>>498755 >>1530470

File (hide): 5bf2c7f24f70616⋯.png (107.74 KB, 1826x973, 1826:973, Snip 1.PNG) (h) (u)

File (hide): 83b29fb7a9e7d68⋯.png (86.44 KB, 1885x1007, 1885:1007, Snip 2.PNG) (h) (u)

File (hide): 446d61fffe32244⋯.png (63.29 KB, 1119x994, 1119:994, Snip 3.PNG) (h) (u)

File (hide): 37d58ab626e65b6⋯.png (55.22 KB, 1826x687, 1826:687, Snip 4.PNG) (h) (u)

Posts from #608

>>493751

>>494228

>>494299

>>494080

>>494202

>>494015

>>493888

>>493884

>>493882

>>493881

>>493886

>>493919

>>493939

>>493854

>>494489

>>494264

>>494457

>>494503

>>494460

>>494405

>>494451

>>494528

>>494471

>>493877

>>493898

>>493929

>>494283

>>494184

>>494548

▶Anonymous 02/25/18 (Sun) 23:56:32 1eba68 (1) No.495890>>496431 >>504261 >>524511

One further comment from a heavy database user for what it's worth:

If we had a list of 'tags' that anons could enter as they post (in a specific format, e.g. preceded by **) covering topics that emerge (such as 'mkultra', 'bridge' etc. - related to topics brought up by Q) when searching through the data it would serve as a way to link crumbs by subject and as an additional variable / filter in any search would serve to streamline any search.

these would have to be moderated by BV / BO / Baker; would not be any more work than creating the notable posts per bread, although would be useful to find a way to insert them after the post identifier to create the link (i.e. >>xxxxxx within/2 **xxxxxx = TRUE).

Historical would be an issue but if there were some way of batch-adding at data assimilation stage based on linked crumbs, as well as specific 'meta-moderators' as we run searches etc.

However it might work - principle is an easily assignable value to identify crumb subject based on Q's topics so more information can be retrieved via regular search.

▶Anonymous 02/26/18 (Mon) 00:41:12 85a843 (4) No.496386

File (hide): da133a2467e3170⋯.png (6.88 KB, 768x100, 192:25, Snip 5.PNG) (h) (u)

>>494745 (OP)

▶Anonymous 02/26/18 (Mon) 00:46:27 85a843 (4) No.496431>>499393 >>501143 >>1530470

>>495890

>One further comment

Well, you kind of lost me pretty quickly. Correct me if I'm wrong, but what you're suggesting is for posts going forward, and posts that are Q centric.

My goal is to see ALL of the board searchable because much of the digging and research that was collected was not just related to items Q had in mind, but many ancillary topics and evidence discovered would help build the "parallel construct".

That's what I see as important, your thoughts?

▶Anonymous 02/26/18 (Mon) 01:29:27 91c771 (2) No.496858>>496999

Might I suggest using SQLite as the DB for the"file format". It's a single file db that performs well for read heavy workloads, is single file, so easy to distribute, easily usable from PHP and just about any other programming language, and could easily be used to load a regular server based db (obv depending on how the schema is designed). Also multi-platform, so should keep everybody happy irrespective of what OS you use.

▶Anonymous 02/26/18 (Mon) 01:44:45 91c771 (2) No.496999

>>496858

I forgot to mention, SQLite also supports full text indices via the ft4 virtual table type

▶Anonymous 02/26/18 (Mon) 03:39:26 6a8d0c (1) No.497972

I made a thread a few minutes ago asking if a wiki could be a good format to organize findings? Could help with navigating. What do you think?

>>497784

▶Anonymous 02/26/18 (Mon) 04:59:07 dbb4a4 (36) No.498755>>499341

>>495005

I've been working on exactly this. I'm pulling the catalog from ga & qresearch. Finding the research general threads and saving those with q posts. Only goes back to about 2/15 when I turned the machine on. Currently working on getting old posts reconstructed. 99% sure I can grab all breads from 8ch.

C# dll to scrape q posts and threads from 8ch. 8ch+ json format but could be serialized XML I guess

▶Anonymous 02/26/18 (Mon) 06:03:59 4073db (3) No.499327>>501166 >>520068 >>532931

One of the anons from the other thread.

I'm not going to jump in too much if other's are doing something where we end up stepping on Pepe's toes. Couple of thoughts though…

- Full text searches/indexes can be garbage. Only good for reserved words

- Most likely want this in a relational database. Creating the schema would consist of a really simple data model. Not even sure I would worry about normalizing it.

- Messages (body) could be stored in a blob and be searched with wildcards.

- Only looking at about 10-15 different queries tops. All simple SQL statements except for a couple that would need to be hierarchical..but still easy.

- I was thinking to use MySQL or SQL Server for the DB Engine.

- Biggest challenge will be the parsing of the threads and crumbs into a loaded format for the database. Once in a useful format…loading will be easy.

I see three main parts to this:

1) Getting the data so it can be loaded into a database.

2) Creating the database structures (really should be first)

3) Spitting out the queries, views, and sprocs that will be used. And putting a front end on it.

* almost doxxed myself and put a link to my web site…so close :-)

▶Anonymous 02/26/18 (Mon) 06:05:42 4073db (3) No.499341>>520068

>>498755

>C# dll to scrape q posts and threads from 8ch. 8ch+ json format but could be serialized XML I guess

Good call me thinks

▶Anonymous 02/26/18 (Mon) 06:13:19 4073db (3) No.499393

>>496431

I think that is a great thought. May be a good idea to just get one set started and loaded then look into the other boards.

We (at least I) can't see a way to search the 'board' itself, but to create a copy of the data in the threads and make those searchable.

▶Anonymous 02/26/18 (Mon) 14:34:54 7cdf2a (6) No.501143

>>496431

>Download Chan.

>Host JSON of posts.

>Build simple interface.

>Use nginx as reverse proxy.

>??????????

Profit

Why the fuck do you want a DB when it's already JSON. FFS.

▶Anonymous 02/26/18 (Mon) 14:40:14 8dbdfa (3) No.501166>>501352

>>499327

Open Source, Cross Platform search engine library - xapian.org

▶Anonymous 02/26/18 (Mon) 15:21:55 8dbdfa (3) No.501352

>>501166

github .com/mcmontero/php-xapian JSON support and web-friendly middleware

▶Anonymous 02/26/18 (Mon) 15:36:29 7cdf2a (6) No.501408>>501440

File (hide): 3e025a52fbb6b5e⋯.png (27.01 KB, 634x278, 317:139, Screenshot from 2018-02-26….png) (h) (u)

A better way to do this is to probably put everything client side. Make a cross platform application that just fetches new posts every so often. The browser is pretty perfect for this is we can set up a cross platform local server to host a local copy of qcodefag and this board.

https:// github.com/bvaughn/js-search

Pros: Fast enough once index is built.

Cons: Have to build index, or send it from a server, ipfs, blockchain, whatever.

rip it from qcodefag for q posts

Add 8ch layout to some button on qcodefag or some tab

Display the posts as normal, but add search bar for board side of new client for qcodefag and this board.

Pic related, it's easy to get .json formatted threads.

inb4 we all pwn ourselves.

▶Anonymous 02/26/18 (Mon) 15:42:26 7cdf2a (6) No.501440

>>501408

Conveniently this also alleviates the clown issue should that garbage bill pass. I mean not really since we'll still own ourselves but fuck, we can try.

▶Anonymous 02/26/18 (Mon) 19:24:07 8143cc (9) No.502752

>>502660

>>>502453

>>Research Threads Ideas. Please claim or create yours, let us know of more subject ideas

>>Quest for Research Searchability Thread

>>>494745 (You) (You)

>Thanks for including my thread. I'm not a coder so I'm not much more than a cheerleader. I am quite sincere in my belief that we have to make it all searchable. I'm not naive enough to expect a volunteer to tackle it. Without doxxing themselves, can any anon point me to a service or company that could accomplish this Quest?

▶Anonymous 02/26/18 (Mon) 19:27:38 8143cc (9) No.502773>>502802 >>502951 >>503685 >>532910

>>502660

>>494745 (OP) (You) (You)

Thanks for including my thread. I'm not a coder so I'm not much more than a cheerleader. I am quite sincere in my belief that we have to make it all searchable. I'm not naive enough to expect a volunteer to tackle it. Without doxxing themselves, can any anon point me to a service or company that could accomplish this Quest?

▶Anonymous 02/26/18 (Mon) 19:32:23 885f7e (1) No.502802

>>502773

A pleasure anon. Here's wishing you all, all the very best in this noble quest. It would be Christmas for us all if you did it. GODSPEED.

▶Anonymous 02/26/18 (Mon) 20:04:28 8dbdfa (3) No.502951

>>502773

The omega interface for xapian could do most work. wget to grab site data, json->csv converter to translate, and to be ready to go. All Free and open source software. Not quite plug and play, but a start.

Sample usage described:

xapian.org /docs/omega/overview.html

linode .com has very affordable linux shell hosting.

▶Anonymous 02/26/18 (Mon) 21:17:42 e4e2ff (1) No.503420

Hey just had a thought but couldn't /ourguys/ look at all bullets ( like they have to on a crime scene )?

Wouldn't "LIPPEL" the one who had been "grazed" be able to connect bullet to her dna with whatever DNA would be on her?

What about the other student who was walking after being shot in both legs by 4 rounds?

Where is the DNA for that match to bullet?

What about the dead coach, the HERO we seen at the funeral? DNA match to that?

All this stuff might not help us ATM but IMO,

would play a big handle in the game out there with Q and friends?

https:// www.youtube.com/watch?v=cPvYxTa1ph4

LIPPEL

Another thing, this video she talks about how "BREAKING THE GLASS WITH SHOTS" starting at @ 2:05.and then she says they arrived..

MAYBE AN HOUR AFTER

She then states at the end of the video then she states the "Swat team/Police" was on the ground, she aid they were banging on the doors to let them in, she "DIDN'T TRUST IT WAS THEM, BECAUSE THE POLICE WERE BANGING ON THE DOORS - NOBODY GOT UP"

==IF THE SHOOTER DRESSED IN FULL METAL GARB SHOT OUT HER WINDOW, SHE WOULD OF SEEN IT BEING POLICE, AND THEY WOULD OF SEEN HER.. AND THEN PROBABLY OPENED THE DOOR THRU THE BROKEN GLASS INSTEAD OF BANGING ON THE DOOR WOULDN'T THEY ?"

Whole story right here in the video proves it was either a False Flag or some type of fuckery

▶Anonymous 02/26/18 (Mon) 22:06:16 1ca2ba (1) No.503685>>503938 >>509646

>>502773

It's not really a company, but wouldn't the person running the 4plebs archive be a good place to look for tools/code in this quest? Maybe he'd even be willing to assist? The site uses some fairly powerful search tools for certain halfchan boards already. I'm not a codefag so I apologize if this hasn't been suggested already.

https:// archive.4plebs.org/_/articles/faq/

▶Anonymous 02/26/18 (Mon) 22:42:13 8143cc (9) No.503938

>>503685

That's a good suggestion. Do you know off the top of your head how many archive sites have been used at 4ch and 8ch? I know about archive.is and 4plebs, but I've seen a lot more. I'm pretty sure the threads are scattered about the internet.

▶Anonymous 02/26/18 (Mon) 23:19:00 912746 (1) No.504261>>524530

>>495890

What about a bulletin board type of system like vbulletin for example? built in search and different forums and sub forums for topics.

▶Anonymous 02/27/18 (Tue) 00:24:53 ae5226 (3) No.504748>>505181

File (hide): c355f8e55c85a3b⋯.png (33.62 KB, 189x244, 189:244, Screen Shot 2018-02-26 at ….png) (h) (u)

>>311157

▶Anonymous 02/27/18 (Tue) 00:53:47 ae5226 (3) No.505007>>505181

File (hide): 6a8f1ccdd28a3a6⋯.png (24.86 KB, 181x245, 181:245, Screen Shot 2018-02-26 at ….png) (h) (u)

>>4144

▶Anonymous 02/27/18 (Tue) 01:14:49 8143cc (9) No.505181>>505196

>>504748

>>505007

Thanks, I'm sure there's a lot of good stuff here, I monitor them daily. Are you implying posts have been excised from threads and posted in these subs?

▶Anonymous 02/27/18 (Tue) 01:16:19 8143cc (9) No.505196

>>505181

> excised from threads and posted in these subs?

Sorry, didn't finish my thought, and they might not be captured in a search of posts in Qresearch? Not sure why you posted these.

▶Anonymous 02/27/18 (Tue) 01:22:18 ae5226 (3) No.505256

File (hide): 6a1582dc8728f23⋯.png (41.71 KB, 176x257, 176:257, Screen Shot 2018-02-26 at ….png) (h) (u)

File (hide): 29c38887f12c348⋯.png (33.06 KB, 190x248, 95:124, Screen Shot 2018-02-26 at ….png) (h) (u)

>>4274

>>4352

▶Anonymous 02/27/18 (Tue) 02:46:27 ec7b2a (1) No.506133

File (hide): facc20f480a8350⋯.gif (11.98 KB, 333x110, 333:110, sociopol_falseflag29.gif) (h) (u)

In naval warfare, a "false flag" refers to an attack where a vessel flies a flag other than their true battle flag before engaging their enemy.

It is a trick, designed to deceive the enemy about the true nature and origin of an attack.

In the democratic era, where governments require at least a plausible pretext before sending their nation to war, it has been adapted as a psychological warfare tactic to deceive a government's own population into believing that an enemy nation has attacked them.

In the 1780s, Swedish King Gustav III was looking for a way to unite an increasingly divided nation and raise his own falling political fortunes.

Deciding that a war with Russia would be a sufficient distraction but lacking the political authority to send the nation to war unilaterally, he arranged for the head tailor of the Swedish Opera House to sew some Russian military uniforms.

Swedish troops were then dressed in the uniforms and sent to attack Sweden's own Finnish border post along the Russian border. The citizens in Stockholm, believing it to be a genuine Russian attack, were suitably outraged, and the Swedish-Russian War of 1788-1790 began.

In 1931 the Japan was looking for a pretext to invade Manchuria. On September 18th of that year, a Lieutenant in the Imperial Japanese Army detonated a small amount of TNT along a Japanese-owned railway in the Manchurian city of Mukden.

The act was blamed on Chinese dissidents and used to justify the occupation of Manchuria just six months later. When the deception was later exposed, Japan was diplomatically shunned and forced to withdraw from the League of Nations.

In 1939 Heinrich Himmler masterminded a plan to convince the public that Germany was the victim of Polish aggression in order to justify the invasion of Poland.

It culminated in an attack on Sender Gleiwitz, a German radio station near the Polish border, by Polish prisoners who were dressed up in Polish military uniforms, shot dead, and left at the station.

The Germans then broadcast an anti-German message in Polish from the station, pretended that it had come from a Polish military unit that had attacked Sender Gleiwitz, and presented the dead bodies as evidence of the attack. Hitler invaded Poland immediately thereafter, starting World War II.

http:// www.bibliotecapleyades.net/sociopolitica/sociopol_falseflag29.htm

For hundreds of links to FF research/reports, use this link below. You are welcome Anons..

http:// www.bibliotecapleyades.net/sociopolitica/sociopol_falseflag.htm

▶Anonymous 02/27/18 (Tue) 13:47:23 8143cc (9) No.509646>>510581 >>519706 >>524543 >>651528

>>503685

>person running the 4plebs archive be a good place to look for tools/code in this quest?

For the archives 4plebs uses sphinx search (http:// sphinxsearch.com/). It's used to index from the database and display search results very quickly.

Easy to implement but I would say it's worth it only if you have a lot of data to search through. For smaller datasets you can use full text search included in a regular database engine.

Also you can take a look at other search engines like Solr (http:// lucene.apache.org/solr/) and elasticsearch (https:// www.elastic.co/)

▶Anonymous 02/27/18 (Tue) 16:25:53 af8c7d (2) No.510581

>>509646

been using duckduck for searches

▶Anonymous 02/27/18 (Tue) 16:27:09 af8c7d (2) No.510592

cryptocert keys moded on puter… should i reboot or undo?

▶Anonymous 02/28/18 (Wed) 21:42:26 9176e6 (1) No.519706

YouTube embed. Click thumbnail to play.

>>509646

I also would Second the Idea of using Sphinx - it can be connected to a currently live database and given clues and sample queries to Index all text in the DB - https://

www.percona.com/resources/technical-presentations/how-optimally-configure-sphinx-search-mysql-percona-live-mysql and they have a video. I don't think there are any existent Docker setups to play with, although I imagine 8ch is quite custom anyway.

▶Anonymous 02/28/18 (Wed) 22:24:37 dbb4a4 (36) No.520068>>520151

>>499327

>>499341

OK So I think I've got my chanscraper console app working as designed.

AFAIK, I've got all the QPosts in a single JSON, I've got complete breads starting with Bread #364 2018-02-07. That's as far back as I've been able to reach programatically. Each complete bread has also been filtered into another json file containing just Q's posts.

The complete breads have only come from 8ch. The chanscraper is set up to whee it could scrape 4ch as well - assuming the json is still available.

I'm showing 825 QPosts - 1 more than qCodeFag because I believe I have a deleted

one. All counted it's 210 threads.

I've done all the hard work of setting up the old catalog/threads/posts. Its set up where you can specify how far back to refresh (to cut down on unnecessary http gets), It reads in the existing data, finds the new threads to search for on 8ch/greatawakening and 8ch/qresearch, and then archives the threads/posts that q has made locally.

If anybody wants the full Q archive as I have it now, here it is: 6mb https:// anonfile.com/H6B7G7dcbc/QJsonArchive.zip

I'm going to integrate the DJTweets + minute Deltas in this week.

Once I get this all cleaned up I'll cut it loose on Github if there are any C#codeFags interested.

My idea is to set up a simple HTML page using some javascript that can be run locally on a single users machine or website. Since the scraper is a C# dll it could be set up to run as a timed service on a web server to keep a site up to date.

▶Anonymous 02/28/18 (Wed) 22:36:37 98bd4e (20) No.520151>>520179

>>520068

Code at github.com/anonsw/qtmerge does some similar things. Check it out, maybe there are some useful ideas to lift from there: anonsw.github.io

▶Anonymous 02/28/18 (Wed) 22:41:23 dbb4a4 (36) No.520179>>520263

>>520151

Yeah I knew about that - but I'd already been getting data from QCodeFag. The QCodeFag data was the basis for what I have now since it had already done the scraping on 4ch. I wanted my own in C# source going forward that I can use locally with my other C# code.

▶Anonymous 02/28/18 (Wed) 22:42:43 7cdf2a (6) No.520183>>520193

I don't know why nobody cares but it's trivial do download threads, posts, and boards through the 8ch api in the form of JSON. There is no reason to not have the local client make the get request every so often.

▶Anonymous 02/28/18 (Wed) 22:44:16 dbb4a4 (36) No.520193>>520201

>>520183

Yep. That's why I did it. Getting all the JSON is easy once you know where everything is - but stuff sliding off the catalog was what made me want to keep a local archive.

▶Anonymous 02/28/18 (Wed) 22:45:54 7cdf2a (6) No.520201>>520237

>>520193

I meant the hypothetical client with which people are searching this board and staying updated. That client should search for posts all on it's own instead of relying on a single source of truth. (saves infrastructure money too)

▶Anonymous 02/28/18 (Wed) 22:50:35 dbb4a4 (36) No.520237>>527353

>>520201

Precisely.

Once I get it finished I'll provide a single HTML page that is like QCodeFag. View on your desktop.

Run the chanscraper then view the HTML to see new posts

▶Anonymous 02/28/18 (Wed) 22:54:26 98bd4e (20) No.520263>>520278

>>520179

Cool, check out qanonmap too for posts no longer retrievable. I think they have some that qcodefag doesn't have.

▶Anonymous 02/28/18 (Wed) 22:56:23 dbb4a4 (36) No.520278>>520298 >>520323

>>520263

>qanonmap

Whazza qanonmap url?

https:// qanonposts.com/ ok?

▶Anonymous 02/28/18 (Wed) 22:58:02 98bd4e (20) No.520298>>520323

>>520278

github.com/qanonmap

qanonmap.github.io

not sure if thestoryofq.com is related

But they are qcodefag forks.

▶Anonymous 02/28/18 (Wed) 23:01:00 dbb4a4 (36) No.520323>>520373 >>536855

>>520278

>>520298

Duh. I had it.

I noticed that qanonmap.github.io has 827 posts and qanonposts.com has 824.

That's going to cause my OCD great consternation.

▶Anonymous 02/28/18 (Wed) 23:06:33 98bd4e (20) No.520373>>520391 >>536855

>>520323

Yep, but I think new ones just haven't been added yet to qcodefag.

▶Anonymous 02/28/18 (Wed) 23:09:45 dbb4a4 (36) No.520391

>>520373

Hmm.. That doesn't help me - I've got those. I'm only showing 825

▶Anonymous 03/01/18 (Thu) 09:23:28 07564d (94) No.524371>>1159568

>>494816

Ctrl-f is only good on a single thread. What researchers really need is a way to access the entire set of Q posts. I've built that capability for myself locally by parsing ctrl-s saves of the threads into a MySQL database and running SQL searches on that.

The best bet for a public search engine might be to cooperate with CodeMonkey to build a search capability for the boards. We'd still have to search each board separately, but at least we would be able to search each board all at once.

I've got most of the Q related posts from 4chan and 8ch locally, but I'm not sure how to make that much data publicly available. I've also got a fair amount of PHP code that I use to access and organize the raw data. I'd be willing to share it if I had a place to do it.

▶Anonymous 03/01/18 (Thu) 09:27:12 07564d (94) No.524384

>>493751

Actually, I have had chan posts show up in browser search engine results, but I know this isn't what you're after. I've built the type of search capability you're after on my local machine. It still takes a lot of time to work with the posts, but it's definitely easier than anything we can do at the original sources.

▶Anonymous 03/01/18 (Thu) 09:29:47 07564d (94) No.524395

>>494228

Timeline is easily generated when one has the ability to set the post time to something other than the current time. That's how I create timeline posts in my own database.

▶Anonymous 03/01/18 (Thu) 09:36:20 07564d (94) No.524431>>530474

>>494015

I definitely appreciate that notable posts are included in the breads on each thread. It isn't necessary for them to be updated on each and every thread, but it is good to have them updated at least every day. Right now, I'm using the links in the bread posts to mark posts in my private database as being included in the bread. Given the volume of posts that I am now working with, these links make it easier to determine what is important to include.

▶Anonymous 03/01/18 (Thu) 09:41:52 07564d (94) No.524456

>>494503

I use PHP because it's free. *shrug*

▶Anonymous 03/01/18 (Thu) 09:51:39 07564d (94) No.524489

>>494471

If you're lucky, you can find your archives on archive.org. That site saves pages with about nearly the same HTML elements as the original page. Archive.is converts the classes used on the original page into their style equivalents, making for a parsing nightmare. When I've had to use the archive.is version of a page, it was a painstaking process to recreate the single post that I went to the archive to get. My parser code can parse the archive.org archives the same as the original, so it's easy to get all posts from that archive.

▶Anonymous 03/01/18 (Thu) 09:54:50 07564d (94) No.524503

>>493929

I've already done this. I'm willing to share my data structures and parsers, if I have a place to do it.

▶Anonymous 03/01/18 (Thu) 09:58:26 07564d (94) No.524511

>>495890

I've got tagging fields included in my data structure. Getting them filled is an entirely different matter. I've got a tool to help do it more efficiently than phpMyAdmin, but it needs a bit of work to make it just a bit more efficient so that more than one post can be updated in one pass.

▶Anonymous 03/01/18 (Thu) 10:04:29 07564d (94) No.524530

>>504261

The challenge is classifying the posts to determine which sub forum to direct them to. Not trivial.

▶Anonymous 03/01/18 (Thu) 10:07:33 07564d (94) No.524543>>538775

>>509646

There are over 750,000 total posts from both sites and all boards containing Q related posts. It's a large data set now.

▶Anonymous 03/01/18 (Thu) 12:03:38 838074 (4) No.524965>>525531

Why not just build a 4chan archive site? That's the main thing lacking from 8ch.

▶Anonymous 03/01/18 (Thu) 14:32:01 7cdf2a (6) No.525489>>530978

Literally just build an index of tags and use fucking client side javascript. Muh databases. Jesus Christ people. You could even let users share tags.

First one with a completed project wins. Peace.

▶Anonymous 03/01/18 (Thu) 14:41:21 dbb4a4 (36) No.525531>>529626

>>524965

https:// 8ch.net/qresearch/archive/index.html

▶Anonymous 03/01/18 (Thu) 20:11:09 dbb4a4 (36) No.527353

>>520237

Here's the archive again + a handy HTML page that you can use in your browser to view the archives locally. Works fine in Chrome and IE. Readme included.

https:// anonfile.com/W3f5H6d8be/QJSONArchive.zip

▶Anonymous 03/02/18 (Fri) 02:28:14 838074 (4) No.529626>>530101

>>525531

OK, so why not do the fashionable, continuous integration FOSS thing and add searching to the archive site at the repo?

▶Anonymous 03/02/18 (Fri) 03:35:10 dbb4a4 (36) No.530101>>530155

>>529626

I expect because 8ch is not a massive corporation with a bunch of resources at their disposal. /sudo/

▶Anonymous 03/02/18 (Fri) 03:42:05 838074 (4) No.530155>>530720

>>530101

What difference does that make? Anons are gathered here. Why don't they just go there to assist in development instead of fragmenting and branching out to 1000 directions? Consolidate, integrate, then diverge.

▶Anonymous 03/02/18 (Fri) 03:59:50 ca4dab (1) No.530283>>530677

File (hide): 40d6ea4f332a28a⋯.png (377.24 KB, 1822x426, 911:213, ClipboardImage.png) (h) (u)

▶Anonymous 03/02/18 (Fri) 04:26:32 6db142 (1) No.530474>>530920

>>524431

>links

If it's server based something like http:// arborjs.org/ For data visualization/selection would then fix the mapping problem and help a lot with the search problem.

>links

There's also the Open Visual Thesaurus project to maybe grab code/ideas from www.chuongduong.net/thinkmap/ to view the data search and what else might be related to walk through the data.

▶Anonymous 03/02/18 (Fri) 05:04:36 dbb4a4 (36) No.530677

>>530283

Here's a newer local archive that moves there.

I've put in some UI enhancements to the JSON Viewer HTML page. Seems to be working good. With a slight mod it could work with local json from any QCodeFag site or even direct from 8ch.

https:// anonfile.com/5ercH3d9ba/QJSONArchive_v1.zip

Getting the posts into 2 columns should be no problem. It's getting a reliable news source that is gonna cause you trouble.

I was planning on putting 3 columns in the viewer, QPosts, Times, DJTweets. In doing all this I've discovered a few things about 8ch/halfchan. The post id's are not guaranteed unique. The best unique key is time and I've found 2 posts that dropped at the same timestamp. Thematically I've been trying to key everything to time. [qposts, tweets, news]

▶Anonymous 03/02/18 (Fri) 05:16:41 dbb4a4 (36) No.530720

>>530155

Jump in.

▶Anonymous 03/02/18 (Fri) 06:22:38 07564d (94) No.530920>>530994

>>530474

yEd can produce maps from spreadsheet data. That's one I know of.

https:// www.yworks.com/products/yed

Maybe when I get further along in the post tagging work, it'll be useful.

I'm toying with the idea of making my raw data available in some way, possibly in read only format. (Clowns can be destructive.)

▶Anonymous 03/02/18 (Fri) 06:42:20 07564d (94) No.530978

>>525489

I would like to be able to allow others to tag posts in my database. Any ideas on how to keep clowns from shitting everything up?

My initial thought is to allow suggesting of tags (similar to comment logic in the blog) with moderators making final decisions on them.

▶Anonymous 03/02/18 (Fri) 06:47:17 07564d (94) No.530994>>538787

>>530920

One of the big reasons I hesitate in making the entire database available is because a few of the images uploaded into the threads are obscene. I have no desire to inadvertently public that sort of thing. When I'm publishing a reviewed subset, the chances of that happening are low.

▶Anonymous 03/02/18 (Fri) 16:54:18 00c874 (2) No.532910>>538741

>>502773

Perhaps?? just a guess.

Half Past Human .com

Absolutely the capability!

Discretion and interests match? Dunno.

▶Anonymous 03/02/18 (Fri) 16:59:16 00c874 (2) No.532931>>534887

>>499327

Is there an interest in pre-selecting data?

For example, select only posts identified on "notable posts" lists from each general #.

Plus, of course, any to-from links on those selected, chained.

Just asking. DB size, usability, etc.

Or is the data set also for researching shill/troll themes? It is a possibility, so I ask.

▶Anonymous 03/02/18 (Fri) 23:32:52 07564d (94) No.534887>>534908

>>532931

I'm working on that right now. I got started on this a week or so ago. I wrote a bit of code to travel back through context links, too. Hopefully, in a few days, I'll be able to repost my blog with the results of this work.

▶Anonymous 03/02/18 (Fri) 23:37:04 07564d (94) No.534908

>>534887

A bit more to say about that:

It's my plan to include items that reach back to a Q post together with that Q post when I can identify such. I may do a little pruning to keep the length of the entry associated with a Q post under control. Not everything in a context thread is important, after all. I may have to think about further arranging of things. I'll think more about that as I get closer to a point where I can implement such a strategy.

▶Anonymous 03/03/18 (Sat) 05:16:26 dbb4a4 (36) No.536855>>540555 >>541964

File (hide): 1aca03a8df398a5⋯.png (92.13 KB, 1241x968, 1241:968, ClipboardImage.png) (h) (u)

>>520323

>>520373

So I managed to find the missing drops. My archive now has 827 total. As it turns out, the scraper was working as designed, filtering out Anonymous posts. The missing 2 for me were #823 and #819 when Q's trip wasn't working.

▶Anonymous 03/03/18 (Sat) 14:16:05 8143cc (9) No.538741

>>532910

>Half Past Human .com

Wow. That's a new one to me.

▶Anonymous 03/03/18 (Sat) 14:25:36 8143cc (9) No.538775

>>524543

>There are over 750,000 total posts from both sites and all boards containing Q related posts.

Yes, and that's the challenge. Making the Q "related" post searchable. Making Q's posts searchable is arguably not as important as making the body of related posts searchable as that's where the body of knowledge resides.

"You have more than you know" taunts us with its promise. We get pointed to Loop Capital, or Stanislav Lunev. We need to be able to search/aggregate all of the posts over weeks/months with a single search. The dedicated research threads are great as far as they go but we're missing a lot of other info posted as snippets.

▶Anonymous 03/03/18 (Sat) 14:29:31 8143cc (9) No.538787

>>530994

>few of the images uploaded into the threads are obscene.

That does complicate it, but a lot of the information in the Q "related" posts is graphic. It seems culling of obscene content would need to be done manually to avoid throwing the baby out with the bathwater.

▶Anonymous 03/03/18 (Sat) 20:18:18 98bd4e (20) No.540555>>543389

>>536855

Good catch. I found some in my db as well.

I like the post headers in the UI. Nice and clean.

▶Anonymous 03/03/18 (Sat) 22:49:51 838074 (4) No.541964

>>536855

Yeah, qanonmap has had all of those for over a week now…

▶Anonymous 03/04/18 (Sun) 02:15:49 dbb4a4 (36) No.543389>>545176

>>540555

What is everybody using as their sources for drops? 8ch? One of the QCode forks? Something else?

How do we verify that our collections are the same?

I've been adding a Guid for each post I scrape, just to give them all a unique value.

▶Anonymous 03/04/18 (Sun) 05:32:44 98bd4e (20) No.545176>>547789 >>547826

>>543389

qtmerge uses the raw JSON/HTML data where relevant from 8ch, 4plebs and trumptwitterarchive as it's source data. It also merges in the JSON from qcodefag/qanonmap. It currently uses the host, board, post timestamp and post number to sync.

I like the idea of matching the GUIDs along with a post hash using some method we agree on.

▶Anonymous 03/04/18 (Sun) 15:17:46 dbb4a4 (36) No.547789>>547826 >>549377

>>545176

Oh shit. Qtmerge is scraping HTML pages? You are dedicated. I sourced stuff from qcodefag that I couldn't get json for.

Do you have the full bread sources?

▶Anonymous 03/04/18 (Sun) 15:23:58 dbb4a4 (36) No.547826>>548084

Phonefag right now.

>>545176

>>547789

There's an md5 field as you know in the 8ch json, but it wasn't in the data I got from Qcodefag. Because he'd modified the .com to strip HTML into a.text field.

My chanscraper keeps the md5 and the .com and strips HTML into .text.

Any C#fags here?

I did set up a GitHub yesterday and push the chanscraper out. Gonna get the Twitter stuff mashed in the next few days.

▶Anonymous 03/04/18 (Sun) 16:08:03 dbb4a4 (36) No.548084>>548229

>>547826

Just ran my chanscraper again since apparently there were new posts last night as I was jacking around with Github.

I checked my posts with what's on qresearch and I think I'm good. Showing 839 total now.

New Q posts from 828 - 839.

I found a bug in the ChanScraper code too. A thing I've been working on that I forgot to remove. I'll push it out too and then link the GitHub.

▶Anonymous 03/04/18 (Sun) 16:26:57 dbb4a4 (36) No.548229>>548433 >>549586

>>548084

Here's the link to my new GitHub

https:// github.com/QCodeFagNet/SFW.ChanScraper

If you are going to run the ChanScraper and then view the posts locally, when you open the QJSONViewer.html page, don't open the [json\_allQPosts.json] file, open the newly generated [bin\json\_allQPosts.json] file.

The machine needed me to include all the existing posts/work json. It's kind of clunky the way I'm doing it because I want to keep this updated with the latest posts/work json. But for a normal user everything is kept updated automagically in the bin\json folders. The project is set up to copy new files if newer - so everything should be kept in sync.

If you are planning on running this locally you'll need the .NET framework 4.5 at least. Probably better to go with 4.5.2

https:// www.microsoft.com/net/download/dotnet-framework-runtime/net452

▶Anonymous 03/04/18 (Sun) 16:47:12 dbb4a4 (36) No.548433

>>548229

You'll need Visual Studio free (at least) to build it unless you are a commandline master.

https:// www.visualstudio.com/vs/visual-studio-express/

▶Anonymous 03/04/18 (Sun) 18:37:42 98bd4e (20) No.549377>>550148

>>547789

Only HTML of archive pages.

▶Anonymous 03/04/18 (Sun) 19:07:30 07564d (94) No.549586>>550148 >>550167

>>548229

Does your scraper work on the archive.is versions? These are the most complete most of the time since that is where so many of the pages were almost immediately saved by anons.

▶Anonymous 03/04/18 (Sun) 20:29:46 dbb4a4 (36) No.550148>>550218 >>550251 >>551411

>>549377

Tedious Dayum. Think you could convert your full bread scrape into some json?

>>549586

Gotta link to one of the JSON files?

>>548564

Here's a mini local JSON viewer as an HTML page + allQPosts.json. @225KB

Includes all QPosts up to 2018-03-04T11:29:14

https:// anonfile.com/06HeJbdeb6/Mini_Local_JSONViewer.zip

I was just thinking that what we really need, to start off with is a single schema that we can all agree on. It will go a far way in interoperability.

I'm going to run some tests on my local QCodeFag install and see if it will work off of the ChanScraper _allQPosts.json file. I think it should.

The JSONViewer could work with straight files from 8ch or 4ch with a single minor change I forgot to put in.

▶Anonymous 03/04/18 (Sun) 20:33:02 dbb4a4 (36) No.550167

>>549586

The ChanScraper includes the full JSON archive as of this morning. I haven't need to go back to any archive.is HTML archives because I've been collecting breads locally since the beginning of Feb. All the Q Posts before that I sourced from the QCodeFag forks.

▶Anonymous 03/04/18 (Sun) 20:42:52 dbb4a4 (36) No.550218

>>550148

Here's what the JSON schema I'm working with looks like.

[

{

"source": "qresearch",

"threadId": 544266,

"link": "https:// 8ch.net/qresearch/res/544266.html#544985",

"imageLinks": [

{

"url": "https:// media.8ch.net/file_store/ffd6128f5949e4d4f6f3480236a63be002ffc5e59c0a31714360624d8ce45170.jpeg"

{

"url": "https:// media.8ch.net/file_store/ffd6128f5949e4d4f6f3480236a63be002ffc5e59c0a31714360624d8ce45170.jpeg/B42CA278-6C32-4618-A856-0CB9B680CC38.jpeg"

}

"references": [

{

"source": "qresearch",

"threadId": 0,

"link": "https:// 8ch.net/qresearch/res/0.html#548166",

"imageLinks": [],

"references": [],

"no": 548166,

"uniqueId": "19294a1b-8cae-435d-9503-8eb70c573d6b",

"_unixEpoch": "1970-01-01T00:00:00Z",

"text": "\r\r>>548157\r\rAlso not a real Q post\r\rQ",

"postDate": "2018-03-04T11:19:47",

"time": 1520180387,

"tn_h": 0,

"tn_w": 0,

"h": 0,

"w": 0,

"tim": null,

"fsize": 0,

"filename": null,

"ext": null,

"md5": null,

"last_modified": 1520180387,

"sub": null,

"com": "<a onclick=\"highlightReply('548157', event);\" href=\"/qresearch/res/547414.html#548157\">>>548157</a>Also not a real Q postQ",

"name": "Q ",

"trip": "!UW.yye1fxo",

"replies": 0

}

"no": 544985,

"uniqueId": "35c759aa-4998-4009-83a7-2af1b3273f28",

"_unixEpoch": "1970-01-01T00:00:00Z",

"text": "\r\r>>548166\r\rNOT A REAL Q POST\r\rQ",

"postDate": "2018-03-04T00:17:27",

"time": 1520140647,

"tn_h": 237,

"tn_w": 255,

"h": 1114,

"w": 1200,

"tim": "ffd6128f5949e4d4f6f3480236a63be002ffc5e59c0a31714360624d8ce45170",

"fsize": 271479,

"filename": "B42CA278-6C32-4618-A856-0CB9B680CC38",

"ext": ".jpeg",

"md5": "CbsCGk0pVEahunzSuV4LKw==",

"last_modified": 1520140647,

"sub": null,

"com": "<a onclick=\"highlightReply('548166', event);\" href=\"/qresearch/res/547414.html#548166\">>>548166</a>NOT A REAL Q POST.Q",

"name": "Q ",

"trip": "!UW.yye1fxo",

"replies": 0

}

]

▶Anonymous 03/04/18 (Sun) 20:48:15 98bd4e (20) No.550251>>553109

>>550148

Let me clarify, HTML for just the archive pages (to capture threads not in catalog/threads.json). JSON for everything in else.

I'm working on how to share it, currently unoptimized and around 6 GiB of data uncompressed.

▶Anonymous 03/04/18 (Sun) 23:05:42 07564d (94) No.551411>>553092

>>550148

http:// archive.is/https:// 8ch.net/cbts/res/*

It doesn't look like archive.is does JSON. Your parser doesn't do HTML?

▶Anonymous 03/05/18 (Mon) 02:30:40 dbb4a4 (36) No.553092

>>551411

Yeah I've dug thru all the html looking for a reference to a json file. Can't find a reference to one either. My guess is, that once it drops off the main thread catalog, the JSON is no longer available. Too bad because that's the meat in a simple format.

No the machine is more of a scraper (grab data and save it) than a parser. It does parse the HTML out of the .com field into .text like QCodeFag does though. It's not designed to read thru html pages to look for posts.

It has a local baseline archive of everything.It reads in that entire local and then figures out the json breads it needs to download from the 8ch/qresearch/catalog.json. Then it downloads all those new breads and resets itself so you don't download everything every time - only the breads from the past [x] days.

▶Anonymous 03/05/18 (Mon) 02:32:20 dbb4a4 (36) No.553109>>554309 >>555095

>>550251

You've got a database? I assume that's with all the images as blobs?

▶Anonymous 03/05/18 (Mon) 04:30:33 dbb4a4 (36) No.554074

Here's an updated mini local JSON viewer as an HTML page + allQPosts.json. @225KB

I updated it so it works with the raw json from 8ch.

https:// 8ch.net/qresearch/res/553655.json

Could probably use an [ascending/descending] button but…

Includes all QPosts up to 2018-03-04T11:29:14

https:// anonfile.com/z4U1Jdd9b9/Mini_Local_JSONViewer.zip

If folks don't like a zip, it's only 2 files they can download the HTML file (ChanScraper) and the allQPosts.json (Console\bin) file on github https:// github.com/QCodeFagNet/SFW.ChanScraper

▶Anonymous 03/05/18 (Mon) 05:10:01 07564d (94) No.554309>>554376 >>569900

>>553109

My images are kept as separate files in original form. Only the links are kept in the database. Here's the record definition for MySQL:

CREATE TABLE `chan_posts` (

`post_key` varchar(31) NOT NULL COMMENT 'site/board#post (post is set to length 9 with . fill.',

`thread_key` varchar(31) NOT NULL COMMENT 'site/board#thread (thread is set to length 9 with . fill.',

`post_site` varchar(19) NOT NULL COMMENT 'For editor post, use editor. For spreadsheet, use sheet.',

`post_board` varchar(15) NOT NULL COMMENT 'For editor post, use editor. For spreadsheet, use sheet.',

`post_thread_id` int(10) UNSIGNED NOT NULL COMMENT 'For editor post, use 1. For spreadsheet, use row.',

`post_id` int(10) UNSIGNED NOT NULL COMMENT 'For editor post, use next available. For spreadsheet, use column converted to number.',

`ghost` int(10) UNSIGNED DEFAULT NULL,

`post_url` text,

`local_thread_file` text,

`post_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,

`post_title` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,

`post_thread_title` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,

`post_text` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,

`prev_post_key` varchar(31) DEFAULT NULL,

`next_post_key` varchar(31) DEFAULT NULL,

`wp_post_id` int(11) UNSIGNED DEFAULT NULL,

`post_type` set('editor','q-post','anon','approved','high','mid','low','irrelevant','timeline') NOT NULL DEFAULT 'anon',

`flag_use_in_blog` tinyint(1) NOT NULL DEFAULT '0',

`flag_included_on_maps` tinyint(1) NOT NULL DEFAULT '0',

`flag_included_in_bread` tinyint(1) DEFAULT NULL,

`flag_bread_post` tinyint(1) DEFAULT NULL,

`flag_relevant_img` tinyint(1) DEFAULT NULL,

`flag_relevant_post` tinyint(1) DEFAULT NULL,

`author_name` text,

`author_trip` text,

`author_hash` text,

`author_type` smallint(6) DEFAULT NULL,

`img_files` json DEFAULT NULL,

`link_list` json DEFAULT NULL,

`video_list` json DEFAULT NULL,

`editor_notes` text,

`tags` text,

`people` text,

`places` text,

`organizations` text,

`signatures` text,

`event_date` datetime DEFAULT NULL,

`report_date` datetime DEFAULT NULL,

`timeline_title` tinytext

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

ALTER TABLE `chan_posts`

ADD PRIMARY KEY (`post_key`),

ADD KEY `post_id` (`post_id`),

ADD KEY `thread_key` (`thread_key`),

ADD KEY `site_board` (`post_site`,`post_board`);

I'm considering making the database publicly available. I need to figure out how much space it will take up and whether it will fit within my current hosting plan. At present, I have over 880,000 posts in the database. The size of the database file for just this table without the images is 1.1GB. There's another GB for images of Q posts, but this is only the fraction that is Q posts, bread posts, and for the context posts related to these.

▶Anonymous 03/05/18 (Mon) 05:21:53 07564d (94) No.554376

>>554309

I guess I should start uploading. I've got the unlimited plan. Anyone want to write the search feature for it? Preferred language is PHP.

▶Anonymous 03/05/18 (Mon) 07:06:02 98bd4e (20) No.555095>>560076

>>553109

For now it just uses a dedicated file system.

With images gathered so far this mirror's total size is 193 GiB.

▶Anonymous 03/05/18 (Mon) 23:27:38 dbb4a4 (36) No.560076>>560415 >>564762

>>555095

holey phuck. 193 GB. That's for a full archive of all breads + images? My local scrape of Q breads and posts as text only comes in at 6mb. My local QCodeFag install with text + Q images is just under 100mb.

193GB is getting unmanageable.

▶Anonymous 03/06/18 (Tue) 00:12:42 98bd4e (20) No.560415

>>560076

Yes, unoptimized and incomplete.

▶Anonymous 03/06/18 (Tue) 07:12:30 07564d (94) No.564762>>568187

>>560076

Not unmanageable. Just big. Maybe every thread needs its own directory for its images. And maybe the data needs to be moved to my other drive locally.

▶Anonymous 03/06/18 (Tue) 07:23:54 07564d (94) No.564862

I'm working on the export files now. I need to change the posts just a bit before I can make them public.

I promised that no links would go to 8ch and particularly qresearch, and also that I would redact mentions of them from the content. I already do this on my blog, but I simply broke the links rather than made them go somewhere else. To get the most out of the republishing of the posts, I need to convert the >> and >>> links so that they link to posts stored on my own site. This is probably better anyway since many posts and threads are now missing from their original locations.

▶Anonymous 03/06/18 (Tue) 17:07:09 dbb4a4 (36) No.568187>>568666 >>569170

>>564762

Yeah it's not totally unmanageable. It's more like moving a full grown oak tree. You can do it, but it's a huge pain in the ass. I was thinking more in terms of moving it around the internet or hosting. That's a pretty big db.

I rejiggered the ChanScraper to archive all the breads even if there isn't a Q post in that bread. It rendered 215 NEW complete breads and brought my jason net filesize from 6MB to 200MB. Starts around "Q Research General #358".

That's with no images, just the raw JSON from 8ch. Each bread is around 700kb.

▶Anonymous 03/06/18 (Tue) 17:44:53 98bd4e (20) No.568666>>568861 >>569061

>>568187

I did some research on collecting the CBTS threads from 4chan/pol the other night and the results might be useful for others. They can be found at the bottom of the page here:

https:// anonsw.github.io/qtmerge/catalog.html

It's still a work in progress.

▶Anonymous 03/06/18 (Tue) 18:06:09 dbb4a4 (36) No.568861

>>568666

>anonsw.github.io/qtmerge/catalog.html

I may be able to give you an list of all those links from the data I have from QCodeFag

▶Anonymous 03/06/18 (Tue) 18:22:03 dbb4a4 (36) No.569061

>>568666

>anonsw.github.io/qtmerge/catalog.html

nevermind looks like you got it covered. nice!

▶Anonymous 03/06/18 (Tue) 18:31:04 07564d (94) No.569170>>569329

>>568187

Yes, the breads are essential. I've got them going back all the way through 4chan stuff. The breads are how you connect in the answers. If you connect up the contexts, most of them link back to a Q post at some point. Then the context of that post that was linked into the bread can be associated with the Q post. That is what I was working on before I started looking at making my entire database available for research.

▶Anonymous 03/06/18 (Tue) 18:44:36 98bd4e (20) No.569329>>569793

>>569170

Were you able to capture any of the original 4chan JSON/HTML data? I wasn't researching Q at that time so I've relied on 4plebs.

▶Anonymous 03/06/18 (Tue) 19:11:36 d6b0f8 (36) No.569596

>>494745 (OP)

I have created a searchable application for /qresearch/.

The database is filling right now. I kept only the image attachments in order to save hard disk space.

At present 52,000 of the most recent posts on qresearch are loaded in the table with the attachments. We'll see how the storage works out.

I'll advise when anons can attempt to use the system.

▶Anonymous 03/06/18 (Tue) 19:33:16 07564d (94) No.569793>>570074

>>569329

I've got most of it, yes.

▶Anonymous 03/06/18 (Tue) 19:48:58 07564d (94) No.569900>>570074

>>554309

I don't know if y'all noticed, but I've got several columns in my database that are not part of the original data. Some of these are tagging fields: `tags`, `people`, `places`, `organizations`, and `signatures`. It would be difficult to automate the filling of these fields, but I don't want to entirely open up editing of these fields to anons, either, due to the potential of clown interference. There's no way I can fill all of them in myself. I have an idea to allow tags to be suggested and then allow up-voting and down-voting and coming up with an acceptance criteria before giving them a permanent place in the data record. Or maybe just leave them in that form with their ratings.

▶Anonymous 03/06/18 (Tue) 20:10:14 98bd4e (20) No.570074>>570566 >>570766 >>570809

>>569793

Excellent. Will that raw JSON data be in the DB as well?

>>569900

I did notice, those are great ideas. Can I suggest letting each user have their own copy/edits of the metadata? The user-specific data could then feedback into the system for suggestions to others, etc. But primarily it gives the user some way to control the interference/noise.

▶Anonymous 03/06/18 (Tue) 21:12:37 dbb4a4 (36) No.570566>>570634

>>570074

What JSON are you looking for anon? Bread before 2/6/2018?

▶Anonymous 03/06/18 (Tue) 21:16:48 dbb4a4 (36) No.570604>>665010

I've rejiggered the ChanScraper to produce TwitterSmashed json. It includes any DJTweets within 60 mins of a Qpost. Here's what a [5], [8], [10] deltas look like.

{

"DJTtwitterPosts": [

{

"accountId": "realDonaldTrump",

"accountName": "Donald J. Trump",

"tweetId": 944665687292817415,

"text": "How can FBI Deputy Director Andrew McCabe, the man in charge, along with leakin’ James Comey, of the Phony Hillary Clinton investigation (including her 33,000 illegally deleted emails) be given $700,000 for wife’s campaign by Clinton Puppets during investigation?",

"delta": 5,

"link": "https:// twitter.com/realDonaldTrump/status/944665687292817415",

"uniqueId": "00e6951d-5f49-455b-bdd9-bda7f184d9c7",

"time": 1514060825,

"_unixEpoch": "1970-01-01T00:00:00Z",

"postDate": "2017-12-23T15:27:05"

{

"accountId": "realDonaldTrump",

"accountName": "Donald J. Trump",

"tweetId": 944666448185692166,

"text": "FBI Deputy Director Andrew McCabe is racing the clock to retire with full benefits. 90 days to go?!!!",

"delta": 8,

"link": "https:// twitter.com/realDonaldTrump/status/944666448185692166",

"uniqueId": "92fbb1a2-169e-412c-abba-6e441d3acbaa",

"time": 1514061006,

"_unixEpoch": "1970-01-01T00:00:00Z",

"postDate": "2017-12-23T15:30:06"

{

"accountId": "realDonaldTrump",

"accountName": "Donald J. Trump",

"tweetId": 944667102312566784,

"text": "Wow, “FBI lawyer James Baker reassigned,” according to @FoxNews.",

"delta": 10,

"link": "https:// twitter.com/realDonaldTrump/status/944667102312566784",

"uniqueId": "eabb202f-3b59-48c9-b282-f0110b8388a5",

"time": 1514061162,

"_unixEpoch": "1970-01-01T00:00:00Z",

"postDate": "2017-12-23T15:32:42"

}

"no": 158078,

"name": "Q",

"trip": "!UW.yye1fxo",

"sub": null,

"com": null,

"text": "SEARCH crumbs: [#2]\nWho is #2?\nNo deals.\nQ\n",

"tim": null,

"fsize": 0,

"filename": null,

"ext": null,

"tn_h": 0,

"tn_w": 0,

"h": 0,

"w": 0,

"replies": 0,

"md5": null,

"last_modified": 0,

"source": "8chan_cbts",

"threadId": 157461,

"link": "https:// 8ch.net/cbts/res/157461.html#158078",

"imageLinks": [],

"references": [],

"uniqueId": "e22306cc-2831-453a-ae1d-16e90aa23707",

"time": 1514060541,

"_unixEpoch": "1970-01-01T00:00:00Z",

"postDate": "2017-12-23T15:22:21"

}

▶Anonymous 03/06/18 (Tue) 21:19:06 98bd4e (20) No.570634>>570660

>>570566

4chan JSON for pol between 2017-10-30 and 2017-12-01.

▶Anonymous 03/06/18 (Tue) 21:22:15 dbb4a4 (36) No.570660>>570874

>>570634

I'll keep my eyes peeled. Finding old JSON for those days is hard. Is 12-1 when you started archiving? Got bread json < 2-6-2018?

▶Anonymous 03/06/18 (Tue) 21:34:22 07564d (94) No.570766

>>570074

I could develop an export, I suppose. But that's low on my list of priorities at the moment. The data structure is above in the list. Minor alteration needed: My host does not support JSON fields. Substitute TEXT, and you should be good. If you want to write an exporter, I can review it and include it.

But I still don't have the data up there yet. I'm working on the alterations to the data needed to keep everything on site at the host.

▶Anonymous 03/06/18 (Tue) 21:38:58 07564d (94) No.570809>>570944

>>570074

I was thinking of attaching the IP address to each suggestion to keep the up-votes and down-votes honest. Is that enough? Or maybe even too much? The other thing I could do is perhaps tie in the WordPress login system, since it's there anyway. It might take a bit of time for me to figure out how to limit permissions.

▶Anonymous 03/06/18 (Tue) 21:44:00 98bd4e (20) No.570874>>570983

>>570660

Thanks, 4plebs is good for now, but a second witness is preferable. Started archiving Feb 15, but some old data was still available at the time.

For 8ch these are the oldest breads I have:

pol: 10509790 (2017-08-28)

cbts: 10 (2017-11-21)

thestorm: 1 (2018-01-31)

I don't have all breads after though, it is incomplete.

I've since stopped archiving pol/cbts/thestorm to save time/space.

▶Anonymous 03/06/18 (Tue) 21:49:15 98bd4e (20) No.570944>>575021

>>570809

Not enough due to VPNs, DHCP, etc. The login may be the best way.

▶Anonymous 03/06/18 (Tue) 21:53:03 dbb4a4 (36) No.570983>>571020

>>570874

I think you and I started archiving those about the same time. I've got complete json breads from 2/6/2018 to now. if you want any of that.

▶Anonymous 03/06/18 (Tue) 21:55:47 98bd4e (20) No.571020>>571161

>>570983

I might already have it, is it in the QJsonArchive.zip from earlier?

▶Anonymous 03/06/18 (Tue) 22:07:21 dbb4a4 (36) No.571161>>571229

>>571020

Ya - you probably have the breads from the last few days eh?

▶Anonymous 03/06/18 (Tue) 22:13:44 98bd4e (20) No.571229>>571310

>>571161

I do, I'll call your dataset QCodeFagNet unless you want a different name. Instead of the zip I'll pull it from your github.

▶Anonymous 03/06/18 (Tue) 22:22:14 dbb4a4 (36) No.571310>>603568

>>571229

Sounds fine. I'll try to keep it updated.

▶Anonymous 03/07/18 (Wed) 04:46:14 07564d (94) No.575021

>>570944

Logins require email addresses. I guess it's always a choice whether to participate.

▶Anonymous 03/08/18 (Thu) 01:05:33 d6b0f8 (36) No.583035>>596604

>>494745 (OP)

Q Research General - searchable archive breads 716-477 presently online.

www.pavuk.com

username qanon

password qanon

updates as I find them

▶Anonymous 03/09/18 (Fri) 03:55:21 70e498 (69) No.596604

>>583035

Looking good

▶Anonymous 03/09/18 (Fri) 06:30:49 6a9543 (2) No.598094>>602595

File (hide): df457dd3420fb52⋯.jpg (101.52 KB, 500x522, 250:261, 1487336933873.jpg) (h) (u)

There so much content being produced now that it should be compiled into a wiki in a dedicated thread. The other threads investigate and make the content, this one adds the best content into one big archive, updated in real-time ofc bc they never stop why should we pic related.

BUT WHY

To take Q's work to the next level we have to increase the public's basic awareness of the criminality being exposed, investigated, and terminated, by an order of magnitude. That order of magnitude is pretty normal people.

>be a normal person

>want to do the right thing but get a link to this Q thing and there's too much complex and """scary""" info what with muh job and family and everything else

>the big load of content is overwhelming and i don't know where to begin and have it be easy

<make 1 entry point to begin browsing the entire body of accepted content

<terse organization keeps it brief and saves the details for a leaf page a click away, as deep as is necessary

<keep source of body of accepted content continuously up to date

<using https for minimal integrity protection

>now i can begin a review of the evidence contained in the case file archive with a single click! jeff bozos eat your heart out nigger

>and look at short well-organized and sourced text, and pictures, and the odd video

>and easily get a run down on whatever topics i browse my way upon

>and now even though my eyes have been opened in a pretty dramatic way, it was easy to use and i know it'll be easy to share, to the topic level

▶Anonymous 03/09/18 (Fri) 18:31:20 70e498 (69) No.602595>>603402

>>598094

I hear you anon.

The key is the content. We have the ability archive threads/qposts. Posts that Q references. Tweets. Known tripcodes/twitter accounts.

What is the source of all the evidence? The dedicated research threads? Notables? In order for it to be automagic, there needs to be a reliable single source here on 8ch. None of the codefag work I've seen reaches a level of what could be called AI - or the ability to discern which anon has posted a certifiable answer/evidence.

Non automated means anonomated, but that causes it's own set of issues.

I agree a wikipedia style thing would be good because it's familiar, but populating it with data may be an issue. Some of it's going to have to be entered in manually.

If all you are looking for is a location for an anon wiki, I think that's pretty easy.

▶Anonymous 03/09/18 (Fri) 20:14:41 6a9543 (2) No.603402

>>602595

No, not automated, curated.

▶Anonymous 03/09/18 (Fri) 20:37:31 98bd4e (20) No.603568>>612945

>>571310

Should I hit _allQPosts.json?

▶Anonymous 03/10/18 (Sat) 00:00:01 07564d (94) No.605608>>605926

I'm stuck. I'm working on getting that database up for you, but I have to make some modifications to the `post_text` field so that those links don't come here to 8ch. (I promised that I wouldn't do that.) I'm trying to fix the `post_text` field so that the >> links refer back into the database, but I'm not familiar enough with the DOMDocument and related classes in PHP. Are there any good tutorials out there on how to do advanced manipulation of HTML using these classes? The reference manual stuff just isn't doing it for me.

▶Anonymous 03/10/18 (Sat) 00:26:13 07564d (94) No.605926

>>605608

I should clarify something. Not only am I going to make the existing links self-reference, but I'm also going to revive those dead >> links and point them back into the database. I've got many of the deleted threads in my database, too, and I can make those available.

▶Anonymous 03/10/18 (Sat) 17:47:01 70e498 (69) No.612945>>613236

>>603568

Ya that's fine. I'm going to update that today to cover the latest.

I've been working on a new local viewer that uses the twitter smashed data. It shows the delta + alt text of the tweet + a link to the tweet. I've noticed that alot of the image links I have a currently broken. I was thinking I'd just update those to point to one of the other QCodeFag branch archives rather than try and archive all the images as well.

Expect an update on GitHub later

▶Anonymous 03/10/18 (Sat) 18:05:08 70e498 (69) No.613236

File (hide): f9d167645faf34b⋯.png (129.03 KB, 830x723, 830:723, ClipboardImage.png) (h) (u)

>>612945

Here's what it looks like. Just trying to finish off a sort idea and clean data.

▶Anonymous 03/10/18 (Sat) 18:26:47 07564d (94) No.613641>>613892

Good news! I've got the code working which makes the post links compliant and refer back into the database. Almost as soon as I posted the request, it came to me that I was making things more complicated than they needed to be and a better algorithm came to mind. The algorithm is so good that in cases where good posts didn't link in 8ch, they will be linked on my site. That includes links such as the one Q pasted into the middle of a word the other day or when they are consecutive with or without comma or white space. Anywhere there is a >> followed by a bunch of digits, a link should be created. The only exception is where the post number of the link is greater than the post number of the current post. This type of error was encountered in early posts after the transition from one board to another. Anyway, I'm going to run a few more quick tests, and then I should be uploading to my host within a few hours. I still don't have code ready to search it, though.

▶Anonymous 03/10/18 (Sat) 18:49:44 70e498 (69) No.613892>>618146

>>613641

When you get that worked out make sure to let us know. I've been wondering about that myself. The early halfchan no's are pretty big. I've found some bugs in my code around there being multiple references per Q post. It does happen on occasion and my scraper isn't catching them all.

I've just uploaded a bunch of json data to the https:// github.com/QCodeFagNet/SFW.ChanScraper/tree/master/JSON gihub. The json folder is what's generated when you run the ChanScraper, the smash folder when you run the TwitterSmash. Each of those folders has a Viewer.html file that can be used with just the _allQPosts.json or _allSmashPosts.json.

Like I said I need to clean up some dead image links for everything to be working right.

▶Anonymous 03/10/18 (Sat) 23:07:49 07564d (94) No.618146>>620330

>>613892

You MIGHT be able to get thumbnails from archives, but you won't get full size images there, for the most part.

▶Anonymous 03/11/18 (Sun) 01:09:19 70e498 (69) No.620330>>622768

>>618146

Ya think it's bad form to go lazy and link em to one of the qcodefag archives?

▶Anonymous 03/11/18 (Sun) 03:15:33 07564d (94) No.622768>>622903

>>620330

Part of making those offline archives is storing the items. Plus, don't assume any platform is forever. There are too many clowns out there who don't want anyone to see this stuff.

So now I've got a bunch of export files of my database ready to upload. Next challenge: Automating the import on the hose.

▶Anonymous 03/11/18 (Sun) 03:23:45 07564d (94) No.622903

>>622768

>import on the hose.

Do clowns alter typing?

▶Anonymous 03/11/18 (Sun) 06:16:03 07564d (94) No.625024>>632885

The table of posts has been added to the database. It's all up there. (All I have, anyway.) I need to get a way to make searches available to you now.

▶Anonymous 03/11/18 (Sun) 22:53:06 70e498 (69) No.632885>>648528 >>648594

>>625024

So you have all the breads searchable as well?

▶Anonymous 03/13/18 (Tue) 05:55:03 07564d (94) No.648528

>>632885

Everything is searchable. The database includes all posts I could find. I'm working on the search front end right now.

▶Anonymous 03/13/18 (Tue) 06:00:57 07564d (94) No.648594>>649300 >>649479 >>650810 >>663255

File (hide): 6b4b674ff054cee⋯.png (82.26 KB, 1231x1217, 1231:1217, Q-Research-Tool.png) (h) (u)

>>632885

This is what the front end looks like right now. I'm working now on turning that into a SQL statement that can search the database. I'm only an hour or two from putting this online.

▶Anonymous 03/13/18 (Tue) 07:41:42 07564d (94) No.649300>>650810 >>663255

>>648594

It's up there. The paging isn't working yet, so don't anyone complain about that. I'll fix it in the morning. I also discovered that a key range of posts didn't import properly. I'll fix that in the morning, too. For now, I've set the posts per page to 2000, which may cause timeouts, but it will allow people to play with things a bit.

http:// q-questions.info/research-tool.php

▶Anonymous 03/13/18 (Tue) 08:20:31 cc8139 (1) No.649479

>>648594

ANON, great work.

▶Anonymous 03/13/18 (Tue) 13:27:59 70e498 (69) No.650810>>652644

>>648594

HOLEY FUCK YES.

This crosses all breads? If so then this is exactly what we need. I can help you with the SQL if you need it.

SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15 you should also specify an ORDER BY

>>649300

How are you getting the breads? Maybe I can work out a way to get you those. Combine up somehow

▶Anonymous 03/13/18 (Tue) 14:58:54 70e498 (69) No.651528>>663255

>>509646

I've been thinking about this. Preliminary research shows that elasticsearch and lucene would probably be the best match for what we've got. There are alot of tools that pile into elasticsearch. Any hostfags here with the ability to set up an elasticsearch node?

The data is big. Tons of images. A proper archive takes space. I'm holding @546 complete breads and with no images it's 250MB+. That's for like a month. By the end of the year the bread collection alone is going to be over 1.5GB.

The images I've got so far is around 100MB, but that's just from the Q posts - and even then I know I'm missing some.

Econ Godaddy hosting is like $45 a year. I'm thinking about just putting the chanscraper/twittersmash online, then write some simple apis. Get thread#, filteredThread, qpost# that kind of thing. Useful or no?

▶Anonymous 03/13/18 (Tue) 16:51:45 07564d (94) No.652644>>654567

>>650810

My algorithm for getting breads is this:

1. Get the author_hash for the first post in a thread.

2. Mark the first posts in the thread that match that author_hash until the author hash doesn't match.

If someone jumps in before the baker is done, oh well. But that shouldn't be much of a problem because the breads get repeated a lot. I can mark posts as bread later, if need be.

▶Anonymous 03/13/18 (Tue) 20:07:00 70e498 (69) No.654567>>654852 >>654901

>>652644

Hmm… When I say bread I mean a full Q Research thread. Like this

https:// github.com/QCodeFagNet/SFW.ChanScraper/blob/master/JSON/json/8ch/archive/651280_archive.json

That's the straight bread/thread from 8ch. It includes all the responses whether the BV posted it or not.

I'm finding those by getting the full catalog from

https:// 8ch.net/qresearch/catalog.json, finding the breads/threads that have q research, q general etc in them, and then getting the json for that thread only from https:// 8ch.net/qresearch/res/651280.json

I think I see what you are doing - going thru and trying to mark the relevant posts?

▶Anonymous 03/13/18 (Tue) 20:41:29 07564d (94) No.654852

>>654567

I haven't even looked at at that.

Paging is fixed, plus I gave you a couple other search parameters.

I'm still working on the import issue, but I at least have put the posts I initially identified as missing up there.

▶Anonymous 03/13/18 (Tue) 20:46:56 07564d (94) No.654901

>>654567

>I think I see what you are doing - going thru and trying to mark the relevant posts?

Yes. Most of it is done automatically. Since I save the marks in the post records, I can go back in there and adjust it, if necessary.

▶Anonymous 03/14/18 (Wed) 16:42:48 5f9a22 (5) No.663255>>664801

>>651528

>Useful or no?

I'm not the guy to ask. The discussions here went over my head immediately. Looks like there's some serious progress being made here:

>>648594

>>649300

One question I have for contributors here is when there is a consensus that you have created a viable search tool, how will you manage promulgation? Do it like a war room announcement on qresearch?

As many have noted, the search tool has to be hardened against tampering before release. Clowns/shills are devious and destructive.

▶Anonymous 03/14/18 (Wed) 19:36:47 70e498 (69) No.664801>>664904

>>663255

I agree on shill proofing.

I've been playing around with a webAPI. I've got it working nice with all the q posts, looking for a specific post# like #929, and posts on a day. Returns json or xml. This is the Crumb Archive.

My plan is to expand that so that the archived breads can be accessed as well - each as a single json file. This is the Bread Archive.

I'm going to set it up where it's an autonomous machine. It will scrape and archive automagically moving forward from the current baseline. No delete. No put. No fuckery.

I'm pretty sure it would with the QCodeFag scraper repos.

The bread archive is pretty big. I'm sure there's no way I can archive images for all the breads. An image archive isn't what I've been focused on. The focus of this is only making the json/xml available from the chanscraper.

Once I can get the breads all up and being served automagically my plan is to set up an elasticsearch node and suck all the breads in.

I figure a year of godaddy hosting is currently $12 with unmetered bandwidth. I'll throw in.

▶Anonymous 03/14/18 (Wed) 19:47:34 07564d (94) No.664904

>>664801

Yes, I'm concerned about that, too.

Perhaps it helps that this data does not reside only there?

In this case, it would take me about half a day to get it all up there again, if need be.

▶Searchability Anonymous 03/14/18 (Wed) 19:51:13 d6b0f8 (36) No.664928

>>494745 (OP)

Searchable Qresearch

www.pavuk.com

username: qanon

password: qanon

Updated regularly with the messages and images from Qresearch general.

▶Anonymous 03/14/18 (Wed) 19:54:08 07564d (94) No.664960>>664984

I'm beginning to wonder if I'm up against some kind of limit on my remote host. I just tried importing into it again, and I'm still missing some posts.

Remote host: 1,010,127 records

Local machine: 1,049,610 records

▶Anonymous 03/14/18 (Wed) 19:56:59 d6b0f8 (36) No.664975

I'm using the 8chan JSON API endpoints. I still need to pull from the archive.json file downloaded yesterday.

My server is on a linode so I have fast response time.

▶Anonymous 03/14/18 (Wed) 19:58:28 07564d (94) No.664984>>667648

>>664960

Maybe I can split the table into 4chan and NewChan (my name for 8ch, since we can't link back to here) and see if they all go up.

▶Anonymous 03/14/18 (Wed) 19:58:44 d6b0f8 (36) No.664990>>666471

You can search the text is the posts with wildcards. Say you want all posts with the word BOOM. Just enter *boom*.

Say you want the posts from Q with his tripcode and "boom"

Put !UW.yye1fxo in the trip code.

put *boom* in the comment

Click search button

voila.

▶Anonymous 03/14/18 (Wed) 19:59:24 d6b0f8 (36) No.664998

Has anyone found a way to go back past the 25 pages in the console.json?

▶Anonymous 03/14/18 (Wed) 20:00:38 d6b0f8 (36) No.665010>>667886

>>570604

Can I access this? I'd like to add the DJT tweets into the database. Twitter is wanting more and more data before they give me an API key.

▶Anonymous 03/14/18 (Wed) 20:02:15 d6b0f8 (36) No.665018

File (hide): c73b9e4f4cc2fd3⋯.png (247.29 KB, 1944x1226, 972:613, ClipboardImage.png) (h) (u)

▶Anonymous 03/14/18 (Wed) 20:03:38 d6b0f8 (36) No.665029

File (hide): f1fe9abb9325de7⋯.png (297.77 KB, 1941x1224, 647:408, ClipboardImage.png) (h) (u)

▶Anonymous 03/14/18 (Wed) 20:06:06 d6b0f8 (36) No.665054

File (hide): 04c25cb69211184⋯.png (16.81 KB, 480x165, 32:11, ClipboardImage.png) (h) (u)

U.POSTS.NEW is the new-format table.

U.POSTS.NEW.ATT is the table of attachment for the primary table. Each one is a link to a binary

▶Anonymous 03/14/18 (Wed) 22:51:34 5f9a22 (5) No.666471>>666959

>>664990

Wow, awesome job! I knew it could be done. I'm going to need some help getting started. Could you put a qsearch for dummies tutorial together?

Did you have to create, or did this create a chronological list of all Q related threads and their titles if any? (/pol/cbts/CBTS(8ch)/The Storm/ qresearch)?

That might be a good Mnemonic to speed searches.

▶Anonymous 03/14/18 (Wed) 23:36:03 ee0f91 (1) No.666874

how about:

-archive threads as they go

-convert to text files, with links to posts

-txt files are easily searchable

▶Anonymous 03/14/18 (Wed) 23:45:14 d6b0f8 (36) No.666959

>>666471

I've not been back into this thread for a while. I'm running the qresearch import process to get up-to-date. One technique that is needed is to re-scan already imported threads for posts missed during initial scans.

Threads are imported from the catalog.json file. In this state, we know the thread number and the number of messages at that time. The only time we know a thread is closed is when the number of posts >= the number in the official "bake" count.

Therefore, my program keeps testing until the posts counter >= the bake counter and then marks the thread as complete in the thread table. This then prevents re-scanning all threads because we get only the open ones.

Multiple scans of posts are needed to get all of them and to deal with duplicate threads.

I use the 8-chan post number as part of the primary key to the threads and posts tables.

8GA_1 is 8chan Great Awakening post 1

8QR_655000 is 8chan Qresearch post 655000

The big problem is going back to find threads BEFORE the last 25 pages in the catalog.json. Therefore, I can't get anything earlier than when I first wrote the import.

▶Timestamps Anonymous 03/14/18 (Wed) 23:47:38 d6b0f8 (36) No.666983>>667927

The import routine uses the JSON API endpoint from the boards. In the JSON is the Unix timestamp of the message. This is a native field/object type in Pavuk. Thus all timestamps are set to UTC internally.

NOW, if I could get DJT's Twitter feed in JSON, it also has UnixTime and this goes in directly.

Twitter wants me to give them all sorts of documentation before they will allow me to use their API. Frankly, I don't have the time to deal with them or the inclination.

▶Other boards Anonymous 03/14/18 (Wed) 23:48:23 d6b0f8 (36) No.666995>>667971 >>668375 >>670221 >>670322

I can get other boards provided the endpoints are similar and that the catalog.json file still has links to the threads.

BO has never responded to my requests on how to get older threads.

▶Searching with Pavuk Anonymous 03/14/18 (Wed) 23:50:44 d6b0f8 (36) No.667022

Super simple.

Entry forms are also search forms.

Enter the data that you wish to match.

Click the search button.

Pavuk creates and then executes the appropriate query and returns the items in a Kendo grid. Scroll, resort, export to excel or click on a row to return to the entry form with your data.

*searching on timestamps has issues that i need to resolve*

▶More work on the system tomorrow Anonymous 03/14/18 (Wed) 23:51:13 d6b0f8 (36) No.667026

I'm done for the day.

▶Table of boards and threads Anonymous 03/14/18 (Wed) 23:52:45 d6b0f8 (36) No.667043

File (hide): bc982c3469bafd6⋯.png (109.82 KB, 781x448, 781:448, ClipboardImage.png) (h) (u)

▶Threads Table Anonymous 03/14/18 (Wed) 23:53:59 d6b0f8 (36) No.667053

File (hide): f6665239b502443⋯.png (394.1 KB, 1393x926, 1393:926, ClipboardImage.png) (h) (u)

▶COMMENTS scrubbed with Lynx Anonymous 03/14/18 (Wed) 23:57:08 d6b0f8 (36) No.667075>>667776

The comments from the JSON API include markup and JS to go to real links. This is a problem with the storage and search. I pipe the comment string through Lynx with the -dump option and this gives me clean text in STDOUT and then a separator and then the list of actual links. I put the text in the comments and the links in a multivalue table. I'll expose the links tomorrow as a separate tab in the entry form.

▶Anonymous 03/15/18 (Thu) 01:10:10 70e498 (69) No.667648>>670142

>>664984

What about 100k transactional batches?

▶Anonymous 03/15/18 (Thu) 01:20:38 5f9a22 (5) No.667776

>>667075

Jesus Einstein, give us a starting point to keep up.

▶Anonymous 03/15/18 (Thu) 01:30:49 70e498 (69) No.667886

>>665010

Yeah man hit it. I've got a github here you can browse around.

https:// github.com/QCodeFagNet/SFW.ChanScraper/tree/master/JSON

json/8ch has the filtered/unfiltered bread and archives in it. smash has the twittersmashed posts. I've been getting my twitter data from http:// www.trumptwitterarchive.com/data/realdonaldtrump/2017.json, 2018.json

I set up a test for the webAPI twittersmashed posts here https:// qcodefagnet.github.io/SmashViewer/index.html

I'm getting close on having the webAPI thing finished up. Just running some more tests and then I should be ready to go.

▶Anonymous 03/15/18 (Thu) 01:33:52 70e498 (69) No.667927

>>666983

Yeah you could mebbe use the smashed json from me. I've already done the unix timestamp on the trump tweets. All 8ch posts and Twitter posts dervive from the same Post base object with the unix timestamp built in.

▶Anonymous 03/15/18 (Thu) 01:37:27 70e498 (69) No.667971

>>666995

I think that's because you can't really get them. There is an 8ch beta archive here, but all the Q Research threads dissappeared shortly after we started archiving them. Even then, those archives are straight HTML. It's of no use to me. AFAIK, once it slides off the main catalog, its pretty much gone. Some trial and error got me a few breads, but not many.

▶Anonymous 03/15/18 (Thu) 02:08:38 5f9a22 (5) No.668375>>668958

>>666995

>BO has never responded

I'm not the board owner, just some schmuck who started a thread he thought was being overlooked. You folks are so far out of my ballpark all I can do is try to keep it inside the curbs of what my original intent was.

I'd like to see a list/catalogue/file of all Q "related" posts.

Aaand I'd like to see a list of post Q "related" posts across all platforms/threads made searchable. Plenty of focus on Q, we need the early digging and free association.

▶Anonymous 03/15/18 (Thu) 02:54:51 70e498 (69) No.668958>>670158 >>670617

>>668375

Interesting concept you have anon. You want to be able to search across ALL 8ch? Not just Q Research? By platforms are you talking 4ch/8ch? or 4ch/8ch/twitter/reddit/facebook…?

▶Anonymous 03/15/18 (Thu) 04:56:07 07564d (94) No.670142>>670304 >>670317

>>667648

The first time I uploaded, I batched them in by 1000.

The second time, I batched them in by thread. I'm not sure how well the LIMIT clause on the SQL works.

In any case, I may have a problem on both computers. I could have sworn I had over 1.1 million records the other night. (Not to worry. I still have all of the source.) The solution may be to partition the table. I won't have to rewrite any code, but it'll chunk the table's file down into smaller sections.

This should be interesting. I've never had to partition a table before. Apparently, newer versions of MySQL do it automatically. But until then, it's gotta be done.

▶Anonymous 03/15/18 (Thu) 04:57:19 07564d (94) No.670158>>670289

>>668958

Mine has 4chan, too.

▶Anonymous 03/15/18 (Thu) 05:06:02 07564d (94) No.670221>>670279

>>666995

If threads are missing, you have to look in archive.org/web or archive.is. Of the two, archive.org/web is better for scraping because the HTML code is about as close to the original as they can make it. I can actually use the same scraper program on it.

Since the stuff that is on archive.is is so different from the original, I will need to write a new scraper for those. On several occasions, the post was important enough that I rebuilt it by hand.

With either archive, you need to know the URL, which can be tricky sometimes. Just having the post number won't do it. You must know the thread as well.

Just thought of something: When I get threads from these archive sites, what time zone do they show? I believe my stuff is saving to GMT when I save a post directly from a chan site. I'm not sure what I'm saving when I get posts from these archives.

▶Anonymous 03/15/18 (Thu) 05:13:54 70e498 (69) No.670279

>>670221

I would think the time is relative to the archive home timezone. That is, unless archive.x has done some wizardry to change the time zone it's pulling at to be the time zone of the user requesting the original archive. That would be more problematic - but you could still deal. It should be marked what time zone and then you convert into the unix timestamp.

▶Anonymous 03/15/18 (Thu) 05:14:33 70e498 (69) No.670289>>670388

>>670158

The 4ch breads or the 4ch Q posts?

▶Anonymous 03/15/18 (Thu) 05:15:50 70e498 (69) No.670304

>>670142

What are the chances it's hanging on a specific record? I see that all the time doing inserts. Bad data kills it off.

▶Anonymous 03/15/18 (Thu) 05:17:38 70e498 (69) No.670317>>670332

>>670142

You could look into raising the timeout. Mebbe it's just such a long job that it's taking too long and timing out? https:// support.rackspace.com/how-to/how-to-change-the-mysql-timeout-on-a-server/

▶Anonymous 03/15/18 (Thu) 05:17:49 07564d (94) No.670322

>>666995

Here's a hint for how to find the post a dead thread belongs to: Go to the earliest archive of the thread on which you found the link, which will usually be on the archive.is site. If you're lucky, the link was still live when the thread was archived. The other thing to do is search earlier posts that you already have to see if someone else linked the same post.

▶Anonymous 03/15/18 (Thu) 05:19:40 07564d (94) No.670332>>670421

>>670317

Time out isn't the problem in this case. Since I'm working with small batches at a time, they're quite quick.

▶Anonymous 03/15/18 (Thu) 05:25:58 07564d (94) No.670388>>670470

>>670289

I have the vast majority of both. Go check it out.

http:// q-questions.info/research-tool.php

After I resolve the table size problem (which is what I think the real problem is), I think it would be good to work some more on my contexting program. On my local computer, I've got it so that it can look back through the links and show all available context with the post. What I haven't done yet is copy that contexting information to a Q post's context when I find one in the backward linking. It'll be ridiculously easy once I set about doing it. Then, when a Q post is pulled up, all that stuff that linked back to it can show together with it.

▶Anonymous 03/15/18 (Thu) 05:28:56 70e498 (69) No.670421>>670518 >>670571

>>670332

Hmm. Yeah just doing some easy math I can see how you would have more than 1mm records. We're at bread 815+ something here and with 751 post each that over 600k here on 8ch alone.

You may be onto something with that. Is there a limit? https:// stackoverflow.com/questions/2716232/maximum-number-of-records-in-a-mysql-database-table

Looks like number of rows my be determined by the size of your rows.

▶Anonymous 03/15/18 (Thu) 05:33:36 70e498 (69) No.670470

>>670388

>q-questions.info/research-tool.php

So much data. It's mind boggling.

▶Anonymous 03/15/18 (Thu) 05:40:52 07564d (94) No.670518

>>670421

Yes, there may be a 1GB limit on the file size, and I'm right about there now. If I partition, I can get around that.

▶Anonymous 03/15/18 (Thu) 05:41:43 98bd4e (20) No.670526>>672344

Below is the qtmerge modified raw dataset (text-only) as of 2018-03-14 02:07 UTC.

I'm putting this out in the hopes that it may be useful to others for ETL, mining, search tools, archiving etc.

Some notes:

* The data is a synthesis of the the qtmerge datasets: https:// anonsw.github.io/qtmerge/datasets.html

* For an idea of threads that are available see: https:// anonsw.github.io/qtmerge/catalog.html

* eventcache.json file contains the posts/tweets/etc in chronological order. The type attribute currently dictates the local object structure (working to fix this to be more clean)

* refcache.json contains the detected post cross references (this is a work in progress)

* The referenceID attribute is the "primary key" between the files

* Timestamps are Unix Time and time strings are US Eastern

Extracted size: ~850 MiB

SHA-256 sum: d6ed89da05c0b714fc66b04ca66a8d701456d882d5f128ee1cef26c8d2e22eb6

http:// anonfile.com/dazfO8d4ba/qtmerge-text-2018-03-15_05.18.37.tar.bz2

▶Anonymous 03/15/18 (Thu) 05:48:48 07564d (94) No.670571

>>670421

That's just the general threads. When I started linking through the breads, I found that I needed many of the other threads, too. Most of those are smaller, though.

▶Anonymous 03/15/18 (Thu) 05:56:53 5f9a22 (5) No.670617>>670657 >>672421

>>668958

> You want to be able to search across ALL 8ch? Not just Q Research? By platforms are you talking 4ch/8ch?

Not all 8ch. Just 4 and 8ch Q related threads. Q has posted in but a small part of all of the digging (and bullshit) threads and much info is contained in those threads. /pol/ was a cluster until adopting the /cbts/ threads, but they shouldn't be too hard to round up and include in the searchable database.

In fact, I'd only include the qresearch general threads since the GA/qresearch reset. Add the digging/ancillary threads as possible. Most of the gold is in the general's IMO.

▶Anonymous 03/15/18 (Thu) 06:05:35 07564d (94) No.670657

>>670617

The reason I'm pulling in other threads is because they get cited as notable posts. I'm not bothering with them unless that happens.

▶Import procedure debugging view Anonymous 03/15/18 (Thu) 12:47:33 d6b0f8 (36) No.672305>>690213

File (hide): bba5385da9ac1ce⋯.png (76.13 KB, 1464x582, 244:97, ClipboardImage.png) (h) (u)

I can get the other boards and other threads, the issue is disk storage. Linode gives me a lot of bandwidth, but only a few gigs of disk until I change my plan with them.

▶Limits Anonymous 03/15/18 (Thu) 12:53:13 d6b0f8 (36) No.672334>>690263

The limit of an OpenQM hash file (table) is 16TB. When this becomes a problem, I can create a distributed file (table) by primary key. Say, put all 8QR in 1 portion, 8GW in another. Simply a way to have physical storage allocated

Pavuk session records are GUIDS. (don't worry, I'll purge anons out of the storage.) It was done because of commercial requirements for SOX and other audit compliance issues. Remember, I created Pavuk to build commercial apps.

The distributed file is built by using the first 2 bytes of the GUID from the primary key. Thus, it has component files:

…

Or 256 parts.

Theoretical table size:

256 x 16TB = 4096TB

www.openqm.com

▶Anonymous 03/15/18 (Thu) 12:54:32 d6b0f8 (36) No.672344

>>670526

I'm going to look at your work.

▶Anonymous 03/15/18 (Thu) 13:06:21 d6b0f8 (36) No.672421>>672688 >>690481

>>670617

I tried 4chan/cbts/index.html and got a 404 yesterday

▶No JSON for older threads :( Anonymous 03/15/18 (Thu) 13:30:07 d6b0f8 (36) No.672572

File (hide): ba02c76b860d580⋯.png (272.49 KB, 1748x920, 19:10, ClipboardImage.png) (h) (u)

Brother Anons, I can find the IDs of the threads by using the search function on Archive.is. For example, research general #2 was post number 799. Once I know this, I can go back to 8chan and pull up the thread.

Sadly, I cannot get it with JSON. I only can get HTML. This means parsing the HTML.

This means a new string parser, but it goes into the same table as the JSON, but with more work. Here's what the posts look like in HTML

▶Crowdfunding more resources Anonymous 03/15/18 (Thu) 13:40:20 d6b0f8 (36) No.672659>>672709

I've put out a tweet thread showing the progress and asking if someone will step up to help lead a crowdfunding campaign so I can afford a bigger Linode.

▶Anonymous 03/15/18 (Thu) 13:43:52 41bee9 (9) No.672688>>672743 >>672920

>>672421

>I tried 4chan/cbts/index.html and got a 404 yesterday

I'd expect that. Threads sunset there rather quickly. I think most everything from 4ch is in http:// archive.is/search/?q=%2Fcbts%2F

I got 22,900 hits. Some people used 4plebs and maybe even other archives. Need to know all of the archive sites used so we can add them to the soup.

A search on 4 plebs from 10-28-2017 to the night of the bans, 11-26-2017 shows 714 hits.

https:// archive.4plebs.org/pol/search/subject/cbts/start/2017-10-28/end/2017-11-26/

▶Anonymous 03/15/18 (Thu) 13:46:15 41bee9 (9) No.672709>>672886

>>672659

>crowdfunding

I don't know diddly about crowdfunding, but I will certainly contribute. Are things like that generally paypal friendly?

▶Anonymous 03/15/18 (Thu) 13:50:48 41bee9 (9) No.672743

>>672688

Belay that last link. It searches only to midnight of the day specified. This one goes through the 26th.

https:// archive.4plebs.org/pol/search/subject/cbts/start/2017-10-28/end/2017-11-27/

▶Anonymous 03/15/18 (Thu) 14:02:41 41bee9 (9) No.672834

File (hide): 4b34a9ff28fb5df⋯.png (8.6 KB, 1061x83, 1061:83, cbts1.PNG) (h) (u)

File (hide): 640134501c1855e⋯.png (3.63 KB, 567x80, 567:80, cbts2.PNG) (h) (u)

Here's some interesting trivia that I missed after being banned. I did see Q approving the first migration in real time, but missed this. Interesting.

▶Anonymous 03/15/18 (Thu) 14:11:41 d6b0f8 (36) No.672886>>672980 >>674536

File (hide): 99bd8abf0435c23⋯.png (173.73 KB, 957x1032, 319:344, ClipboardImage.png) (h) (u)

>>672709

I was just going to have folks send to my personal paypal account since I'm funding the site anyway. You can set up a regular monthly payment. I do that with others like Stefan Molyneux where we send $10/month.

▶Block Storage Anonymous 03/15/18 (Thu) 14:12:47 d6b0f8 (36) No.672897

File (hide): b8602cfeff149e8⋯.png (111.48 KB, 1025x804, 1025:804, ClipboardImage.png) (h) (u)

▶IF I GET HELP FUNDING... Anonymous 03/15/18 (Thu) 14:13:46 d6b0f8 (36) No.672908>>674321 >>680546

We need to work together to get all of the data into the database. If someone could help with a Twatter feed from DJT - preferably raw and in JSON, that can be added to the posts table.

▶Anonymous 03/15/18 (Thu) 14:15:24 d6b0f8 (36) No.672920

>>672688

That was helpful. I would ask people in this thread to help develop the information model.

There is a "boards" table with the links to get data for each type. It can be expanded into which boards are archived where and I can automate the pulls.

▶Anonymous 03/15/18 (Thu) 14:27:57 41bee9 (9) No.672980>>672997 >>673832

>>672886

>personal paypal accoun

Set up an account specifically for this, don't dox youself. (((They))) will be able to find you, but the malicious shills won't.

▶Anonymous 03/15/18 (Thu) 14:30:24 d6b0f8 (36) No.672997>>673070 >>674502 >>809411

>>672980

already doxxed. I own pavuk.

▶Anonymous 03/15/18 (Thu) 14:40:22 41bee9 (9) No.673070

>>672997

Ha, OK. Thought that might be the case.

▶Anonymous 03/15/18 (Thu) 16:08:51 41bee9 (9) No.673658>>677084

Here's another archive with over 1,200 threads:

https:// archive.fo/search/?q=%2Fpol%2F+-+cbts

Some good ones here missing in other archives. How many more are out there?

archive.is

archive.4plebs

archive.fo

▶Anonymous 03/15/18 (Thu) 16:25:12 41bee9 (9) No.673773

Found first CBTS thread on 8ch.

http:// archive.is/Pvbqq

▶Anonymous 03/15/18 (Thu) 16:34:19 023ac5 (1) No.673832

>>672980

Yes they will and I will add that my paypal account was subject to MUCH fuckery during the time I was posting a lot about PG on my twitter. Nov/dec 2016

▶Anonymous 03/15/18 (Thu) 17:22:42 fe09dd (1) No.674321>>690320

>>672908

I can probably get you what you need. What are you looking for specifically? All DJT tweets? Tweets with Delta's?

▶Anonymous 03/15/18 (Thu) 17:39:22 adacee (2) No.674502

>>672997

That sucks. I love the system, though! More user friendly than my crap attempts.

▶Anonymous 03/15/18 (Thu) 17:42:20 adacee (2) No.674536>>791772

>>672886

They sell "Storage Blocks" expansion way cheaper than more memory. Very fast systems already. Lots of data on the 8GB plan, buy another 100GB storage for way less than the next plan. Call linode to get info on that.

▶Anonymous 03/15/18 (Thu) 21:50:54 41bee9 (9) No.677084>>690349

>>673658

>archive.is

>archive.4plebs

>archive.fo

Another one"

https:// yuki.la/

▶Anonymous 03/16/18 (Fri) 02:23:40 70e498 (69) No.680546

>>672908

>http:// www.trumptwitterarchive.com/data/realdonaldtrump/2017.json, 2018.json

▶Anonymous 03/16/18 (Fri) 23:57:25 07564d (94) No.690213

>>672305

It probably won't be long until I find out if my host really means it when they say "unlimited".

▶Anonymous 03/17/18 (Sat) 00:03:26 07564d (94) No.690263

>>672334

Limits depend on the operating system. I'm not sure how much I'll end up needing in the end. I've got some full page web captures in my system that may bump up the size needed fairly fast. So far, I haven't outgrown the 500GB on my home system. It's about half full now. But that also includes just about all of my software. I have other drives, so I'm not limited to that 500GB. (Recalling when a 60MB hard drive was a big deal…)

▶Anonymous 03/17/18 (Sat) 00:08:15 07564d (94) No.690320>>698628

>>674321

Yeah, that would be cool to add to my system, too. I wonder where I should fit that into the task list. I've got to reparse anyway, so it has to be after that. (Backslashes weren't properly handled the first time around.) It was my plan to get to it eventually. So much to do! If you've got it in JSON files, I've got to believe it would be very easy to get them into my system.

▶Anonymous 03/17/18 (Sat) 00:10:37 07564d (94) No.690349>>709043

>>677084

>https:// yuki.la/

The archive sites are only as good as whether they're actually saving our stuff. What's the hit rate finding stuff there?

I'm not sure, but I think archive.is and archive.fo may be the same system. Mirrors, perhaps?

▶Anonymous 03/17/18 (Sat) 00:22:51 07564d (94) No.690481>>709094

>>672421

I don't have 4chan/cbts. Was Q posting there, too? If I recall correctly, we went from 4chan/pol to 8ch/cbts.

▶Anonymous 03/17/18 (Sat) 18:59:08 70e498 (69) No.698628

>>690320

AllQPosts smashed with DJTwitterposts by day

https:// github.com/QCodeFagNet/SFW.ChanScraper/tree/master/JSON/smash

▶Anonymous 03/18/18 (Sun) 05:37:45 07564d (94) No.704953

I got the problem with the backslashes fixed. Also, I changed the way I process emoji characters. There actually might be a few more posts that get parsed in during the reparsing. I am in the process of reprocessing everything now. This is going to take a while. I'll let you know when the uploads are done, which will probably be tomorrow afternoon.

▶Anonymous 03/18/18 (Sun) 06:26:01 b08c93 (2) No.705344

speaking of searchability, here is a search engine anons can use that will let you search for all those things normal search engines won't, like stringers that include punctuation / symbols or exact spellings of short words and abreviations, without the search engine being 'helpful' and excluding the results you want, and returnign the results it thinks you want.

▶Anonymous 03/18/18 (Sun) 06:26:15 b08c93 (2) No.705347

http:// symbolhound.com/

▶Anonymous 03/18/18 (Sun) 15:32:21 59e915 (2) No.709043

>>690349

> I think archive.is and archive.fo may be the same system

Yes, they sure look like the same system as does archive.li. I must admit complete ignorance how they are structured and how they work. I initially thought archive.is was for /pol/, but now I've found /pol/ and cbts all over the place. Any anon's have any insight it would sure be appreciated.

▶Anonymous 03/18/18 (Sun) 15:40:10 59e915 (2) No.709094

>>690481

>we went from 4chan/pol to 8ch/cbts.

4chan/pol/ first posts were 10/28/2017. We were flushed by a bot storm on 11/26/2017 and regrouped on 8chan as CBTS. When that blew up the campaign became The Storm. When that blew up is when we landed on our own board qresearch/greatawakening.

Archives and threads are all over the place, one of our fundamental challenges aggregating all the info to be searchable.

▶Anonymous 03/19/18 (Mon) 16:46:04 07564d (94) No.722324

All records and images that I have should now be up on the research tool.

I thought my post count was short on the site last night, but using the following statement on both, they are equal:

SELECT COUNT(`post_key`) FROM `chan_posts`

Funny thing is that when I pull up the table in phpMyAdmin, the row count does not equal the answer to that query. It's short on both. Don't trust the row count in phpMyAdmin when you view a table.

Total number of posts in the research tool is:

1,113,968

Next up: Getting the POTUS tweets into the database.

http:// q-questions.info/research-tool.php

▶Anonymous 03/19/18 (Mon) 19:41:02 43423a (2) No.724053>>724127

>>724027

>Has anyone thought to take full news articles and social data dumps, per person, and do sub text matching across the entire body of text to find exact matches?

▶Anonymous 03/19/18 (Mon) 19:48:56 07564d (94) No.724127>>734330

>>724053

I've thought to do it. The tagging feature can get us there. The problem is that tagging posts is a lot of work. I need to find a way to get others to help with that without compromising the database.

▶Anonymous 03/20/18 (Tue) 17:24:45 70e498 (69) No.733102

OK brother codefags. I've stood up a simple API. It serves json and XML for your consumption pleasure.

It's currently set up to:

1) Scrape the chan automagically and keep an archive of QResearch breads and GreatAwakening.

2) Filter each bread to search for Q posts and include anything in GreatAwakening into a single QPosts list

3) Serve up access to posts/bread by list, by id, and by date.

I'm going to incorporate the TwitterSmash delta output next. I figure I can do a simple search across all Q posts easily. Searching across the breads is harder.

You can check it out here: http:// qanon.news/

McAffee says secure https:// www.mcafeesecure.com/verify?host=qanon.news

There's a sample single page app that shows how to use it. http:// qanon.news/posts.html

I still gotta set up my email account so if you spam me now, it's likely to get bounced. I'll check back in later.

My reason for doing this is twofold, I figured we could use it, and I'm looking at the job market in my area and thinking about changing it up. This is partially a learning project to open opportunities by using different tech. I'm claiming ignorance. My plan is to try out an elasticsearch node once I get this working as designed.

Let me know if you can think of a query/filter that you think would be useful. It's not proven to be too difficult to work new things in other than the ugly local path issue I came across working on it this morning.

Try it out anons.

▶Anonymous 03/20/18 (Tue) 19:37:44 43423a (2) No.734330>>737563 >>763341

>>724127

I think you're misunderstanding my idea. The idea is to identify sources of narrative scripts being pumped into the public conciousness. Remember when Trump's speech at the '16 RNC was immediately phrased as "dark" in dozens of articles, tweets, etc? We need to know who's putting out the scripts ("dark") and who's repeating the scripts ("""journalists""" that articles with "dark" are attributed to, shitter users with "dark" in their tweets, etc)

The code could work in different ways but trying to automate everything at the beginning is hard. The easiest way to start would be:

>anon notices a suspicious pattern of the same language being used all of a sudden

<like "dark"

>anon enters the string that's being repeated into a text box

<bonus points if it's pure JS that can run locally rather than requiring a server, at least initially

>code ingests search results of news, shitter, faceblack, etc with that string from the recent past

<configurable in near term increments like past hour, past day, past 2 days

>anon is provided a list of results

From this simple aggregated news & social search an anon can easily see by visually skimming the results to see how widespread the suspicious pattern of the same language being used all of a sudden is.

<next features

>let anons select search result items as suspect and enter them into a database that indexes on journalist/author, keyword, etc

>database can use search result item post date to build a timeline, to identify the earliest sources of the narrative script

At this point, with the database trained on common sources of narrative script repeating, it would be pretty doable to automate suspicious pattern detection by ingesting the full body of content from the sources and searching for sub text matches that exceed noise. Like if "the" is used in most of the article headlines and tweets, that doesn't mean shit because "the" is a common word, but if "dark", an much less common word, all of a sudden appears across article headlines and facebook posts, that would be pretty easy to pick up for human review.

▶Anonymous 03/21/18 (Wed) 01:22:17 07564d (94) No.737563>>737661 >>739099 >>761567

>>734330

>We need to know who's putting out the scripts ("dark") and who's repeating the scripts ("""journalists""" that articles with "dark" are attributed to, shitter users with "dark" in their tweets, etc)

You can search the word "dark" in my database as it is right now. If that word was used in chan discussions (and it was), you can get results for it. Is there something you think we need to add? Do you have an idea for an algorithm based on what we have?

Right now, though, I changed my mind about what to do next. I want to get the contexting code finished. When I've used my personal version of it, I learned quite a lot.

After that, I will work on getting the tweets in there. If anyone can point me to php code for that, it would be appreciated. I'm not talking about chan posts that link them, but rather the tweets themselves.

▶Anonymous 03/21/18 (Wed) 01:29:22 07564d (94) No.737661>>761567

>>737563

I've got a suggestion for the search: enter the following in the text field:

dark%http

and also in a separate search

http%dark

Those should find posts that use the word "dark" and include a link. I don't know how to do this better with what I have without doing some extensive programming.

▶Anonymous 03/21/18 (Wed) 03:54:29 70e498 (69) No.739099>>742213

>>737563

> I've been getting my twitter data from http:// www.trumptwitterarchive.com/data/realdonaldtrump/2017.json, 2018.json

▶Anonymous 03/21/18 (Wed) 09:12:36 07564d (94) No.742213>>744374

>>739099

He isn't keeping it up to date.

▶Anonymous 03/21/18 (Wed) 15:14:28 70e498 (69) No.744374>>746289

>>742213

>www.trumptwitterarchive.com/data/realdonaldtrump/2018.json

There was a 9 day gap at the beginning of the year. Otherwise it's been updated. Unfortunately I think there were 2 markers in that time. Delta anon knows about it.

▶Anonymous 03/21/18 (Wed) 19:02:42 07564d (94) No.746289>>746837 >>750792

>>744374

I didn't see anything past January.

▶Anonymous 03/21/18 (Wed) 20:00:53 70e498 (69) No.746837

>>746289

Refresh yer cache? I'm seeing Jan 9 - March 21 2018

▶Anonymous 03/22/18 (Thu) 03:14:03 07564d (94) No.750792

>>746289

>www.trumptwitterarchive.com/data/realdonaldtrump/2018.json

Reverse order. OK, I see it. Thank you.

▶Anonymous 03/23/18 (Fri) 00:26:29 70e498 (69) No.760314

Feckin dates. I got it all sorted out. Discovered a bug in the different times zones my dev server is on and the API webserver.

I've been sorting out small bugs and about to wire in the TwitterSmash. The automation part seems to be working good now that I sorted the date bug. I've got it set up to do hourly scrapes. Last run at 8:03pm 3-21 est. The scrapes themselves only take about 45 seconds - including the twittersmashing. There's a test smashpost page here to see the deltas in action. Not totally live Q post data online yet.

http:// qanon.news/smashposts.html

This is another test page using live data

http:// qanon.news/posts.html

I did this to test some code out. Get a random Q post.

http:// qanon.news/api/posts/random/?xml=true

I set up an elasticsearch node today to experiment. We'll see how that goes. Could be an huge pain in the ass to set up at a host. We'll see.

▶Anonymous 03/23/18 (Fri) 02:29:46 bef1f1 (1) No.761567

>>737563

>>737661

I'm trying to help but you're not getting it. Reread my posts.

▶Anonymous 03/23/18 (Fri) 05:33:37 07564d (94) No.763341

>>734330

I think that's beyond the scope of what I'm doing. Hopefully, there will be enough here that what I have can help you do that research, especially after I finish the contexting work. Right now, I've had to reparse the database yet again to correct image links. I hope I've finally gotten it right because it takes an entire day to cycle through the entire set.

▶Anonymous 03/23/18 (Fri) 21:24:26 70e498 (69) No.771168>>773397

Update your tripcodes codefags.

public readonly string[] ConfirmedTrips = new string[] { "!ITPb.qbhqo", "!UW.yye1fxo", "!xowAT4Z3VQ" };

http:// qanon.news/api/posts/943/?xml=true

▶Anonymous 03/24/18 (Sat) 01:07:31 07564d (94) No.773397>>774681

>>771168

>!xowAT4Z3VQ

Thank you for the heads up. I've made the change in my code, too.

The export/import finally looks like it's ok. Please let me know if you run into issues.

I'm going to be pulling out the post range and thread range options from the form. They unnecessarily complicate things now that I've added date range capability.

I'm moving on to contexting now. Y'all are going to love that feature.

▶Anonymous 03/24/18 (Sat) 03:02:59 70e498 (69) No.774681>>774698 >>775587

>>773397

yeah that sounds like a good one.

I've done some more work on the http:// qanon.news api. I managed to work out a coupla small bugs and get the TwitterSmashed posts integrated. Everything seems to be working as designed.

Here's the smashposts.html demo page. Shows deltas to Q posts within the hour.

http:// qanon.news/smashposts.html

I've going to add another result to the smashposts where everything is grouped by days. I'll probably put it in the posts API as well.

It's starting to look like this may be close to going on autopilot. Any interest in changes/additions before I move onto something else?

▶Anonymous 03/24/18 (Sat) 03:04:50 70e498 (69) No.774698

>>774681

I'd love to work out a local copy of the Jan 1 2018 - Jan 9 2018 @realDonaldTrump tweets. Those are missing from the trumptwitterarchive site. Anybody got access to that?

▶Anonymous 03/24/18 (Sat) 04:59:29 07564d (94) No.775587>>781191

>>774681

>qanon.news/smashposts.html

It looks good so far. One thing, though: you need to save the images. You're linking directly to the 8ch images, and those have a tendency to go missing.

▶Anonymous 03/24/18 (Sat) 20:54:24 70e498 (69) No.781191>>782643

>>775587

Hmm. Yeah I'll look into it. I can see that archive getting really big really fast. This things only been running for a month and it's over 400mb only JSON. I'll have to make sure what kind of space I've got avail.

▶Anonymous 03/24/18 (Sat) 23:14:27 07564d (94) No.782643>>791554

>>781191

But you're not saving more than the Q posts, right? There aren't that many Q posts, and he hasn't posted that many images. But if you're trying to save the entire thing, yes, it's really big and grows really fast. I'm not automatically saving the full size images, and there's still quite a lot in my set.

▶Anonymous 03/25/18 (Sun) 22:06:35 70e498 (69) No.791554

>>782643

I never figured that another image archive was what we needed. Each of the QCodefag installs has it's own local archive. My concern was in preserving the JSON data from QResearch before it slid off the main catalog.

I'm going to put up a more simple list to show what's been archived. I'm showing 716 total breads., but again that's only starting at 2-7-2018. Q Research General #358 is my earliest full archive - it's up to #982 now.

That's 624 breads in 47 days. 13.2 breads per day. EST 4846 breads in one year ~ 800k/bread = @ 4GB/year in JSON bread alone. Mebbe different if I moved to a DB.

I may have enough storage, but it's so hard to say. Any image archive estimates anons?

▶Anonymous 03/25/18 (Sun) 22:29:15 d6b0f8 (36) No.791772

>>674536

I just saw this info. I need to convert my monthly plan to an hourly plan before they'll let me buy storage blocks.

▶Images and thumbnails at 18G Anonymous 03/25/18 (Sun) 22:29:41 d6b0f8 (36) No.791778

Pavuk Searchable.

▶Anonymous 03/26/18 (Mon) 13:42:08 663ab1 (2) No.798718>>799873

Can someone post the original json of GA post 461 which was deleted? I pulled the json data from qanon.pub, and can use pieces of it to fill in my local copy, but I'd rather have the real thing if I can get it.

As an example, below is a comparison of the original 460 from 8ch and the archived version from qanon.pub. They are close, but the 'com' field did go through a filter to get into qanon's 'text' field. Not saying there's anything wrong with it, but I have the originals for all except 461. Am playing with python code to save all the json files locally for all relevant boards on 8ch, and can parse & search for keywords or q's trips, etc. and display in a browser. Since it's all stored locally, a search doesn't have to hit the net. It's not perfect by any means, but if I can clean it up a bit, I'll share if there's interest.

8ch original 460:

{

"com": "Updated Tripcode.Q",

"name": "Q ",

"locked": 0,

"sticky": 0,

"time": 1521824977,

"cyclical": "0",

"bumplocked": "0",

"last_modified": 1521824977,

"no": 460,

"resto": 452,

"trip": "!xowAT4Z3VQ"

}

qanon.pub copy of 460:

{

"email": null,

"id": "460",

"images": [],

"link": "https:// 8ch.net/greatawakening/res/452.html#460",

"name": "Q",

"source": "8chan_greatawakening",

"subject": null,

"text": "Updated Tripcode.\nQ\n",

"threadId": "452",

"timestamp": 1521824977,

"trip": "!xowAT4Z3VQ",

"userId": null

Need 8ch original 461 please if someone has it.

▶Anonymous 03/26/18 (Mon) 16:52:33 70e498 (69) No.799873>>803461

>>798718

Try this

http:// qanon.news/api/posts/962/

or this

http:// qanon.news/api/bread/452/?xml=true

add/remove the xml from the query string to get XML

▶Anonymous 03/26/18 (Mon) 23:55:53 663ab1 (2) No.803461>>805321

>>799873

>http:// qanon.news/api/posts/962/

Perfect - thanks! The xml flag showed me the exact pieces I was missing to rebuild my entry. Much appreciated and quite a handy api…

▶Anonymous 03/27/18 (Tue) 00:13:43 964d61 (1) No.803653

>>494745 (OP)

Forgive me, lads. Where do i go for info on Valerie Jarrett? Got lost.

▶Pavuk Searchable Anonymous 03/27/18 (Tue) 00:25:57 d6b0f8 (36) No.803777>>805300 >>809411

Linode is telling me that I can get block storage, but only by migrating my VM to the Fremont data center, getting a new IP address (SSL cert. etc.)

Crickets from followers whom I've asked to donate funds for the added expenses.

▶Anonymous 03/27/18 (Tue) 00:45:27 07564d (94) No.803966

>803653

The search engine on the Research Tool works well. Try searching VJ, too.

http:// q-questions.info/research-tool.php

▶Anonymous 03/27/18 (Tue) 02:34:07 70e498 (69) No.805300>>809001

>>803777

Do you have to have block data storage? Any other options?

▶Anonymous 03/27/18 (Tue) 02:36:45 70e498 (69) No.805321

>>803461

Glad it was useful. The posts API numbering is a bit squirrelly till you get used to it. The post ID is the post count starting from 1 on Nov 28 2017.

So finding out it was post #692 I had to view all posts (or posts.html or and of the QCodeFag installs) to get the post#. The bread# is in the post as threadId

▶Anonymous 03/27/18 (Tue) 13:23:43 d6b0f8 (36) No.809001>>809048

>>805300

What *other* options?!?!

"Archive EVERYTHING OFF LINE"

"MAKE IT SEARCHABLE"

If I don't have enough storage, where am I going to store the data?

If you don't know about IT, you should not be in this thread.

▶Anonymous 03/27/18 (Tue) 13:39:33 70e498 (69) No.809048>>809084

>>809001

Fuck off nigger. I'm just trying to come up with other ideas. I've been in IT for over the last 2 decades. I know exactly whats going on.

My point was, hosting can be found on the cheap if you look around. Not sure you NEED SSD. What you need is storage space. I was thinking drop the SSD for cheaper storage.

Whatever, it's your problem. You seem to be capable of figuring it out.

▶Anonymous 03/27/18 (Tue) 13:50:29 d6b0f8 (36) No.809084>>809146 >>904574

>>809048

I'm sure you're really good at building PCs for your aunt Martha. Plugging in the cards and loading and reloading Windows.

▶Anonymous 03/27/18 (Tue) 14:01:52 70e498 (69) No.809146

>>809084

Hurts me to my core!

No I write the software. Whatever. Deal with your own problem - it doesn't concern me.

▶Anonymous 03/27/18 (Tue) 14:51:56 9a029c (1) No.809411

>>672997

>www.pavuk.com

>>803777

>Crickets

Can't find contact info on your site(s). Link?

▶Anonymous 03/27/18 (Tue) 16:55:27 07564d (94) No.810060

I decided to prune. Too much garbage is in the chans.

▶Pavuk Qresearch HOWTO video Anonymous 03/27/18 (Tue) 18:14:34 d6b0f8 (36) No.810605

YouTube embed. Click thumbnail to play.

Raw video.

▶Wildcard searching howto Anonymous 03/27/18 (Tue) 19:29:11 d6b0f8 (36) No.811118

YouTube embed. Click thumbnail to play.

▶Anonymous 03/30/18 (Fri) 07:51:10 07564d (94) No.838965>>879844

File (hide): 997560d2ea60e2f⋯.png (836.81 KB, 1228x926, 614:463, resting.png) (h) (u)

The research tool is undergoing extensive overhaul at the moment.

▶Anonymous 03/30/18 (Fri) 22:00:54 70e498 (69) No.843897

I think I finally managed to squash the date bug in the QPosts/DJTweets.

I took the 60min delta restriction off - and it's applying each day's tweets on each Q post to allow you to see all the deltas.

http:// qanon.news/smashposts.html

▶Anonymous 04/02/18 (Mon) 02:40:31 07564d (94) No.864973

File (hide): 13780b00acc2ffc⋯.png (273.88 KB, 1229x896, 1229:896, GettingLucky.png) (h) (u)

Sometimes I get lucky.

▶Anonymous 04/03/18 (Tue) 16:14:47 07564d (94) No.879844

>>838965

The Research Tool is back up with a more concise data set. Much will be added in the next several days as I return to development of the contexting feature.

http:// q-questions.info/research-tool.php

▶Anonymous 04/04/18 (Wed) 03:50:40 70e498 (69) No.887653>>890080

File (hide): 54e4b21d5b8fcc8⋯.png (182.93 KB, 1197x986, 1197:986, Untitled-1.png) (h) (u)

File (hide): 495288dfb71a59a⋯.png (58.1 KB, 1197x986, 1197:986, Untitled-2.png) (h) (u)

I've been thinking about a timeline for the past few days. I looked into different solutions and found timelineJS that works pretty good.

I managed to wrangle the API data into a timeline. I'm planning on adding in the DJTwitter data and ideally news/notable events.

Once I can get the twitter data in I'll cut it loose. I was hoping to figure out an easy way to get other data into the timeline. News/notables. Any ideas? QTMergefag? You got good news/events?

Here's what it looks like:

▶Anonymous 04/04/18 (Wed) 08:22:07 07564d (94) No.890080>>891187

>>887653

If I can figure out how to import the twitter posts WITH the images, getting a timeline in Research Tool system is a no brainer. The JSON someone directed me to does not appear to have the image links, unfortunately. The images are essential to some of the tweets.

The plan is for POTUS to have his own post type. Then all one need do is select both q-post and potus posts in the same search, and they'll be displayed properly interleaved.

▶Anonymous 04/04/18 (Wed) 13:18:36 70e498 (69) No.891187>>891871

>>890080

I think the timelineJS handles that for you if you add it as media/tweet to each slide.

▶Anonymous 04/04/18 (Wed) 15:27:06 07564d (94) No.891871>>892076

>>891187

OK. I guess I'll have to take another look at it. Right now, though, my priority is to get the contexting feature working. I do wish there was a way to safely hand off some of the work on the site I'm putting together. There's so much to do! But I have no idea how to know to trust someone. Clowns will be clowns.

▶Anonymous 04/04/18 (Wed) 15:59:37 70e498 (69) No.892076>>892089 >>892772

File (hide): 47a15bed31ad982⋯.png (455.88 KB, 1197x936, 133:104, ClipboardImage.png) (h) (u)

>>891871

Agree. I've been thinking about trying to work out a way of collab. I'm sure I could come up with a way to prove we're who we each say we are. Unless the clowns are here building community Q research tools…

Check it out. I got the twitter working.

What I can say about this timeline is that there's alot of events on it. There's Q posts batched down to days across 98 days. Add in the Tweets and there's alot going on. Each day/tweet == a slide. It's definitely more than it was probably designed to handle. It takes a minute to make sense of the somewhat sizable JSON data and then render the display.

▶Anonymous 04/04/18 (Wed) 16:01:28 70e498 (69) No.892089

>>892076

FOK delete this please

▶Anonymous 04/04/18 (Wed) 17:48:32 07564d (94) No.892772>>892975

>>892076

>It takes a minute to make sense of the somewhat sizable JSON data and then render the display.

I just have to make sense of a few of them. Then I can come up with an algorithm to parse them into the structures I already have developed. My site is quite capable of handling multiple sources (chan, tweet, other posts) if I can do that much.

▶Anonymous 04/04/18 (Wed) 18:13:11 70e498 (69) No.892975>>893062

>>892772

{
"scale": "human",
"events":
 [{
  "start_date":{"year":"2017","month":"10","day":"28","hour":"0","minute":"0","second":"0","millisecond":"0","display_date":"2017-10-28 00:00:00Z"},
  "end_date":{"year":"2017","month":"10","day":"28","hour":"0","minute":"0","second":"0","millisecond":"0","display_date":"2017-10-28 00:00:00Z"},
  "text":{
    "headline":"HRC extradition...",
    "text":"The body text...<hr/>"
  },
  "media":null,"group":"QAnon Posts", "display_date":"Saturday, October 28, 2017","background":null,"autolink":true,"unique_id":"1dba35d4-46ac-4c5f-94d7-1e6b0f53ad4d"
 },
 {
  "start_date":{"year":"2017","month":"10","day":"28","hour":"21","minute":"9","second":"0","millisecond":"0","display_date":"2017-10-28 21:09:00Z"},
  "end_date":{"year":"2017","month":"10","day":"28","hour":"21","minute":"9","second":"0","millisecond":"0", "display_date":"2017-10-28 21:09:00Z"},
  "text":{"headline":"&Delta; 25","text":"2017-10-28 21:09:00Z<br/>@realDonaldTrump<br/>After strict consultation with General Kelly..."},
  "media": {"url":"https:// twitter.com/realDonaldTrump/status/924382514613030912","caption":null,"credit":null,"thumbnail":null,"alt":null,"title":null,"link":null,"link_target":"_new"},
  "group":"realDonaldTrump","display_date":null,"background":null,"autolink":true,"unique_id":null
 }]
}

▶Anonymous 04/04/18 (Wed) 18:22:55 07564d (94) No.893062>>893234

>>892975

What is this from?

▶Anonymous 04/04/18 (Wed) 18:35:43 07564d (94) No.893190>>893234

I decided to see if I could find some hidden Q:

SELECT * FROM `chan_posts` WHERE `post_type` != "q-post" AND `author_hash` IN (SELECT `author_hash` FROM `chan_posts` WHERE `post_type` = "q-post")

This statement found 718 of them I hadn't identified.

▶Anonymous 04/04/18 (Wed) 18:40:03 70e498 (69) No.893234>>893321

>>893062

That's the output from a new timeline api I'm working on. It plugs directly into the timeline.JS.

>>893190

Holy shit. That's notable there innit? Are you the OP in this thread?

▶Anonymous 04/04/18 (Wed) 18:50:40 07564d (94) No.893321>>893483

>>893234

Figured out quickly that I had to add a couple additional checks.

SELECT * FROM `chan_posts` WHERE `post_type` != "q-post" AND `author_hash` IS NOT NULL AND LENGTH(`author_hash`) > 0 AND `author_hash` IN (SELECT `author_hash` FROM `chan_posts` WHERE `post_type` = "q-post")

Still came up with 120. Perhaps a couple of them were misidentified as Q in the first place?

▶Anonymous 04/04/18 (Wed) 19:08:27 70e498 (69) No.893483>>893908

>>893321

Interdasting. I'd have to see a list.

http:// qanon.news/timeline.html

http:// qanon.news/Help/Api/GET-api-timeline

▶Anonymous 04/04/18 (Wed) 19:52:18 07564d (94) No.893908>>894169 >>894303

>>893483

At least one of the ones I had identified as Q, maybe 2, had been mislabeled. Plus, a known impostor got tagged as Q. Not sure how that happened. I'll have to fix it. But a few other interesting ones popped up.

I made one of my editor features available to you so that you can have a look. On the search form, go to the bottom and check "In processing list:" box. Leave the rest blank. And you can have a look for yourself.

http:// q-questions.info/research-tool.php

▶Anonymous 04/04/18 (Wed) 20:08:29 70e498 (69) No.894169>>894338

>>893908

>q-questions.info/research-tool.php

Yeah it looks like there are some missed posts in there for sure. You may have done some good work on that one.

▶Anonymous 04/04/18 (Wed) 20:17:18 07564d (94) No.894303>>894345

>>893908

ID:RrydKbi3 in post 147683274 definitely looks misidentified to me.

▶Anonymous 04/04/18 (Wed) 20:20:19 07564d (94) No.894338>>894359

>>894169

I have to go to an appointment now. But I'll fix the known misses this afternoon, and I can tag you to have another look, if you like.

▶Anonymous 04/04/18 (Wed) 20:20:26 70e498 (69) No.894345>>898657

>>894303

>RrydKbi3

Agree. That's the only post with that ID. Nothing ties it back to Q.

Same for Anonymous ID:9o5YWnk7 2017-10-29 19:35:45 Thread.147146601 Post.147171101

▶Anonymous 04/04/18 (Wed) 20:21:48 70e498 (69) No.894359

>>894338

▶Anonymous 04/05/18 (Thu) 00:56:45 07564d (94) No.898657>>900583 >>904541

>>894345

There are more of them than you're seeing, actually. I've just discovered that I'm still having issues with the import/export process. Not everything I've set to export is getting up there. I'll have to run that to ground tonight and fix it. I thought I had that worked out already. When I was still thread-based, everything I was exporting from the home machine was importing just fine into the online machine. But I guess I changed the logic somehow when I went from thread-based to post-based. (It can sometimes actually be more difficult to change a program than to write it for the first time.) At the moment, some of what I've said below may not be visible. But sometime tonight, it should all be there.

>ID:RrydKbi3

He responded to Q. That's it.

>ID:9o5YWnk7

Yes, he was just responding to a Q post. He isn't Q. I'm not sure at this point if it's an approved post or just another response. I'll have to take another look at it when I'm working with the maps again. For now, I've demoted him to a regular anon. And I'm removing the posts that weren't marked as Q from the online database, at least for now.

I'm not sure what to think about ID:afa548. I had the impression that a hash was good for only one thread. And yet he shows up as a hidden Q in one thread and with his trip in another. Same with ID:4533cb, but there was only one unmarked post for that one.

ID:5ace4f has only one marked post. It looks like he got marked as Q because he's on a map, but I'm not sure it's really him. The other posts look interesting and possibly relevant, though. Still, it's possible the one should be marked as approved rather than as a Q post.

ID:071c71 got reused on a different board. On one, with a non-Q trip. But it's interesting who that ended up being.

ID:23de7f looks entirely legit and probably could be marked.

With ID:d5784a, you can see what I can do to imposter clowns.

ID:1beb61 and ID:26682f look like imposters, but I haven't heard one way or the other on those. Maybe I need to put date ranges on my trip test?

Some hashes are particularly colorful in their unmarked posts. Not sure what the story is there. But I do believe the one that's marked is legit. Maybe another should be marked, but I certainly wouldn't mark all of them.

▶Anonymous 04/05/18 (Thu) 02:48:46 07564d (94) No.900583

>>898657

They're all up there now. There was something weird about two of the records. In one case, someone did something to a file name that I didn't know could even be done! I'll just have to edit that in the database, and it should be fine if it ever needs to be exported again. And I don't know what the deal is with the other. I pasted the SQL statement for it directly, and it worked just fine. Slash issues, maybe.

▶Anonymous 04/05/18 (Thu) 09:30:57 07564d (94) No.904541>>907276

>>898657

>ID:afa548

I've been looking further at this. I don't think the one in cbts is Q. The hash just happens to be the same. But there's something like a 3 month gap in when the hash was used.

>ID:1beb61

Fairly certain he's fake, and I'm marking him as so.

A couple of the ones I'd incorrectly marked as Q had the same post number as an actual Q post on another board. So I suppose it's easy to see how that could have happened. Now that Q uses a trip, that's much less likely to happen. They're probably relics of a time when I hadn't developed my toolset so well yet. Now, it's easier because the editor mode of the research tool has drop boxes and the like for making those kind of changes. When I had to use phpAdmin, I was somewhat flying blind because I couldn't see as well what was really in a post. Now I can see the posts in their final form when I'm making changes like that.

▶Anonymous 04/05/18 (Thu) 09:39:47 1bfbc2 (1) No.904574

>>809084

Not constructive newgro. You would do well to realize the calibre of techs that browse chans and do what you can to get their help rather than get salty.

▶Anonymous 04/05/18 (Thu) 17:06:23 07564d (94) No.907276

>>904541

By the way, this has not been an idle exercise. One of the things I'll be doing is keeping track (programmatically, in the data) of context chains that reach back to Q. So it's important that Q be properly identified. To that end, finding hidden Q has been valuable. Not only did I find Q gems I had not recognized (probably because they're on maps I haven't worked through yet), but I was able to recognize some misidentified posts as well as get the imposters properly marked. So it's all good.

▶Anonymous 04/09/18 (Mon) 00:26:09 70e498 (69) No.958418>>965953

Qanon.news bumped from the bread anons.

Somebody said that the site was serving malware and it was taken out of the bread. I posted in the meta thread to have BV check it out and he gave it the OK. I spent an hr or so trying to get it back in. No luck.

I'm not interested in begging - but I do want people to use what I've been working on. I'll see what happens after dinner I guess.

▶Anonymous 04/09/18 (Mon) 15:06:05 70e498 (69) No.965953

>>958418

Meh. I've been thinking about it. After reading all about codefags problems, bandwidth issues, SSL certs, all the other qcodeClones… It may be better to just stay quiet and let people use it when needed. I'm a little disappointed that it was so easy to get something removed from the bread.

What I've been working on is really more backend style anyways. I have been thinking about a few different things though.

I saw one anon post something about there needing to be an RSS feed for QPosts. I think that should be pretty easy to provide. If I get some time I may whoop something out.

I've been playing around with the timelineJS. I worked it up where you can select a specific timeline. Qposts. DJTweets. Etc. Q has mentioned timelines a few times and I've been looking around trying to find threads that were timeline based. No real luck so far. Anyways, I was thinking about working on some different timelines.

I've been starting to wonder if moving to a database solution rather than file based json is going to be worthwhile. Better speed probably? Built in caching? Do I want that for an api? What does everybody else think?

▶Anonymous 04/09/18 (Mon) 15:45:05 70e498 (69) No.966304>>967035

>966124

Even in here.

▶Anonymous 04/09/18 (Mon) 16:48:56 07564d (94) No.967035

>>966304

We must be over the target.

▶Anonymous 04/10/18 (Tue) 12:56:40 70e498 (69) No.981495>>988865

I built a new API to get a specific post from a specific bread. Maybe I'll get it uploaded today.

Looks like ~/api/bread/981411/981444/

to get >>981444

Researching an RSS/ATOM feed. That looks to be low hanging fruit.

▶Anonymous 04/10/18 (Tue) 17:06:28 f86e40 (1) No.983853

Very afraid they are!

Goodbye trolls and shills!

▶Anonymous 04/10/18 (Tue) 17:52:21 70e498 (69) No.984329

I was contacted by a guy that says he's from this site http:// we-go-all.com

Looks to have a Qcodefag repo installed on a page. He wanted to know if he could help at all and I asked him if he had posted anything in here.

He doesn't know anything of the codefags thread. He's interested in access to the api. I don't wanna dox the guy, but this name matches a guy that works for Representative Jared Polis (D-CO 2nd)

5th-term Democrat from Colorado.

http:// www.congress.org/congressorg/mlm/congressorg/bio/staff/?id=61715

Probably nothing. The QCodeFag stuff is open, 8ch is open. Nothing to worry about anons?

▶Anonymous 04/10/18 (Tue) 18:33:37 8e73cf (1) No.984766

Hgbbbkop

▶Anonymous 04/10/18 (Tue) 23:20:43 70e498 (69) No.988865

>>981495

All updated

New Qanon ATOM feed:

I managed to throw together an ATOM feed here:

http:// qanon.news/feed

http:// qanon.news/feed?rss=true

It returns the last 50 of q posts. It's a work in progress. I can include referred posts, images etc.

New Timeline api: Timeline api that shows Qposts and DJTweets. I also set up an Obama timeline that another anon pointed out. I'm planning on adding more to it and some other timelines I'm thinking about. You can see a few at http:// qanon.news/timeline.html

▶Anonymous 04/10/18 (Tue) 23:31:18 07564d (94) No.989046

With the contexting problem I'm working on, I'm thinking I need to also write a "mea cupla" system for when a bread (or bread-like) post is not properly identified. It would go in and recalculate context when status of a post changes. This way, I don't have to be so concerned whether bread posts are properly identified at the outset, and I can just get on with it.

▶Anonymous 04/10/18 (Tue) 23:48:51 f47016 (1) No.989332>>989845 >>990025 >>991237 >>1024355

Hey CodeStuds - I was wondering if there's a quick way to find all posts in the qresearch thread by 'U'? I've run across a couple and I've really enjoyed them. I am not trying to take anything away from 'Q' drops - I owe 'Q' a ton for waking me up. But the 'U' drops always ease my mind and make things clearer for me…not sure if they're benefiting anyone else in the same way or not. I wanted to grab them all if I can find them. Thanks Patriots. #WWG1WGA

▶Anonymous 04/11/18 (Wed) 00:24:54 07564d (94) No.989845

>>989332

There could have been before I took everything down and then uploaded only select posts. But to do what you want, I still would have had to set up a whole word search mode, and I didn't have that yet. I abridged my public database due to obnoxious content by shills. I don't want to republish that stuff. I won't put the whole thing back up unless I have a way for visitors to flag posts for review, and right now I don't.

▶Anonymous 04/11/18 (Wed) 00:36:39 07564d (94) No.990025

>>989332

If all you want to search are Q posts, you could try using my system. The way it's set up, you can't force it to look at the first or last letter of the post. But you could try doing searches with a space before and after or a period before and after and other such things to force a word search. The REGEX of the LIKE statement is not strong enough for much else than this.

http:// q-questions.info/research-tool.php

▶Anonymous 04/11/18 (Wed) 01:34:30 f486e4 (1) No.990817

Thank both of you Patriots for your responses. I will do some regexing around. Be safe anons. Love you guys.

▶Anonymous 04/11/18 (Wed) 02:01:01 70e498 (69) No.991237>>1293496

>>989332

Anything is possible.

U is the username? Any other identifying info? Do you know of a post you could point us towards?

▶Anonymous 04/13/18 (Fri) 15:27:09 07564d (94) No.1024355>>1024641

>>989332

Let me clarify something. Is U a name? Is that the whole name? If I've made it public, you can search that on my site already. If not, I can take a peak and possibly make that public for you if it isn't shill stuff.

▶Anonymous 04/13/18 (Fri) 15:52:53 07564d (94) No.1024641>>1293496

>>1024355

I found 1 in qresearch and 3 in 4chan. I've added them to my public database for you. I don't see any real revelations in them, though. Enjoy!

http:// q-questions.info/research-tool.php

▶Anonymous 04/13/18 (Fri) 21:07:02 70e498 (69) No.1028050>>1028183 >>1030259

I've discovered the machine broke for a few hours on March 27-28 and I'm missing some json. Am I the only one saving off json or does some other codefag have some to send my way?

PageScraper to json?

▶Anonymous 04/13/18 (Fri) 21:21:10 70e498 (69) No.1028183

>>1028050

Nevermind. The JSON I needed had slid off the catalog but was still avail. Thanks CM!

▶Anonymous 04/14/18 (Sat) 00:13:35 07564d (94) No.1030259>>1034522

>>1028050

It probably should be part of my work eventually, but it isn't yet. It's taken some time to get to that contexting feature. I'm finalizing the algorithm now.

A context chain will begin with a post that has been listed in a bread post and go backward through the links. These are either from the top of the thread or later where the next baker is being told what to include.

Links will also be followed backward from Q posts.

Contexts will stop at bread posts and not include them. (The intent is for context chains to stick to one topic as much as possible.)

When a post that includes a map is encountered, the posts from the map will not be included in the context chain, but links from the text of the post will be included. (Same reason as above: Maps include multiple topics.)

I will keep track of context chains that include Q posts. These can be shown with the Q posts. To minimize confusion, I will be displaying the context chains in separate bordered DIVs with a display/hide button. Not sure yet which to make default. Probably the hidden state to minimize clutter. I MIGHT parse the description of the leading post of the chain from the bread post into it. In the hidden state, this would be all that would show.

▶Anonymous 04/14/18 (Sat) 03:47:46 70e498 (69) No.1034522>>1040483

File (hide): 4bdff4d0782661a⋯.png (55.51 KB, 355x327, 355:327, ClipboardImage.png) (h) (u)

>>1030259

Interesting that you should post that anon, I've been thinking the same thing. We need a crawler. Sounds like a great idea. A better way of visualizing the context thread would be great. Ya know I've been reading about Google. PageRank. How that was designed in the beginning. Links you come across that have alot of responses can be either good or bad on 8ch.

With the new breadID/postID feature I rolled out you could find anything you were missing for sure.

So you think your initial targets are just the baker posts and the other posts that are deemed notable?

I've been wondering if we could use a hashtag internally for our own benefit. #notable. That kind of thing.

It sounds like an interesting project. If I can help at all let me know.

▶Anonymous 04/14/18 (Sat) 16:37:43 73cc1f (28) No.1040483>>1040716

>>1034522

Hmmm…. I wasn't thinking about doing an indented method of arranging things. Should I be?

And if I knew how to pass off some of the work to others, I'd do it. It's a LOT for one person to do. One of the reasons it's been taking so long is because I'm still adding to the database, etc. If I had left the entire database online, perhaps? But the clowns were shitting things up with some truly raunchy stuff, and I didn't want to republish that. Truth is, though, that I've done some preliminary with this already. It shouldn't take me long to finish the coding. But it will take a while to do the following:

-> properly identify the bread and map posts on some 2300 threads (Yes, this matters.)

-> identify the posts listed on the map posts

Even so, I've identified enough bread and maps already that some interesting stuff should begin floating to the surface. That's part of what is taking so long. The code is pointless without at least some of that done.

I'm eager to get to work on this. I lost an entire evening/night due to a power outage.

▶Anonymous 04/14/18 (Sat) 16:55:59 73cc1f (28) No.1040716>>1051320

>>1040483

I think what I'm getting at is that it's difficult to share the work without putting the entire database back online again. If I do that, I may have to do the following:

-> Buy dedicated hosting. If I do that, I'll be putting a donation button on the site for sure. So far, this has all been from my own time (a LOT of it) and resources.

-> Including a "report this post" button. Like I said, I don't want to be republishing truly obnoxious unrelated stuff. But it's all on me right now, and I can only do so much by myself. I'd have to let the community help me control that content.

But you know, really, the way I'm doing things now has a good side to it. There's a lot of fluff in the complete database. The way I'm doing things now eliminates a lot of that. You're going to get the dense info rich posts this way.

▶Anonymous 04/15/18 (Sun) 09:26:17 73cc1f (28) No.1049462>>1051320 >>1059305

The program can now save data for the contexting. Tomorrow (aka, after I wake up in the morning), I will be working on display.

▶Anonymous 04/15/18 (Sun) 15:25:31 70e498 (69) No.1051320

>>1049462

Nice.

>>1040716

I bought hosting from Godaddy. Unlimited bandwidth and 100GB storage. Economy plan on sale was $12/year. I think I even got another domain with that deal for $1/year that I'm not even using.

Yet.

I hear ya on time. My shit got bumped from the bread because 1 anon got confused about a malware notification. I've got 2 pretty solid months of time in on what I've been doing and got taken out by a single post.

As we reach more and more of the masses, the information is going to appear on more sites that show ads/donations. It's a way of paying for the infrastructure needed to provide the service. I see nothing wrong with it.

▶Anonymous 04/16/18 (Mon) 02:33:13 73cc1f (28) No.1059305>>1059896

File (hide): 73e1b2cc4c7a6c8⋯.png (54.42 KB, 1143x775, 1143:775, Research-Tool-1.png) (h) (u)

File (hide): d74c80a0e2d55d9⋯.png (87.41 KB, 1137x769, 1137:769, Research-Tool-2.png) (h) (u)

>>1049462

The Research Tool can now display context the way I described above EXCEPT that I have not built in a show/hide button yet.

Right now, you have available to you SOME context that I calculated during my initial work putting together a contexting feature a couple months ago. I have more up through the date on the first image, but I have to get an export/import process built to get it into the online database. Since I have an export/import system for the posts, it shouldn't take much to make a modified version for the contexts.

My current task list is:

-- Build the export/import process for the contexts.

-- Get the contexts calculated for the 2300 or so more threads that I currently have. This could take several days.

-- Then perhaps I'll look at getting that show/hide button in there. I might do it in the middle of working on getting the contexts calculated if I get bored of that.

-- After that, including POTUS tweets is next.

http:// q-questions.info/research-tool.php

▶Anonymous 04/16/18 (Mon) 03:03:24 70e498 (69) No.1059896>>1060038

>>1059305

Wow anon. It's coming together. It will be great to see it once finished.

Interesting what you are doing with the links. I think some of my pages are linking like the qcodefag sites. The RSS I hooked up to go back into the api. Think I should change that?

▶Anonymous 04/16/18 (Mon) 03:11:29 73cc1f (28) No.1060038

>>1059896

That's up to you and how you want to display your data. It might be cool to automate at least the downloading of new threads for what I'm doing. But to get the contexting right, I have to go through what comes in anyway. As mentioned before, not properly identifying bread and maps can overload the context chains.

▶Anonymous 04/18/18 (Wed) 09:01:36 73cc1f (28) No.1087614>>1088682

Contexting functionality is complete. The export/import process to make calculated contexts is complete.

I asked Anons on the general thread whether it is more important to calculate the contexts or to include POTUS tweets. The ones who responded want the tweets, so that's next.

▶Anonymous 04/18/18 (Wed) 10:45:45 51250b (1) No.1087924

>>1080429

I think the messege 'we are being set up' is in response to the SC failing to pass the IMMIGRATION BILL. Also POTUS tweeted CA will not be accepting national guard to border.

https:// www.denverpost.com/2018/04/17/neil-gorsuch-immigration-law-vote/

▶Anonymous 04/18/18 (Wed) 13:13:38 70e498 (69) No.1088682>>1090066

>>1087614

Nice.

Let me know if you want to hit the smash data. I'll set you up.

I rejiggered the links on some of my pages. It was set up like the qcodefag sites where each post contained a link back here. I changed that to a self referencing link instead. I decided to not be the cause of any more traffic back here.

Statistics show that the pages people coming to my site are interested in primarily the presentation pages - not the API. I think what I've decided to do is remove all references to the API - but still provide it. Default to the posts page or something. I got a few ideas.

▶Anonymous 04/18/18 (Wed) 16:22:22 73cc1f (28) No.1090066>>1090922

>>1088682

That would be great! A JSON source would speed that process along greatly.

▶Anonymous 04/18/18 (Wed) 17:44:52 70e498 (69) No.1090922>>1091428

>>1090066

Look at the SmashPosts

http:// qanon.news/Help/

Tell me what ya want and I'll see what I can work out.

▶Anonymous 04/18/18 (Wed) 18:13:34 73cc1f (28) No.1091190>>1091273

I'm looking at the help page, but I don't understand how to actually make the call to your API. It looks like the call I would want to use is

GET api/timeline/{2}

but I don't see how to actually implement it.

▶Anonymous 04/18/18 (Wed) 18:21:57 73cc1f (28) No.1091273

>>1091190

I think I figured out what I need to do. I just need to add the path to the URL.

▶Anonymous 04/18/18 (Wed) 18:39:09 73cc1f (28) No.1091428>>1092764

>>1090922

There are only 32 tweets in the JSON I got with api/timeline/2. There must have been more than that since October. Maybe I need a different call?

▶Anonymous 04/18/18 (Wed) 19:15:40 4a2958 (5) No.1091802>>1091988

My search-fu is nonexistent & need help for something current:

Somewhere within the past few weeks, someone posted a manual for Mueller firing protests. Didn't see it as a notable in BoB. Think it might have been pinched from ShariaBlue or the like. Thought it was a pdf, but not sure. Couple of screengrabs posted. In any case, it was a pretty thorough treatise on how to organize the march, chants, dealing with infiltrators (:D) and other stuff.

A couple of posts appeared today where one city (Pittsburgh) police department announced they were preparing for "semi-spontaneous" Mueller firing riots. That means they have that manual (but aren't disclosing it).

If we can find that manual again and post it all over that town's (and other) social media, it will awaken many to the fact that most of these protests are always preplanned.

Anyway, sorry for the hijack, but appreciate any help.

I just can't find it.

▶Anonymous 04/18/18 (Wed) 19:38:13 73cc1f (28) No.1091988>>1092170

>>1091802

Do you recall any words that would have been in the post?

▶Anonymous 04/18/18 (Wed) 19:57:29 4a2958 (5) No.1092170>>1092387 >>1092813

>>1091988

Someone found the site where it was from in the current bread:

https:// act.moveon.org/event/mueller-firing-rapid-response-events/search/

I could have sworn it was the whole "rapid events response manual" from MoveOn or allied organizations as a standalone doc.

"Mueller" would return too many hits.

Maybe Mueller + fired + protest(s) or something. Maybe add "plans" or "manual"

This is why their Mueller firing riots plan should get out into the public domain before any protests occur:

http:// pittsburgh.cbslocal.com/2018/04/18/robert-mueller-pittsburgh-police-prepare-riots-if-trump-fires/

Normies will realize how scripted all these protest marches are.

On phone so can't grab the whole site.

TY for any help!

▶Anonymous 04/18/18 (Wed) 20:21:44 73cc1f (28) No.1092387>>1092397

>>1092170

Try these. I was not able to find any PDF files posted recently about this.

>>208025

>>209411

>>211959

>>214550

>>674819

>>725107 (Unfortunately, I was not able to find the fullsize image of this one. Put a request on the general thread as well as Lost & Found if you really want it. Ask them to put it in the Lost & Found thread so you can find it if you look later.)

>>1003999

▶Anonymous 04/18/18 (Wed) 20:22:57 73cc1f (28) No.1092397

>>1092387

Looks like those got deleted. I'll make them available on the Research Tool for you.

http:// q-questions.info/research-tool.php

Look in a few hours. I have to run to an appointment right now.

▶Anonymous 04/18/18 (Wed) 21:08:05 70e498 (69) No.1092764>>1096204

>>1091428

The Smash API will give you more data you want.

You probably don't want the timeline stuff just yet. Unless you want to just stick with the default q/DJT timeline. Just do a get on the timeline API. The timeline API filters out all the tweets to just show the 5,10,15… deltas.

Yeah Gotta add the full path to the URL. If you are hitting it programattically I gotta give you access. Domain you would be calling it from?

▶Anonymous 04/18/18 (Wed) 21:13:51 7ceb42 (1) No.1092813>>1092887 >>1092973

>>1092170

I believe you are talking about this website:

https:// act.moveon.org/survey/resistance-recess-host-materials

▶Anonymous 04/18/18 (Wed) 21:22:15 4a2958 (5) No.1092887

>>1092813

Yes, that's most of the material, but it had been put into a document (pdf or doc, I think) and indexed.

Much easier to forward a doc to which notes can be added than point normies to a site which is hostile-owned. That document (in whatever format) contained all the articles on that page and more. Was well done by somebody.

▶Anonymous 04/18/18 (Wed) 21:32:00 4a2958 (5) No.1092973>>1093667

>>1092813

Somebody found it!

www.scribd.com/document/375930782/Nobody-is-Above-the-Law-Mueller-Firing-Rapid-Response-Moveonorg-Protest-Guide

This is the basic protest manual all Soros/ SEIU and associated groups use.

Great doc to hand out to redpill people. Leave the redline the Mueller title and add the protest du jour.

Found in this bread:

https:// 8ch.net/qresearch/res/1092389.html#1092719

▶Anonymous 04/18/18 (Wed) 22:40:28 73cc1f (28) No.1093667>>1093804

>>1092973

I'm glad you found it. I'm beginning to think that I need to get the entire database back up there again, even if I have to not upload the images. We've had a couple of search requests like this for which I've had the data. In this case, the original posts had been removed, which would explain why he couldn't find it.

▶Anonymous 04/18/18 (Wed) 22:51:23 4a2958 (5) No.1093804>>1093832 >>1093909

>>1093667

Since it was in a Scribd doc, not sure it would have been found anyway, unless someone commented on it using key words.

I couldn't even hazard a guess as to what percentage of information here since Day 1 is critical vs.otherwise. Throughout it all, it's painting pretty clear pictures of the players& their proclivities, even if we haven't found a smoking gun yet.

In any case, thanks again for everyone's efforts.

▶Anonymous 04/18/18 (Wed) 22:54:42 73cc1f (28) No.1093832

>>1093804

One of the posts I found would have led you there.

▶Anonymous 04/18/18 (Wed) 23:01:32 73cc1f (28) No.1093909

>>1093804

There is an awful lot of absolute garbage posts out there, to be sure. And now that there are over 1.5 million of them, there is no way one person can censure out the stuff that absolutely should not be republished. I don't like the idea of putting all of the unreviewed stuff up there without their images, either, since a lot of the intel is in those images. It's a tough call. Even though I do have a content warning on the research page, I have concerns about the legal side of just blindly posting some of those images. I most definitely couldn't do it without a reporting feature.

▶Anonymous 04/19/18 (Thu) 02:16:09 73cc1f (28) No.1096204>>1096311

>>1092764

I was hoping for the complete set of Trump tweets since Q showed up in late October. Do any of your API calls provide that?

▶Anonymous 04/19/18 (Thu) 02:22:00 70e498 (69) No.1096311>>1099789

>>1096204

Well you can get all those from the trumptwitterarchive. What I did is group them into days that Q posted, and then only calculated the ones that DJT tweeted after Q posted.

If you check the API you can see the data, or look at http:// qanon.news/smashposts.html to see it more visually.

▶Anonymous 04/19/18 (Thu) 09:46:54 73cc1f (28) No.1099789>>1100463

>>1096311

> trumptwitterarchive

Thank you for the suggestion. That was what I was looking for. I've got it done now.

>>1099779

▶Anonymous 04/19/18 (Thu) 13:27:40 70e498 (69) No.1100463>>1100644 >>1100681

>>1099789

You are on it!

Pain having to get the 2017 and then the 2018 from TrumpTwitterArchive but… it's the only way.

I guess I could suck all that in and then offer it as an api… just raw twitter data.

I only thing I found with the twitterdata is that there's a 9 day gap in January at the beginning of 2018. I've been fighting off a compulsion to archive those (manually) to make it complete.

>>1099789

css : You can just use the twitter magic.

https:// dev.twitter.com/web/overview

On the smash page I just make links and decorate with the bird and tweet. The timeline does it automagically.

Here's a question for you.

How hard would it be for you to remove all the inline style you have on q-questions.info/research-tool.php ?

Do you know about jqueryui themeroller?

Conjigger your jqueryUI website and then download the custom css like magic.

▶Anonymous 04/19/18 (Thu) 13:58:09 73cc1f (28) No.1100644

File (hide): 53c8eeb1f110c85⋯.png (85.12 KB, 1207x839, 1207:839, Research-Tool-1.png) (h) (u)

File (hide): 9a50c1627d356f1⋯.png (150.58 KB, 1209x841, 1209:841, Research-Tool-2.png) (h) (u)

>>1100463

I've pulled it into the same database that contains the chan posts. I don't want to make too many exceptions to how I do things. That makes it more difficult to keep track of what is for what.

▶Anonymous 04/19/18 (Thu) 14:05:32 73cc1f (28) No.1100681>>1100740

>>1100463

Eventually, the cream of the project will be going into that WP site that's at the front of the URL. That will take care of appearances nicely.

▶Anonymous 04/19/18 (Thu) 14:14:09 73cc1f (28) No.1100740

>>1100681

And all of the text is back up there now. People won't have to request searches anymore.

▶Anonymous 04/20/18 (Fri) 19:44:28 73cc1f (28) No.1117987>>1119101 >>1119154

File (hide): 3846409f0747842⋯.png (839.64 KB, 1232x881, 1232:881, offline_only.png) (h) (u)

>>1106873

>Archive OFFLINE immediately.

>Offline only.

>>1117592

I'm not sure what was meant by the recent Q post. Does that affect our work? And what is the scope of the request?

▶Anonymous 04/20/18 (Fri) 21:24:32 70e498 (69) No.1119101>>1125149

>>1117987

Kinda wondering about that myself.

IMO, he was talking specifically about the NP/NK video. Many have archived that offline.

On one hand, I'm archiving online - but that makes it easier for others to archive.

On the other hand - I'm archiving at home too.

The online stuff I'm doing has no bearing on my archives. I put it online so others could use it.

▶Anonymous 04/20/18 (Fri) 21:30:21 4202ae (1) No.1119154>>1134796

>>1117987

Hardcopies. Print out things. Copy files to USB/CD/DVD. Place inside of safe or better yet faraday cage. Use means that are hard to destroy and items that are not online and can be erased via virus or EMP. It's not just for you, but for the Country. Think that everyone is an off line version of "the cloud" but with a hard copy.

▶Anonymous 04/21/18 (Sat) 04:04:25 73cc1f (28) No.1125149

>>1119101

That was the reason I finally ended up putting it online as well. It seemed a shame to keep that functionality to myself. I reworked a few things to make it better in a multiuser environment. It ended up being better for myself as well.

▶Anonymous 04/21/18 (Sat) 20:35:14 73cc1f (28) No.1134796

>>1119154

I believe USBs are magnetic. CDs and DVDs are your best choices.

▶Anonymous 04/23/18 (Mon) 20:03:07 73cc1f (28) No.1159341

Don't mind me. I'm just trying to find some missing posts.

>>309741, >>309240, >>209205

▶Anonymous 04/23/18 (Mon) 20:16:50 adc966 (1) No.1159568

>>524371

Anyplace we can download your stash?

▶Anonymous 04/24/18 (Tue) 02:22:34 73cc1f (28) No.1164429

Well, now I feel stupid. I just realized there's an "Expand all images" link in the lower right of the page header. Had I realized this, I would not have lost so many full size images. One save could have been done in thumbnail most, and another in full size mode, and I would have had everything on the page.

▶Anonymous 04/24/18 (Tue) 03:03:11 73cc1f (28) No.1164952

File (hide): c2a311efc179b6a⋯.png (145.3 KB, 1000x843, 1000:843, top-of-page.png) (h) (u)

The ctrl-S method of saving a page will NOT automatically pick up the full size images when in thumbnail mode. If the page is expanded when the save is done, then you'll get the full size images (but not the thumnails, though this is a minor issue).

So here's my suggestion to get the best archiving:

You can save once or twice, but one of the saves should be in expanded most. If you want the thumbnail mode as well, then that's a separate save.

(All of the official archives so far have been in thumbnail mode.)

▶Anonymous 04/24/18 (Tue) 19:27:45 70e498 (69) No.1171658>>1176955

Huh. Anon never showed up to drop his image link on us?

▶Anonymous 04/25/18 (Wed) 03:28:36 73cc1f (28) No.1176955>>1235051

>>1171658

Not yet, apparently. It's a lot of files. It's going to take some time to upload them all, possibly a few days. Even my thumbnail image set takes a long time to upload.

▶Anonymous 04/29/18 (Sun) 16:51:07 90e281 (20) No.1235051>>1239115

>>1176955

Still nada.

▶Anonymous 04/29/18 (Sun) 17:10:04 407540 (1) No.1235261

>>1234889

You are so wrong faggot

She only asks that her comments page is respected, that's who she deems as her people. Do some research before you fuck up your own opinions next time

▶Anonymous 04/29/18 (Sun) 22:12:23 131565 (22) No.1239115

>>1235051

Unfortunately, we're anonymous here. I have no idea how we can even check on something like this.

▶Anonymous 05/01/18 (Tue) 03:19:00 90e281 (20) No.1256952

Anon asked about the JSON for all Q posts.

The API is still there, I just removed all the links.

http://qanon.news/Help/

▶Anonymous 05/01/18 (Tue) 14:58:03 90e281 (20) No.1261359

Anon asked for a word count in all Q posts and I did it really quick. Just gonna drop this here.

Here's the results, sorted by occurrences.

https://pastebin.com/e1u1jxR2

▶Anonymous 05/04/18 (Fri) 05:30:13 4c77f1 (1) No.1293496>>1300223

>>991237

>>1024641

U was just at the bottom of their post. Like…

I'll search for the posts you added to your db. Thanks for hunting around for them!

#WWG1WGA #TheGreatAwakening #ItsSpiritual

▶Anonymous 05/04/18 (Fri) 11:21:31 354bf8 (1) No.1294544

File (hide): 0a76d8e37c88540⋯.gif (22.25 KB, 334x379, 334:379, 54790.gif) (h) (u)

Why Did George Bush Buy Nearly 300,000 acres in Paraguay?

▶Anonymous 05/04/18 (Fri) 22:29:39 0e52bb (1) No.1300223

>>1293496

I finally found one by grepping around in the json files. I'm searching for more, but here's an example.

https:// 8ch.net/qresearch/res/932740.html#933285

▶Anonymous 05/04/18 (Fri) 22:42:18 de2dc6 (1) No.1300381

Coincidence?

https://www.bostonglobe.com/news/nation/2018/05/04/kerry-quietly-seeking-salvage-iran-deal-helped-craft/2fTkGON7xvaNbO0YbHECUL/story.html

▶Anonymous 05/05/18 (Sat) 21:09:36 19c17f (6) No.1310801>>1310819 >>1310983 >>1313635

Sorry for popping in on you re this but the Anon I was speaking with about "time stamps and markers" said to come here. They have info for me to be able to start working on it.

There was a thread dedicated to this but appears to be missing now or I keep missing it.

▶Anonymous 05/05/18 (Sat) 21:11:47 19c17f (6) No.1310819

>>1310801

BTW, great work Anons!!

I wish I could help but a bit beyond my capabilities.

▶Anonymous 05/05/18 (Sat) 21:30:50 131565 (22) No.1310983>>1311143 >>1313047

>>1310801

I'm a little behind in my archives at the moment, but that should be remedied by this evening. (I was busy working on my tools.) My site is a good one for looking at tweets vs. Q posts because I can show them on the same timeline.

http://q-questions.info/research-tool.php

▶Anonymous 05/05/18 (Sat) 21:47:57 19c17f (6) No.1311143>>1312967 >>1313371

>>1310983

That is fine and thank you. Can you tell me what is Q's marker that I should look for?

▶Anonymous 05/06/18 (Sun) 00:43:28 131565 (22) No.1312967>>1313371

>>1311143

Q's trip codes are listed at the top of the general threads on this board. On my site, known Q posts are shown in green.

▶Anonymous 05/06/18 (Sun) 00:51:39 131565 (22) No.1313047

>>1310983

For some reason, posts on Q's new board aren't saving properly (except the first post on the thread). But I'll have everything else up there shortly.

▶Anonymous 05/06/18 (Sun) 01:19:56 19c17f (6) No.1313371>>1313627

>>1311143

>>1312967

Got the - http://q-questions.info/research-tool.php

Got - https://qanon.pub/

and another that has actual screenshots of Q's posts

been doing some research on Q's marker and need to clarify then will start on "wind the clock"

Much appreciate all the help - I need to do a better job at bookmarking important info on decoding Q.

▶Anonymous 05/06/18 (Sun) 01:44:49 19c17f (6) No.1313627>>1313667

>>1313371

Found this in QMap PDF thread. Going to try to locate Anon because no sense on duplicating work.

"Anonymous 01/28/18 (Sun) 10:17:16 ID 3c320a No.190706

>>187088

Thank you for all this hard work. One thing that I think would really help. If the book could include all the Q post with Time Stamps including the early post before trip code. This needs to be searchable by time stamps (EST). The time stamps and dates could be either with each post or in the front with reference to the post. I find that the time stamps are important to first identify Markers. I’m currently have to jump from time Stamp Search to Marker Search and most data bases I use are not complete with latest posts. This would be extreamly helpful. Thank you Anon. Truly a Patriot! One other thing is some links to Q posts are 404 when link is clicked so I can’t find related time stamp."

▶Anonymous 05/06/18 (Sun) 01:45:36 90e281 (20) No.1313635>>1313775

>>1310801

You may be looking for the Delta thread.

>>410413

I think I told you to come here. I did some Delta workk here

http://qanon.news/smashposts.html

That Delta is only considering the difference between a Q post and a DJT tweet. There's is nothing in there to account for DJT corrections of deltas between tweets.

The deltas you see on the smashpost page are spread out across the Q posts - since there is a different delta for each.

IE: Q posts at 12:00p

DJT tweets at 12:10p [10] delta

Qposts at 12:05p <- this would also mean the DJT tweet at 12:10p is also [5] delta.

I did it like that because I wasn't sure of the meaning of all deltas. Is a [29] valid? Only on the 5's? Good luck anon! Let us know what we can do to help.

▶Anonymous 05/06/18 (Sun) 01:48:30 90e281 (20) No.1313667>>1313946

>>1313627

I think most everything we've been doing here has all been resolved to either GMT or Zulu time. 8ch JSON comes in GMT/Zulu. The TrumpTwitterArchive comes in GMT/Zulu.

Correct me if I'm wrong codefags.

▶Anonymous 05/06/18 (Sun) 01:58:12 19c17f (6) No.1313775>>1314163

>>1313635

So, just to confirm, the Delta's are the marker?

And I will pop over there and see what is going on. Thank you.

The work being done via this thread is very important. Thank you Anons!!

▶Anonymous 05/06/18 (Sun) 02:12:01 131565 (22) No.1313946

>>1313667

Yes, my posts are saved in GMT also.

I've about got the issue with the new Q board taken care of. I just needed to tell my database about it. I'm getting those posts ready to upload now.

As it happens, I'm currently working on setting up special search types that you may find useful. One of those search types will show just the Q posts and POTUS tweets. That way, you won't have to think about the proper way to limit your searches if that is what you are after. Look for that in the next day or two. I'm still working on finalizing that feature.

▶Anonymous 05/06/18 (Sun) 02:32:57 90e281 (20) No.1314163>>1319771 >>1321452

>>1313775

The deltas are what helps you find the marker.

IE: Q posts something about "win" 5 mins later DJT posts something about "Goodwin" That's a marker. (Just an example - I don't remember the deltas on the goodwin marker.)

The Delta thread where the work has been done on deltas. I'd like to see def documentation of confirmed markers.

▶Anonymous 05/06/18 (Sun) 19:22:28 131565 (22) No.1319771>>1326496

>>1314163

It would not be difficult at all to include calculations in my displays. So let me double-check what the logic should be.

When displaying a Q post

-- show delta since last Trump tweet.

When displaying a Trump tweet

-- show delta since last Trump tweet

-- show delta since last Q post

Is there anything else?

▶Anonymous 05/06/18 (Sun) 22:42:06 131565 (22) No.1321452>>1326496

>>1314163

I added delta calculations. Check it out and let me know if it's what you need.

http://q-questions.info/research-tool.php

▶Anonymous 05/07/18 (Mon) 13:36:44 90e281 (20) No.1326496>>1329318 >>1329416

>>1321452

>http://q-questions.info/research-tool.php

Looking good!

Checking the Show Delta box seemed to kill off any results for me tho. I'll try again later!

>>1319771

I believe you are nearly correct.

Once you have found a [marker], then the time between DJT tweets/Corrections appears to be the indicator of another marker. I don't think it goes back to a Q post delta.

Check the logic for the [5] & [1] markers.

I disregarded all negative deltas (any tweet BEFORE a Q drop). There's information there possibly - but it just introduced too much noise into the results.

▶Anonymous 05/07/18 (Mon) 19:54:28 131565 (22) No.1329318

>>1326496

I didn't even attempt to find the series. I'm simply showing the delta between the last of either. I suppose I could. So what is the pattern we are looking for?

▶Anonymous 05/07/18 (Mon) 20:07:38 131565 (22) No.1329416

>>1326496

Not sure how checking the box kills results. The logic of the check box is implemented in a way that does not affect the search logic. The deltas are calculated after the fact. The actual SQL statement that creates the results is at the top of the page. That doesn't change. Still, I've seen unusual and unexpected things before. What are you seeing that has you thinking there's a difference in the search results?

▶Anonymous 05/08/18 (Tue) 22:43:28 131565 (22) No.1341497>>1381398

Never mind. The whole darn thing broke. I had overhauled the search logic to better support the data prep steps, and I guess stuff got messed up in the process. When I get done being disgusted about that, I'll fix it.

▶Anonymous 05/09/18 (Wed) 00:34:56 8dde8e (1) No.1342663

>>1341498

#68

I wonder if what Q is referring to is the Legal Status of the US., Macron brought a new contract to sign for Trump in conservatorship. That the old, legal status with the Rothschilds is no longer in effect due to bankruptcy.

▶Anonymous 05/12/18 (Sat) 06:08:59 131565 (22) No.1381398

>>1341497

I have no idea how defined() can return FALSE and yet the value be correctly set. Anyway, the program has been fixed, I believe.

▶Anonymous 05/17/18 (Thu) 19:01:53 90e281 (20) No.1445611>>1446207

Hows it looking you faggots? Things progressing as designed?

I got a nagging image issue sorted out. Now archiving Q images and reference images to my site. Just about ready to get back on the elasticsearch idea.

▶Anonymous 05/17/18 (Thu) 19:53:40 131565 (22) No.1446207>>1446391

>>1445611

I have no idea what elasticsearch is. Would you care to explain?

I'm still working on things. At the moment, I'm adding some editing features to the research-tool version of things that I'd had in a prior tool. If you've noticed, older posts on my site have thumbnails and screenshots of links from the posts. And I've also started some work on the flagging feature so that I can feel better about putting all of the images back online rather than just selected ones.

▶Anonymous 05/17/18 (Thu) 20:06:53 90e281 (20) No.1446391

>>1446207

Superfast multitenant full text search for json. Clients in Java, C#, PHP, Pyton, Apache Groovy, Ruby etc…

I think all I need to do is write something that will input all my json into my local elasticsearch instance and then all lights are for go.

▶Anonymous 05/22/18 (Tue) 17:23:53 7daa5d (1) No.1506424>>1591211

File (hide): ffe64e00e6b4a69⋯.jpg (378.29 KB, 1200x900, 4:3, 1459522989864.jpg) (h) (u)

I've heard whispers of Q + Team posting at set time intervals

Worthwhile to investigate

How to visualize?

Side by side threads (yes, whole threads!) + time lines (with colours)

Helluva Job, No doubt, but who else to ask .. ?

▶Anonymous 05/24/18 (Thu) 18:50:03 6324a9 (2) No.1529968>>1530201 >>1548098

File (hide): 8dd1a603b9caee8⋯.png (111.39 KB, 633x318, 211:106, Search suggestion.PNG) (h) (u)

Saw this on Qresearch and didn't know if it had any merit. Leave it to the experts.

▶Anonymous 05/24/18 (Thu) 19:16:54 90e281 (20) No.1530201>>1530470 >>1548098

>>1529968

MMmmm Yes I have. I like the idea.

There are many services out there that will allow you to do this or you can create your own blockchain w/ ethereum.

Were you thinking just qposts or all qresearch?

▶Anonymous 05/24/18 (Thu) 19:50:51 6324a9 (2) No.1530470

>>495005

>>1530201

>Were you thinking just qposts or all qresearch?

>>496431

>My goal is to see ALL of the board searchable

Please see the first thumbnail in the OP, and the post referenced here. (I'm the OP)

▶Anonymous 05/26/18 (Sat) 15:37:43 2ffc90 (1) No.1548098

>>1529968

>>1530201

htt ps://ste emit DOT com SLASH wikileaks/@ausbitbank/the-great-wizard-of-leaks-a-blockchain-fantasy-action-adventure-epic

▶Anonymous 05/30/18 (Wed) 23:14:16 131565 (22) No.1591211

>>1506424

Just got back from vacation and saw this. My site can display Q posts and Trump tweets in the same search results in time order.

http://q-questions.info/research-tool.php

I just got back from vacation, so my archive is over a week behind at the moment. I should be more current in a few hours.

▶Anonymous 06/13/18 (Wed) 19:43:53 131565 (22) No.1732451>>1733098

Last night, anons were discussing the fact that the chans are part of history. Concern was expressed about the shill impacts on the boards and that perhaps there needed to be a cleaner view of it all. I suppose one answer could be to get back to the original purpose intended for the private version of my database, which is to identify what should be included in the blog that is in the root directory of the site. I haven't actually updated anything there in quite a while. Maybe it's time to get back to that.

▶Anonymous 06/13/18 (Wed) 20:13:39 90e281 (20) No.1733098>>1733203 >>1733870

>>1732451

Sounds like a good idea. Probably alot of work!

▶Anonymous 06/13/18 (Wed) 20:18:45 4ca1a6 (1) No.1733192

>>1732671 (prev.)

I have heard estimates of Roth wealth in the area of 400-500 Trillion dollars.

▶Anonymous 06/13/18 (Wed) 20:19:21 131565 (22) No.1733203

>>1733098

It HAS been a lot of work and will continue to be. I've been coasting for a bit, just making sure that the general threads have been archived and made available. But there's also a lot of processing to do with the data if the ultimate goal is to be achieved as imagined. Kinda wish there was a way to safely share the work.

▶Anonymous 06/13/18 (Wed) 21:02:15 90e281 (20) No.1733870

>>1733098

I heard that. I coasted about 2 weeks for the same reason. I've been working on tightening up the site and working on small bugs I've found.

I implemented a search for Q posts and am working on the big bread search now.

▶Anonymous 06/13/18 (Wed) 21:16:52 131565 (22) No.1734049>>1735439

One of the tricky things about making my research tool available publicly is that the platforms are different. Different operating system, different database, and (apparently) different PHP. So I may have something working perfectly on my development machine, but I find there are problems when I try to share it. If the focus is to prepare the blog, which is an abridged view, then maybe I shouldn't sweat it if what I have shared publicly doesn't always work?

▶Anonymous 06/13/18 (Wed) 23:12:46 90e281 (20) No.1735439>>1736163

>>1734049

Ahh you've entered the big new world of internet interoperability! The internet is great, but it's not always the easiest to move data from platform to platform.

It's one of the reasons I stuck with straight JSON. Platform independent. Easily shared. Do you have the capability to transform into JSON/XML? What is your end goal? Share the database? Share the data? The app itself?

▶Anonymous 06/14/18 (Thu) 00:00:37 131565 (22) No.1736163

>>1735439

I probably do. It's all databased. I'd just have to put stuff into a structure and run an encode_json() on it. Not sure it would be all that easy to put the advanced features into the JSON, though. It doesn't solve the problem of making something accessible for non-techie types, though, which is my goal.

▶Anonymous 06/16/18 (Sat) 18:20:34 90e281 (20) No.1774255>>1963374

>>1773748

Big bread search update.

▶Anonymous 06/22/18 (Fri) 20:58:12 965f24 (2) No.1865219>>1865264

File (hide): 27a802435b8d244⋯.png (222.17 KB, 1330x741, 70:39, ss (2018-06-22 at 12.57.57….png) (h) (u)

http://YaCy.net -- distributed search engine – has 17 hits for clean query {Q Clearance Patriot}. Kek.

But we should probably download the software and seed a lot moar…

▶Anonymous 06/22/18 (Fri) 21:03:48 965f24 (2) No.1865264>>1865819

File (hide): d84d58488a73599⋯.png (156.88 KB, 1376x862, 688:431, download.png) (h) (u)

>>1865219

Just for kicks another search

▶Anonymous 06/22/18 (Fri) 21:50:56 131565 (22) No.1865819

>>1865264

Certainly a page could be made for telling people how to search the original sources. Maybe it could include input fields as well to help people get it right. Unfortunately, original sources have been hacked from time to time, and some material is no longer available.

▶Anonymous 06/23/18 (Sat) 10:29:10 cabbaa (2) No.1873487>>1876296

Hi there anons, just stumbled on this thread in my search for a collection of notables.

Anyone thought of putting them together in a tread/breads?

What were/would be pros/cons of doing such?

Data duplication, Too big etc.

Are there easy ways to make/view/access such collection?

▶Anonymous 06/23/18 (Sat) 17:42:11 131565 (22) No.1876296>>1879714

>>1873487

My project has the capability of searching by threads.

As for breads, I'd been working toward that, and I'll probably get back to it soon. The challenge of breads is a bit tougher because they must be identified. So far, my own solution has been a combination of automation and inspection.

▶Anonymous 06/23/18 (Sat) 23:09:45 cabbaa (2) No.1879714

>>1876296

Hey TY for getting back to me about this anon.

Your solution is similar to mine I see.

It is why I'd like to have a blogroll with exclusively notables, scraped from all breads by automation, so I could inspect the works thereof.

▶Anonymous 06/29/18 (Fri) 23:42:40 90e281 (20) No.1963374>>1971866

>>1774255

I may be back to Solr not being a good solution.

In trying to create a prebuilt index I've discovered that either

a) javascript just doesn't have enough memory to do it

b) javascript times out before it gets done and nothing happens.

I'm going to take a closer look at this

https://xapian.org/docs/bindings/csharp/

▶Anonymous 06/30/18 (Sat) 17:44:54 90e281 (20) No.1971866>>1972297

>>1963374

Moar testing today. Solr is NEVER going to work in this instance. I was hoping that I could just create an index on my dev machine and save that off and then use a worker process to add to the index. I've got one other idea to see if I can bend it to my will - but so far no workie. From what I can tell it's not possible to add to the index - it needs to be completely regenerated when you add a new document.

I don't understand how other people can add so many docs to the index and have it work. My tests were showing it to run for 12+ minutes just to generate an index and it never finished.

I'm open to new ideas if anybody has one.

The custom Google search I've got on there now does seem to work, but again it's not ideal. What I want is a list of POSTS that match and the goog search seems to find the matches, but only returns complete breads. You still have to CRTL F to find what you were looking for within the bread.

I can put together a test harness for Solr if anybody want to see if they can figure out a way to make it go.

▶Anonymous 06/30/18 (Sat) 18:16:53 90e281 (20) No.1972297>>1993030 >>1997608

>>1971866

My gut is telling me that my next best option is to move into a database in order to accomplish the bigbreadsearch. It's probably possible to do using a hosted elasticsearch solution (https://www.elastic.co/cloud @$50/mo)

On the other hand, I think that I can write an app to fill a database in a couple hours, and it would solve a few of the problems I was seeing in the other search tech. Most of the good search engines will plug into a database anyways so I think this is probably the direction I'm headed.

▶Anonymous 07/02/18 (Mon) 02:21:11 131565 (22) No.1993030>>1997626

>>1972297

$50/month seems like a lot. My cost isn't nearly that much.

▶Anonymous 07/02/18 (Mon) 13:03:34 90e281 (20) No.1997608>>2004075

>>1972297

For elastic search?

▶Anonymous 07/02/18 (Mon) 13:08:43 90e281 (20) No.1997626>>2004084

>>1993030

>$50/month seems like a lot. My cost isn't nearly that much.

Derp. I clicked the wrong post.

I agree - which is why I haven't done anything on it. My hosting costs a bit more than that - ANNUALLY.

I feel like a DB is just just going to be a better solution now. I'd hoped that I'd be able to just do everything with straight JSON - but alas! You cannot.

I guess I need to find the best search engine to plug a DB into now. I'm hoping to write the code to insert my existing data into the database today, write code to insert new data into the DB tomorrow.

▶Anonymous 07/02/18 (Mon) 23:59:40 131565 (22) No.2004075>>2013368

>>1997608

That sounds like a software lease.

▶Anonymous 07/03/18 (Tue) 00:01:03 131565 (22) No.2004084>>2013368

>>1997626

MySQL and MariaDB have a natural language search capability built into it. Have you checked to see if that meets your needs?

▶Anonymous 07/03/18 (Tue) 09:33:55 e8d48e (1) No.2009885>>2013368

File (hide): fef4a8d9dda3055⋯.jpg (35.89 KB, 300x300, 1:1, QSEARCH.jpg) (h) (u)

https://resignation.info/scripts/8chan/search.php

Anons might find this useful.

Doesn't work so well with images but is good for keyword searches.

▶Anonymous 07/03/18 (Tue) 17:37:08 90e281 (20) No.2013368

>>2004075

Yeah. It's a hosted service. It appears that deploying a custom elasticsearch is probably a large pain in the ass most folks don't want to deal with.

>>2004084

I have SQLServer currently set up and my host gives me a database so I'll probably go with that.

>>2009885

WTFERK? We already have like 3 bread searches already now? Am I totally wasting my time?

Regardless….

Interesting! Tell me more about how you are doing this. Search seems to be pretty quick. Are you using a DB backend? Straight text search? Is all this in PHP?

I've managed to import all the JSON data I have on hand. 1,569,777 posts took 25mins to import. My DB design is ultra simple. Single table that virtually matches the JSON data structure. There's no telling what the performance is going to be like just yet. Even getting a count takes 16 seconds. Ugh.

I'll run some simple tests later to see what I can figure out.

/qresearch/ - Q Research Board★

CATALOG IS FROZEN AGAIN. USE THE INDEX TO NAVIGATE FOR NOW.

First time on QResearch? 8chan? Click here, newfag.

General

WebM

Theme

User JS

Do not paste code here unless you absolutely trust the source or have read it yourself!

Favorites

Customize Formatting

Filters