I have a fuckhuge imageboard folder and had an idea the other day to make a system where I could expose my collection to the internet in a way that would allow other anons to download stuff and help organize it by submitting tickets to suggest changes (add

Email
Comment *
File	Select/drop/paste files here
Password	(Randomized for file and post deletion; you may also set your own.)
* = required field	[▶ Show post options & limits] Confused? See the FAQ.

Flag
Oekaki	Show oekaki applet (replaces files and can be used instead)
Options	Do not bump (you can also write sage in the email field) Spoiler images (this replaces the thumbnails of your images with question marks)
Allowed file types:jpg, jpeg, gif, png, webm, mp4, pdf Max filesize is 16 MB. Max image dimensions are 15000 x 15000. You may upload 3 per post.

[–]

▶Anonymous 12/14/18 (Fri) 14:40:56 No.1008423>>1008427 >>1008471 >>1008537 >>1010439 >>1010494 >>1032943 >>1034096 [Watch Thread][Show All Posts]

I have a fuckhuge imageboard folder and had an idea the other day to make a system where I could expose my collection to the internet in a way that would allow other anons to download stuff and help organize it by submitting tickets to suggest changes (add, remove, move, rename).

So my question for you is this: Are there any existing solutions that I could set up that would accomplish this or would I need to cobble something together?

▶Anonymous 12/14/18 (Fri) 14:48:50 No.1008427>>1008430 >>1032945

>>1008423 (OP)

Git repository and an issue tracker. For example github/gitlab.

▶Anonymous 12/14/18 (Fri) 15:02:20 No.1008430>>1008432

>>1008427

github will shut him down for wrong think and racism

▶Anonymous 12/14/18 (Fri) 15:04:03 No.1008432>>1008435 >>1008440

>>1008430

That's why I said gitlab. Otherwise you can always selfhost gitlab instance

▶Anonymous 12/14/18 (Fri) 15:18:51 No.1008435>>1008437 >>1008440

>>1008432

Gitlab is also a silicon valley company

▶Anonymous 12/14/18 (Fri) 15:19:18 No.1008437>>1008439 >>1008715

>>1008435

Self-hosted gitlab.

▶Anonymous 12/14/18 (Fri) 15:21:54 No.1008439>>1008446

>>1008437

Maybe you can base it on some issue tracking software, but you'd want some kind of gallery frontend so people can just browse it without needing to clone the whole repo.

Also git works really poorly with binary files, your .git folder will be huge once you start making changes

▶Anonymous 12/14/18 (Fri) 15:22:01 No.1008440

>>1008435

>>1008432

Right now, hosting a gitlab instance looks appealing to me (I'm investigating it right now). I would have to get a server and set it up but that isn't beyond my abilities.

▶Anonymous 12/14/18 (Fri) 15:28:21 No.1008446

>>1008439

Yeah, but I'm sure you'd sometimes like to upload something, remove something etc and eventually revert to older instances. If you know any versioning software suitable for such case, please gib a link

▶Anonymous 12/14/18 (Fri) 17:17:41 No.1008471

>>1008423 (OP)

Try Internet Archive. ISIS used it and got away with it, you should too.

▶Anonymous 12/14/18 (Fri) 19:11:37 No.1008520>>1008523 >>1034096

Use IPFS, faggot.

▶Anonymous 12/14/18 (Fri) 19:16:19 No.1008523>>1008837 >>1014417

>>1008520

botnet

▶Anonymous 12/14/18 (Fri) 19:48:55 No.1008530

If you're going the self-hosted route, Fossil has a built-in web interface. But I'm not sure if it's flexible enough to accomodate for your collection of smugs.

▶Anonymous 12/14/18 (Fri) 20:17:23 No.1008537>>1008540

>>1008423 (OP)

>50GB imageboard folder

>Fuckhuge

▶Anonymous 12/14/18 (Fri) 20:21:53 No.1008540>>1008541 >>1034096

>>1008537

In over 12 years I've collected less than 1k files and I've deleted three quarters of them (though sometimes I wish I hadn't). Do you just save everything you see?

▶Anonymous 12/14/18 (Fri) 20:26:00 No.1008541>>1008606

>>1008540

I use Hydrus to make sure i don't collect duplicates, prune regularly, and have a soft spot for webm threads, so i've got around three to four hundred GB spread out over 200,000 files.

▶Anonymous 12/14/18 (Fri) 22:46:48 No.1008606>>1008807 >>1032656 >>1034096

>>1008541

>Hydrus

That looks incredible, but I'm skeptical that security is properly implemented. I wonder if there is a non-sharing fork available.

▶Anonymous 12/15/18 (Sat) 05:19:43 No.1008715

>>1008437

>using 8GB of RAM for a git server

▶Anonymous 12/15/18 (Sat) 05:32:04 No.1008722>>1008729 >>1034096

>not just dumping all your files in one folder with SHA-256 filenames and then using softlinks to category directories

>plebs using bloated software like hydrus and even using VCS instead of Doing It The UNIX Way(tm)

▶Anonymous 12/15/18 (Sat) 05:43:58 No.1008729

>>1008722

>using directories instead of tags

You have to have 10K+ memes collected in order to post in this thread, newfag.

▶Anonymous 12/15/18 (Sat) 10:29:54 No.1008807

>>1008606

Then fork it, rip out all the stuff that communicates anywhere, or go harass the dev. Wasn't he doing a strawpoll recently about what to implement next?

>>>/hydrus/

▶Anonymous 12/15/18 (Sat) 13:20:11 No.1008837

File (hide): 4d433fc9234cc20⋯.jpg (22.85 KB, 546x364, 3:2, b57c34c8ae97a2b9a1e4889d05….jpg) (h) (u)

>>1008523

no u

▶Anonymous 12/15/18 (Sat) 16:08:23 No.1008856>>1008913 >>1011145 >>1032593

make a booru

▶Anonymous 12/15/18 (Sat) 19:15:59 No.1008913>>1010421

File (hide): b807b8e9734173e⋯.pdf (5.16 MB, The-Future-Was-Here-The-Co….pdf) (h) (u)

You should go through your own trash. You will also be a lot happier when you trim off the fat and stop hoarding shit you don't\won't use, and only keep what you actually will. Personally, I've been organizing and adding to my document collection

For now, you can find any information you want on the internet... but it won't be that way forever. Would be a shame if you wasted this time collecting reaction images for a dying medium.

>>1008856

Pretty much what OP wants.

▶Anonymous 12/19/18 (Wed) 16:42:17 No.1010421>>1010440 >>1010451

>>1008913

Not anon, how would you decide whats worth keeping? The reason you/I archive in the first place is that you dont know wether you want to keep a file around.

▶Anonymous 12/19/18 (Wed) 17:46:38 No.1010439

>>1008423 (OP)

Train an OpenCV image classifier using the Haskell bindings.

▶Anonymous 12/19/18 (Wed) 17:56:40 No.1010440>>1010599 >>1034137

File (hide): 9a0b588e339282a⋯.pdf (189.75 KB, and so fourth.pdf) (h) (u)

File (hide): f833e5316ea7a1d⋯.pdf (619.44 KB, ONGOING RESEARCH AND DEVEL….pdf) (h) (u)

File (hide): f31e77377c30d88⋯.pdf (2.31 MB, The-Design-of-Everyday-Thi….pdf) (h) (u)

>>1010421

I archive because I know I want to use it, or have used it. I have no use for a million imageboard pics, so I don't save any. However I have a 10GB folder of just documents (mostly pdfs). I suppose I value information more than data.

▶Anonymous 12/19/18 (Wed) 19:53:48 No.1010451

>>1010421

>The reason you/I archive in the first place is that you dont know wether you want to keep a file around.

If you have a hundreds of thousands of hoarded and untagged files, you'll never find what you're looking for should you ever need a specific file anyway.

▶Anonymous 12/19/18 (Wed) 22:04:52 No.1010491

File (hide): ea26e44c0b47ff3⋯.jpg (63.47 KB, 500x706, 250:353, crystal_skull.jpg) (h) (u)

What about a db with image hashes mapped to a name and/or folder structure? The db is some sort of collective with voting etc. I would assume it would get abused in like 3 sec though. Then just run a program that renames images according to the db. Also run it reverse and upload names of your image hashes. Then run a AI bot that merges the names into one.

▶Anonymous 12/19/18 (Wed) 22:22:19 No.1010494

File (hide): 847b402698cfcea⋯.png (14.54 KB, 410x237, 410:237, Screenshot from 2018-12-19….png) (h) (u)

File (hide): 0e8a0a4450860d2⋯.png (14.16 KB, 405x225, 9:5, Screenshot from 2018-12-19….png) (h) (u)

>>1008423 (OP)

>fuckhuge

Anon I...

▶Anonymous 12/19/18 (Wed) 22:42:24 No.1010499

Maybe some booru software could import them all.

▶Anonymous 12/20/18 (Thu) 04:08:55 No.1010599>>1010739 >>1011139

>>1010440

I'm also an archivist (mainly pdfs). We should build a network in the future, to make everything redundant.

▶Anonymous 12/20/18 (Thu) 06:59:43 No.1010641

literally a booru

▶Anonymous 12/20/18 (Thu) 15:25:36 No.1010739

>>1010599

Someday maybe. The big problems I've had with the existing archives I've been picking clean so far are:

>most of the books aren't vetted terribly well, and only exist in the archive because somebody got it from another archive. no guarantees that anybody used it, or that it's even a good reference

>half the good shit shilled on the gentoomen\osdev\etc wikis isn't in them

>it's rare that anything referenced in bigger projects is in an existing archive (see the GCC source for examples)

>research papers and patents are rarely archived, you have to follow the trail of citations manually

Only solution I've came up with is doing it myself.

▶Anonymous 12/21/18 (Fri) 15:31:49 No.1011139>>1033247

>>1010599

>We should build a network in the future, to make everything redundant.

We should, not the question is ... how? Maybe we shouldn't be relying on the internet as much as we do.

Just use the Internet for co-ordinating a Sneakernet of multiple terabyte per delivery, because for most people the Internet is much too slow and it is certainly not anonymous or private.

What kind of PDFs by the way.

▶Anonymous 12/21/18 (Fri) 15:32:51 No.1011140

>not the question

now the question

▶Anonymous 12/21/18 (Fri) 15:57:03 No.1011145

>>1008856

this

▶Anonymous 12/21/18 (Fri) 17:09:27 No.1011160>>1034138

File (hide): f6f02b540bb689b⋯.jpg (37.68 KB, 351x352, 351:352, f6f02b540bb689b0177b938563….jpg) (h) (u)

I've been looking to start the same kind of thing and am currently considering NextCloud. Haven't verified it will work for this purpose, but it seems like it might work.

Failing that, perhaps I can build a web front-end for a NAS that lets people just directly download files. Adding requests, etc, would be lovely. My main concerns are security and bandwidth consumption.

▶Anonymous 12/22/18 (Sat) 19:41:04 No.1011554>>1014406

https://hydrusnetwork.github.io/hydrus/

Https://8ch.net/hydrus

▶Anonymous 12/22/18 (Sat) 19:42:41 No.1011555

If you could upload these to archive.org for now and work out the details of a distributed system later, that might help kickstart interest once we know what's available. Pack them into zips of 1gb each or by subfolder and upload, they accept any size, anytime.

▶Anonymous 12/30/18 (Sun) 19:50:39 No.1014406>>1032656

>>1011554

Isn't anything better than hydrus

A program that doesn't lock you up on it and doesn't change folders/files structure ???

▶Anonymous 12/30/18 (Sun) 20:34:36 No.1014417

>>1008523

based

▶Anonymous 01/22/19 (Tue) 03:51:31 No.1021945

Nginx+tor just make a bare-bones basic webserver with a hsv3 onion

▶Anonymous 02/18/19 (Mon) 15:37:46 No.1032590

Torrent. Did one around 20gigs. I think people still seed it to this day.

▶Anonymous 02/18/19 (Mon) 15:52:32 No.1032593

I use tag filenames so they are easier to find because some images fit in multiple directories and 100Gb is already too much to handle duplicates

>>1008856

and this

▶Anonymous 02/18/19 (Mon) 19:49:45 No.1032656

>>1008606

>but I'm skeptical that security is properly implemented.

It runs offline if needed, what are you worried about?

Also you can just not use a public tag repo.

>>1014406

You can mimic file/folders with tags, if for some reason you need those.

▶Anonymous 02/19/19 (Tue) 09:05:04 No.1032897

Hold on I've seen this exact thread before. OP check latter pages and see if the threads still up.

▶Anonymous 02/19/19 (Tue) 13:05:33 No.1032943

>>1008423 (OP)

You're running Windows, anon. I'm not sure you're up to this.

▶Anonymous 02/19/19 (Tue) 13:15:46 No.1032945

>>1008427

Most git* services would take him down in like 6 days. It's a better idea to set up a VPS with a gitlab/gitea instance.

▶Anonymous 02/20/19 (Wed) 05:56:02 No.1033247>>1033530

>>1011139

Different anon but I have 20000 copies of the anarchist cookbook.

▶Anonymous 02/21/19 (Thu) 03:09:06 No.1033530

File (hide): 3ef953d1500b6af⋯.png (49.12 KB, 449x494, 449:494, asdsad.png) (h) (u)

File (hide): 48af55aa45eeb97⋯.png (115.35 KB, 1028x1298, 514:649, afssd.png) (h) (u)

>>1033247

i have only this one and its too big for this site.

▶Anonymous 02/22/19 (Fri) 19:07:14 No.1034096>>1034097

File (hide): 73c3e6f494e129b⋯.jpg (71.56 KB, 1024x683, 1024:683, Cisco 2620MX & 2950-24.jpg) (h) (u)

>>1008423 (OP)

PROTIP 1: do not use windows for file hoarding.

PROTIP 2: nobody will sort you shit for you. ESPECIALLY WITH FUCKING TICKETS, NO REALLY? YOU EXPECT PEOPLE TO FILE TICKETS ABOUT WHERE TO PUT IMAGE?. But might produce better sorted pack, that you can incorporate into yours later. So putting your collection online is god's work

PROTIP 3: forget about git. It will only bring pain in your case

>>1008520

This. Planning to publish my shit too, once I bother fixing my net

>>1008606

'ip netns exec' is your friend. Cannot share anything without interfaces

>>1008722

Duplication is great.

>softlinks

Hardlinks give you reference-counting & garbage collection for free.

Also don't bother with index. Scan directory, group files with same size, hash them, relink duplicates to single inode - https://0x0.st/zihz.py

It does re-hash some groups of same-sized-but-different files on every run, but that's not a problem for a weekly batch job.

>>1008540

>Do you just save everything you see?

No, I save even shit I don't see.

66G sorted

97G unsorted

11.5T scraped

▶Anonymous 02/22/19 (Fri) 19:08:46 No.1034097

>>1034096

>https://0x0.st/zihz.py

Don't forget to replace md5 with sha.

▶Anonymous 02/22/19 (Fri) 19:51:49 No.1034106>>1034110 >>1034111

At some point I realized that I don't actually revisit my fuckhuge collection that is as large as yours so I made it a habit to reduce the crap on my computer every day. So far I halved my data collection within a couple of months and hopefully I'll pick anything important out of the pile before lighting everything else on fire really soon.

Hoarding is bloat.

▶Anonymous 02/22/19 (Fri) 20:06:28 No.1034110

>>1034106

thats why i dont download these massive book collections. there are good books but a tb or more of some random math shit is not something that i care about and hdds arent free. archivers are a different thing but those probably have a infinite supply of money and hardware and they really like it so they do it.

▶Anonymous 02/22/19 (Fri) 20:07:25 No.1034111>>1034117

>>1034106

Censorship is ramping up. Hoarding is the only way we can protect from that shit. Of course I mean hoarding up useful, relevant shit. Throw away your seasons of Friends and most animu and mango.

▶Anonymous 02/22/19 (Fri) 20:12:53 No.1034117>>1034126

>>1034111

they arent going to cencor the shit that these stupid book torrents are full of. most of the collections are probably even full of outdated information

▶Anonymous 02/22/19 (Fri) 20:36:04 No.1034126

File (hide): b5bb7e9d1019c78⋯.jpg (1.59 MB, 1257x4361, 1257:4361, b5bb7e9d1019c780aa65271c6f….jpg) (h) (u)