[–]▶ No.1046836>>1046855 >>1046866 >>1046867 >>1046889 >>1046994 >>1047057 >>1047372 >>1047406 >>1057062 [Watch Thread][Show All Posts]
I've been seeing the altchans ramp up their shilling since the New Zealand kebab removal. I'm thinking of throwing my hat into the ring. What's the best database solution for an imageboard? MySQL, PostgreSQL, SQLite? One of the non-relational databases like MongoDB I know it gets Stephen Lynx all wet.
How do you justify your choice?
▶ No.1046837>>1046839
Your altchan won't even get 1PPH so literally anything will work.
shit thread btw
▶ No.1046839
>>1046837
Yeah I figured that. I'm asking hypothetically, for an imageboard that gets actual traffic, what would be best.
▶ No.1046855>>1046891
>>1046836 (OP)
A guy like me would just say fuck it imma gonna just use tinyIB And stick to using a flat pack
▶ No.1046856>>1046857
Nope im not gonna get a ssl cert either because fuck logic
▶ No.1046857
▶ No.1046866>>1047231
>>1046836 (OP)
when in doubt, and when not in doubt, and when it's sunny or not sunny, and on days ending in 'y', or if a flipped coin is pulled by gravity into any of ( Heads | Tails | Side | Floating_Perpetually | Annihilated ) states, then use SQLite.
▶ No.1046867>>1046868
>>1046836 (OP)
>MongoDB changed it's license to a nonfree one, you should avoid it, unless you won't make a fork.
▶ No.1046868
>>1046867
That wasn't intended to be green
▶ No.1046869>>1046870 >>1046917
Stephen Lynx here.
To be fair, db choices aren't too crucial for a database. Relational databases are easy to not fuck up but more laborious to work with. Document databases like mongo are much easier to develop because you don't have a schema to worry about, but it's much easier to fuck up.
Now, that changes if you wish to store files in the db. That was why I picked mongo, because of gridfs. If you wish to store files on the filesystem, then that doesn't matter. Most of the performance bottleneck will be on your cache and the building of the final HTML to be served.
So coming from someone with quite some experience in the field my advice is: don't bike shed around dbs. Just pick something that fits your overall design, you just have to know WHAT your overall design is.
▶ No.1046870
>>1046869
>To be fair, db choices aren't too crucial for a chan
Fucked right off the bat smh
▶ No.1046871
Oh yeah, there's the license thing with mongo. But in my opinion that only matters for services offering mongodb cloud things. Mongo as a corporation wanted to force people into using THEIR solution, so they designed a shitty license to fuck cloud services over.
IMO, if they ever fuck up their licensing so much that it affects the actual community, it will get forked and people will move over. So that doesn't really concern me, tbh.
Don't take me wrong, I am not defending their kikery, just saying that I don't see it as a threat for people running their own thing.
▶ No.1046874>>1046884
mongo is made by nsa
they by default permit admin login online and when some hackers noticed that there's a shitton of mongodbs on the web available to everyone (ISPs, phone companies, etc.) the devs just sent out shills telling everyone that this was fake news and that everyone who uses mongodb remembered to firewall it.
which was evidently wrong.
other than that i don't think it fucking matters. also you forgot to mention redis.
▶ No.1046884
>>1046874
Oh yeah, the authentication thing.
Here's what I know about it: when you compile it yourself, by default it used to be open. BUT their repositories for both rhel and ubuntu were closed off for remote connections.
But so many people fucked it up that they changed it, now you always have to open up remote connections manually.
▶ No.1046887
▶ No.1046889
>>1046836 (OP)
Use the filesystem.
▶ No.1046891
go ask on halfchan/g/
>>1046855
what fucking pack? fuck off niggers
▶ No.1046893>>1046899
PostgreSQL is the only fat SQL DB worth using. SQLite is the only slim SQL DB worth using. As for NoSQL, I don't see the point unless you're looking to cut dev time by 15% at the cost of your sanity. You often end up with a hard-to-maintain mess that you query with either some SQL analogue (like CouchDB's JS query language) or a low-functionality HTTP URI meta-language.
For most server applications, I'd go PostgreSQL. It's fast, it's hard, it's stable, it's easy to maintain, it's easy to replicate and it has good documentation. That or don't bother and use flat files, keeping mappings and indexes in RAM. It was viable for 4chan during its prime (or it would have been if it was coded correctly) so I don't see why you couldn't do it for your two posts per day IB.
▶ No.1046899>>1046904
>>1046893
>SQLite is the only slim SQL DB worth using
The current state of /tech
▶ No.1046904
>>1046899
>Not knowing it's god's choice of a SQL database
▶ No.1046917>>1046923 >>1047140 >>1047224
>>1046869
>Most of the performance bottleneck will be on [...] the building of the final HTML to be served.
So just use client side rendering?
▶ No.1046923>>1046977 >>1047224
>>1046917
That's what all those sites that don't work without JS do. They just dump JSONs over and work with them with JS on the client side. Literal CIA niggers. xD
It's true that you reduce server (and probably client too, if done correctly) load by using JSON, but by doing that you drop any legacy HTML whatsoever.
▶ No.1046977>>1046983 >>1047003 >>1047004 >>1047049 >>1047224
>>1046923
>legacy
Most of the time legacy is an euphemism for garbage.
Giving the client the raw data serialized as json sounds way better than dumping it into an html document.
How do you extract the data from html?
▶ No.1046983>>1046984 >>1046988 >>1047004 >>1047049
>>1046977
these days legacy means the good thing that just worked and modern is the bloated slow and heavy thing that never works properly and is much harder to use.
▶ No.1046984>>1046988 >>1047049
>>1046983
one example could be the scales in grocery stores.. they used to be simple machines where you put your thing and press a number but they changed them to some modern touch screen thing so now you have to press two buttons to do the thing that you could do with one button on the old system. i thought that tech was supposed to make things easier but its actually making things harder.
▶ No.1046988>>1046990
>>1046983
>>1046984
ok gramps. a touchscreen really isn't that hard to grok.
▶ No.1046990
>>1046988
still harder than the old system. there the buttons were always visible and just work. on this new system you have to press multiple virtual buttons and try to find your shit in the menus that they will some day change just so it would not be too easy to use. its just another case of putting computers where they dont belong.
▶ No.1046994
>>1046836 (OP)
>non-relational databases
If you have to ask it, you don't need it.
>sqllite
Will work, but, being file lock based, can produce some unique problems, when mixed with multiple processes/threads.
>mysql
Somewhat similar situation to nosql. Shit, unless you go into hardcore optimizations.
So PostgreSQL
+ good implementation of standard SQL
+ sane default behaviour
+ 'CREATE TABLE nosql (id SERIAL PRIMARY KEY, document JSONB NOT NULL)', and you have mongodb that actually stores your data.
+ nice documentation
▶ No.1047003
>>1046977
>How do you extract the data from html?
You don't. Have separate json endpoint or use Accept header.
>Most of the time legacy is an euphemism for garbage.
Not in web. All attempts in web3.0, SPA imageboards I saw where pure shit.
▶ No.1047004>>1047048
>>1046977
JSON could be a way to go, but I'd rather see some more static structures with it, with pre-defined rules like in static HTML, if that makes sense. Contemporary Web is too JS-heavy. Even putting privacy and security issues aside, JS is slower because it needs to load and execute and not bug out LOL.
>>1046983
JSON is pretty nice, it's primitive as FUCK. Even Cat-v niggers like it.
▶ No.1047020
The best solution is to hang yourself
▶ No.1047022>>1047058
SQLite is the only respectable one. Other than that just roll your own bro.
▶ No.1047048>>1047154
>>1047004
>JSON is pretty nice
Dynamic type fags can hang themselves. You are literally cancer and make messes that are poorly designed and documented.
▶ No.1047049
>>1046977
>Most of the time legacy is an euphemism for garbage.
literally wrong seeing as everything is shit nowadays and overly complex for what it needs to do, >>1046983
>>1046984
<buttons on a scale
what are you talking about? you put your fruits onto the scale and you read the weight off the dial, there's no buttons
▶ No.1047057>>1047058
>>1046836 (OP)
are Web imageboards really the best choice anymore?
I would actually rather use a bulletin board type thing, maybe having a lot of those would be better than imageboards on the web.
▶ No.1047058>>1047066 >>1047130 >>1047141
>>1047022
SQL is for when you need to query data in ways you've never queried it before. If you know something about your data (ex, posts are contained in threads that are contained in boards) you can do much better just using your filesystem and a caching layer.
>>1047057
Is correct. Interactive stuff should be in a thick application, or at least not over HTTP. You don't need to do shit like "Update in x seconds" or use web sockets that break without js. Use network sockets like they were meant to be used, 2-sided.
▶ No.1047066>>1047127
>>1047058
>you can do much better just using your filesystem and a caching layer.
And where can I find a guide teaching pajeets how to do this? SQL is as simple and easy as taking a dump in the street. What you're talking about sounds a little more involved.
▶ No.1047127
>>1047066
sql is not easy if you want good performance. not even the best cpu in the world is enough for postgres if you dont know how to properly optimize your shit.
▶ No.1047130
>>1047058
>filesystem and a caching layer.
Isn't that what sqlite is?
▶ No.1047140>>1047154
>>1046917
>but I'd rather see some more static structures with it, with pre-defined rules like in static HTML, if that makes sense.
No, it doesn't. Don't think schemas are the be all end all.
▶ No.1047141
>>1047058
>you can do much better just using your filesystem and a caching layer
>filesystem
goodbye ACID
▶ No.1047147>>1047174
People in the know don't use databases they use flat files. No database laws apply ;^)
▶ No.1047154>>1047184 >>1047199
>>1047048
You could supply type information in JSON LMAO.
Anyway, I don't see how it is relevant. Is XML-ish HTTP shit better somewhat? Is HTML typed LOL?
>>1047140
What I'm talking about is static content should remain static. JSON is too primitive by itself to supply it without JS or application that KNOWS what to do with it, and possibly expect some meaningful shit on user screen.
▶ No.1047156
>ITT LARPers LARPing about databases
TOP LARP
▶ No.1047174>>1047183
>>1047147
Yeah that's sort of what I was trying to tell somebody but they called me a nigger
▶ No.1047183>>1047204
>>1047174
Pretty sure it still counts as a database in most countries. It certainly does in the UK.
▶ No.1047184
>>1047154
>JSON is too primitive by itself to supply it without JS or application that KNOWS what to do with it
It's either something that will be LITERALLY the same or HTML. Json is just the most practical for this kind of task.
▶ No.1047199>>1047200 >>1047213
>>1047154
>You could supply type information in JSON LMAO.
No you can't. It's still valid JSON to call a string an int. Additionally I want it required in the format. I have never seen someone bundle types into JSON
>Anyway, I don't see how it is relevant. Is XML-ish HTTP shit better somewhat? Is HTML typed LOL?
Both are trash too.
▶ No.1047200>>1047205 >>1047363
>>1047199
Well, a string always could be read as int, I guess. Just point at the beginning and read 4 bytes, this is a legit int32 LOL. Whatcha gonna do about it, kid?
I don't see what you're getting at. Do you expect everybody on the Internet follow some specific standard of communicating structured generic binary data, when languages don't align on that in general? Or do you imply that you gonna shove BLOB data streams around and your apps are going to know what's up because you wrote them? If second, well, that's what everybody does when doing something that doesn't concern everybody, I guess.
▶ No.1047204
>>1047183
> It certainly does in the UK
Good joke. The UK specifically doesn't.
>UK
>The country that - when it was discovered intel agencies were illegally spying, changed the law to make it legal.
▶ No.1047205>>1047211
>>1047200
the ideal data interchange format is tiny SQLite files.
▶ No.1047211
>>1047205
Imagine having to have SQLite dependency to be able to read/serve DNS requests.
▶ No.1047213>>1047220
>>1047199
>No you can't. It's still valid JSON to call a string an int.
You clearly don't know json. Json differentiates numbers, strings, booleans, nulls, arrays and dictionaries.
▶ No.1047220
>>1047213
He probably implied that if you call something that is semantically a string an integer because you are a dipshit (like "someArbitraryString": 1337), you could be in trouble. There is no error if you assign 1337 or "1337" to that shit.
▶ No.1047224>>1047239 >>1047242
>>1046917
>>1046923
Building a JSON document or an HTML document in this case is very close in terms of performance.
>>1046977
When a real document (i.e. anything but an "app") is involved, JS should always extend HTML and should not be a hard requirement. Not only is this good for accessibility, usability and SEO (with parentheses, if you want) but it also implicitly prevents you from building a spaghetti monolith.
I'd do it this way:
http://gay.imageboard.on.nimp.org/mlp/123456/VIEW gives you a specified VIEW of thread 123456. This can be "html" or "json", most likely.
http://gay.imageboard.on.nimp.org/mlp/123456 gives you the default view of thread 123456, which is likely always going to be "html".
Adding ?limit=50 makes it so the resulting view has at most 50 posts (by default the newest ones).
Adding ?since=20190329T190256Z makes it so the resulting view has posts new than the specified time.
I really can't think of anything else that a thread "api" would need unless you want liveposting.
▶ No.1047229
it really depends on the data model of the information you are collecting; your database choice should be influenced on a decision you have made about how data relate to one another in abstract, because this informs the concrete implementation. Do a data model first before you touch a keyboard.
relational databases make sense for a number of object-like things which each have various fields of data which they do not share; customer data for a point-of-sale system works well because each customer has a name, address, phone number, orders associated with that customer alone.
non-relational databases work better when you're collecting a lot of different fields and don't know how they should relate to each other, but you want to do business intelligence operations to that data. IE: If you are storing posts from a social media source into a db so you can do inferences and provide feedback stats for a CRM system, you will want to use a non-relational system (possibly with Scala).
Hope this helps
▶ No.1047231
>>1046866
>>>r/iamverysmart
▶ No.1047232
▶ No.1047234>>1047236
>non-relational databases work better when you're collecting a lot of different fields and don't know how they should relate to each other, but you want to do business intelligence operations to that data. IE: If you are storing posts from a social media source into a db so you can do inferences and provide feedback stats for a CRM system, you will want to use a non-relational system (possibly with Scala).
I'm not sure if I understood this particular example in contrast with a previous one.
You could relate posts to a user, to a place it was posted etc, and you could relate some stuff to the post as well, like, the amount of likes, replies etc.
▶ No.1047236
>>1047234
I understand what you are saying but if you are doing BI for a social media segment you want to do analysis with posts as the main input to the db (indidivual users do not matter in this context)
your model would also be useful, but the sort of overhead for it would only be worth it if you want to identify brand promoters vs brand detractors and bad actors, where user persona is a factor of your analysis
▶ No.1047239>>1047343
>>1047224
>Adding ?limit=50 makes it so the resulting view has at most 50 posts (by default the newest ones).
>Adding ?since=20190329T190256Z makes it so the resulting view has posts new than the specified time.
In case of HTML, that would imply dynamic page generation on server-side, which is a spook. Like, that would mean unzipping HTML generator each time a user makes a GET request. Worst case scenario it knocks our shit out momentarily.
▶ No.1047242>>1047343
>>1047224
>Building a JSON document or an HTML document in this case is very close in terms of performance.
Hell no.
>Adding ?limit=50 makes it so the resulting view has at most 50 posts (by default the newest ones).
Kiss caching goodbye. That's a terrible idea.
▶ No.1047244
sqlite is technically the best but when it's not enough, postgres
▶ No.1047343>>1047371
>>1047239
>that would mean unzipping HTML generator
?
>>1047242
>Kiss caching goodbye.
Why? Most users would hit a cache.
▶ No.1047363
>>1047200
>I don't see what you're getting at.
There should be a header that specifies the entire data type the body should be. This doesn't need to be sent every request or even in the same request, but it must remain consistent with the body.
▶ No.1047371
>>1047343
Maybe he thinks that every user would request a slightly different number of replies for shits and giggles, instead of 99% using the default.
▶ No.1047372
>>1046836 (OP)
SQLite has the driver embedded into your software and it's not persistently running in host OS, not intended for high throughput but has great portability. Everything else is pretty much the same thing, just different flavors. By selecting a specific one you can optimize for certain tasks but if you don't have a pressing need to do this, you can just pick the default - MySQL is a database server of choice for web apps. Also it supports a bunch of different database drivers optimized for specific loads, but again, if you don't have specific pressing requirements then just stick with default.
As far as selecting particular implementation goes, what you do is create a benchmark that simulates your load accurately (or test it on live service, provided your system won't fail if you switch the database) and test all available configurations to see which one works best for you.
▶ No.1047406>>1047571 >>1048103 >>1048207
>>1046836 (OP)
Never really was into web but what's the advantage of using a DB over “lol just files” or whatever?
▶ No.1047571
▶ No.1048103>>1048207
>>1047406
A database is "lol just files" except it's a huge, fully featured, tightly integrated "lol just files" that often comes with features like durability (see post above), sorting, indexing, lookups, automation, views, operations-as-transactions and more. As you increase the complexity of an ad-hoc "lol just files", it eventually becomes indiscernible from an ad-hoc database.
▶ No.1048132>>1048227
The only best DB is sqlite read/written directly to/from the disk block device. Skip the filesystem middle man, they just want you to believe it's necessary.
▶ No.1048207
>>1047406
>>1048103
In my experience there's no practical difference if what's being implemented is a small system with relatively little data and few users.
However as the data expands and the amount of users increase, database becomes faster, more stable and reliable over just using files especially if you're using NTFS
▶ No.1048227
▶ No.1057062
>>1046836 (OP)
Sqlite is the best db until you outgrow it and need distributed architecture / sharding. Then you're fucked and have to hire a team of 30 engineers to figure out how to make it scale.