[ / / / / / / / / / / / / / ] [ dir / aus / boxxy / choroy / dempart / f / jenny / jp / komica ][Options][ watchlist ]

/tech/ - Technology

You can now write text to your AI-generated image at https://aiproto.com It is currently free to use for Proto members.
Email
Comment *
Verification *
File
Select/drop/paste files here
Password (Randomized for file and post deletion; you may also set your own.)
* = required field[▶ Show post options & limits]
Confused? See the FAQ.
Expand all images

[–]

 No.1046836>>1046855 >>1046866 >>1046867 >>1046889 >>1046994 >>1047057 >>1047372 >>1047406 >>1057062 [Watch Thread][Show All Posts]

I've been seeing the altchans ramp up their shilling since the New Zealand kebab removal. I'm thinking of throwing my hat into the ring. What's the best database solution for an imageboard? MySQL, PostgreSQL, SQLite? One of the non-relational databases like MongoDB I know it gets Stephen Lynx all wet.

How do you justify your choice?

 No.1046837>>1046839

Your altchan won't even get 1PPH so literally anything will work.

shit thread btw


 No.1046839

>>1046837

Yeah I figured that. I'm asking hypothetically, for an imageboard that gets actual traffic, what would be best.


 No.1046855>>1046891

>>1046836 (OP)

A guy like me would just say fuck it imma gonna just use tinyIB And stick to using a flat pack


 No.1046856>>1046857

Nope im not gonna get a ssl cert either because fuck logic


 No.1046857

>>1046856

Yeeeeeeeeer


 No.1046866>>1047231

>>1046836 (OP)

when in doubt, and when not in doubt, and when it's sunny or not sunny, and on days ending in 'y', or if a flipped coin is pulled by gravity into any of ( Heads | Tails | Side | Floating_Perpetually | Annihilated ) states, then use SQLite.


 No.1046867>>1046868

>>1046836 (OP)

>MongoDB changed it's license to a nonfree one, you should avoid it, unless you won't make a fork.


 No.1046868

>>1046867

That wasn't intended to be green


 No.1046869>>1046870 >>1046917

Stephen Lynx here.

To be fair, db choices aren't too crucial for a database. Relational databases are easy to not fuck up but more laborious to work with. Document databases like mongo are much easier to develop because you don't have a schema to worry about, but it's much easier to fuck up.

Now, that changes if you wish to store files in the db. That was why I picked mongo, because of gridfs. If you wish to store files on the filesystem, then that doesn't matter. Most of the performance bottleneck will be on your cache and the building of the final HTML to be served.

So coming from someone with quite some experience in the field my advice is: don't bike shed around dbs. Just pick something that fits your overall design, you just have to know WHAT your overall design is.


 No.1046870

>>1046869

>To be fair, db choices aren't too crucial for a chan

Fucked right off the bat smh


 No.1046871

Oh yeah, there's the license thing with mongo. But in my opinion that only matters for services offering mongodb cloud things. Mongo as a corporation wanted to force people into using THEIR solution, so they designed a shitty license to fuck cloud services over.

IMO, if they ever fuck up their licensing so much that it affects the actual community, it will get forked and people will move over. So that doesn't really concern me, tbh.

Don't take me wrong, I am not defending their kikery, just saying that I don't see it as a threat for people running their own thing.


 No.1046874>>1046884

mongo is made by nsa

they by default permit admin login online and when some hackers noticed that there's a shitton of mongodbs on the web available to everyone (ISPs, phone companies, etc.) the devs just sent out shills telling everyone that this was fake news and that everyone who uses mongodb remembered to firewall it.

which was evidently wrong.

other than that i don't think it fucking matters. also you forgot to mention redis.


 No.1046884

>>1046874

Oh yeah, the authentication thing.

Here's what I know about it: when you compile it yourself, by default it used to be open. BUT their repositories for both rhel and ubuntu were closed off for remote connections.

But so many people fucked it up that they changed it, now you always have to open up remote connections manually.


 No.1046887


 No.1046889

>>1046836 (OP)

Use the filesystem.


 No.1046891

go ask on halfchan/g/

>>1046855

what fucking pack? fuck off niggers


 No.1046893>>1046899

PostgreSQL is the only fat SQL DB worth using. SQLite is the only slim SQL DB worth using. As for NoSQL, I don't see the point unless you're looking to cut dev time by 15% at the cost of your sanity. You often end up with a hard-to-maintain mess that you query with either some SQL analogue (like CouchDB's JS query language) or a low-functionality HTTP URI meta-language.

For most server applications, I'd go PostgreSQL. It's fast, it's hard, it's stable, it's easy to maintain, it's easy to replicate and it has good documentation. That or don't bother and use flat files, keeping mappings and indexes in RAM. It was viable for 4chan during its prime (or it would have been if it was coded correctly) so I don't see why you couldn't do it for your two posts per day IB.


 No.1046899>>1046904

>>1046893

>SQLite is the only slim SQL DB worth using

The current state of /tech


 No.1046904

>>1046899

>Not knowing it's god's choice of a SQL database


 No.1046917>>1046923 >>1047140 >>1047224

>>1046869

>Most of the performance bottleneck will be on [...] the building of the final HTML to be served.

So just use client side rendering?


 No.1046923>>1046977 >>1047224

>>1046917

That's what all those sites that don't work without JS do. They just dump JSONs over and work with them with JS on the client side. Literal CIA niggers. xD

It's true that you reduce server (and probably client too, if done correctly) load by using JSON, but by doing that you drop any legacy HTML whatsoever.


 No.1046977>>1046983 >>1047003 >>1047004 >>1047049 >>1047224

>>1046923

>legacy

Most of the time legacy is an euphemism for garbage.

Giving the client the raw data serialized as json sounds way better than dumping it into an html document.

How do you extract the data from html?


 No.1046983>>1046984 >>1046988 >>1047004 >>1047049

>>1046977

these days legacy means the good thing that just worked and modern is the bloated slow and heavy thing that never works properly and is much harder to use.


 No.1046984>>1046988 >>1047049

>>1046983

one example could be the scales in grocery stores.. they used to be simple machines where you put your thing and press a number but they changed them to some modern touch screen thing so now you have to press two buttons to do the thing that you could do with one button on the old system. i thought that tech was supposed to make things easier but its actually making things harder.


 No.1046988>>1046990

>>1046983

>>1046984

ok gramps. a touchscreen really isn't that hard to grok.


 No.1046990

>>1046988

still harder than the old system. there the buttons were always visible and just work. on this new system you have to press multiple virtual buttons and try to find your shit in the menus that they will some day change just so it would not be too easy to use. its just another case of putting computers where they dont belong.


 No.1046994

File (hide): 2710857667f347a⋯.png (241.69 KB, 544x399, 544:399, 4e868dde3680eed7f2cb071bb8….png) (h) (u)

>>1046836 (OP)

>non-relational databases

If you have to ask it, you don't need it.

>sqllite

Will work, but, being file lock based, can produce some unique problems, when mixed with multiple processes/threads.

>mysql

Somewhat similar situation to nosql. Shit, unless you go into hardcore optimizations.

So PostgreSQL

+ good implementation of standard SQL

+ sane default behaviour

+ 'CREATE TABLE nosql (id SERIAL PRIMARY KEY, document JSONB NOT NULL)', and you have mongodb that actually stores your data.

+ nice documentation


 No.1047003

>>1046977

>How do you extract the data from html?

You don't. Have separate json endpoint or use Accept header.

>Most of the time legacy is an euphemism for garbage.

Not in web. All attempts in web3.0, SPA imageboards I saw where pure shit.


 No.1047004>>1047048

>>1046977

JSON could be a way to go, but I'd rather see some more static structures with it, with pre-defined rules like in static HTML, if that makes sense. Contemporary Web is too JS-heavy. Even putting privacy and security issues aside, JS is slower because it needs to load and execute and not bug out LOL.

>>1046983

JSON is pretty nice, it's primitive as FUCK. Even Cat-v niggers like it.


 No.1047020

The best solution is to hang yourself


 No.1047022>>1047058

SQLite is the only respectable one. Other than that just roll your own bro.


 No.1047048>>1047154

>>1047004

>JSON is pretty nice

Dynamic type fags can hang themselves. You are literally cancer and make messes that are poorly designed and documented.


 No.1047049

>>1046977

>Most of the time legacy is an euphemism for garbage.

literally wrong seeing as everything is shit nowadays and overly complex for what it needs to do, >>1046983

>>1046984

<buttons on a scale

what are you talking about? you put your fruits onto the scale and you read the weight off the dial, there's no buttons


 No.1047057>>1047058

>>1046836 (OP)

are Web imageboards really the best choice anymore?

I would actually rather use a bulletin board type thing, maybe having a lot of those would be better than imageboards on the web.


 No.1047058>>1047066 >>1047130 >>1047141

>>1047022

SQL is for when you need to query data in ways you've never queried it before. If you know something about your data (ex, posts are contained in threads that are contained in boards) you can do much better just using your filesystem and a caching layer.

>>1047057

Is correct. Interactive stuff should be in a thick application, or at least not over HTTP. You don't need to do shit like "Update in x seconds" or use web sockets that break without js. Use network sockets like they were meant to be used, 2-sided.


 No.1047066>>1047127

>>1047058

>you can do much better just using your filesystem and a caching layer.

And where can I find a guide teaching pajeets how to do this? SQL is as simple and easy as taking a dump in the street. What you're talking about sounds a little more involved.


 No.1047127

>>1047066

sql is not easy if you want good performance. not even the best cpu in the world is enough for postgres if you dont know how to properly optimize your shit.


 No.1047130

>>1047058

>filesystem and a caching layer.

Isn't that what sqlite is?


 No.1047140>>1047154

>>1046917

>but I'd rather see some more static structures with it, with pre-defined rules like in static HTML, if that makes sense.

No, it doesn't. Don't think schemas are the be all end all.


 No.1047141

>>1047058

>you can do much better just using your filesystem and a caching layer

>filesystem

goodbye ACID


 No.1047147>>1047174

People in the know don't use databases they use flat files. No database laws apply ;^)


 No.1047154>>1047184 >>1047199

>>1047048

You could supply type information in JSON LMAO.

Anyway, I don't see how it is relevant. Is XML-ish HTTP shit better somewhat? Is HTML typed LOL?

>>1047140

What I'm talking about is static content should remain static. JSON is too primitive by itself to supply it without JS or application that KNOWS what to do with it, and possibly expect some meaningful shit on user screen.


 No.1047156

>ITT LARPers LARPing about databases

TOP LARP


 No.1047174>>1047183

>>1047147

Yeah that's sort of what I was trying to tell somebody but they called me a nigger


 No.1047183>>1047204

>>1047174

Pretty sure it still counts as a database in most countries. It certainly does in the UK.


 No.1047184

>>1047154

>JSON is too primitive by itself to supply it without JS or application that KNOWS what to do with it

It's either something that will be LITERALLY the same or HTML. Json is just the most practical for this kind of task.


 No.1047199>>1047200 >>1047213

>>1047154

>You could supply type information in JSON LMAO.

No you can't. It's still valid JSON to call a string an int. Additionally I want it required in the format. I have never seen someone bundle types into JSON

>Anyway, I don't see how it is relevant. Is XML-ish HTTP shit better somewhat? Is HTML typed LOL?

Both are trash too.


 No.1047200>>1047205 >>1047363

>>1047199

Well, a string always could be read as int, I guess. Just point at the beginning and read 4 bytes, this is a legit int32 LOL. Whatcha gonna do about it, kid?

I don't see what you're getting at. Do you expect everybody on the Internet follow some specific standard of communicating structured generic binary data, when languages don't align on that in general? Or do you imply that you gonna shove BLOB data streams around and your apps are going to know what's up because you wrote them? If second, well, that's what everybody does when doing something that doesn't concern everybody, I guess.


 No.1047204

>>1047183

> It certainly does in the UK

Good joke. The UK specifically doesn't.

>UK

>The country that - when it was discovered intel agencies were illegally spying, changed the law to make it legal.


 No.1047205>>1047211

>>1047200

the ideal data interchange format is tiny SQLite files.


 No.1047211

>>1047205

Imagine having to have SQLite dependency to be able to read/serve DNS requests.


 No.1047213>>1047220

>>1047199

>No you can't. It's still valid JSON to call a string an int.

You clearly don't know json. Json differentiates numbers, strings, booleans, nulls, arrays and dictionaries.


 No.1047220

>>1047213

He probably implied that if you call something that is semantically a string an integer because you are a dipshit (like "someArbitraryString": 1337), you could be in trouble. There is no error if you assign 1337 or "1337" to that shit.


 No.1047224>>1047239 >>1047242

>>1046917

>>1046923

Building a JSON document or an HTML document in this case is very close in terms of performance.

>>1046977

When a real document (i.e. anything but an "app") is involved, JS should always extend HTML and should not be a hard requirement. Not only is this good for accessibility, usability and SEO (with parentheses, if you want) but it also implicitly prevents you from building a spaghetti monolith.

I'd do it this way:

http://gay.imageboard.on.nimp.org/mlp/123456/VIEW gives you a specified VIEW of thread 123456. This can be "html" or "json", most likely.

http://gay.imageboard.on.nimp.org/mlp/123456 gives you the default view of thread 123456, which is likely always going to be "html".

Adding ?limit=50 makes it so the resulting view has at most 50 posts (by default the newest ones).

Adding ?since=20190329T190256Z makes it so the resulting view has posts new than the specified time.

I really can't think of anything else that a thread "api" would need unless you want liveposting.


 No.1047229

it really depends on the data model of the information you are collecting; your database choice should be influenced on a decision you have made about how data relate to one another in abstract, because this informs the concrete implementation. Do a data model first before you touch a keyboard.

relational databases make sense for a number of object-like things which each have various fields of data which they do not share; customer data for a point-of-sale system works well because each customer has a name, address, phone number, orders associated with that customer alone.

non-relational databases work better when you're collecting a lot of different fields and don't know how they should relate to each other, but you want to do business intelligence operations to that data. IE: If you are storing posts from a social media source into a db so you can do inferences and provide feedback stats for a CRM system, you will want to use a non-relational system (possibly with Scala).

Hope this helps


 No.1047231

>>1046866

>>>r/iamverysmart


 No.1047232

Oracle XML DB


 No.1047234>>1047236

>non-relational databases work better when you're collecting a lot of different fields and don't know how they should relate to each other, but you want to do business intelligence operations to that data. IE: If you are storing posts from a social media source into a db so you can do inferences and provide feedback stats for a CRM system, you will want to use a non-relational system (possibly with Scala).

I'm not sure if I understood this particular example in contrast with a previous one.

You could relate posts to a user, to a place it was posted etc, and you could relate some stuff to the post as well, like, the amount of likes, replies etc.


 No.1047236

>>1047234

I understand what you are saying but if you are doing BI for a social media segment you want to do analysis with posts as the main input to the db (indidivual users do not matter in this context)

your model would also be useful, but the sort of overhead for it would only be worth it if you want to identify brand promoters vs brand detractors and bad actors, where user persona is a factor of your analysis


 No.1047239>>1047343

>>1047224

>Adding ?limit=50 makes it so the resulting view has at most 50 posts (by default the newest ones).

>Adding ?since=20190329T190256Z makes it so the resulting view has posts new than the specified time.

In case of HTML, that would imply dynamic page generation on server-side, which is a spook. Like, that would mean unzipping HTML generator each time a user makes a GET request. Worst case scenario it knocks our shit out momentarily.


 No.1047242>>1047343

>>1047224

>Building a JSON document or an HTML document in this case is very close in terms of performance.

Hell no.

>Adding ?limit=50 makes it so the resulting view has at most 50 posts (by default the newest ones).

Kiss caching goodbye. That's a terrible idea.


 No.1047244

sqlite is technically the best but when it's not enough, postgres


 No.1047343>>1047371

>>1047239

>that would mean unzipping HTML generator

?

>>1047242

>Kiss caching goodbye.

Why? Most users would hit a cache.


 No.1047363

>>1047200

>I don't see what you're getting at.

There should be a header that specifies the entire data type the body should be. This doesn't need to be sent every request or even in the same request, but it must remain consistent with the body.


 No.1047371

>>1047343

Maybe he thinks that every user would request a slightly different number of replies for shits and giggles, instead of 99% using the default.


 No.1047372

File (hide): 156e94f10a450ae⋯.jpg (56.14 KB, 226x239, 226:239, xPI2oIV.jpg) (h) (u)

>>1046836 (OP)

SQLite has the driver embedded into your software and it's not persistently running in host OS, not intended for high throughput but has great portability. Everything else is pretty much the same thing, just different flavors. By selecting a specific one you can optimize for certain tasks but if you don't have a pressing need to do this, you can just pick the default - MySQL is a database server of choice for web apps. Also it supports a bunch of different database drivers optimized for specific loads, but again, if you don't have specific pressing requirements then just stick with default.

As far as selecting particular implementation goes, what you do is create a benchmark that simulates your load accurately (or test it on live service, provided your system won't fail if you switch the database) and test all available configurations to see which one works best for you.


 No.1047406>>1047571 >>1048103 >>1048207

>>1046836 (OP)

Never really was into web but what's the advantage of using a DB over “lol just files” or whatever?


 No.1047571


 No.1048103>>1048207

>>1047406

A database is "lol just files" except it's a huge, fully featured, tightly integrated "lol just files" that often comes with features like durability (see post above), sorting, indexing, lookups, automation, views, operations-as-transactions and more. As you increase the complexity of an ad-hoc "lol just files", it eventually becomes indiscernible from an ad-hoc database.


 No.1048132>>1048227

The only best DB is sqlite read/written directly to/from the disk block device. Skip the filesystem middle man, they just want you to believe it's necessary.


 No.1048207

>>1047406

>>1048103

In my experience there's no practical difference if what's being implemented is a small system with relatively little data and few users.

However as the data expands and the amount of users increase, database becomes faster, more stable and reliable over just using files especially if you're using NTFS


 No.1048227


 No.1057062

>>1046836 (OP)

Sqlite is the best db until you outgrow it and need distributed architecture / sharding. Then you're fucked and have to hire a team of 30 engineers to figure out how to make it scale.




[Return][Go to top][Catalog][Screencap][Nerve Center][Cancer][Update] ( Scroll to new posts) ( Auto) 3
73 replies | 4 images | Page ???
[Post a Reply]
[ / / / / / / / / / / / / / ] [ dir / aus / boxxy / choroy / dempart / f / jenny / jp / komica ][ watchlist ]