[ / / / / / / / / / / / / / ] [ dir / animu / bestemma / blog / tingles / u / vg / vichan / zoo ]

/hydrus/ - Hydrus Network

Bug reports, feature requests, and other discussion for the hydrus network.
Name
Email
Subject
Comment *
File
Password (Randomized for file and post deletion; you may also set your own.)
* = required field[▶ Show post options & limits]
Confused? See the FAQ.
Embed
(replaces files and can be used instead)
Options

Allowed file types:jpg, jpeg, gif, png, webm, mp4, swf, pdf
Max filesize is 16 MB.
Max image dimensions are 15000 x 15000.
You may upload 5 per post.


New user? Start here ---> http://hydrusnetwork.github.io/hydrus/

Experienced user with a bit of cash who wants to help out? ---> Patreon

Current to-do list has: 1,478 items

Current big job: none, voting on next job


8375b0  No.5680

Namespaces should be sorted, but how?

I am hereby presenting a solution. A tree-like namespace for certain tag categories that are too big.

The text below can be used as an example on how it could look like.


Definitely necessary
Artist (A) - The people who drew it
Character (C) - Characters involved
Fetish (F) - Erotic tropes (NSFW)
A way to categorize sub-genres
Group (G) - Artistic Guilds (yes, it's a thing)
Knowledge (K) - Field of study /pol//tech/
Field of study and grouping of topics
Universe (U) - Alternate realities
Parody (P) - Shows and movies it references
Franchises of the same story/plot (higher level)
Specific series or shows (lower level)
Season (S) and Episode (E)
sXXeXX (two numbers) vs eXXX (singular number)
sXX (series number)
XXXXXXX (episode or season name)

May not be necessary
Descriptor/Miscellaneous (D/M)
Body (B) - Posture, features and clothing
Posture
Eyes
Hair
Skin
Upper body clothing
Lower body clothing
Footwear
Accessories
Posture
etc...
Items (I) - Objects used
Animals
Furniture
Tools/Stationary
Food/Drink
Weapons
etc...
Location (L) - Plot setting
Offshoots (O) - "Donut steel" <= containment for cancer
Request (R) - Donor contributions

Other junk tags not related to subject, used to maintain quality
Captioning
Infographic
JPGcompression
MSPaintings
ScreenCap
SeizureWarning
SourceRequest
SourceLost
TextWall
Watermark

Searching image dimensions and file type could also help

Can Hydrus ever get to the mainstream?

Definitely. Since (Dan|Gel)boorus are now shit, perhaps Hydrus+IPFS can kill it.

1. Querying an ipfs node that returns a list of IPFS address and thumbnails, which then are used to download images. Great when being outside-of-house with fast Internet and a home node, while only having access to smartphones. Goal: Mobility

2. A web-browser-friendly client that does not need fiddling with manual file allocation, easy install. Greasemonky-style customization to booru and Tumblr/deviantArt websites with one-click tag/artist download button. Goal: Accessibility

3. Literal dismantling of deviantArt and Tumblr using zeronet as layout and zeronet as database. Blog vs Gallery layouts, standard customizations, upvotes, chat, following, personal vs group sites, donation box (why not?), search engine with tagging. Goal:Competition

4. A wiki for keeping information about tags and related subjects, with a forum-esque conflict resolution system using zeronet to debate on accuracy of such information. Providing ipfs links to anime and manga, being better than bittorrent info sites. Goal:Resolution

1 and 2 is solved by others through Hydrus API, 3 and 4 will need help from zeronet.

Problem for #3: Decentralized networks are hard to search.

Problem for #4: Fiddling wikis with IPFS would be a challenge.

All suggestions are welcomed. The more problems to solve, the better.

8375b0  No.5681

#2 web-browser-friendly GUI client for mass adoption

Drag-and-drop model for downloading and managing images

Drag-and-drop model for creating, grouping and managing tags

Operates either in the web browser or A simplified Hydrus client

A sleeker GUI panel style, or an enhanced Hydrus client layout

Multi-language support for translated interface and filenames

Indicator for size per update and batch download (if possible)

Audio/Video support for mp3, m4a, aac, flac, mp4, mkv, webm

Ebook support for pdf, epub, mobi, html, txt, libreoffice

Greasemonky for adding download buttons for convenience

Complex AND/OR/XOR of tags

( tag_A AND tag_B ) OR ( tag_C XOR tag_D XOR tag_E ) would be a proper syntax, using () or [] for nesting. Operator could be done left-to-right for ease of understanding.


aed5e0  No.5682

>>5680

Are you on drugs?

Namespaces can already be sorted alphabetically, this post makes absolutely no sense.


8375b0  No.5683

>>5682

care to explain?


f024ab  No.5684

>>5682

I think they mean organized rather than sorted.


f024ab  No.5685

>>5680

>Definitely necessary

Why are those definitely necessary? I'd be surprised if more than 1% of my pictures falls under the parody or season or episode category.

>Can Hydrus ever get to the mainstream? Definitely. Since (Dan|Gel)boorus are now shit, perhaps Hydrus+IPFS can kill it.

Why are boorus shit (honest question, I think they're useful, not trying to stir anything)?

Where do you live that IPFS is mainstream?

All in all your post is not really clear to me, both in what you want to achieve and why you want it.


8375b0  No.5686

>>5685

Those tag categories are from (Dan|Gel|Derpi)Booru and Konachan.

The "parody" category is what some other places called "Copyright".

The Season-Episode category is useful for fandom-specific Boorus like Derpibooru Not a brny, just getting inspirations. Similar systems has been implemented in subreddits for spoiler control.


8375b0  No.5687

>>5685

>Why are boorus shit (honest question, I think they're useful, not trying to stir anything)?

They are starting to fight against adblockers, they have bad admins/mods, having paywalls at all. The Hydrus chat has discovered it for a while.

>Where do you live that IPFS is mainstream?

No, this piece of software CAN make IPFS mainstream. Decentralized image tagging is such a useful tool that it will make IPFS look good. I do have faith that IPFS will out perform bittorrent in the near future.


93f5ad  No.5698

>>5683

>lexicographic (a-z) (grouped by namespace)

For that to work for what OP is suggesting with what we have now, we'd have to have, like, double namespaces. "Body:Eyes:", ect.

Also

>Searching image dimensions and file type could also help

We've got that though, "system:dimensions" and "system:mime" in the search box.

Also suddenly seeing "hair:" and "eyes:" namespaces. There's also a handful of "cloth:" tags, and "clothing:" is being used for thigh-highs, and ONLY thigh-highs, and was made a sibling of every instance of the original tag.

So #4 is something we're desperately in need of, but decisions need to be made first. Hell, I'd settle for cleaning up the countless files that have long strings of unparsed tags, or tags taken from hentai-foundry that have a single word, so character names are split up over two tags and shit like that.


24dbf9  No.5709

>>5698

What we're most in need of is some way to organize tags at all. As it stands now, if i accidentally misname a tag on one file in one image on one database, it could go unchecked practically forever. This is especially worrying if you use a lot of unique tags on each image. like a Filename: tag but right now, something that allows you to simply and effectively merge tags into one is key. It's why people use namespaces so much, cause it's easy to not taint up your own tags with stuff like "Body: or Clothing:" tags and still keep using all the advantages the public repo allows.


0a0d55  No.5710

>>5682

He want namespace groups, as I do.

We may disagree on the way to do it though.

I want to be able to organize it locally, and import any other user grouping.

In 'my system' it would be possible to attribute the color at the group level and define the order ofthe group.

Every non namespaced tag would belong to the 'unsorted group' whcic besides not being deletable would be moveable as any others.


0a0d55  No.5711

>>5680

I'd rather have the posture and gesture group being distinct from

distinct from the body group.

To the artist group, I'd prefer the creator group encompassing artist, circle, editor and translator.

To go above boorus should be the goal eventually for the users. I think sorting namespace by group would be better though.

>testing codesbox 

>>5687

>I do have faith that IPFS will out perform bittorrent in the near future.

Seeing what just happened with nyaa, the faster the better.


0a0d55  No.5712

>>5680

I failed, how do I pretty colour into the boxes?


8375b0  No.5715

>>5711

If Artists and Circles won't clash, it could be done.

Good on you for the posture-gesture-expression difference.

>>5712

Just use \

 \


8375b0  No.5716

>>5712

>>5715

I meant [cod e] [/code] (no spaces in the first bracket)


21709f  No.5717

I think Hydrus doesn't have the ability to do the full complement of set operations during searches - union / intersection / difference etc … yet, right ?

Seems to me like they would be important in case some arrangement into a hypothetical single bigger tree wasn't entirely perfect.


f129b5  No.5737

for https://http://archiveofourown.org/

As long as you can capture these two tags, the text will be preserved.

<!-- BEGIN section where work skin applies -->
<!-- END work skin -->

The tag box starts in

<dl class="work meta group" role="complementary">
and ends in
</dl>

The photos uses hyperlinks that references other websites.

Expect the need for headers and titles, summaries and notes.

for https://www.wattpad.com/

The text will starts with

<pre><p data-p-id="
and ends with
</p></pre>

Capture the chapter image with

<img data-index="1" src="https://d.wattpad.com/${STUFF}" alt=""/>

The tags starts with

<div class="tags">
and ends with
</div>

for https://fanfiction.net/

Starts with

<div class='storytext xcontrast_txt nocopy' id='storytext'><p>

Ends with

</p>

Problem: Tagging references and crossovers, full image 403 forbidden


f129b5  No.5738

>>5737

For https://devinatart.com/

The text starts with

<div class="grf-indent"><div class="text">
and ends it with
<script type="text/javascript">

Yes I now, they literally inject scripts before finishing </div>

The image can be found at

<meta property="og:image" content="${LINK}">


8375b0  No.5796

All codes are tested, if anyone wants to, give it a whirl!

from bs4 import BeautifulSoup; from urllib import request; from re import match; import requests
def ao3_rip(ao3id):
if isinstance(ao3id, int): ao3id = str(ao3id)
else: assert isinstance(ao3id, str)
r = request.urlopen('http://archiveofourown.org/works/'+ao3id+'?view_adult=true&view_full_work=true').read()
text = BeautifulSoup(r, "lxml")
tag_bar = text.find("dl", class_="work meta group", role="complementary").find_all("dd")
tag_list = {}
for i in tag_bar:
x = [j.get_text() for j in i.find_all("a", class_="tag")]
if x == []: x = [i.get_text().lstrip('\n').lstrip(' ').rstrip(' ').rstrip('\n')]
tag_list[i.get("class")[0]] = x
tag_list['author'] = [i.get_text() for i in text.find_all("a", rel="author")]
del tag_list["stats"]
summary = text.find_all("div", class_="summary module", role="complementary")
chapter = {}
for i in text.find_all("div", class_="chapter", id=True):
x = i.find("h3", class_="title").find("a").get("href").split("/")[-1]
chapter[i.get('id')] = [x, i]
return ao3id, tag_list, summary, chapter
def ff_rip(ffid): # no image
if isinstance(ffid, int): ffid = str(ffid)
else: assert isinstance(ffid, str)
r = request.urlopen("https://www.fanfiction.net/s/" + ffid +"/1").read()
text = BeautifulSoup(r, "lxml")
tag_line = text.find("span", class_="xgray xcontrast_txt").get_text().split(" - ")
tag_list = {}
for i in tag_line:
if ":" in i: a, b = i.split(": ")[0:2]; tag_list[a] = b.rstrip(' ')
else: tag_list[['Language', 'Genre', 'Characters'][tag_line.index(i)-1]] = i
tag_list['Characters'] = tag_list['Characters'].split(', ')
tag_list['Genre'] = tag_list['Genre'].split('/')
del tag_list['id']
author = text.find("div", id="profile_top").find("a", class_="xcontrast_txt", title=False, target=False)
userid, username = author.get("href").split("/")[-2:]
summary = text.find("div", class_="xcontrast_txt", style=True).get_text()
chapter_list = {}
for i in range(1, int(tag_list['Chapters'])+1):
r = request.urlopen("https://www.fanfiction.net/s/" + ffid + "/" + str(i)).read()
storyid = match(".*storytextid=([0-9]+)", str(r)).group(1)
text = BeautifulSoup(r, "lxml")
story = text.find("div", class_="storytext xcontrast_txt nocopy", id="storytext")
chapter_list[i] = [storyid, story]
return ffid, userid, username, summary, tag_list, chapter_list
def da_rip(user_id, item_id): # name of literature can replace user_id
if isinstance(user_id, int): user_id = str(user_id)
else: assert isinstance(user_id, str)
if isinstance(item_id, int): item_id = str(item_id)
else: assert isinstance(item_id, str)
r = request.urlopen("https://deviantart.com/art/" + user_id + "-" + itemid).read()
text = BeautifulSoup(r, "lxml")
title = text.find("title").get_text()
artist = text.find("span", class="name").find("a", class="u regular username").get_text()
summary = text.find("meta", property="og:description").get("content")
idnum = {}
for i in ["itemid", "splitid", "ownerid"]:
idnum[i] = text.find("input", {"type":"hidden", "name":i}).get("value")
type_tags = text.find("meta", {"name":"keywords"}).get("content").split(", ")
art_tags = [i.get_text()for i in text.findall("a", class="discoverytag")]
img = text.find("meta", property="og:image").get("content")
dick = text.find("div", class="grf-indent").find("div", class="text").contents[:-2]
return title, artist, summary, idnum, type_tags, art_tags, img, dick
def watt_rip(watt_id):
if isinstance(watt_id, int): watt_id = str(watt_id)
else: assert isinstance(watt_id, str)
book = requests.get('https://www.wattpad.com/apiv2/info?id=' + watt_id, headers={'User-Agent': 'Mozilla/5.0'}).json()
chapter_list = book['group']
result = {"tags": book["tags"].split(" ")}
for i in ["cover", "description", "groupId", "author", "completed"]: result[i] = book[i]
for j in range(1, len(chapter_list)+1):
chapter_id = str(chapter_list[j-1]['ID'])
chapter_data = requests.get('https://www.wattpad.com/apiv2/info?id=' +
chapter_id, headers={'User-Agent': 'Mozilla/5.0'}).json()
result[j] = [chapter_data[k] for k in ["id", "title", "date", "modifyDate", "language",
"videoid", "photolink"]]+[requests.get('https://www.wattpad.com/apiv2/storytext?id=' +
chapter_id, headers={'User-Agent': 'Mozilla/5.0'}).text]
return result

Things to do:

1. Sort out the tags to be consistent

2. Download the images and IPFS it properly

3. Proper text formatting (especially dA and Wattpad)

4. Urllib.request vs Requests, which is better?


8375b0  No.5797

>>5796

Next goal:

1. Able to download whole writers' galleries

2. Able to download all texts by a single tag or search (maybe dangerous)

3. Using tags to bind different chapters together (for deviantArt)


8375b0  No.5805

from __future__ import unicode_literals
import youtube_dl

class MyLogger(object): # prints out warning and error messages
def debug(self, msg): pass
def warning(self, msg): print(msg)
def error(self, msg): print(msg)
def my_hook(d):
if d['status'] == 'finished': print('Done downloading, now converting ...')
def downer(code, link): # standard spec for Hydrus
ydl_opts = {
'format': code,
'logger': MyLogger(),
'progress_hooks': [my_hook],
'writesubtitles': 'allsubtitles',
'write_all_thumbnails':True,
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download([link])

def set_pick(dash, container="mp4", resolution=[360,720], hfr=False):
assert dash in [True, False, None], "error: dash is not bool or None"
assert container in ["mp4", "webm", "both"], \
"error: container variable not correct"
assert set(resolution) < set([240, 360, 480, 720, 1080]), \
"error: bad input from resolution"
assert isinstance(hfr, bool), "error: hfr is not boolean"
non_dash = [[18,22],[43]]
dash_audio = [[140], [171, 249, 250, 251]]
dash_video = [ # 24,36,48,72,108,72h,108h
[133,134,135,136,137,288,289],
[242,243,244,247,248,302,303]]
dash_video_flip = list(zip(*dash_video))
table = {240:[0], 360:[1], 480:[2,5], 720:[3], 1080:[4,6]}
unity = sum(dash_audio, []) + sum(dash_video, [])
royale = unity + sum(non_dash, [])
def format_war(n): return dash_audio[n] + dash_video[n] + non_dash[n]
if dash == True: a = unity
elif dash == None:a = royale
elif dash == False: a = sum(non_dash, [])
if container == "mp4": b = format_war(0)
elif container == "webm": b = format_war(1)
elif container == "both": b = royale
c = [sum([dash_video_flip[j] for j in table[i]], ()) for i in resolution]
c = [e for l in c for e in l] + sum(dash_audio, [])
if 360 in resolution: c += [18, 43]
if 720 in resolution: c+= [22]
if hfr == True: d = unity + sum(non_dash, [])
elif hfr == False: d = sum(dash_audio, []) + \
[e for l in dash_video_flip[:5] for e in l] + sum(non_dash, [])
return list(set(a) & set(b) & set(c) & set(d))
def multi_downer(self, link): # please pipe from set_pick for best results
for i in self: downer(str(i), link)


8375b0  No.5806

>>5805

from __future__ import unicode_literals; import youtube_dl

class MyLogger(object): # prints out warning and error messages
def debug(self, msg): pass
def warning(self, msg): print(msg)
def error(self, msg): print(msg)
def my_hook(d):
if d['status'] == 'finished': print('Done downloading... start converting')
def downer(code, link): # standard video and playlist specs for Hydrus
ydl_opts = {
'format': code,
'logger': MyLogger(),
'progress_hooks': [my_hook],
'write_all_thumbnails':True,
'writesubtitles': 'allsubtitles',
'outtmpl': 'yt-dl/%(title)s/%(title)s-%(format_id)s.%(ext)s'
}
if not isinstance(link, list): link = [link]
with youtube_dl.YoutubeDL(ydl_opts) as ydl: ydl.download(link)

def set_pick(dash, container="mp4", resolution=[360,720], hfr=False):
assert dash in [True, False, None], "error: dash is not bool or None"
assert container in ["mp4", "webm", "both"], \
"error: container variable not correct"
assert set(resolution) < set([240, 360, 480, 720, 1080]), \
"error: bad input from resolution"
assert isinstance(hfr, bool), "error: hfr is not boolean"
non_dash = [[18,22],[43]]; dash_audio = [[140], [171, 249, 250, 251]]
dash_video = [
# 24, 36, 48, 72,108,72h,108h
[133,134,135,136,137,288,289],
[242,243,244,247,248,302,303]]
dash_video_flip = list(zip(*dash_video))
table = {240:[0], 360:[1], 480:[2,5], 720:[3], 1080:[4,6]}
unity = sum(dash_audio, []) + sum(dash_video, [])
royale = unity + sum(non_dash, [])
def format_war(n): return dash_audio[n] + dash_video[n] + non_dash[n]
if dash == True: a = unity
elif dash == None:a = royale
elif dash == False: a = sum(non_dash, [])
if container == "mp4": b = format_war(0)
elif container == "webm": b = format_war(1)
elif container == "both": b = royale
c = [sum([dash_video_flip[j] for j in table[i]], ()) for i in resolution]
c = [e for l in c for e in l] + sum(dash_audio, [])
if 360 in resolution: c += [18, 43]
if 720 in resolution: c+= [22]
if hfr == True: d = unity + sum(non_dash, [])
elif hfr == False: d = sum(dash_audio, []) + \
[e for l in dash_video_flip[:5] for e in l] + sum(non_dash, [])
return list(set(a) & set(b) & set(c) & set(d))
def multi_downer(self, link): # please pipe from set_pick for best results
for i in self: downer(str(i), link)


ac386a  No.5821

Is it really necessary? Calibre + fanficfare handle, well, fanfic well enough.


f4ee16  No.5828

>>5821

Ebooks of the /tech/ and /pol/ kind is in my mind.


93f5ad  No.5844

So I've been noticing more namespaces, and I think we need to decide what we're going to allow and what we aren't. Wasn't it mentioned that there shouldn't be namespaces that then make the base tag meaningless? So 'eyes:blue' without the namespace is just 'blue' and therefore meaningless.

Are we going to do 'item/body part:color'? 'body part:size/status'? 'firearm:weapon' as something similar to species is honestly something I can get behind (there's barely 100 tagged like this and a handful are Splatoon weapons, then over 400 with 'firearm:tagme') as I'm gun illiterate and can't always make out what the tag is supposed to be referring to.

If there's a method that would work to show all the different namespaces, that would help discussion.


93f5ad  No.5852

>>5844

Not counting accidentally namespaced tags (like unnamespaced titles with colons in them), I'm up to 135 different namespaces. Wish showing everything didn't kill my computer.


e9f9ac  No.5856

>>5844

>>5852

Would a tree-based namespace solve the problem?

Originally I was thinking a directory-like namespacing system, but then I realized that it would waste disk space.

Is it possible to have a category-parent-child tag system? with a separate database interlinking different parent tag as a hierarchy?


4a3b79  No.5857

>>5844

An "eye" or "hair" namespace is, in my opinion, utterly useless. Namespaces are there first and foremost to remove ambiguity, such as when a character or an artist have a name that could also be used as a tag. For example, creator:blue and character:blue. Seriously, we don't need a namespace for every single goddamn thing. It's just confusing. There's a reason virtually all boorus just limit it to creator/character/copyright, plus maybe a couple special ones.


0a0d55  No.5861

>>5857

That's because they use namespace to arrange thematically close tags.

hair:blue hair and hair:drill hair can thus be together.

Being able to create group of tag and to organize how the group ar sorted would remove this problem.

>>5857

>There's a reason virtually all boorus just limit it to creator/character/copyright

Yes, laziness and limited competence. It's a headache to think up a whole complex namespace system. Where do you start, where do you stop?

We can start with the simple rule, eery tag must have a meaning without its namespace.

language:japanese works, background:blue doesn't at the moment.

I wonder if it would be possible to present tag in this way:

medium:webm, transparent background, dated.


0a0d55  No.5862

>>5861

As a second capital rule, do some research before doing anything with a tag, ask if needed.

I just found out while parenting some translation (medium:ドット絵 namely) that pixel (artwork) was turned into medium:pixel artwork, which pixel art and medium:pixel art was aliased to.

No one ever used pixel artwork to describe this. It's either sprite or pixel art.

I'm sorry if it felt like a lecture but the same happened not long ago with the pixiv tag for original work, aliased into original character.

/Autism off

gonna do some petition.


0a0d55  No.5863

File: 40d3de617df0f27⋯.jpg (93.35 KB, 418x152, 11:4, pixel artwork.jpg)

>>5862

Actually, that would be a good way to gauge how we deal with diverse tag.


Sorted by new tag in sibling:
Al of these are aliased to medium:pixel artwork
\>sprites
considering 'sprite' remain a tag, it is better to alias it to 'multiple sprites', and not only that but to create the implication 'medium:pixel art'.
\>sprite art
Is a synonim of pixel art which is the usual name for this art style, to be alliased to pixel art
\>pixel art
\>pixel (artwork)
tag from e621, smae situation than sprite art, same solution.
\>medium:pixel art
Thsi should be the end product
\>hydrus invalid tag: ""medium:pixel art""
It's broken, will not touch, might transmit space aids for all I know
\>game sprite
It is a terrible decision to alias it to pixel art or pixel artwork as this would remove useful informations. Not sure if medium:game sprite or leaving it unnamespaced would b better.

Tell me what you all think of this situation and what you would do.


93f5ad  No.5865

>>5862

>no one ever used pixel artwork to describe this

Yeah, same case with 'clothing:thigh-highs', every instance was turned into that, a namespace that doesn't exist for any other item of clothing, and the full tag isn't used anywhere at all outside of the tag sibling.

>>5856

Honestly right now the bigger problem that needs to be solved are the nonsensical unparsed tags. And then, I don't know, ban anyone who doesn't even look to see what they're adding to the PTR. I'm thinking it'll be easier just to clear all the tags from the mangled parses, instead of manually separating the tags for thousands upon thousands of these individual clusterfuck tags.

On the actual subject, there's the 'male/female:' namespace. Things like 'female:breasts' as if that isn't redundant, or 'female:blue hair'. Using this sort of tag would make it nearly impossible to use tag siblings, and would double the number of tags a single image has, if we were to combine it with other namespaces. You'd have 'female:blue hair' and 'hair:blue', and if there was a guy in the picture with blue hair, too, then you'd also have 'male:blue hair'.

Also made a pastebin of all the existing namespaces I could find.

https://pastebin.com/3eyVwduH


0a0d55  No.5869

clothing:tag may be inspired by this:

https://vndb.org/v12033/chars#chars

while female:tag & male:tag are surely inspired by exhentai and nhentai.

I must acknowledge that it's not per se stupid but perhaps the wrong approach.

I wonder if it couild be possible to have dynamic group for each files and to autisticaly arrange in these group the filename, namespaced or not.

I'l have to think this through. I'm not even sure hydrus could support this.


93f5ad  No.5879

>>5869

Imo we should tag things based on how hydrus currently functions.

Going through the weird namespaces, I figured that the 'category:' namespace is all one guy posting photos of plastic models and shit. Literally nothing is namespaced past 'category:', 'id:' that doesn't have any meaning to anyone but him, and 'uploader:', which led me to figure out that most of the images are taken from myfigurecollection. 85k files worth of worthless tags. If you're reading this figure!anon, please stop.

Then there's the anon carefully categorizing every futa edit by this one person, and possibly someone else with all of Dmitry's futa art… with tags like "site:", "official release:yes", the release date, filename, "official art", "siterip" and "art", and no actual informational tags, more like image metadata.

tl;dr: autists spamming the ptr with photo metadata that's useless to everyone and I'm whining about it


011ab0  No.5881

>>5879

So basically separate image metadata and tags?

Okay. Here is a list of things that could be metadata:

1. Image source

2. Image file type, transparency, grayscale/color, dimensions and file size

3. Image name and ID, and maybe collection name and ID


011ab0  No.5890

Things that should definitely NOT exist:

1. Newest posts/comments/threads/news in search page

2. Cluttering window and thumbnails with massive tags

3. An upvote-downvote system by default

Considerations of layout design:

1. Panel thumbnail vs spaced thumbnail

2. Full image thumbnail vs Expanded-by-pointing

3. Physical size vs image size vs number of thumbnail

4. Infinite scroll vs full page view and arrow key controls

5. Custom fanciness i.e. e621 and derpibooru


93f5ad  No.5896

>>5881

Well that too, I'm just saying that if all you have is image metadata and you don't even have a series name or something, don't dump that onto the PTR.


011ab0  No.5897

>>5828

One more note, PDF, EPUB and MOBI (FB2/DJVU/PDB maybe?) should be supported


f78892  No.5899

>>5897

Calibre already exists and does it all better.


011ab0  No.5901

>>5899

Then could Hydrus launch Calibre for ebooks?

If so, why does Hydrus use ffmpeg instead of mpv?

>>5865

>>5869

>>5879

Some case studies:

https://e621.net/wiki/show/e621:tagging_checklist

https://chan.sankakucomplex.com/wiki/show?title=help:_tags

Also, Derpibooru has episode tags for spoilers


93f5ad  No.5903

>>5901

Your links don't say anything about name spaces outside of artist ('creator' being used here), character, copyright ('series' here, other usage actually fits under 'studio'), medium, meta, and studio.

Episode is something that's rarely relevant, and typically loses relevance once a series has concluded. It's fine existing in the PTR, but I feel that cases of it right now are from tags just being pulled from everything with no consideration. Possibly the same with the 'eyes:' and 'hair:' namespaces, but I don't know what site those would come from.

I was looking at the aircraft and firearm namespaces and was considering something for robot models and mecha and the like, since they're not really characters (depends on how the robots are treated in the series though) and like with aircraft and firearms, the make and model aren't always actually descriptive.


93f5ad  No.5905

Can we stab whoever is dumping the files with the "type:" namespace? They're doubling up on the series tags by making each one a "type:" tag, too.


e19a4f  No.5906

>>5903

For the episode tag, ease of searching episode-specific material should also be considered.

For e621, body/species, fetishes, posture/expression, clothing and location would be useful for some.


93f5ad  No.5907

>>5906

Yes, but those are all tags already, but without namespaces, save for species. For instance, a dress would be tagged as "dress", not "cloth:dress" or if it's a girl wearing a dress "female:dress", just "dress". If someone wants to search for or exclude guys wearing dresses, they would simply search for "crossdressing" or "-crossdressing" in addition to "dress".


7274bb  No.5908

>>5907

My worry that some clothing, location and fetish tags can have name clashes with character names and others (since parens and brackets are forbidden)

One such case would be Senketsu the character vs Senketsu the costume (could be resolved by stating what is and is not a character).

Another case is Panty Anarchy (if name not in whole).

It could also be possible that a location is named after a person.


93f5ad  No.5909

>>5908

>since parens and brackets are forbidden

No they aren't, and we've got the "character:", "series:", and "creator:" namespaces.

Then you've got the "person:" namespace for real people.

Okay so example I came across.

You've got the pokemon Dragonite, right? There's an artist called "Dragonite". There were a bunch of different tags. "Dragonite", "dragonite (pokemon)", "character:dragonite" "dragonite (artist)", "creator:dragonite", so on and so forth. There is no reason to have the "(artist)" at the end because it's supposed to be in the "creator:" namespace. The pokemon dragonite is going to be in the "species:" namespace, having been moved from the "character:" one because really it could be both, but since we're already using the species namespace from the various furry boorus, might as well.

So I don't see where the confusion would come in unless you're adding tags to the repository without actually looking at them.


8375b0  No.5910

>>5909

The dev is not fond of parenthesis and brackets. Check the discord.

Quote from th chat: "The hydrus dev prefer without the ()…"

This is why I am asking for an alternative to tag trees or namespaces around a dozen.

Also, how is the dev going to implement nested boolean operators in the future without parens/brackers?


0a0d55  No.5911

File: bd6d6eee4e19514⋯.gif (2 MB, 340x307, 340:307, bd6d6eee4e195144d02f679839….gif)

>>5909

A disaster may happen if a sibling namespace a tag or someone search all the instance of a word to namespace it.

I've seen a gotyui image tagged as creator:policeman from the unnamespaced policeman tag present.

This more a matter of vigilance really, and we're far from the program maturity.

>we're already using the species namespace from the various furry boorus

And what a disaster it is…

species:mammal

>mammal

>species

class:mammalian would have been better.

'taxonomy:tag' would encompass all of this without being weird.

monster:tag, pokemon:tag, species:tag fit nicely under it but I'm really hesitant to put the hour in redoing what preexist.

Especially when what I really want is tag sorted under the taxonomy tag group, with unnamespaced vernacular name (cat) and the whole phylum:tag, order:tag, class tag as on/off appeareance under their parented vernacular name.

AKA, you click on this unnamespaced animal and you get all his taxonomy.

Here's a musing from my local tags

1 animal

1 person

animal

butterfly

class:insecta

common name:atlas moth

family:saturniidae

genus:attacus

hand

kingdom:animalia

medium:animated

moth

order:lepidoptera

phylum:arthropoda

species:attacus atlas

title:attacus atlas


57189d  No.5912

>>5911

That will be overdoing it for the database at this stage.

I would say having a system like bird-eagle would be good enough.

We should reduce the grand namespace to less than two dozen for best user experience.


0a0d55  No.5913

>>5912

that's why it's local at this point.

The database would quickly be even more of a mess with this.

We should reduce the grand namespace to less than two dozen for best user experience.

I don't have a stance either way as long as the namespace is correct and useful.

series:comic market

event:comiket#89

would annoy me way less than

meta:cirno edits

or

meme:reddit

I think taking a look at the tag group may be a good start. the medium namespace could gain from a subdivision into artsyle:tag, format:tag and medium:tag, perhaps…

I'm really undecided on this.


0a0d55  No.5914

File: 164192fe8bafa9c⋯.png (453.57 KB, 1700x700, 17:7, 164192fe8bafa9cbdbe0a000aa….png)

>>5912

>I would say having a system like bird-eagle would be good enough.

I didn't give it the attention I should have at first.

Do you mean for this image:

moth-insecta

moth-animalia

moth-lepidoptera

moth-arthropoda

I'm not sure I get exactly what you are suggesting.


93f5ad  No.5918

>>5910

Yes, so the tags are imported with (artist) at the end or whatever, then you change it to the "creator:" namespace and delete the ending tag. The only conflicts would be between, say, characters with the exact same full name, which you'd then get around by specifying the series. I'm not even sure what you're suggesting now, unless you can think of a single word that would have two different applications that aren't an artist, character, or series name.

>>5911

So you get a picture of every character from Sailor Moon, and in addition to the long cast list you'd get:

>kingdom:animalia

>phylum:chordata

>class:mammalia

>order:carnivora

>family:felidae

>species:felis catus

>common name:domestic cat

>cat

>>5913

I can't think of a single time when I'd be looking for something knowing which comiket it came from, or even being aware that it came from that though.

>>5914

I thought it would be like "insect:moth".


93f5ad  No.5919

Honestly everyone should take a really close look at the current state of the PTR before suggesting anything that would over complicate things. Points of interest:

>Thousands of instances of all the tags being in a single line

>Hundreds of tags with underscores that aren't siblings with the proper tags yet

>So many people unaware of the tag suggestions and end up making up yet another spelling for "black hair"

>Countless files with not a single namespaced tag

>Bad parses from hentaifoundry and other sites that then break the title into a separate tag for each word

>As-intended parses from hentaifoundry which puts the first and last names of a character as separate tags because the site doesn't support multi-word tags apparently

>Pictures of cats tagged with "cat ears" or "animal ears:cat"

>Some futa imageset making up the rather creepy namespace "character age:"

You guys are working on cleaning this up in the meantime, right?


0a0d55  No.5920

>>5918

This can only work if we get some kind of conditional tags.

I only want this for real animals and dinosaurs, especially dinosaurs.

That probably will never be.

In that sailor moon example, I already think species:cat is too much.

I would have used

2 animals

cat

>>5918

>I can't think of a single time when I'd be looking for something knowing which comiket it came from, or even being aware that it came from that though.

Cosplay my friend, cosplay. And for a quick overview of the other material.

Besides tags are not only a way to search content, but also a way to hold information.

You can't have too many tags, as long as they hold correct informations.


93f5ad  No.5921

>>5920

>You can't have too many tags, as long as they hold correct informations.

True, I guess, I just keep running into files with nothing besides tags that serve no purpose other than to hold information and don't actually have information that could be used to search unless you know the exact full name of a porno or something.


0a0d55  No.5922

>>5919

I do, that kills me and frustrate me to see how negligent some are.

>As-intended parses from hentaifoundry which puts the first and last names of a character as separate tags because the site doesn't support multi-word tags apparently

I had that on some pixiv import.

Wasn't happy.

>Countless files with not a single namespaced tag

Either:

-it isn't the original

-the original never was tagged, preserved and a resaved files was instead.

-you have the last copy on the net of this files. (or it is lost on an obscure site.)

If you have some lost frfr pic, plz send them here, I lack half of a specific set /joke off


93f5ad  No.5923

>>5921

To elaborate, no number of fancy namespaces will change the fact that someone decided to tag 4k files with nothing but "bike shorts" or 9k with nothing but the tag "frills".

>>5922

The ones without namespaced tags actually do have two namespaced tags in the end: "uploader:" and "category:". Most of these are photos of plastic models downloaded from myfigurecollection, for some reason. At least as far as I can figure.


7274bb  No.5928

>>5913

I would say Sourcing should not be a tag according to #1 in >>5881

>>5914

I would say kind:insect and type:moth, or animal:insect:moth if it supports tree-like tags

>>5919

Some formatting issues:

1. Should we use hyphen, underline or space for separation of words in a tag, and which other for separating different tags in a search?

2. Should we use () or [] to add extra information in a tag, and which other should we use for nested boolean algebra?

3. How do we index different types of tags? What are the specific definitions?

We definitely should fix the god damn spelling

Also metadata for whether its a direct copy from the artist, watermarked by the hosting site, or its a screencap should not have tags.

>>5923

>shitting on bike short fetish

>oh_come_on_now.png


0a0d55  No.5930

>>5928

Wouldn't metadata simply be tags sorted under the metadata tag group?

Do namespace really eat that much space?

>I would say kind:insect and type:moth, or animal:insect:moth if it supports tree-like tags

The namespace doesn't provide any added useability or information over the non namespaced tag.

It is at best poorman tag group.

It's the crux of the problem isn't it.

>no riding a short bike while on a toilet

>2032

So much things to do, not enough time to.


93f5ad  No.5931

>>5928

Use a space to separate words, and right now we use () for extra information, though I can't say I have any experience with nested boolean algebra, so I'm working under the assumption that searches continue to function as they currently do, more or less at least.

The direct copy/watermarked/screencap thing I disagree on, it would be a help with searching for duplicates, or seeing if you have the shit version of a file or whatever. The problem with, say, watermark tags would be if they would be medium or meta. Same with other stuff like jpeg artifacts. Of course, neither of those tags are relevant when the completely original picture has either of those. It's disappointing to use saucenao to get to the original pixiv page, only to find that you've got the highest quality, the artist just compressed it really badly when saving it the first time.

And not shitting on any fetish, just seems silly to not even say who the character is.


0a0d55  No.5932

>>5931

>The problem with, say, watermark tags would be if they would be medium or meta.

Medium is a statement on the image composition, the format, or the artsyle.

background, watercolor and jpeg artifact go there.

meta is a comment, often imperative on the file, tagme, artist request, identification request, source request and the like.

Duplicate, inferior duplicate, and stripped metadata (the most common cause of duplication) go there.

Considering a watermark is a part of the image, even if disgustingly so, it should stay unnamespaced I think. It could be added as medium without raising too many eyebrow though.


93f5ad  No.5933

>>5932

>stripped metadata (the most common cause of duplication)

Shit, thanks, that's news to me. Might have a few more files with tags on the PTR if I grab (what still exists) from the source again, since before hydrus I tried a shit-ton of stuff including picasa, so my metadata is all over the place.


0a0d55  No.5935

File: c17f63068d92302⋯.png (227.25 KB, 645x1450, 129:290, c17f63068d923022cdd6ae33a4….png)

>>5933

I created the tag, but I didn't submit yet, still 1000+ filename to go through.

You are free to create a 'meta:stripped metadata' tag before deleting the files yourself, that'd be helpful.

To not, I do not add it if I can confirm the original files never had metadata themselves. (like all frfr artworks)

filename:tag are the one giving me most trouble really, I wonder if I should change them to title or leave them as the original filename when it's from an imageboard.

what do when it's a transparent from a manga titled 1.jpg?

page:1?

filename:1?

Delete the filename and forget, thus losing the ability to extract it back as it was to share?

case in point, this was posted as 01.png on 4chan/a/.


93f5ad  No.5936

>>5935

I'd leave the filename only if it's something you'd post in a filename thread, otherwise the information normally contained in it would be better off somewhere else. Like a pixiv filename would be better off tagged with the work number in the correct namespace. A booru filename of course I'd hope would already have the tags in it applied as normal tags, but, y'know. A deviantart filename contains both the artist name and the title, so that would be deleted and replaced with the 'creator:' and 'title:' tags.


8375b0  No.5939

>>5930

>>no riding a short bike while on a toilet

>Taking things out of context in CY+2

>>5931

So it is settled that it should be part of metadata. Got it.

>And not shitting on any fetish, just seems silly to not even say who the character is.

If someone want specifically a fetish (e.g. latex or pilot suit) then they would want all the stuff they can muster.

>>5935

The filename tag should only be used when downloaded from deviantArt, FurAffinity, HeantiFoundry, Pixiv etc. where they do have a title for those images.

>>5936

…well said.


8375b0  No.5942

If we exclude language-specific spaces, these unicode characters could be sanitized: \x09-\x0d, \x20, \x85 and \xa0

for Unicode these should also be cleaned: \u2000-\u200a, \u2028-\u2029, \u202f, \u205f-\u2060, \u3000 and \ufeff

Non-tags that could be reserved for boolean algebra: "and", "or", "xor", "not" (Case insensitive for ease of use)

If words are not part of boolean algebra, these symbols could be reserved: "&", "|", "^" and "~"

Parenthesis should be reserved for tag compatibility, while "[" and "]" could be used for nested boolean.

According to python, the order of operations is "not" > "and" > "xor" > "or", that could be accounted for.


c3dcba  No.5944

Does anyone know of somewhere to put all this information? Maybe a wiki or Google doc would be best for a full documentation?


93f5ad  No.5945

>>5944

Yeah, we need a wiki, like Danbooru's taging guidelines. Maybe a big flashing sign in the client itself saying not to commit shit to the PTR if you're not even looking over it with a link to it, too.


0a0d55  No.5955

>>5945

I'm currently working on diode's nijie profile as a quick side project and came up with a tag musing.

I wonder if it would be possible, one day to present the authors like so.

Wiki:Diode

-urls

-list of aliases and their translation if needed

-type of artworks

-doujins/h-artist yes/no

taglist:

nijie profile:7228, ヂオデ, diode

With each of these element being separatly right-clickable and linked to the wiki "definition".

instead of:

creator:diode

creator:ヂオデ

nijie profile:7228


6653c2  No.5956

Added a page to the Hydrus wiki. I encourage everyone else to contribute sections and content.

http://hydrus.wikia.com/wiki/Tagging_Standard


0a0d55  No.5966

>>5956

> Art found on pixiv can be tagged with the creator's pixiv ID (useful if their name is difficult to translate or changes) in the format: creator:pixiv id 1234

And if I prefer the more efficient 'pixiv profile:1234' ?

Being able to right click the author number and paste it to another is super nice.


93f5ad  No.5970

>>5966

I thought we were using "pixiv id:1234" for that. And that would be separate from creator, but in the end you could search for that and see if the artist has a name already, by looking to see what those pictures are tagged with for the creator.


29238a  No.5980

>>5966

>>5970

Shall we go with pixiv id: as a namespace? If there is a particular use for being able to directly copy the number on its own then it makes sense.


93f5ad  No.5983

>>5980

It would be good for finding artists with more than one name, for one, and make it less likely that someone will add another creator:pixiv id #### tag.

That said, there are a bunch of "pixiv_id" namespaced tags, but downloading every file from the artist's page I've only gotten maybe one file to match up. The files just have "pixiv_id###", the artist's current display name, and "pixiv" as the tags.


0a0d55  No.5987

File: 2d10ca40dc94542⋯.jpg (708.85 KB, 1203x1004, 1203:1004, 2d10ca40dc94542801a748639b….jpg)

To begin with when I started to use hydrus, 3 type of tagging of this kind were in use: 'creator:pixiv id #' from the boorus, 'pixiv id:#' confusingly inspired by saucenao and 'pixiv illustration:#'.

1-lack aesthetic, and is not so useful when copying from the namespaced (an advice from the dev)

2-is confusing, since pixiv id:refer to posted works on saucenao and predate hydrus.

3-is incorrect since it would implies al submitted works are under the illustration category; or worse inform that it is an illustration, which should both be its own tag and as useful as using the art tag for every image ever.

Pixiv artists love aliases, at least half of them cannot stick to a single name, which is why I need, and think everyone would benefit from a tag based on their very account.

Naming the needed tag had been a bother for a while. I decided to use the very naming scheme pixiv use.

That is the division between [profile]&[works], [bookmarks]&[feeds] not being useful to us.

which give us pixiv profile:# and pixiv work:#.

To these I add the status of the account/image IF it would qualify for a 'bad pixiv id'

Pixiv work:deleted/private

Pixiv profile:deleted

Further more I also add if the work was posted as a manga or an illustration, which may be seen as excessive information to most.


0a0d55  No.5988

>>5987

This can be reused almost verbatim for nijie and seiga, though I lack an account for the latter to confirm if the wording match, or even deviant art, though i would use a slightly different wording to reflect the site inner organisation.

I barely use it though…

[site] profile:#, (status)

[site work]:#, (category it was posted in), (status),

Would be my ideal presentation if a far hydrus future. it would be nice to have the option to present some tag group in such way if they share a namespace.

In such system, it may even be a good idea to use pixiv 'profile:creator'. I still use creator:tag though as there is no "inside" wiki to hold all the site profile together under each creator.

for sorting purpose as I already defended previously, pixiv work:# is useful, I'll refer you to my previous post.

https://8ch.net/hydrus/res/2231.html#q5789


93f5ad  No.5989

>>5987

>>5988

Finally found the conversation that was already had about this.

>>5665


93f5ad  No.5990

>>5988

>>5989

Also yeah, pixiv work:# is probably way better than having a separate namespace for manga work numbers, from illustration numbers, as I don't see a reason to distinguish between the two in that spot.


0fde48  No.5991

>>5990

>>5988

>>5989

>>5987

Better t have metadata than actual tag in the database.

Don't want to create unnecessary namespaces if it's heavily reused and is certain that all images from sites will have them.

t. OP


0a0d55  No.5993

>>5991

At the moment, it is not possible to either include these information as metadata nor to sort by collection using any metadata, nor will it be a priority in the 3 month to come, the downloader will.

If metadata was an option we wouldn't bother discussing it really.

You may know an alternative that I don't to present pixiv work from a given artist's profile as a collection of work, if so please share. More knowledge is better.

Tags are tools.

1-tools to search

2-tools to inform

3-tools to sort

The sorting tags may eventually become metadata as the source:tag will, but this is at best in a year and it will cannibalize these tags (and thus may need them) to begin with.

Should title:tag and filename:tag eventually become metadata too?


8375b0  No.5995

>>5993

We will botch it until the dev fixes the problem properly and allow for metadata searching.

List of "metadata" that needs to be included in the near future:

>>5881


93f5ad  No.5999

>>5993

>>5995

Okay, wait, but realize, that different metadata in some cases will make the hash completely different. So any metadata that wouldn't completely fuck up the whole PTR concept would literally be just normal tags in the end, maybe just displayed somewhere else. So metadata isn't going to happen anytime soon, if at all. I can't think of a single reason to make another weird menu specifically for it, or anything that could be done that tags can't, or any benefit at all, period.


0a0d55  No.6001

>>5999

Altering the source in anyway is blasphemy, so of course

>>5999

> So any metadata that wouldn't completely fuck up the whole PTR concept would literally be just normal tags in the end, maybe just displayed somewhere else.

Exactly.

Which means we still need to create them.

In the end this will be solved by sorting, enabling or hiding whole tag group, by the grace of his highness hydrus_dev.


93f5ad  No.6002

>>6001

So I figure the best place to start would be to standardize and clean up existing media and meta tags, maybe try to mass delete the filenames that have absolutely no information and/or can just be broken up into the separate tags. I was thinking, just glancing, that we could get rid of the extra wallpaper tags and just have the medium:wallpaper combined with medium:whateverratio, the only problem being that that's silly for practical usage, because I couldn't tell you any wallpaper ratio off the top of my head, and I have to stop and google it because I always feel I'm getting the numbers backwards. On the other hand, it's more practical for searching for a wallpaper to use because typically it doesn't look like shit to use a wallpaper that's too big but the same ratio.

Just as an example.

Also I need to update my list of namespaces that exist, since I just found a 'wallpaper:' one.

"wallpaper:woman"

Wow.


93f5ad  No.6003

>>6002

Also someone is using the 'medium:' namespace to tag kancolle cosplay. Fucking hate this shit.


8375b0  No.6005

>>5999

>>6001

What I mean is an out-of-file metadata set, not having the same GUI properties as other tags.

In a sense, it should be more static than dynamic, requiring more work to tamper with it.


93f5ad  No.6018

>>6005

That's great until you get the asshats that don't look to see if their tags even parsed at all before committing them, because it would then be harder to fix their fuck-ups.

So aside from artist-specific information, like their site, name, page that the file came from, what would be included as metadata? We already can search by file type, length of video (or if it's simply animated or not), height, width, ratio, number of pixels, file size, and number of words. Rating (as in sfw or nsfw)? Source/series/character/artist requests?


8375b0  No.6020

>>6018

We expect the artists information from direct in-hydrus gallery-rips to be 100% accurate, since IDs are immutable unless the artists delete their account and start anew.


93f5ad  No.6027

>>6020

>unless the artists delete their account and start anew

I've got at least one pixiv artist in my files that has had two prior accounts, a few more that have had one prior one, at least two that have two separate active accounts, and a shit-ton who deleted and may or may not have a new account. Does Deviantart have hidden account numbers? Since the other year they added the ability to change your username, luckily those links redirect, but plenty of people made new accounts several times for the sake of changing their name before that. And don't get me started on the string of name changes and account deletions I've followed for some artists on tumblr.


8375b0  No.6029

>>6027

Redirects. deviantArt does redirects by pinning names to static IDs.


4b8caa  No.6150

Just find

<div class='post-body entry-content' id='{ID}' itemprop='description articleBody'>
in the individual blog pages




[Return][Go to top][Catalog][Nerve Center][Cancer][Post a Reply]
Delete Post [ ]
[]
[ / / / / / / / / / / / / / ] [ dir / animu / bestemma / blog / tingles / u / vg / vichan / zoo ]