[ / / / / / / / / / ] [ dir / cute / egy / fur / kind / kpop / miku / waifuist / wooo ]

/hydrus/ - Hydrus Network

Bug reports, feature requests, and other discussion for the hydrus network.

Catalog

Name
Email
Subject
Comment *
File
* = required field[▶ Show post options & limits]
Confused? See the FAQ.
Embed
(replaces files and can be used instead)
Options
Password (For file and post deletion.)

Allowed file types:jpg, jpeg, gif, png, webm, mp4, swf, pdf
Max filesize is 12 MB.
Max image dimensions are 10000 x 10000.
You may upload 5 per post.


New user? Start here ---> http://hydrusnetwork.github.io/hydrus/

Current to-do list has: 714 items

Current big job: finishing off duplicate search/filtering workflow


File: 1468844329300-0.png (50.75 KB, 478x478, 1:1, emoji.png)

File: 1468844329300-1.jpg (5.42 MB, 2953x4169, 2953:4169, 1456351071309.jpg)

d11e99 No.3197

well OP i wanted to post this because i really like your project but i don't think it gets much coverage or assistance from the imageboard community. i myself have run the app on ubuntu linux less than a dozen times across two builds. it was able to tag 200 of 10,000 images (2%) at the time. perhaps the content was too much /b/ and not /a/, but it was slightly annoyed since there was much weeb content as well. the GUI is vague and hard to operate, very unresponsive at times. and the i/o and cpu usage is outrageous.

in the end i stopped using the program all together and would just manually pick a file out of the messy number/letter tree in the db_files directory, just shuffle around until i find something that looks ok to post for w/e, but this was also annoying. i didn't want to delete the db and files because i thought i might be able to increase the tags or something. the program is fairly complex, even if you run it all day (or two) you don't know what it's done at the end. not to mention each build requires replacing the whole app and rebuilding the local database, there is no streamlined updating mechanism inside the program.

i know there is documentation on these issues but it's mostly a diy fix. if anything the software should have a warning against traditional hdd owners, please use on SSD's only. this might help for developers, if there were any.. here are some of the metrics about the project, as are listed below with notes.

> Active users 10 ( https://8ch.net/boards.html , search: hydrus )

> https://github.com/hydrusnetwork/hydrus/graphs/contributors (only one other 'coder' with only 2 commits)

> https://github.com/hydrusnetwork/hydrus/network/members (11 members, many likely post on this board)

> patreon.com/hydrus_dev , five donors, total less than 20 shrekels avg/month. obviously has potential but is still in beta, IPO when?

> https://github.com/hydrusnetwork/hydrus/blob/master/license.txt

> https://en.wikipedia.org/wiki/WTFPL#Discussion ( joke license, good luck getting /g/nu-tards interested. )

as you can see there is much opportunity for growth and involvement. i have seen many projects gain attention on 4cuck imageboard and on 8ch as well, often these projects have a large following. as you can see below there is continued interest in your project, people are talking about it, they are very curious and hopeful. please look at the threads below too see what people are saying about the project and similar efforts.

> https://archive.rebeccablacktech.com/g/thread/55452530 (most discussion)

> https://archive.rebeccablacktech.com/g/thread/55309793 (2nd most)

> https://archive.rebeccablacktech.com/g/thread/55058483 (mention)

> https://www.google.com/search?q=site%3A4chan.org+hydrus (search of recent 4chan mentions, 26 results, with a few false mentions, so basically 20 proper mentions. 3 listed above. )

> https://webcache.googleusercontent.com/search?q=cache:__Syl8zd8PsJ:https://8ch.net/tech/res/597841.html+&cd=1&hl=en&ct=clnk&gl=us ( one thread on 8ch which wasn't on this board )

d11e99 No.3198

as you can see in these communities there is much support about sharing the project but it hasn't gained a foothold in the community as some other projects have. as you can see below many projects have has support from the imageboard culture, some more successful than others ofc:

> https://wiki.installgentoo.com/index.php//g/#Contributions

> https://wiki.installgentoo.com/index.php//tech/#Contributions

imo to recruit developers and to generate more interested, more user data, and feedback you could make a habit of posting your weekly releases in the daily programming threads on 4chan's /g/. there is literally on every day or two, so it should be a steady platform to share the project on and it will hopeful get the attention of other 'app' developers.

> https://wiki.installgentoo.com/index.php/Daily_programming_thread

> https://boards.4chan.org/g/thread/55623304 (here's the most recent, as of posting this)

one could also consider setting up an IRC room on any network that you prefer, rizon is the standard for quad/octa-channers usually. such a room would allow for quicker communications and discussion but would only be needed if the project grew to a certain size.

surely since the project is open source and well documented the concepts at play can be shared and improved upon by more developer involvement? i know you have a great attitude working on it when you post here. i don't mean to damped the mood but for a project which such a grand goal certain you can see how the information about can look.

even the name hydrus network implies a distributed approach but this isn't the case, the database of tags is on a central server (that is throttled and overworked), users may create tags and upload them (somehow) but the program is very centralized. the workload of the computing is not shared, nor is hosting the database of tags/hashes. perhaps i misunderstand but it seems like a inefficient design and a central point of failure? do we have any metrics on public repos, is there only one?

well that's enough ranting for now, please take this all with a grain of salt, from a early adopter…


d11e99 No.3199

quick note for accuracy, i fired up v162 from an old windows build from 2 yrs ago with client_files and everything. checked my tagged objects, 967 tagged files out of 13,431 files in the db, so 7% not 2%.. still not what i was expecting..

the old build is updating content from the 'public tag repo' and it added 164 files from the 'public file repo'. i would like to know how to change the build without having to remake the database by re-importanting the client_files folders as raw images. i don't remember the version i was using on ubuntu but i could boot over and check.

the hydrus data itself was on another hdd from the windows OS (v7) for the build i'm testing now on this current windows setup (v8.1), it seems to be more usable than when on ubuntu in the home directory on the same hdd, it was very hard to use there. i would have kept it on the other hdd but i thought it might be faster since hdd the linux was on was brand new, lol as if that make a difference. ofc as i said in the first post, it might be best to run the app on an SSD but i don't have one of those to play with. my best bet would be some kind of ramdisk on windows since i have 16gb ram but windows seems fine with 4gb idling. ofc idk how that could be implemented since the hydrus likes to make 5-10 gb databases, almost as bad as thunderbird. wew lad.


1214c1 No.3205

File: 1468946747191.jpg (378.12 KB, 1506x1100, 753:550, ed0ebdec6fc0f8330bce5a0fb7….jpg)

I appreciate your feedback, thank you. I'm sorry that hydrus hasn't worked out great for you.

You can upgrade quickly by extracting a release on top of a previous install. When the new client boots, it will see the old database at install_dir/db, notice it is an old version, and make any updates it has to. You don't have to rebuild from scratch every time. I've written more about this here:

http://hydrusnetwork.github.io/hydrus/help/getting_started_installing.html

As for your thoughts on growth of the project, hydrus is certainly not huge, but I am nonetheless pleased and content with the current direction, and it isn't in my roadmap to go for lots of users or lots of money. I'm amazed that anyone wants to use it, and when people say they love it, it easily gives me enough of a boost to be satisfied with my work. I prefer for the growth to be driven by word of mouth, and I'd rather deliver regular honest code to enthusiastic users than desperately try to keep up with overly ambitious Patreon promises.

Although things are held together by duct tape in many places, and the user interface is often ugly and debug-tier, I think these things are improving over time. I've been pleased by the recent window sizing system, for instance, and the new disk cache has eliminated a great deal of the lag when running off an hdd rather than an ssd. Furthermore, I really prefer the independence and drama-free nature of working on my own. I've never been an open source/github or generally social person, and am not interested in working on this in a team.

So, I thank you for the suggestion, but I'm content to keep my current schedule. I apologise if that is disappointing. I really enjoy working on hydrus, and I don't want that to change.


018633 No.3212

I'm just a user but I felt like I'd give my input on this, feel free to ignore it.

>Active users 10

>based on search results

Just because people don't talk about something does not mean they're not using it. The absence of evidence is not evidence of absence. I've used hydrus for a while now and have only mentioned it to a few people in private who still use it today, them alone constitutes more than 10 people.

I've also seen some users posting here with giant collections which has some merit, even if there were not many users the ones that exist are using the project extensively, which is pretty impressive to see.

>on the PTR

The public tag repository is built by the people who use it, it's not an AI like >>1553, people need to have tagged the content and shared it to the public repo before it will tag images for you, feel free to contribute your tags and help out, especially in areas that are lacking. If you're going to tag your images anyway you may as well share those tags, as you help the PTR the PTR may help you.

>on being heavily centralized

While hydrus is mostly client-server it's not dependent on 1 provider, you can host and/or use public and/or private services for files, tags, and more. There's also ipfs support in the client, you could extend hydrus and make some kind of replacement for the tag and file repos making it actually decentralized. I think the project owner plans to do something like this eventually but that's quite a task to do and get right. As an aside, I think for an unfinished project the current systems in place work well and the rapid development is stellar, especially when considering how open to the community the developer is.

>on guerilla marketing

That's shady and could give a bad image to the project and its users. If people want to use software like this they should be able to find it, there's a lot of presence online for search engines to pick up on. In my own anecdotal experience everytime someone bring up the topic of "how do I handle large collections of files without a hierarchy" someone will have already mentioned hydrus before I even can.

More importantly more users does not inherently mean more contributions. I personally haven't had the time to make any contributions to the project source, only to the PTR and with information on this board; I would if I could.

>>3198

>surely since the project is open source and well documented the concepts at play can be shared and improved upon by more developer involvement?

I don't think hydrus has or would turn down patches, nothing is preventing this from happening now so if you have a patch you should probably submit it.


d11e99 No.3215

File: 1469027426791.jpg (624.34 KB, 1920x1200, 8:5, 1435862821956-1.jpg)

>>3205

thanks for replying. i really wanted to write a proper reply yesterday but i got busy irl, now i'm too sleepy but i will write back this evening. will try to install the today's latest build on windows before posting.

noticed plenty of talk on /g/ today too: > https://boards.4chan.org/g/thread/55663442

>>3212

thanks as well, will write a proper response tonight 4 u. ;^)


d11e99 No.3282

File: 1469526033832-0.jpg (2.56 MB, 1910x1349, 1910:1349, b5a62e94bfde61d2af7a94422b….jpg)

File: 1469526033833-1.png (20.74 KB, 879x479, 879:479, repo.PNG)

>>3215

>>3205

OP here again. here's an update on my progress using this 'app';

>moved old db folder away from hydrus folder

>installed v215 over old version, on separate hdd from windows, using wizard. ran fine on empty.

>move new, 'empty' db folder up, replaced with old db folder, ran test.exe and client.exe

>program presented error, saying to old db was too old to try v207 first.

>move old db folder up again, remove hydrus folder, replace with v207 from wizard (old thread on 8ch)

>move down db folder and replace 'empty' db, run test.exe and client.exe

>program (v207) runs splash, claims to update datebase for a while, then present python ui error of some kind, try relaunching program, same python error but no status about updating database anymore.

>give up

>move old db folder up, clear everything inside except client_files, defrag drive.

>install v215 fresh, launch it, tell it to import files from client_files (as i've done before as mentioned in thread)

>it's done importing (and deleting old copies, which actually moves to windows trash, very annoying to have to purge it often b/c windows is on another hdd and it's thousands of files), setup repos using help menu.

>download public files very quickly, public tags want to download only on close (maintenance script), try this a few times during the day with little success.

>next day, notice the 'force idle mode' in debug, try that, tags seem to DL fine, but 'review services' shows incomplete download of repo ofc.

>happen to need to place an order on amazon for some junk for someone, order an ssd for cheap for myself (i need to reinstall ubuntu since v16.10 broke AMD drivers, might as well put v14.04 on an ssd for boot or w/e, wew lad).

>ssd arrives today, it's only 60gb (says 64), partition drive on ubuntu for half ext4, half ntfs, boot into windows 8.1, mounts fine.

>move hydrus folder to ssd (froget to defrag hdd first), takes a half an hour to move folder (~10gb of data, database is huge, client files is 6gb but that has thumbs in it, so probly on 3gb of actual content). edit windows shortcuts to route to new drive.

>defrag only ssd (lolwut), done in less than a min, fastest defrag ever, try 'optimize ssd' option in asslogics defragger, wew lad, takes 30 minutes.

>finally launch hydrus app, balance files (instantly), close for maintenance, runs public repo downloads and such doing 15-25000 writes/second, launch app a few more times try idle mode, etc, no changes, check services, repo is done, pick related.


d11e99 No.3283

File: 1469526966898.jpg (1.36 MB, 1500x1063, 1500:1063, 110599fe52e616e988a0488850….jpg)

>>3282

>tag stats are twice better than before upgrading app and full repo download.

>system:everything [13,430 files]

>system:number of tags > 0 [1,344 files]

>system:number of tags = 0 [12,086 files]

while much of the untagged content is silly reaction faces which aren't really art, there are still hundreds/thousands of high-pics and anime stuff. i don't expect such a small project to have original content from my machine tagged, but the output of tags and untagged are very similar looking mix of content that was originally downloaded from community sources (imageboards, ftp, dumps, etc).

the program is 'usable' at this point but the GUI still takes many many seconds to load content when a tag is requested, even with the ssd backend, it's noticeable faster than when i was using it on the hdd, that would take minutes. this said even though searching content can be sluggish, once a selection is ready the gallery mode and scrolling the thumbnails work fine. the app is still prone to become unresponsive when doing some functions, even ones without warning about gui stalling.

with the build coming out tomorrow i will have a chance to mess up this version or w/e using the wizard or 7zip. so that's good i guess. ideally i would like to increase the tag repos as much as possible or practice a way of doing a blitz tag with autocomplete typing manual, much of the content is similar to images that have been tagged. otherwise i would like to export/move those untagged images out of the program into root folder on the SSD so i can try to manage them by hand/deprecated picasa app and slowly add/tag them in hydrus as needed.

this whole exercise has reminded me how much junk chantard images i have on the machine but tbh i downloaded large amounts from private ftp servers and such so i could test the tagging system in hydrus, likely only had a couple thousand images in my own collection, but that was years ago. the same applies to much data on hdds and such but i don't make efforts to organize it more than removing large files and creating backups, etc. it's all very tiresome to deal with 'big data'.

could the UI be improved if done as local booru-style web UI and a local webserver backend and javascript frontend? surely the browser would save work rending images and caching? are there any other projects you know of that are like that? i know it's easy to speculate but surely there web browser can function to assist, it has full screen functions and multi-instances, thinking like IPFS how does it's local server UI, regardless of the data gateway ofc.

tl;dr how to increase tag repos, slim down database/export images not tagged in bulk, and generally improve performance.


d11e99 No.3284

File: 1469530996764-0.jpg (1.52 MB, 1639x1156, 1639:1156, dff12e78a8a8c7d16dd684aa18….jpg)

File: 1469530996764-1.png (30.99 KB, 725x768, 725:768, 725px-1percentrule.svg.png)

>>3212

oh also wanted to reply to your points to clarify.

>active users

simply taking note of the 8ch traffic at the time, only used the meter to speak to the mount of activity on the board, did not mean to implying i thought only that many people used the app. today it has 17 for example. 170% growth!

>public tag repo

i am aware there's no AI magic to it, it's based on the images that people have tagged and shared, therefore like any MLM scheme it follows that the more people use the program and contribute tags the better the tagging system will get because of a larger sample. ofc there may be bias in who uses it and therefore what type of content they tagged but again, the more diverse the users the less chance of that kind of bias manifesting in content tagging. you put it well in your last line, personally i haven't tagged anything since until today the program was harder to operate.

>centralization

if there is any other repos being hosted online, how can they be found/shared? perhaps a wiki could be setup on the github for these kind of notes,otherwise general threads would do on this board. as for ipfs, i've used the software before and was very impressed but it is also cryptic and hard at times, even after watching the devs talks on YT, mostly the troubles manifest when trying to use it backwards for http(s) hosting since there is no browser extensions (though talk of them), so it's not as easy for nerds like mega or w/e. you can see posts about it here: >>>/tech/499795

>'guerilla marketing'

it's not shady in the least if done tasteful and interactively. there is mention of the project in the ipfs thread linked above ( >>>/tech/500911 ) , and also you can see in my OP the search engines show discussions.

>If people want to use software like this they should be able to find it, there's a lot of presence online for search engines to pick up on.

that's ridiculous and you should know better, the only people finding this project are chan users via word of mouth (aka shill/shitposting). as you said it is mention when questions are asked in certain communities but this is not a given it only works when a hydrus user is present to make such a suggestion and choose to. does the word 'viral' mean anything m8?

if the dev posts in a 'programming general' once a week when there is a new build is subtle enough to keep a presence in that arena but not enough to be considered spam. if anything it could have it's own general thread if the posting in the other general thread became popular. thinking long term ofc.

you do make the point that this board itself is already very interactive but one day/session a week for community outreach is so demanding compared to monitoring this board. a weekly hydrus general on 4ch's /g/ is not much to ask in the long run, even if only for a few hours before it 404s. much of the obscurity comes from the 4ch/8ch divide, which is silly imo.

>developers / patches

>More importantly more users does not inherently mean more contributions.

this is true but the chances of a user being interested in the source code is tangible. it's not linear but it's probabilistic, esp if people are invited to help. but as you can see in >>3205 the dev is not currently require assistance. i only think in this direction because of the nature of the content in the database and positioning of the project/meme inside the imageboard community as a whole.

i don't have magical faith in FOSS, i've only seen projects grow and sustain in the past when there is efficient communication and motivation from small groups of users/developers working together, it usually follows the '1% rule' (pic related, see wikipedia). this ratio shows that of any group of people who are potential able to assist in a project only 10% will be motivated to do so.

therefore increasing the general userbase would increase the chances of contributions, at least in principle. ofc that only works when barriers to entry are low, but in the FOSS world that is a given, those who can code will code, when motivated/inspired. as you say in your line.

>>3205

@ dev, perhaps telemetry/metrics on how many requests are made to public repo tag/file server in X time period, how many instances in the wild, version counts, etc (within reason ofc, opt-in on next update, etc), the patron/IPO comment was joke, meant to complement the fact you get any donations at all, it's a good sign.

tl;dr organic growth is easier when the seeds are watered and spread in the wind.


1214c1 No.3436

File: 1470607803053.jpg (68.21 KB, 740x1100, 37:55, 76da10e2020b1a1603891ca7c5….jpg)

>>3282

>>3283

>>3284

Thanks for this follow-up. I'm sorry I could only get around to it now.

I'm also sorry you had a bad experience messing around with db versions and slow hdd load times.

I think you make good points. I'm not 100% confident on what the project actually is, nor where it is going, so I will keep your thoughts in mind. Like you said about realising how many junk files you have, I think we don't yet have a good understanding of what 4.4 million file hashes means. I'm increasingly of the thought that direct file/tag mappings are going to be infeasible surprisingly quickly. I never thought we'd be able to have comprehensive tag coverage of all Anonymous images, but I am also surprised at how many files there truly are out there, and how many resizes and so on. I expect to slowly move the program towards a more decentralised network and more automatic tagging systems, with clients sharing tag descriptions (this shape might be "ribbon"-type metadata) rather than direct tag mappings.

I've also wondered, lately, if we'll soonish see something of a collapse of some sectors of new content generation. There's so much stuff already digitised and easily accessible, and people's backlogs/to-watch-lists/hydrus-inboxes are all growing faster than they can be cleared out, I figure non-topical media deflation can only be around the corner. If the rate of new manga being made reduces one day because there is already so much good old manga to read, does that change hydrus stuff? I've been going over some of my older hydrus client's files recently, and I can't remember most of them. A lot of them have the same value to me as new files, except they were already on my drive. I wonder if good search and management of existing collections will be more important than good gallery downloaders in future.

I don't know what to do about the bits of ugly ui and delays on high CPU time, at least beyond slowly improving things. The project was originally going to be browser-based, funnily enough, but javascript couldn't do all that I wanted. I think I'd like to continue having hydrus as a wx application, if for no other reason than it would be a tremendous effort to rewrite for a new graphics library. That said, writing an http api for the client remains a long-term goal so someone can write an external searcher/viewer for web or android or whatever.

I'm not sure about gathering any sorts of metrics–it just sets off privacy alarm bells in my head. Better I think to have completely private clients and let users tell me their info through other channels if I need it. Maybe I could throw up some polls just for fun, although the main bottleneck in my dev process is finding time and energy, not that I don't know what I want to do.

Anyway, I can't promise anything will be very polished any time soon, but I hope every new version will generally be a little better than the one before. I'm still trying to figure things out on my end.




[Return][Go to top][Catalog][Post a Reply]
Delete Post [ ]
[]
[ / / / / / / / / / ] [ dir / cute / egy / fur / kind / kpop / miku / waifuist / wooo ]