[ / / / / / / / / / ] [ dir / cute / egy / fur / kind / kpop / miku / waifuist / wooo ]

/hydrus/ - Hydrus Network

Bug reports, feature requests, and other discussion for the hydrus network.

Catalog

Name
Email
Subject
Comment *
File
* = required field[▶ Show post options & limits]
Confused? See the FAQ.
Embed
(replaces files and can be used instead)
Options
Password (For file and post deletion.)

Allowed file types:jpg, jpeg, gif, png, webm, mp4, swf, pdf
Max filesize is 12 MB.
Max image dimensions are 10000 x 10000.
You may upload 5 per post.


New user? Start here ---> http://hydrusnetwork.github.io/hydrus/

Current to-do list has: 714 items

Current big job: finishing off duplicate search/filtering workflow


YouTube embed. Click thumbnail to play.

9b613f No.4726

8chan/cloudflare was having posting problems when I made this. Full post follows in parts:

Post last edited at

698417 No.4727


9b613f No.4728


9b613f No.4729

I had a tough week, but I got a lot done. I did a bunch of bug fixes and other background work, and I overhauled the manage subscriptions dialog.

https network upgrade

tl;dr: Hydrus network is getting more secure, you don't have to do anything except update today or in the next few weeks.

A user and I were talking about the network when it came up that hydrus is still running on http, and hence not yet encrypted. Moving to a fully encrypted protocol has always been an important long-term goal, but absent a complete overhaul of the network, there aren't any really excellent solutions. The current network code is a mess of overlapping iterations, so adding robust encryption got a lost in the mix of everything else I wanted to clean up first.

The user mentioned it was nonetheless very important for them to be on some sort of encrypted communication, and after discussing it more, I agreed to jump on this and get something reasonable out rather than waiting for a 'perfect' solution.

So, I will be moving the current http system to self-signed https. This is not very beautiful, but it is decentralised and easy and free to implement and will completely obscure the contents of all hydrus network traffic to any typical outside observers. I can also improve on it in future.

Unfortunately, because of the way hydrus works, I can't simply run http and https in parallel, like a website might, so upgrading is slightly more complicated. I have a working prototype, but I do not want to flip the switch today, without any warning, because it will break anyone who does not update immediately. The change is significant enough that an old client cannot even detect what is wrong–so it won't give a 'you need to update to talk to this server' error, it'll give a deeper and uglier network-level exception.

So, this week I have updated only the client. When talking to a hydrus server, it will attempt http and then upgrade to https if required. In three weeks, on v239, I will update the server code to be https only and update my PTR at the same time, and then three weeks after that, on v242, I will make the client only ever try https. This means that anyone who updates their client within the next three weeks will see no service disruption from any server that updates in the subsequent three weeks.

I hope this update window ends up working out. I can appreciate how important encryption is to some people, even for the relatively stale information going across the hydrus network–I apologise, I should have put work into this sooner.


9b613f No.4730

manage subscriptions overhaul

More immediately, I have overhauled the manage subscriptions dialog as I have discussed with many users. I thought this would be too large a job to fit into my normal weekly cycle, but it came up it my list again and I thought it through and figured it might not be so difficult with the other recent window updates. I scheduled a bunch of extra hours to work on it and just got it done.

So, manage subs now works on the new sizing system, and it looks more like the new manage scripts dialog. All your subscriptions will be summarised in a big list, and you edit their fine details on a separate window. You can also do some mass-editing–selecting multiple subscriptions and telling them all to pause or resume, or retry all their failed files (which are listed in the summary), for instance.

You can now also import and export subscriptions! You can put them on your clipboard or dump them to png files, just like scripts.

Furthermore, you can now export or import multiple subscriptions or scripts at once (i.e. to/from the same single clipboard or png)! If you want to backup all your subs, or transfer them to a different client, it is now easy and takes no time at all.

If you have a lot of subs, try exporting them all to a single png and then looking at the several hundred KB of encoded mess below the descriptive text–it's pretty cool!

middle-click to open new page

This is only a small thing, but it is neat: Middle-clicking on the greyspace beside your main gui's notebook tabs will now pop up the page chooser dialog.


9b613f No.4731

future dupe plan

I had a long think about a workflow for 'hey hydrus client, show me whatever dupes I have so I can figure out what to do with them', and I am feeling good about getting started on it. I have to do some db stuff to find dupe pairs (much easier, now we have the faster search!), and create some gui stuff to actually display them and let you decide what to do, and then I need to create some new dupe metadata.

For this first version, I will present duplicate pairs for judgment let you filter them into three broad classes:

- they are not a dupe

- they are the same image (but different resolution, quality, or one has a watermark, say)

- they have a family relationship (alternate colours/versions/costumes, clean/messy, and so on)

I will write further gui to deal with 'they are the same image' pairs, to let you delete one if you want and merge tags and ratings across either way.

Family relationship is a much more complicated problem, however, and deserves to be its own thing on the next 'big thing to work on next' poll. Family relationships often form complicated groups with sort order and hierarchies, and dealing with that is more work than I can tuck into the 'faster dupe search' job. I can however record the family relationship pairs you discover and store them in the db until this next round comes up.


9b613f No.4732

full list

- in prep for a network https upgrade, the client can now detect and escalate to https when making connections to hydrus services

- import/export to png and clipboard now supports multiple objects at once!

- rewrote the manage subscriptions dialog to work on the new panel system

- the new manage subscriptions dialog has a listctrl and a sub edit dialog

- the new manage subscriptions dialog has the same add/export/import/dupe/edit/delete buttons as the manage scripts dialog

- subscriptions are now importable/exportable, including en masse with the new multiple object import/export support!

- the new manage subscriptions dialog has retry failed/pause-resume/check now/reset buttons for easy mass subs management

- the edit subscription panel has a bit of a layout makeover

- the edit subscription panel now updates itself as its buttons are hit

- the edit subscription panel disables buttons that are not applicable

- subscriptions can now be renamed!

- cleaned some misc subscription code

- relabelled initial and periodic file limit in the subscription edit panel

- middle-clicking on the main gui's greyspace (e.g. to the right of the notebook tabs) will spawn the new page chooser!

- created a simple HydrusRatingArchive class–will do more with it in future

- added ffmpeg, python, and sqlite versions to the help->about window

- harmonised daemon code

- added a new class of daemon that will not fire while a session load is occuring

- subscriptions, import and export folders, and file repo downloads now use this new daemon

- cleaned the way background daemons check for idle

- expand/collapse panels now notify the new kind of toplevelwindow that a resize may be needed when they switch state

- time deltas (like on subs edit panel or a thread watcher) now render more concisely ('7 days' instead of '7 days 0 hours')

- serialisable object png export panel now has a width parameter

- fixed a bug where tags that begin with unicode digits were accidentally identifying as numbers for the purposes of sorting and throwing errors on convert fail

- the media viewer can handle some more unusual content update combinations–for instance, if it cannot figure out which media to show next, it will revert back to the first image rather than displaying an undefined null mess

- updated and cleaned a bunch of my old misc encryption code

- misc cleanup


9b613f No.4733

next week

I'll be starting this dupe search stuff. I also started a HydrusRatingArchive this week for future easy rating migration, so I might do a bit more on that.


97baf3 No.4734

Getting a weird issue of images having tags but not showing up as tagged in the grid as having them, just happened on a batch download i made after upgrading


9b80e4 No.4735

>>4733

>HydrusRatingArchive this week for future easy rating migration

Can you explain more what this is or how it will work? It sounds like something that will replace the current rating system.


89f638 No.4736

Noticing that the performance of similar image search hasn't improved at my end over the last two versions. Seeing about 15 seconds to search (db locked / Loading.. in the status bar) with 373k files in the database.


438ce3 No.4737

I have a problem with rule34@booru.org. The error appears after a while, after I try to download something. It does not depend on the object that has been downloaded, so if I start to download again, everything is downloaded, but stuck on a another picture.

UnicodeDecodeError
'ascii' codec can't decode byte 0xcf in position 2343: ordinal not in range(128)
Traceback (most recent call last):
File "include\ClientImporting.py", line 424, in _THREADWork
self._WorkOnFiles( page_key )
File "include\ClientImporting.py", line 255, in _WorkOnFiles
self._seed_cache.UpdateSeedStatus( url, status, exception = e )
File "include\ClientImporting.py", line 2009, in UpdateSeedStatus
note += traceback.format_exc()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcf in position 2343: ordinal not in range(128)


438ce3 No.4738

File: a668595d52a5e90⋯.png (24.27 KB, 570x815, 114:163, 123.PNG)

>>4737

Also, what is this? Why tab became wider?


0c3613 No.4739

>>4731

Sounds good. Will there be a dupe PTR of sorts so that this dupe info can be shared?

Also could Hydrus mark what it thinks is the "best" version based on resolution, quality, size etc. Would probably speed up the process in most cases.


1c8038 No.4748

>>4730

>This is only a small thing, but it is neat: Middle-clicking on the greyspace beside your main gui's notebook tabs will now pop up the page chooser dialog.

Nice.


9b613f No.4754

File: 3edd2cec88283ea⋯.jpg (833.08 KB, 2240x1692, 560:423, 3edd2cec88283eaebf2471b2d4….jpg)

>>4734

Thank you for this report. I think the 'selection tags' taglist is getting confused and permanently missing some tag updates. I am going to look into it this week. If you open an entirely new page and find those images again, do they seem to have tags?

>>4735

Just like a HydrusTagArchive, if you are familiar with that. Basically a single external file from/to which you can import/export ratings. It'll make moving ratings from one client to another easier than messing around with raw sql.

>>4736

Thank you for this report. Please run a couple of profiles as described here:

http://hydrusnetwork.github.io/hydrus/help/reducing_lag.html

For these scenarios:

Running a similar files search on a freshly booted client (freshly booted computer would be great too, if that is convenient)

Running the same search immediately afterwards again (just hit f5).

Then go database->maintenance->regen similar files search data and repeat the search. Do this a couple times, so we have a sample of different tree shapes.

Then close the client and bundle all that profile info into a pastebin or .txt file or whatever and email it to me or post it here.

>>4737

Thank you, I will try to fix that this week.

>>4738

I changed how the import options expand/collapsible things report their size info last week, maybe they are doing it. I'll have a look at it–for now please drag your management panel a little wider.

>>4739

I am not sure. I used to think so, but I am leaning away from it. Deciding this stuff may or may not be very subjective and so difficult to agree upon. I personally wouldn't want to set up a 'delete images other people think are bad quality' system without putting my eyes in front of any decision commit, and if the workflow needs my approval, then what's the point of other people's input?

Family relationships are probably more useful to share in some way, and less drastic to fix if it goes wrong.

I will have something on the gui that summarises the differences between dupes, giving hints to what it automatically thinks is probably best. You can then decide what you want.


0c3613 No.4766

>>4754

>re: a ptr for dupes

Hydrus automatically deleting images based on what others think is bad quality is not good, I agree. But could this not be used as a suggestion instead? Although perhaps it would be more trouble to implement than it's worth, compared to a simple suggestion by Hydrus based on file properties.

Sharing family relationships via the PTR would be very useful though.


62a17a No.4768

Hey dev, do you know of or have any plans to include support for Fuskator?

"Page of images" detects the pictures on a photoset page, but then fails with:

MimeException: Filetype is not permitted!… (Copy note to see full error)
Traceback (most recent call last):
File "include\ClientImporting.py", line 1351, in _WorkOnFiles
( status, hash ) = client_files_manager.ImportFile( temp_path, import_file_options = self._import_file_options, url = file_url )
File "include\ClientCaches.py", line 878, in ImportFile
return self._controller.WriteSynchronous( 'import_file', *args, **kwargs )
File "include\HydrusController.py", line 348, in WriteSynchronous
return self._Write( action, HC.LOW_PRIORITY, True, *args, **kwargs )
File "include\HydrusController.py", line 113, in _Write
result = self._db.Write( action, priority, synchronous, *args, **kwargs )
File "include\HydrusDB.py", line 696, in Write
if synchronous: return job.GetResult()
File "include\HydrusData.py", line 1922, in GetResult
raise e
DBException: MimeException: Filetype is not permitted!
Database Traceback (most recent call last):
File "include\HydrusDB.py", line 469, in _ProcessJob
elif job_type in ( 'write' ): result = self._Write( action, *args, **kwargs )
File "include\ClientDB.py", line 8697, in _Write
elif action == 'import_file': result = self._ImportFile( *args, **kwargs )
File "include\ClientDB.py", line 5327, in _ImportFile
( size, mime, width, height, duration, num_frames, num_words ) = HydrusFileHandling.GetFileInfo( temp_path )
File "include\HydrusFileHandling.py", line 166, in GetFileInfo
if mime not in HC.ALLOWED_MIMES: raise HydrusExceptions.MimeException( 'Filetype is not permitted!' )
MimeException: Filetype is not permitted!

But they seem to be normal JPGs.

The HTML structure is like this:

<body>
<form id="aspnetForm" method="post" action="/full/blah/blah-blah.html"> /* the path from website main directory to current page */
<div id"main">
<div class="hreview-aggregate">
<div>
<div class="imagelinks">
<a href="#2"> /* always the bookmark numbered one more than the current image - clicking an image jumps to next image */
<img id="i1" alt="blah-blah-1" ondragstart="return false" onselectstart="return false" onload="autoSizeImage('i1', 752, 1155);" src="//i6.fuskator.com/large/blah/blah-blah-1.jpg" width="752" height="1155">

I'm thinking maybe the bookmark links are messing it up or something? Or Hydrus doesn't recognize the type of jpg encoding they use?


9b613f No.4771

File: c0d384cb68dead6⋯.jpg (303.03 KB, 718x471, 718:471, c0d384cb68dead68049c3a295c….jpg)

>>4738

It was the longer 'stop searching once this…' text, which I changed this past week. I've shortened it again for tomorrow's release.

>>4768

My current stance is to not spend any more time writing parsers for new sites, but instead overhaul the downloader engine so any user can create/maintain their own. I will start this overhaul once I am finished with the current faster dupe stuff.

The page of images downloader can't handle that html–it can do:

<a href="fullsize_image.jpg"><img src="thumbnail.jpg" /></a>

which gets the fullsize url, and with the other checkbox--

<body><span><div><img src="fullsize_image.jpg" />...

In your case, I think it is probably fetching the '#2', which is just the same page back again, which is html that then causes the MimeException.

Writing a parser for this in the new system will take no time at all. Please hang on for now.


62a17a No.4776

>>4771

Sweet. I figured the links were messing it up because it was looking for a full image. Will eagerly anticipate the engine scripting bit.

Once you get that finished, you should consider stickying a thread for people to share all of the various things you can import and export as images in.


62a17a No.4777

Also, if possible, a way to bake delays between each file download straight into each script would be handy for avoiding bans. On that same note, some sort of scripting socket to a basic rand generator could allow us to avoid downloading at even milliseconds and getting autobanned for that.


0c3613 No.4780

I have a request for the parsing scripts. It would be neat if you could make it so link nodes only trigger if a certain string is found in the link address. That way you could have multiple nodes for different websites on the same parsing script, and you could for example do an iqbd search on all websites available on that site at once instead of a specific one at a time, which would save a lot of bandwidth.




[Return][Go to top][Catalog][Post a Reply]
Delete Post [ ]
[]
[ / / / / / / / / / ] [ dir / cute / egy / fur / kind / kpop / miku / waifuist / wooo ]