[ / / / / / / / / / / / / / ] [ dir / random / alleycat / cow / desu / doomer / loomis / mu / pdfs / wis ]

/hydrus/ - Hydrus Network

Bug reports, feature requests, and other discussion for the hydrus network.
Name
Email
Subject
Comment *
File
Password (Randomized for file and post deletion; you may also set your own.)
Archive
* = required field[▶ Show post options & limits]
Confused? See the FAQ.
Embed
(replaces files and can be used instead)
Voice recorder Show voice recorder

(the Stop button will be clickable 5 seconds after you press Record)
Options

Allowed file types:jpg, jpeg, gif, png, webm, mp4, swf, pdf
Max filesize is 16 MB.
Max image dimensions are 15000 x 15000.
You may upload 5 per post.


New user? Start here ---> http://hydrusnetwork.github.io/hydrus/

Experienced user with a bit of cash who wants to help out? ---> Patreon

Current to-do list has: 2,017 items

Current big job: Catching up on Qt, MPV, tag work, and small jobs. New poll once things have calmed down.


HookTube embed. Click on thumbnail to play.

5f3e9d  No.8708

windows

zip: https://github.com/hydrusnetwork/hydrus/releases/download/v304/Hydrus.Network.304.-.Windows.-.Extract.only.zip

exe: https://github.com/hydrusnetwork/hydrus/releases/download/v304/Hydrus.Network.304.-.Windows.-.Installer.exe

os x

app: https://github.com/hydrusnetwork/hydrus/releases/download/v304/Hydrus.Network.304.-.OS.X.-.App.dmg

tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v304/Hydrus.Network.304.-.OS.X.-.Extract.only.tar.gz

linux

tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v304/Hydrus.Network.304.-.Linux.-.Executable.tar.gz

source

tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v304.tar.gz

I had a great week. There is a bunch more downloader work and some new shortcuts.

tag blacklists

After a long time, the client now supports tag blacklists! I've put them under tag import options.

They use the newer 'tag filter' object, which scans a file's tags as they come in, and if it sees any it would exclude (like 'scat' or whatever else you might not want), it stops the file from importing automatically.

On the newer download systems, this vetoes the file before it is downloaded (saving you some time and bandwidth), but the legacy downloaders still download the file and tags together, so they have to stop after the download is done. There are also currently and temporarily two locations where default tag import options can be set. The ambiguity here should all be cleaned up in the coming weeks as I move everything over to the new systems.

If you have been waiting for this, please give it a go and let me know how it works for you. It seems ok in my testing, but I may have let some unusual situations fall through the cracks.

url normalisation and other downloader work

An important objective of this downloader overhaul has been to 'normalise' URLs–to collapse the different ways you can write a single URL into a single comparable format that is not only clean and pretty but also makes it easy to determine if the client has seen and downloaded it before. The new 'url classes' system does this, and in 304, the client will apply URL normalisation to all incoming import URLs. This mostly matters for boorus like e621 that append some random tags at the end of the URL as a description, but it will also convert any matched legacy http definitions or drag-and-drops to https automatically.

The way the client determines if it has seen a URL before–particularly when it has not been matched by the new system (an unknown url class)–is also much improved. The client can now better deal with conflicting data like multiple files claiming to have the same URL without either redownloading every time nor abandoning the attempt entirely.

The main import object will now also handle certain import errors like 404 and the new tag blacklist event in a more graceful way (now called 'veto' status) and also present more information in its '23 successful, 5 skipped' status text.

I've added several basic parsers to the simple downloader for yiff.party as well–thanks to @cuddlebear on the discord for the submission. I expect to do more here in future as well (likely making a watchable url class and full-blooded PageParser so you can 'watch' a yiff.party stream like a thread and even a subscription).

Due to a mistake in the update code, I accidentally restore the simple downloader parsers to default (which includes the new yiff.party parsers)–if you have a bunch of custom simple downloader parsers set up, please export them before you update, or wait a week for the fixed update code (which will just add the new parsers to whatever already exists) to roll out.

new shortcuts

You can now set shortcuts for opening the new downloader pages (urls, simple, and thread watcher) and the duplicate filter and page of pages, all under the 'main_gui' shortcut set. Support for individual file pages should come in the near future.

And you can now shortcut all the duplicate-setting actions (like 'set these all as alternates' or even 'the focused one is better than all the others selected') under the 'media' shortcut set. These are advanced commands, so if you don't get the duplicate filter yet, stay away!

full list

- renamed the new 'tagcensor' object to 'tagfilter' (since it will end up doing a bunch of non-censoring jobs) and refactored it into clienttags

- attached a tag filter object to all tag import options to act as a tag blacklist. all tags that go through the import pipeline (except for a couple of old legacy instances) are now checked against the blacklist, and if a bad tag is found, the file vetoes! tag import options has some new ui to handle this and background code to deal with inheritance from defaults and so on

- new file import urls that have url classes, no matter their source, are now normalised!

- all new file import urls are now tested against both the original and normalised version of the url, so even though previously parsed urls remain un-normalised, new urls that are pre-normalised the same will not count as new! -fingers crossed-

- on update, the db will get normalised copies of all existing urls. this means many files will now have two versions of its urls–some ui to collapse everything down to only the normalised version (after some human eyes have passed in front of this big change) will come in the coming weeks

- some sites where normalisation is a consistent problem for later redownloads (like e621, which appends 'preview' tags to the post url) _should_ now be caught reliably!

- the 'allow subdomains' on edit url class panel is now named 'match subdomains' and has a tooltip to better explain how it works

- 'keep subdomains' is now 'keep matched subdomains' and has a tooltip as well

- the 'keep matched subdomains' enabled behaviour (and some normalisation calculation) is now additionally governed by the 'associate url with files' value and api url conversion info rather than just 'match subdomains' and raw url type

- fixed an issue that was stopping the 'associate url with files' option sticking in edit url class panel

- edit url matches now resorts after an add or edit action

- all listctrls with a wrapper panel now resort after an import from clipboard, png, or defaults call

- url matches now match against www*. versions of their domain regardless of 'match subdomains' settings

- updated xbooru url classes to prefer https

- the manage url class links panel now has a 'clear' button to clear a url_class->parser link

- introduced three new simple downloader parsers for yiff.party, thanks to @cuddlebear on discord for the submission

- the old 'uninteresting mime' status has been expanded to a wider 'vetoed' status to represent all file imports that are abandoned without a particular error (e.g. tag blacklist, wrong filesize or resolution)

- the import system now reports the total of 'num vetoed' as 'num ignored' in its summary statements

- it now also reports 'num skipped'

- the 'num successful' and 'num already in db' are now folded more neatly together in import cache summary statements

- file downloads that are cancelled will now set a 'veto' state rather than a 'skip' state

- improved file import exception handling across the board

- improved how single-file-result parsing vetoes propagate up to the file import status cache

- 404 network errors will now provide a 'veto' status rather than an 'error'

- vetoes will not count as errors when deciding whether a subscription should be abandoned early (so a bunch of decomp bombs or 404s will no longer stutter a subscription!)

- misc fixes and improvements to the new download stuff

- wrote a new parsing cache that saves a lot of work in the new parsing system

- improved the 'is this url known?' test to better deal with situations where all the given urls are galleries or unrecognised–a better aggregate of file status is formed, and 'already in db'/'deleted' statuses will apply if there is no evidence otherwise (the dev got the new logic for this from a legit nightmare about urls downloading over and over, so let's hope it works out)

- the 'is this url known?' logic also recovers from 1->n url->hash relationships where it does not expect them, trying to find 'already in db' hashes over 'deleted' ones

- to clear up some ambiguity, galleries or subscriptions now give a different 'checking in x seconds' status when waiting on the first page of a query

- the 'noneablebytescontrol', as seen in edit file import options, will now correctly disable/enable its bytes sub-control when it is none'ed

- a persistent issue with the new network engine sometimes failing to correctly error after certain broken connections (the computer going to sleep mid-download was a common cause here) should now be recovered from and the connection naturally reattempted

- added three new shortcuts to the 'main_gui' shortcut set that allow for opening a new 'urls', 'simple', or 'thread watcher' downloader page

- added two more shortcuts to 'main_gui' for new 'page of pages' and 'duplicate filter page'

- moved some old 'new page' menu code to the new application command system

- added numerous 'duplicates' shortcuts to the 'media' shortcut set that will work on selections of thumbnails

- the thumbnail duplicates menu actions now go through the new application command system

- fixed an issue where the current tag parents caches was not refreshing when notified

- inputting a short invalid syntactic input on a 'read' tag autocomplete such as '-' will now clear the system predicates list–system preds should now only show on a completely empty input

- fixed an issue where certain combinations of 'remove a tag, then re-add it' nullipotent actions in a single manage tags dialog transaction were not applying reliably (sometimes, the subsequent mirror action was not occuring due to a processing re-order optimisation at the db level)

- made some animation code a little safer and quieter as a test for some users who were getting blitzed with some deadwindow error spam in certain situations–let's see if this changes anything

- replaced all the em dashes in the help with double hyphens as github pages was rendering them wrong

- added CrystalDiskInfo recommendation to 'help my db is broke.txt'

- misc cleanup

next week

Now all the new urls going into the system are normalised, I would like to get the gallery and subscription downloaders to start using the new system where it can find a parsing solution. I and other users can then start adding parsers and it should all naturally migrate over the coming weeks.

I've also still got plenty of small stuff to work on.

____________________________
Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

44e840  No.8709

File: 28dbbc3c2fe9b5d⋯.png (885.7 KB, 800x1091, 800:1091, 28dbbc3c2fe9b5d674376a94f3….png)

>>8708

Wow, you got quite a lot done again this week. Thanks!

I can see the URL normalisation [ŧricky stuff that, I hope you used some lib?] & tag blacklist helping greatly going forward.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

3803d2  No.8710

File: c53b41afab950a8⋯.png (21.26 KB, 152x254, 76:127, 1473057531020.png)

Thanks for the new shortcuts!

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

6d780e  No.8711

Nice! Really happy to see blacklists arrive. I might even pop my cherry on hydrus file repos if I can find some good ones.

Here's a request while I'm here:

Idiot-proof windowing

Yesterday I spaced out and opened two subscriptions windows, worked on the front one for a while, closed the front one, saw the second one, very nearly applied its "changes", which would have wiped out my work in the front window by blithely overwriting it. It would be nice to get a popup warning you that you're retarded when you open multiples of something important that you normally wouldn't like subscriptions, especially since I think I did it due to delay in the first window appearing. That or just bring current subscription window to foreground instead of opening another instance. I don't know if this is easy to implement in wx or not, but if it is it could save someone some loss.jpg

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

6d780e  No.8712

File: 4d9dd4afd4d1211⋯.jpg (37.67 KB, 152x254, 76:127, 8ch-sickos.jpg)

>>8710

Here, take an upgrade.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

688e36  No.8714

File: e189acbe758f746⋯.jpg (7.5 KB, 250x201, 250:201, cot.jpg)

>>8712

>upgrade

>jpg with no transparency

>almost twice the filesize

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

8279ff  No.8716

File: 0efee81551286f9⋯.png (13.52 KB, 152x254, 76:127, sickos.png)

>>8714

Your image wasn't transparent, it was a 4chan-colored background, so I changed it to 8ch-colored, hence "upgrade". Is joke. Maybe you're using a nostalgic 4chan CSS or something?

But here, this one is actually transparent, a png, and smaller to boot so you can save some room on that floppy

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

8279ff  No.8717

I should be more specific, the image was transparent but the identically-sized thumbnail was not. So I saved the thumbnail and "fixed" it, then actually fixed it. For real though, what's going on with that original image?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

c0ac98  No.8718

On a given image,

https://pixiv.net/member_illust.php?illust_id=68406455

Which I suppose to be the normalized url send me to MY profile while

https://www.pixiv.net/member_illust.php?mode=medium&illust_id=68406455

send me to the correct page as usual.

I'm not sure if this is art of the normalized url system or simple bug however.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

d7fd08  No.8719

File: 43ff54e2f0b716d⋯.png (31.44 KB, 495x312, 165:104, transparent.png)

>>8716

Worked on my machine.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

05a442  No.8722

>>8717

>>8712

I saw that, too. It was a grey border, until you click it that is.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

5aae01  No.8723

Got this error during Pixiv Sub check

Exception
The subscription Pixiv encountered several errors when downloading files, so it abandoned its sync.
Traceback (most recent call last):
File "include\ClientImporting.py", line 4846, in Sync
self._WorkOnFiles( job_key )
File "include\ClientImporting.py", line 4276, in _WorkOnFiles
raise Exception( 'The subscription ' + self._name + ' encountered several errors when downloading files, so it abandoned its sync.' )
Exception: The subscription Pixiv encountered several errors when downloading files, so it abandoned its sync.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

df55bb  No.8725

File: c92ef54bfda4e79⋯.jpg (133.98 KB, 625x872, 625:872, c92ef54bfda4e798ded08bfe75….jpg)

>>8709

I didn't use a library for the URL normalisation–would you have a recommendation?

I don't do anything _too_ clever–I replace the http with https when desired, strip off undefined path components and parameters, and alphabetise parameters by key. I am not following an official standard to be super neat or anything–just collapsing some common booru url dupe problems we've come across.

>>8711

Thank you for this report. Was the double-sub window issue on a recent version of the program? I believe I have fixed this problem (maybe around v301?) so it'll stop you opening another while the first is loading, but please let me know if you can still do this.

>>8718

>>8723

Thank you, I have had this report from a couple of people now. Yeah, it looks like an issue with the new url classes, maybe combined with pixiv's unusual url referral stuff. I will make sure to fix this properly for v305.

Please pause your pixiv downloaders/subs for now. I apologise for the inconvenience.

>>8717

>>8719

I see the grey background in the thumb too, but >>8716 is correct for me. c53b is a grayscale image, whereas 0efe is a 256 colour image, so I guess 8chan's thumb generator deals with transaparency in the different modes differently? Could that grey be the same intensity of the blue it would otherwise put in, collapsed back down to greyspace?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.
Post last edited at

8a1597  No.8726

File: 4e1969ffc02dc27⋯.jpg (248.07 KB, 870x1230, 29:41, 4e1969ffc02dc270fe9790b2ef….jpg)

> I would like to get the gallery and subscription downloaders to start using the new system where it can find a parsing solution. I and other users can then start adding parsers and it should all naturally migrate over the coming weeks.

Does that mean we'll be able to write parsers using the new system for boorus?

New to hydrus and I've been trying to strangle the current system into downloading from the drawfag booru, but it's busted on tags due to there being no classes on the <li> elements (seems like that's how hydrus tries to pull tags from gallery posts)

It sounds like the new system allows for a bit more customization on the HTML parsing side of things.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

8d785b  No.8728

>>8725

It was v303, but it may have been because it was lagging, I had a lot of other programs open and it's an old laptop.

Regarding the image, when I opened the one I saved it actually had the coloration in the background, which I just deleted and left it transparent (but first I replaced it with 8ch background color for the "upgrade" jpg). I don't know if it's a thing with the image itself or for some reason 8ch is adding background colors to some images.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

641b23  No.8729

>>8725

>would you have a recommendation?

For Java I'd have recommended Nutch's URL normalizers (most of them should be easy to adapt, see https://github.com/apache/nutch/blob/master/src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net/urlnormalizer/basic/BasicURLNormalizer.java for example), but for Python I got little practical experience.

Well, something like https://github.com/alephdata/urlnormalizer looks like a reasonable enough start.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

f11ffe  No.8730

File: 0d303238fb6509e⋯.jpg (308.67 KB, 550x550, 1:1, 0d303238fb6509e203239cb4df….jpg)

>>8725

Thanks for clarifying.

I tried to go back to 303 but this gave me a serious database error.

I'll have to pause it and mass save my bookmarks of the week.

if anyone is a heavy pixiv user and read this before upgrading to 304.

DO NOT UPGRADE YET.

Thanks for the attention.

While we're discussing pixiv, what is the deal with the tags I can see how their presentation have changed but I don't actually understand why the tag with a wiki/translation are ignored during download while those without are correctly added.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

5b4df9  No.8731

>introduced three new simple downloader parsers for yiff.party, thanks to @cuddlebear on discord for the submission

does this mean we can rip off entire yiff.party galleries now?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

5b4df9  No.8732

File: c97e744d1d32628⋯.jpg (43.29 KB, 918x749, 918:749, DICC.jpg)

>>8731

holy shit yes we can MUH DICC, hydev you're the best

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

273202  No.8734

>>8708

Im skipping this version because I have a non default parser and cant be asked to regret it,

got a question

is it possible to have blacklisted tags got to trash instead of skipped? your example scat, there are images that while technically would be scat are also ones I would want to save, one that comes to mind is a doujin that has all the scat content invisible/clear due to magic bullshittery, so it instead turns into more of a gaping hole hentai then scat but would still be labeled scat.

a black list for me, would be helpful if only to get over the initial sorting, but I would like to have final discretion on what to keep.

also, I didn't get an answer for this

>>8582

To sum it up, I was wondering if there's a way to only see new images from a download

I find it useful to see all downloads from threads/sites so I know that its working properly but once its down, I would like to be able to hide previously imported files

I'm not sure if any current method works retroactively.

Is it possible to do this and if not, is it something you may have an interest in implementing?

And saying this I also have another thing I asked about for view a long time ago but am not sure if it ever got a response.

I'm not sure anymore because im careful, but new pages would always default you to the view mode, and after a thread downloaded, 150 images+ you would want to scroll down, and then the view changed and the import order was fucked

now, lets say you downloaded an image set that you have a partial collection of already, you scroll wheel down everything is out of order, you go to time imported, and its not sorting it by the import order for the thread, but the import order it was added to archive. is it possible to have a sort order that is 'thread order' that sorts by where its position in the current pages download was?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

a7d789  No.8743

Any news on the API that you had planned after the downloader overhaul?

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

df55bb  No.8751

File: fbc5248a721852c⋯.jpg (382.21 KB, 2932x2132, 733:533, fbc5248a721852c0db4c715ac3….jpg)

>>8726

Yes, absolutely. This stuff will start rolling out for real today, but there is still more to do. In v305 today, it will be possible to write a file page parser and hence add drag-and-drop support for your booru's file pages. Gallery page parsing and search-string->gallery-page-url stuff will come later. If you would like to check the help so far, it starts here:

http://hydrusnetwork.github.io/hydrus/help/downloader_intro.html

If you don't get it or would like help with any of it, please let me know. I'll eventually be writing some let's-make-a-full-downloader-from-start-to-finish examples for it as well, but I'm still working on some of the components.

If you end up writing some parsers and url classes, please send me the pngs if you like and I'll check them out and roll them into the release as new defaults!

>>8728

I checked this out this week but unfortunately could not reproduce it (there's a simple strong lock against doing this, precisely because of the delayed manage subs launch), but in thinking about it just now I realised you might not have been talking about manage subscriptions, but instead one of the 'edit a single sub' sub-dialogs that that dialog launches. Was this it--were you able to hit the 'edit' button/double-click twice before the first dialog opened, so you had two dialogs for the same single sub?

>>8729

Thanks, I will keep this in mind.

>>8730

It looks like pixiv changed their tag layout since I first wrote its parser and they now have these jap/romaji options. I've got a new and fixed pixiv downloader/parser rolling out today that will fetch the romaji versions.

>>8731

>>8732

I've got an improved parser coming out today that works with yiff.party's API and does 'inline' images in the post body as well, and I've linked it up with some url classes so you can drag-and-drop yiff.party artist urls and spawn a thread watcher that should be able to recheck every week or whatever if you set some sensible check timings. I also expect to add thread-watchable stuff to the subscription system in this downloader overhaul, which is a more appropriate location for this sort of long-period checking. Assuming your penis survives, please let me know how it works for you.

>>8734

I think I will figure out some kind of 'blacklist = remember as deleted' system so the html pages' themselves are at least not reattempted.

I am not sure about getting the actual file and sending it to trash immediately. Is the workflow here basically "it is easier to choose the three good files out of three dozen bad ones than to choose to delete thirty-three bad ones"? Would you be walking through a search for say ( 'file domain: trash', 'scat' ) and then rescuing good files and permanently deleting bad ones? Could this workflow be served just by letting all scat through and searching in 'my files' for 'scat' and sending the 'bad' examples to 'trash'?

For >>8582 , sorry, I missed it. Yeah, I don't have many good 'dynamic' display tools in for thumbnails yet. The media panel where the thumbnails are held is largely ignorant of any importer or searcher that feeds it thumbs, so it can't do 'sort by import time of my importer' or 'hide this class of file, and then show them again' yet, as hiding/removing something currently permanently removes it from the media panel.

I would like to add these things, but the media panel is probably the most fucked part of my codebase. I would like to clear out a bunch of time to focus on overhauling it so I can clean it all up and start making it aware of more things than its own tangled duct-tape mess. I'd also like to add 'grouping', to split thumbs into per-mime or whatever groups with little spacers between them, and maybe even generalise the data code a little more so I can add 'view as list' or any other view options. Sadly, it will have to wait a bit.

>>8743

I have not worked on it. I am mostly hammer and tongs on the downloader right now. I would still like to do something and would like to start by rustling up a very limited prototype that I can iterate on.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.

4aa73f  No.8758

>>8751

It was manage subs, not a sub edit, but I haven't been able to duplicate it either.

Maybe just a post-upgrade first launch bug or something.

Disclaimer: this post and the subject matter and contents thereof - text, media, or otherwise - do not necessarily reflect the views of the 8kun administration.



[Return][Go to top][Catalog][Nerve Center][Random][Post a Reply]
Delete Post [ ]
[]
[ / / / / / / / / / / / / / ] [ dir / random / alleycat / cow / desu / doomer / loomis / mu / pdfs / wis ]