windows
zip: https://github.com/hydrusnetwork/hydrus/releases/download/v311/Hydrus.Network.311.-.Windows.-.Extract.only.zip
exe: https://github.com/hydrusnetwork/hydrus/releases/download/v311/Hydrus.Network.311.-.Windows.-.Installer.exe
os x
app: https://github.com/hydrusnetwork/hydrus/releases/download/v311/Hydrus.Network.311.-.OS.X.-.App.dmg
tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v311/Hydrus.Network.311.-.OS.X.-.Extract.only.tar.gz
linux
tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v311/Hydrus.Network.311.-.Linux.-.Executable.tar.gz
source
tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v311.tar.gz
I had a great couple of weeks. E3 was fun to watch, and then I got back to proper work, mostly fixes and improvements to the new download systems.
pixiv fixed and other downloader stuff
I have made a new parser for the new dynamic pixiv layout. It was not simple, but it seems to work ok, including for manga. It only gets unnamespace tags in romaji/kanji (fetching just romaji/translations was a bit of a pain), but Pixiv's unnamespace tags have never been high quality, so unless you have a particularly important need for them, I recommend you not parse them. You should be updated to be using this new parser as soon as you update. My understanding is that everyone has been updated to the new layout, but if you are still on the old one, please check out network->manage url class links to roll back and let me know if you need any more help. Also, pixiv now lists a 'page' namespace in its downloader/subscription tag import options, if you want to parse page:1, 2, 3 for manga downloads.
In a similar way, I have fixed the new inkbunny parser, which was fetching and tagging additional unwanted files. It now visits each page of multi-page Post URLs independently to get the correct File URLs. Let me know if you still have trouble with it, including any example links that break!
The multiple watcher also has some bells and whistles--it now remembers its highlight, displays the URL of the current highlight, provides ways to set checker/file import/tag import options, and presents 'added' time in its watcher list. This added time is new, so for any existing watchers it will be set as the next 'load time', but it will remember thereafter.
Tag import options now has a 'get all tags' checkbox that advanced users may wish to use to override some missing-namespace weirdness related to the new downloader stuff currently being half-complete. I expect to do some more here in the coming weeks.
misc
Some kinds of regular file search are now much faster. Ratings searches, in particular, should now be pretty snappy.
Importing and exporting serialised .png objects through the new url class/parser dialogs is now easier--the little export panel now fills in better defaults and remembers the last location used, and the respective lists should now support .png drag-and-drop import. So, if I or anyone else gives you a new parser .png to try out, you can just drag it right onto the network->manage parsers dialog and it should import no prob.
Illustration2Vec project for advanced users
A user has done some really neat work integrating the machine learning Illustration2Vec project into hydrus. If you are interested in playing around with auto-tagging using ML systems, please check out the conversation starting here: >>9142
Although I am up to my neck in downloader overhaul at the moment, I am still enthusiastic and increasingly optimistic about integrating ML into hydrus in a variety of ways over the coming years. I am particularly interested in us generating our own models using our own CPU/GPU cycles. This is all extremely new tech, and my chief concern is how to make flexible and reasonable and productive workflows around it, so if you check this out, please let me know how it works welland less wellfor different situations, what you think is ultimately practical to achieve, and how you might like to integrate it into a future workflow.
full list
- wrote a new parser that muddles its way through pixiv's new dynamic javascript layout. it seems to get everything working again. it gets tags in kanji, although the unnamespaced pixiv tags remain low quality, and you may wish to just not parse them at all anyway
- fixed some misc parser text handling, unicode conversion etc...
- the new pixiv parser has a 'page' tag stub that should inform tag import options in the old downloader
- the multiple watcher now remembers the highlighted watcher through a session restart
- the multiple watcher now shows the highlighted watcher's url up top
- the multiple watcher now has checker, file import, and tag import options, which it will assign to all new watchers it creates
- the multiple watcher now has a 'set options to watchers' button that will force-set the current options to all the selected watchers
- the multiple watcher now has an 'added' column with watcher creation time listed. storing this creation time is new, so any existing watchers will get a new creation time of their next load time, but it is remembered henceforth. the listctrl here is now pretty crushed for width, so maybe we'll rejigger some stuff here
- watchers added to a multiple watcher will now have a status of 'just added' for five seconds
- watchers that are added to a multiple watcher that is already watching them will now have the status of 'already watching' for five seconds
- the multiple watcher list now has a much taller minimum height--layout here is another work in progress
- fixed the inkbunny parser (and a related tweak to the inkbunny url class)--it now uses the new 'multiple-file-per-post' import object generation to actually walk through the pages of the mini-gallery (which for inkbunny have -p2- suffixes on the url) to fetch only the correct files and url-associate them neatly
- tag import options now has a 'get all tags' checkbox, which can override the normal namespace checkboxes. it gets all tags, even those with namespaces not listed, which happens for several reasons in the new download system. (eventually, the namespace list may be replaced with a slightly different system)
- watcher tag import options no longer list 'filename' under their namespace checkboxes--they just have this 'get all tags', which works for everything (so watching yiff.party pages should now get tags)
- simplified and sped up similar files search at the db level
- sped up some ratings search code
- generalised some common file search optimisations, meaning they now apply in more situations and can take advantage of some other speed-ups:
- similar files system predicate is now faster
- inclusive ratings searches are now faster
- duplicate relationship count searches with non-zero-inclusive count are now faster
- removed some clumsy old ratings search optimisation code
- exporting serialised objects as pngs is a bit easier--now, it displays current export path better, will remember the last export location used, and for single png exports will pre-fill the filename and 'title' value with a reasonable default
- the content parser, page parser, and url class listctrls now accept serialised png files when drag and dropped!
- the simple downloader should recover and continue better from malformed urls during a page parse
- the url downloader should now recover better from various situations where it cannot not derive some tag import options (including urls with a 'file' url class, such as 4ch/8ch direct file links)
- parse test results will now state the priority value of urls
- gave the 'updating' section of help a pass and wrote a little more on how to do a big-version-gap update
- when a new multi-file import object inserts its child file import objects while being looked at in the ui, the listctrl should now correctly refresh the displayed indices
- subscriptions will now wait up to 90s for bandwidth (was 30s before, I think) before quitting, which should avoid a few more early-quit events
- cleaned up some server decompression bomb testing
- users with admin-level accounts can now upload decompression bombs to file repositories, better options on this will be avaliable in future
- the manage urls dialog will now OK on the same 'manage_file_urls' shortcut action that can open it (like manage tags and ratings already do)
- fixed the string converter for new file lookup parsing scripts
- started work on some in-the-background mass file reparsing, but I want to get some nicer ui going before I pull the trigger on any of it
- file reparsing now repopulates the table for md5, sha1, and sha512 hashes if they are missing
- improved some ffmpeg error parsing
- moved from basic list to a pop-faster collections.deque for importable path parsing and duplicate search branch regen
- added a BUGFIX option to options->gui that forces minimum width for popup messages in the continuing attempt to deal with some funny fit/layout calculation in certain Linux WMs
- fixed how some 'unrepairable db' error messages are displayed in Linux systems
- cleaned up a ton of old tuple-stripping code from the db
- updated to new sqlite for windows build
- misc improvements
next week
I was getting a bit too tired before the break, so I gave myself a bit more sleep every day this week (and spent a bit more time keeping up with messages, my ongoing battle), and it worked well. It felt good to get back to it. I now want to hammer out the last outstanding parsers and get into the meat of the gallery parsing overhaul.