windows
zip: https://github.com/hydrusnetwork/hydrus/releases/download/v420/Hydrus.Network.420.-.Windows.-.Extract.only.zip
exe: https://github.com/hydrusnetwork/hydrus/releases/download/v420/Hydrus.Network.420.-.Windows.-.Installer.exe
macOS
app: https://github.com/hydrusnetwork/hydrus/releases/download/v420/Hydrus.Network.420.-.macOS.-.App.dmg
linux
tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v420/Hydrus.Network.420.-.Linux.-.Executable.tar.gz
I had a great week fixing a whole bunch of bugs.
bugs
I fixed taglist drag-select, which was not moving the to-be selected indices down with the scroll. Sorry for the trouble here. You can now also ctrl+drag-select to deselect.
There was a bug in the new virtual siblings and parents lookup system that meant some grandparents and siblings were not appearing. For instance, for parents, 'samus aran' might have 'metroid', and 'metroid' would have 'nintendo', but 'samus aran' would not have 'nintendo'. Thanks to help from users, I was able to reproduce it and fix the problem. When you update, the client will spend a few seconds regenerating the lookups and finding the missing links. It will queue up a bit more work for the background display sync to do later on. In my test situation, the PTR went from 189,000 sync rows to 192,000.
Autocomplete results in manage tags now list parents beneath every tag they apply to. Previously, parents could only exist in the list once, so they were accidentally de-duping and only ending up beneath the last applicable tag. Now you get plenty.
Also, these 'write' autocomplete results now show sibling and parent data for the tag that matches your input text even if that tag has no count. When that tag has sibling data, all the siblings are loaded as well. This sounds obscure, but you'll notice when you next start entering tags on a service with a lot of siblings. It should make it easier to quick-type complicated tags in manage tags.
Typing to get autocomplete results in search pages with thumbnails is now significantly faster and more responsive when 'searching immediately' is turned on. This routine gathers count-accurate results based on the thumbnails in front of you and then sends that data down to the database for sibling population. This suddenly got laggy with the new virtuals system, but I have improved how it schedules its searches and performs its tag lookups, and it should be much better now.
An unusual situation with a 'null character' tag that could not be displayed in the client should be better. This tag is detected, the rendering problem caught, and the user notified. The database routine that corrects bad tags now also fixes this tag and forces a tag presentation refresh once it is done.
speed
I spent a bunch of time in the siblings and parents system this week, and autocomplete and tags in general, running profiles on complex data and optimising various calls. I also sped up a critical new optimisation used across the program. Tag searches, autocomplete lookups, tag processing, tag display sync, and anything that runs frequently in small amounts should now be a bit faster. The only thing that may run a bit slower is tag display sync for very complicated tag parent situations, due to the more thorough logic in the fixes above. Thank you to users who helped with profiling here.
Tag display sync now only tries to run every 30 seconds (up from every ~3 seconds) when allowed to run in 'normal' time. It also takes breaks when it thinks a bunch of other work is going on. It was running hotter and faster than we needed, and it made the client too laggy, so I am pulling it back until I can make it more intelligent. I am also considering breaking the main display sync maintenance job into even more granular pieces. I will keep working.
full list
- misc:
- fixed the bad position indexing when drag-selecting taglists that were scrolled down. this also caused some weird selection when scrolled and clicks included a little mouse movement. sorry for the trouble!
- ctrl+drag-select now deselects!
- fetching tag autocomplete results when you have thumbnails and 'searching immediately' on, which has been way too slow recently, now cancels much faster. in some large page situations, it was adding multi-second lag on the first character-press. it also runs faster overall
- hydrus should now deal better with invalid tags that contain the null character (there it one we know about on the PTR, from a decode of botched Shift JIS, which could crash the client from too many errors during critical paint periods). when a tag like this turns up in a taglist, thumbnail, or canvas background, it now renders as an appropriate 'invalid tag' string, and a one-time 'woah, bad tag, run fix tags now' popup appears
- regular tag cleaning now looks for and removes null characters, so all new sources of these bad tags should now be eliminated
- _database->check and repair->fix invalid tags_ now fixes tags with the null character. it also fixes tags so broken that after cleaning they have no subtag left. it also now forces a full media tag reload when it is done for all media
- the 'regen storage mappings', 'regen display mappings', and repopulate from cache' database routines now have an additional step where you can order them only to work on one tag service, so regenning or repopulating local tags, which usually takes a couple seconds, doesn't need to wait two hours for the PTR to go as well
- added some menu help to the 'profile modes' debug menu, and gave 'reducing program lag' help page a pass
- fixed virtual display regeneration on service delete
- .
- parents and siblings:
- fixed situations where some grandparent and sibling relationships would not appear in the virtual system. it was a bug when certain links of a multi-part display 'chain' were updated at different times. when repopulating chain data, the sibling and parent update routines now correctly chase their complete chains both when wiping ideal data and repopulating from raw data, hitting all levels of the chain, ensuring to go back up and down chains when there are multiple grandsiblings/children/parents, and chasing parents where one or both members have better siblings. thank you to the several users who reported and helped figure out this problem, which was not simple to reproduce (issue #725)
- your ideal display data will be regenerated on update, which should not take more than a couple of seconds. it will likely correct some siblings and add some grandparents to be filled in by the siblings/parents sync. my PTR test environment went up from about 189,000 display rows to 192,000
- while sibling and parent lookup is more thorough (and hence more expensive), I also optimised many parts of lookup week. I believe tag display sync and tag processing will be much faster for tags with simple sibling and parent relationships, and slightly slower for tags with complex relationships and many instances to files on your drive. as always, let me know what sort of processing speeds and lag you get, and if you know how to make a db profile, please send them in when it gets bad
- when a 'write' autocomplete results list includes parent expansion rows (as in _manage tags_), parents now show duplicated and properly for all the tags that have them, including siblings and other children/grandchildren (previously, a parent label could only exist once in a list, which meant parents were ending up hanging off the last valid tag for which they applied)
- 'write' autocompletes now show results that exactly match the text entry, and all their siblings, when they do not have count but do have sibling or parent data. so, if you type in 'samus aran', and it has a sibling to 'character:samus aran', but 'samus aran' doesn't actually have count, you now get it and all siblings anyway. this may need tuning, but it solves a persistent and annoying lookup and quick-sibling-access problem in _manage tags_
- copying tags and their indented parents now removes the parent indent whitespace
- tag sync display now takes way longer breaks (now 30 seconds, was 2.5) between 'normal' background work periods. this thing was hammering people far harder than needed and could clog up db write/commit time and nobble UI responsitivity when big bumps collided
- the tag display maintenance manager now also tries to detect when many siblings or parents are streaming in (from a migration or a repository process with a heap of data), and pauses work while that continues
- greatly sped up mass imports of sibling and parent data, either from tag migration or big dialog pastes. what was 40 rows/s should now be about 1,000 rows/s
- fixed the database menu's 'regenerate tag parents lookup cache', which wasn't hooked up
- .
- boring changes:
- gave tag parents and siblings update, regen, and chain fetch a full pass, correcting bad queries to fix the above, fixing raw pair chain level navigation and parent-sibling idealisation, and optimised these lookups as well
- fixed some tag_id vs ideal_tag_id nomenclature (and related bugs) in tag parents cache
- optimised 'all known tags' autocomplete count fetching a little. tag autocomplete and search should be a bit faster in this domain
- reduced display sync pre-processing overhead by about 30% with a better random pair sampling routine
- reduced the overhead of my now very commonly used single integer memory table select optimisation. this now recycles tables after use, which reduces overhead about 50% in small number scenarios. all features of the database will enjoy this speed improvement, particularly small repetitive tag lookup jobs (such as the new display sync and repository tag processing)
- reduced overhead on some sibling chain lookup code
- reduced overhead on the sibling lookup used by manage tag dialog taglist
- reduced overhead on some parent chain lookup code
- tiny optimisation on single sibling chain lookup
- sped up the ancient OG single tag->tag-id fetching routine, seems to work about twice as fast now
- more misc optimisations, mostly list/set/dict comprehension rewriting to reduce overhead, across virtual sibling and parent code
- added a full combined siblings and parents unit tests for the main missing parent chain link problem solved this week
- added a full combined siblings and parents unit test for large real world data added in multiple pieces
- 'a file identifier was missing!' critical errors now print a stack trace to the log for further debugging info
- updated the 'help my db is broke.txt' document with a couple new comments
next week
I want to do some code cleanup, catching up on bad old code and making the duplicate potentials search non-interrupting. I'll also prototype a database mode that may improve performance on HDDs. Other than that, I want to properly plan and start work on the big network improvements that I will finish the year on.