A small addendum since I've found something tremendously useful other people should know about.
With Subs2SRS and "Xavier's Retimed JP Sub Pack v01" I've been experimenting with a supplemental deck for audio comprehension, and I've had very promising results so far.
First, spoken words seem to be much easier to learn when they're spoken in a context that's actually interesting to me. Testing both reading and listening of the same sentences with the interval modifier cranked up to 200% has made them easier to learn while having far fewer overall reviews involved.
Second, I am saving an incredible amount of time making my cards: Subs2SRS spits out everything nicely formatted into fields with audio and images automatically, then I import them into a custom card format that generates both a listening and reading comprehension note from each card. All I have to do manually is go through the browser and mark the sentences I'm interested in and add a definition, 90% of the tedium is done for me. No pitch lookup and coloration is required since I can just listen to the audio.
Just something you all may want to try out if you make your own cards.