I think as long as you can follow the narrative, it generally doesn't matter whether you can literally "see" what is being described or not.
If you already know or remember from past listenings what you'll be told to visualize in a file, if can help the process a little for people who have serious problems imagining images to look at photos or videos of what you are intended to imagine just before listening. If there's a staircase countdown, you might also want to actually walk down some stairs (if there are any nearby) shortly before listening to keep that experience fresh in your mind.
As for myself, the problem with "walking on a beach" kind of imagery isn't so much that I can't imagine a beach (even if it's nowhere near photorealistic in my imagination), it's that I'm lying down at the time while listening and have trouble imagining my feet moving.
Another scene I have problems with is forests. I can imagine a tree. I can imagine two trees. Maybe even three. It's hard to imagine dozens of trees at the same time, and my mind forests end up looking like large empty rooms with tree trunks as pillars, an orange and green carpet, and a vaguely defined ceiling of leaves.