Xoroshiro128+ Fails PractRand Even When Truncated
Although I know a lot of effort went into Xoroshiro128+, and there are many good things that have come out of its development, I am sad to say that on balance I feel it has too many flaws to be worth recommending---there are many better choices. In this post, I'll dig a little deeper into some of its flaws.
Let's begin with what we already know:
* John D. Cook showed that it fails PractRand with just a couple of seconds of testing, and before that Chris Doty-Humphrey, author of PractRand, showed that it didn't just fail “binary rank” tests, it also failed the DC6 test which is a short-medium range linear test.
* Daniel Lemire showed that it fails TestU01.
* I showed that it is trivially predictable.
* I showed that it is visualizing smaller-scale versions of the scheme shows clear flaws.
But are the flaws superficial and easily ignored, or more troubling than that?
Evolving Author Caveats
The authors of Xoroshiro128+ do acknowledge some of these flaws. The source code admits (somewhat obliquely) that it fails PractRand's binary-rank tests, but suggests that the problem is confined to just the lowest few bits. Over time, however, these claims have been progressively weakened. The source for this generator used to say:
>with the exception of binary rank tests, which fail due to the lowest bit being an LFSR; all other bits pass all tests.
On 14 October, 2017, the comments in the source were revised to say
>with the exception of binary rank tests, as the lowest bit of this generator is an LSFR. The next bit is not an LFSR, but in the long run it will fail binary rank tests, too. The other bits have no LFSR artifacts.
Less than six weeks later, on 29 November 2017, the comments were revised yet again, to what is now their current wording (archive):
>with the exception of binary rank tests, the lowest bit of this generator is an LFSR of degree 128. The next bit can be described by an LFSR of degree 8256, but in the long run it will fail linearity tests, too. The other bits needs a much higher degree to be represented as LFSRs.
This version is quite a climbdown---from saying the other bits have no LFSR artifacts to admitting that actually, yes, they do. But the implication remains that only the lowest few bits fail statistical tests in practice. If that is the case, it means that using Xoroshiro128+ to generate 48-bits of randomness (throwing away not just the lowest two bits, but the lowest 16) is fine. But you have to wonder: is that really true?
Conclusions
Of course, few people bother to run a test that lasts 30 weeks and consumes half a petabyte of random numbers, but the point remains, Xoroshiro128+ has detectable flaws even if you throw half of its output, 32-bits, away.
It seems reasonable to surmise that you only throw 16 bits away, leaving 48 bits, you'll see statistical flaws sooner. When I get around to it, I'll test that out too, but even if it's less than 30 weeks of testing, it might need more than the typical week of testing to unmask the issues.
It's my understanding that Vigna mostly dismisses linearity-related failures as overly technical, and has even gone so far as to argue with the author of PractRand as to whether BCFN and DC6 are valid tests at all (even though other PRNGs pass them just fine). But I'd rather use a PRNG where you don't have to worry as much. There are plenty of them out there, including some very old ones.
I left out the Tests sections. Read them here:
http://www.pcg-random.org/posts/xoroshiro-fails-truncated.html
discuss