UPDATE: Fixed in 3.1 branch
Word to the wise about the fast random8/16 functions. They have almost no independence in the value stream. This may be fine for many things, but I have already run into a number of cases where the results are not as expected from random. A number of time I just presumed I was doing something wrong, but I’ve isolated a good test case. It manifests most dramatically if you are testing for a value ( probability ) and then using the next value for something visible.
Take the example below.
void rain_bad() {
memmove(&leds[1],&leds[0],(NUM_LEDS-1)*sizeof( CRGB ));
// 1 in 60 chance per loop
if ( !random8(60) ) {
// color returned by a successive call to random8() will always produce a small range of colors. In this case yellow/orange.
leds[0] += CHSV(random8(),255,255);
} else {
leds[0].fadeToBlackBy(50);
}
}
Do to the lack of independence in subsequent calls it appears unwise to use it to generate a probability and then immediately a color/hue value. I have tried several variations including testing for values other than zero ( 1,30,59 ) and they all result in the same very narrow output of the next call. Adding additional calls to random cause the hues to deviate slightly more, but they tend to be pretty narrow.
I ended up altering the function by replacing the lesser called random with the system copy. Slower, but guaranteed to not be lined to the previous call.
void new_rain_working() {
memmove(&leds[1],&leds[0],(NUM_LEDS-1)*sizeof( CRGB ));
if ( !random8(60) ) {
leds[0] += CHSV(random(),255,255);
} else {
leds[0].fadeToBlackBy(50);
}
}
So if random8/16 isn’t behaving like random numbers then take a look at your usage patterns and make sure you’re not falling victim to high dependence in the number stream.
Thanks for the write up!
Are you using the latest 3.1 code from github? A significant randomness improvement went in on Novemeber 18; discussion was here https://plus.google.com/112916219338292742137/posts/184Qd9fHK6g
If we’re still having poor sequential-randomness even after the November 18 commit, I’ll reopen ticket number 82 (https://github.com/FastLED/FastLED/issues/82) and rework the code. If that’s the case, please do let me know.
random8/random16 are intended to be fast enough for use on every pixel, and “random enough”,with a sixteen bit entropy pool. The Arduino library has more entropy and is fast enough to use “once per loop”, but not “once per pixel per loop”.
Pulled in random8() from 3.1 ( since that’s what changed )… can confirm, results are far better. First impression is that it is resolved ( for now ). I will update my post to reflect that this is 3.0.2 only. Sorry I’m late to the game and rehashing some of these issues. 
P.S. On AVR, FastLED’s memmove8 is faster than Arduino/libc’s memmove. (We also provide memcpy8 and memset8, also faster.) And if I were a more patient man than I am, I’d submit a minimal patch to the avr-libc and argue that the one additional instruction is worth the 20%+ speedup, but last time I saw that sort of discussion, they shut it down pretty fast with a rousing chorus of “every byte is precious”. And they’re right: storage is finite, and clock cycles go on seemingly forever. But on the other hand if you’re running on a battery, clock cycles may be even more precious than one instruction worth of storage.
Oops, sorry for the rant-- just one “old programmer” to another.
Anyway: memmove8 is there, and faster. Help yourself if you wish.
Not to worry- and it’s great to have you here, helping to make the library better and better. Seriously: thanks for the tire-kicking!
Oh, and of course the problem existed in ALL previous versions of the library, from 2.0 through 3.0.x 
Hmm… wonder if ent would be “more happy” about the values if the two halves were XOR’ed together. I’ll run it and see.
Actually, generally accepted wisdom in the crypto/hashing/randomness business is that adding is generally better than XOR’ing. Didn’t try that here though, so I’m curious what you find.
The other thought I have is converting the whole thing to assembly so that I can capture the overflow bit of the 16-bit multiply and add that into the low bit of the entropy sum with ADC vs ADD; captures one more bit and doesn’t cost any additional cycles at all. I’d have to test it both ways.
Anyway please do post your results if you test XOR vs ADD using ent (or something else).
ent agrees with you, just tested + and ^ with this little snippit.
#include<stdio.h>
#include<time.h>
#include<stdlib.h>
#include<inttypes.h>
#define RAND16_2053 ((uint16_t)(2053))
#define RAND16_13849 ((uint16_t)(13849))
uint16_t rand16seed;// = RAND16_SEED;
uint8_t random8()
{
rand16seed = (rand16seed * RAND16_2053) + RAND16_13849;
// return the sum of the high and low bytes, for better
// mixing and non-sequential correlation
return (uint8_t)(((uint8_t)(rand16seed & 0xFF)) +
((uint8_t)(rand16seed >> 8)));
}
int main()
{
srand(time(NULL));
rand16seed = rand();
for(int i = 0; i < 65536; i++) {
putchar(random8());
}
}
Nice. How big a difference does it make? My guess would be that it is definitely measurable, but not terribly huge.
The + vs ^ was very large.
^ = Serial correlation coefficient is -0.012836 (totally uncorrelated = 0.0).
So if you can capture overflow without any additional cost it would improve it, but it’s a very small improvement.
This was my test snippit.
uint8_t random8()
{
rand16seed = (rand16seed * RAND16_2053) + RAND16_13849;
// return the sum of the high and low bytes, for better
// mixing and non-sequential correlation
uint16_t inter = (uint16_t)(((uint16_t)(rand16seed & 0xFF)) +
((uint16_t)(rand16seed >> 8)));
if ( inter & 0x100 ) {
return (uint8_t)++inter;
} else {
return (uint8_t)inter;
}
}
Great, great data. Thank you!
I think I’ll not worry about the overflow bit trick today. Keeping the code in unforked, pure C helps with all kinds of things around here. Maintenance. Readability. Reuse. Inspection/validation.
Anyway, thanks very much for the data and digging.