How to Test Randomness of Numbers

At first i said the answer was to check this out: Diehard Battery of Tests of Randomness which is linked to by this page which may also be of interest: Tests for Random Number Generators.

But apparently that is the “old way” and there is a new program from NIST that you can get here: NIST test suite for random numbers, which subsequently is linked to from random.org: Random.org Statistical Analysis.

Getting that program from NIST to compile was a little bit of a chore for me on msvc 2010. The biggest hurdle i hit was that msvc 2010 doesnt have erf() and erfc() so i had to google “erf.cpp” and find an implementation. If you can’t find one, erf and erfc are part of gcc which is open sourced so you can always go that route if you need to!

After compiling, i was able to run the test on my numbers but couldn’t make much sense of the results very easily. There were a few p scores and presumably some chi squared scores somewhere, but the “summary file” was very cryptic (pun intended) so i wasn’t really sure…

Anyways, just wanted to put it here for myself and others if anyone’s looking for this in the future 😛

Thanks to my buddy James for the correction and links to the newer NIST program. Thanks man!

Bottom Line

Interestingly, the tests above use the source number data to do a bunch of different things, and then measure the statistics of the results.

For instance, it will use the numbers to shuffle a deck of cards, and then it will play poker and see if there is any bias of cards dealt, or players winning.

Or, it will use the numbers as the source of numbers for a roulette wheel and see if players win at the right rate statistically.

I guess the bottom line lesson for testing random numbers is that you should use the numbers how you intend to use them, and see if there’s any statistical anomalies.

There doesn’t seem to be a magic bullet test that works for generic randomness, but I’m guessing it’s just sort of… check for patterns in every way you can, or every way you care about, and if you don’t find any, consider it random. If you are using it for purposes where randomness really matters – like security or gambling – you then hope nobody else finds a pattern you didn’t! 😛

On that topic, check this out: Wikipedia: Michael Larson

External C++ Header Guards

A buddy at work said “I wish C++ had a two pass pre processor so that we could do external header guards”. It got me thinking about some random macro stuff i had seen before and i thought “hrm… you know, that actually might be possible to do… i’m going to give it a try”.

I ended up working something up tonight at home that’s semi-palatable. The way you use it is a little weird, but i think it satisfies the spirit of the challenge, and works as a proof of concept that you can do external header guards without having to type a bunch of stuff.

If you can think of a way to improve it, post a comment or something and let me know!!

Umm…External Header Guards? What are you Talking About??

Have you ever seen something like the below? it’s called a header guard:

#ifndef BLAH_H
#define BLAH_H

// code goes here

#endif BLAH_H

You might also have seen this variation:

#pragma once

Without the header guards, if you include the header file twice, it will complain that the classes etc have already been defined. Those make sure that doesn’t happen, by only including the contents of the file if it hasn’t already been included.

External header guards on the other hand would be guards in the place it’s included instead of in the header file itself. That is more typing (more work), but the benefit there is that the compiler doesn’t have to open the header file at all to see if it’s already been included, which could make for faster compile times in large projects.

Anyways, here’s the code:

Main.cpp

// include testoneblah.h which defines the typedef ProofIncluded__blah_h
// to prove that it was really included
#define FILESEQ (testone)(blah)
#include "Includer.h"

// try to include testoneblah.h again.  It won't get included again, and
// instead, ProofIncludeBlocked__blah_h will get typedef'd by includer.h
// to prove that the file did not get included.  Comment these lines out
// and you'll get a compiler error that ProofIncludeBlocked__blah_h is
// an undeclared identifier
#define FILESEQ (testone)(blah)
#include "Includer.h" 

int main(int argc, char **argv)
{
	ProofIncluded__blah_h a;
	ProofIncludeBlocked__blah_h b;
	return 0;
}

So it is a little weird… but to include a file, you define FILESEQ with a directory and filename (without .h on it), and then include “Includer.h”. Even though it’s weird to use, and doesn’t work for .inl files (and maybe other issues, some easily solved), it’s only one extra line of typing to do an external header guard, which is about as good as you can expect.

Ideally I wish the interface were like the below, but I haven’t been able to figure out how to make that work unfortunately.

IncludeFile_((testone)(blah))

Includer.h

//=====================================================================================================
// Rip off boost, hooray!!  boost_pp is really nice, you can just grab it from the boost bundle and
// start using it because it's just a bunch of includes.  you don't need to build or link with boost
// at all.  It's really nice.  http://www.boost.org/
//=====================================================================================================
# define BOOST_PP_EMPTY()
# define BOOST_PP_SEQ_ELEM(i, seq) BOOST_PP_SEQ_ELEM_I(i, seq)
# define BOOST_PP_SEQ_ELEM_I(i, seq) BOOST_PP_SEQ_ELEM_II((BOOST_PP_SEQ_ELEM_ ## i seq))
# define BOOST_PP_SEQ_ELEM_II(res) BOOST_PP_SEQ_ELEM_IV(BOOST_PP_SEQ_ELEM_III res)
# define BOOST_PP_SEQ_ELEM_III(x, _) x BOOST_PP_EMPTY()
# define BOOST_PP_SEQ_ELEM_IV(x) x

# define BOOST_PP_SEQ_ELEM_0(x) x, BOOST_PP_NIL
# define BOOST_PP_SEQ_ELEM_1(_) BOOST_PP_SEQ_ELEM_0
//=====================================================================================================
#define EB_COMBINETEXT(a, b) EB_COMBINETEXT_INTERNAL(a, b)
#define EB_COMBINETEXT_INTERNAL(a, b) a ## b
#define EB_TOTEXT(a) EB_TOTEXT_INTERNAL (a)
#define EB_TOTEXT_INTERNAL(a) #a
//=====================================================================================================

// extract the directory and file name
#define DIR BOOST_PP_SEQ_ELEM(0, FILESEQ)
#define FILE BOOST_PP_SEQ_ELEM(1, FILESEQ)

// create the full file name: /.h
#define THEFILENAME EB_TOTEXT(EB_COMBINETEXT(DIR, EB_COMBINETEXT(/, EB_COMBINETEXT(FILE, .h))))

#if !EB_COMBINETEXT(__, EB_COMBINETEXT(FILE, _h))
	//including file: YES
	//include the file
	#include THEFILENAME
#else
	//including file: NO
    //create a typedef called ProofIncludeBlocked___H to prove we didn't include the file
	typedef char EB_COMBINETEXT(ProofIncludeBlocked__, EB_COMBINETEXT(FILE, _h));
#endif

// clean up the things we created.  boost macros can stick around ::shrug::
#undef THEFILENAME
#undef DIR
#undef FILE

// defined by caller, but we're cleaning it up for convenience
#undef FILESEQ

I ripped some macros out of the boost preprocessor library (boost_pp) to help things out a little bit. In a nutshell what we are doing is this…

We test to see if the preprocessor value __<File>_h is not true (false, or undefined). If that is the case, we include the file <Directory>/<File>.h. Else, we define a typedef ProofIncludeBlocked<File>_h to prove that we blocked the include from happening.

Blah.h

#define __blah_h 1
class ProofIncluded__blah_h {};

Blah.h defines __blah_h as 1 (true). It’s important that it uses the same naming convention as Includer.h does (__<File>_h), otherwise this setup won’t work. If you screw it up, you’ll get compile errors about multiply defined symbols.

This file also defines a class ProofIncluded__blah_h to prove that this file actually got included, and also defines something that will complain if the file is included twice.

Issues

So, this is just a proof of concept and it has some issue including…

  • Duplicate file names – if you have the same file name in different folders, this setup has issues. It might be able to be helped by including the directory name into the header guard preprocessor symbol.
  • Referencing the same file different ways – if you reference the same file in different ways because there’s multiple ways to reach it in the include search paths, it won’t be able to tell that it’s the same file if you do the last fix. Maybe the real solution is to have another parameter defined specifying the header guard symbol? dont know…
  • Only supports .h files – It assumes a .h extension, but maybe another parameter could be the file extension to use so you could include .inl, .hpp etc.

Hopefully you find it interesting at least though (:

Is it Worth It?

Poday made some good points in the comments about it not being worth it, and my friend Doug also has this to say:

It’s not needed though all compilers already do that:

(GCC):
The GNU C preprocessor is programmed to notice when a header file uses this particular construct and handle it efficiently. If a header file is contained entirely in a `#ifndef’ conditional, then it records that fact. If a subsequent `#include’ specifies the same file, and the macro in the `#ifndef’ is already defined, then the file is entirely skipped, without even reading it.

(Clang)
The MultipleIncludeOpt class implements a really simple little state machine that is used to detect the standard “#ifndef XX / #define XX” idiom that people typically use to prevent multiple inclusion of headers. If a buffer uses this idiom and is subsequently #include‘d, the preprocessor can simply check to see whether the guarding condition is defined or not. If so, the preprocessor can completely ignore the include of the header.

(Clang still)
clang_isFileMultipleIncludeGuarded – Determine whether the given header is guarded against multiple inclusions, either with the conventional #ifndef/#define/#endif macro guards or with #pragma once.

For MSVC all I could find is Herb Sutter lead architect for MSVC and head of the C++ committee in his book ‘C++ Coding Standards: 101 Rules, Guidelines, and Best Practices’:
24. Always write internal #include guards. Never write external #include guards.
With a reason of:
Don’t try to be clever: Don’t put any code or comments before and after the guarded portion, and stick to the standard form as shown.
Today’s preprocessors can detect include guards, but they might have limited intelligence and expect the guard code to appear exactly at the beginning and end of the header.

Alloca and Realloc – Useful Tools, Not Ancient Relics

If you are a C/C++ programmer, you are likely familiar with malloc() and free(), the predecessors to C++’s new and delete operators, as well as the existence of the variations of malloc such as calloc, realloc and alloca.

If you are like me, you probably thought for a long while that malloc and it’s variations were relics of days gone by, only actually useful in a few very limited situations. Some of these guys still have use though, and don’t really have equivalents in C++ to replace them.

First the boring ones…
malloc – Allocates memory. Precursor to new operator.
calloc – Allocates memory and sets the contents to zero. C’s answer to the problem of uninitialized memory that constructors solve in C++.

Now the more interesting ones!

Alloca

Believe it or not, alloca actually allocates memory on the stack. When your function goes out of scope, the stack memory is automatically returned to the stack due to the nature of how the stack and stack pointer work. No need to free the memory allocated with alloca, and in fact if you tried, you’d probably get a crash 😛

If you are a programmer who writes high performance applications, you are probably familiar with the benefits of using the stack instead of allocating memory on the heap with new or malloc.

The benefits of using the stack include…

  • Quicker allocations – Allocating memory can be a relatively costly operation in terms of time, especially if you have multiple threads running using the same (thread safe) allocator. Allocating memory on the stack is essentially the same cost as defining a local variable. Under the hood, it’s just moving the stack pointer a little farther and gives you that empty space to use.
  • No memory leaks – when the function you’ve allocated the stack memory in exits, that memory is automatically freed. This is because the stack pointer just “moves back” to what it used to be. There is not really any memory to free.
  • Less memory fragmentation – When mixing large and small memory allocations and frees, sometimes you end up with your memory in a state where there is a lot of memory free, but just not all together in one place. For instance, your program might need to allocate 50MB, and there may be 300MB free on the heap total, but if there are small 16 byte allocations littered in the memory every 10MB, your program won’t be able to find a single 50MB region to allocate and the allocation will fail. One common cause of this problem is small allocations used for things like relatively small arrays or small string buffer allocations that exist temporarily to copy or transform some data, but are not meant to stick around very long. If you can put these on the stack instead of the heap, those small allocations don’t hit the heap, and your memory will be less fragmented in the end.
  • Increased performance (fewer cache misses) – the contents of the stack are likely already in the CPU cache, so putting your data there means less information for the CPU to have to gather from RAM which is a slow operation.

However, there are some dangers when allocating memory on the stack as well

  • If you keep a pointer to the memory, that memory could be “freed” and re-used, getting filled with other random data (local variables). That can cause crashes, memory corruption or other strange program behavior.
  • If you allocate too much on the stack you could run out of stack space. The stack isn’t really meant to hold large amounts of allocated data. You can adjust your programs stack size though if this is a route you want to pursue.

Alternatives

There are some common techniques I’ve seen people use in places that could have also used alloca instead. These include…

  • Small Pool Allocators – To get around the memory fragmentation problem, sometimes people will have different memory allocators based on the size of memory being allocated. This way, small temporary allocations for things like temporary string buffers will all be allocated from one place, while larger allocations for things like textures will be allocated elsewhere. This dramatically improves the memory fragmentation issue.
  • Object Pools – Object pools are similar to small pool allocators but they work by allocating some amount of memory for specific types of objects, and have a way to remember which objects are used and which ones are free. For instance, you may dynamically allocate an array of 100 SMyStruct objects and have a flag for each to know which ones are in use and which ones aren’t. This way, the program can ask for a new object, and it can find one currently not in use and return it to the caller without needing to hit the ACTUAL memory allocator to get the data (unless all objects are spoken for, at which point it can choose to fail, or allocate a new “page” of objects to be able to hand out). This also has an interesting side effect that cache misses can drop quite a bit since the same kinds of objects will be nearer to eachother in memory.
  • DIY Stack Allocator – When I was working at Midway, a friend (Hi Shawn!) profiled the animation code and found that a lot of time was spent in allocating temporary buffers to blend bone data together. To fix this, he rolled his own stack allocator, where there was one contiguous piece of memory on the heap that could be allocated from. There was an internal index keeping track of where the “top of the stack” was, and when memory was allocated, that stack index would just move up by however many bytes were asked for. At the end of the frame, the stack index was reset to zero, thus “freeing” the memory. This dramatically improved the animation system performance by making the temporary bone blend buffer allocations essentially free.
  • Thread Specific Memory – If you are having problems where multiple threads are trying to allocate memory at the same time, causing contention and slowdowns due to thread synchronization, another option is to give each thread it’s own chunk of memory and let it allocate from that. That way there is no contention and you won’t have the slowdown of thread synchronization due to memory allocation anymore. A problem here though can be figuring out how much memory each thread needs. One thread may need a lot of memory, and another thread may need none, and you may not have any way of knowing which in advance. In this case, you’d have to allocate “a lot” of memory for each thread in advance, and pay an extra cost in memory that you technically don’t have to. But hey, at least it’s fast, maybe the trade off is worth it in your situation!

Lastly, there’s another common trick to avoid dynamic allocations involving templates, check it out!

// define the CStaticArray class
template 
class CStaticArray
{
public:
  T m_values[N];

  // you could put functions in here to do operations on the array data to make it look more like a standard
  // data type, instead of a plain vanilla array
  unsigned int Count () { return N; }

  void SomeOtherFunction () { }
};

void MyFunc ()
{
  // make an array of 32 floats
  CStaticArray m_floatArray;

  // make an array of 128 SSomeStructs
  CStaticArray m_objectArray;

  for (unsigned int index = 0; index < m_objectArray.Count(); ++index)
  {
    m_objectArray.m_values[index].DoSomething();
  }
}

The above really shines if you have a standard API for strings or dynamic arrays in your code base. You can make a version like the above which works without dynamic allocations, but gives the same interface so it's easier for fellow programmers to use and swap in and out as needed.

Another nice benefit to the above technique is that it works for stack allocations, but you can also make them member variables of other objects. In this way, you can minimize dynamic allocations. Instead of having to dynamically allocate an object, and then dynamically allocate the array inside of it, you do a single allocation to get all the memory needed.

That is the closest thing in C++ that I've seen to alloca, but even so, alloca has the advantage that you can decide how much memory to allocate at run time. With the template method, you have to know at compile time which is fine for a lot of cases, but othertimes is a deal breaker, forcing you to have to go back to dynamic allocations (or perhaps now, alloca instead?)

Realloc

Realloc is the other interesting memory allocation function.

Like I was mentioning above, the fewer allocations you can do, the better off you are in terms of performance, and also memory fragmentation.

By writing smart containers (dynamic arrays, dynamic strings, etc) you can make it so when someone tries to make a container smaller, that instead of allocating new memory that’s smaller, copying the data over, and freeing the old memory, that instead it just remembers the new size but keeps the old, larger memory around.

Then later on, if the container was told to grow, if it was smaller than the larger size from the past, it could just use some of that old memory again.

However, if that container grows larger than it used to be, you are going to have to allocate, copy, and free (costly etc) to grow the container.

Down in the guts of your computer however, there may be memory right after the current memory that’s not being used by anything else. Wouldn’t it be great if you could just say “hey… use that memory too, i don’t want to reallocate!”.

Well, realloc does ALL of the above for you without you having to write special code.

When you realloc memory, you give the old pointer and the new size, and if it’s able to, it won’t do any allocations whatsoever, and will just return you your old pointer back to you. It may allocate the next memory block for you if the new size is larger, but would still return the old pointer value in this case. Or, if the new amount of memory is smaller, it may return you back the same memory without doing anything internally (it depends on your compiler’s specific implementation of realloc what it does when)

If realloc does have to allocate new memory though, it will copy over all the old data to the new memory that it returns to you and free the old memory. So, you don’t have to CARE whether the pointer returned is old or new, just store the return value and continue on with your life.

It’s pretty cool and can help reduce actual memory allocations, lowering memory fragmentation and increasing performance.