Getting Strongly Typed Typedefs Using Phantom Types

In C++, typedefs have a lot of really handy uses.

They can make code easier to read:


    // ugly code:
    std::map<std::string, size_t> wordCounts;
    for (std::map<std::string, size_t>::iterator it = wordCounts.begin(); it != wordCounts.end(); ++it)
    {
        ...
    }

    // typedefs
    typedef std::map<std::string, size_t> TWordCounts; 
    typedef TWordCounts::iterator TWordCountInterator;

    // prettier, simplified code:
    TWordCounts wordCounts;
    for (TWordCountInterator it = wordCounts.begin(); it != wordCounts.end(); ++it)
    {
        ...
    }

And, if you find yourself wanting to change the std::map above to a std::unordered_map, typedefs can make your code easier to maintain since you only have one place to change instead of several (yes, decltype can help too!).

Typedefs can bite you sometimes though if you aren’t careful. Look at the below, where the code’s diligent author made friendly typedefs for model and animation ids:

    typedef unsigned int TModelID;
    typedef unsigned int TAnimationID;

    TModelID id = GetModelID();
    SetAnimationID(id); // BUG!  SetAnimation() wants a TAnimationID!

Wouldn’t it be great if you could make it so even though TModelID and TAnimationID are the same underlying type, the compiler could give you a compile error when you tried to use the wrong one?

One way to solve this problem is to use C++ 11’s feature enum classes, but that isn’t the solve I want to show you today (that technique is pretty straightforward anyways…). The solve I want to show you uses a spooky technique called phantom types.

Phantom Types

Here is how you might see a phantom type in the wild:

template <typename PHANTOM_TYPE>
struct SUInt
{
public:
    SUInt (unsigned int value) : m_value(value) { }
    inline unsigned int& Value () { return &m_value; }
private:
    unsigned int m_value;
};

You might be asking yourself “why??” or “what??” or perhaps even “I wonder what the wife will be making for dinner tonight?”.

The neat thing about that is that you can create dummy types to use as the template parameter, and then if you try to mix and match objects that use a different template parameter you’ll get a compile error instead of a subtle run time bug. Fail hard and fail early!

Here’s how to use phantom types to solve our issue:

// our phantom-type-using struct
template <typename PHANTOM_TYPE>
struct SUInt
{
public:
    SUInt (unsigned int value) : m_value(value) { }
    inline unsigned int& Value () { return &m_value; }
private:
    unsigned int m_value;
};

// some "tag types" to use as phantom types
struct SModelID {};
struct SAnimationID {};

// our strongly typed typedefs, which both are really just unsigned ints
typedef SUInt<SModelID> TModelID;
typedef SUInt<SAnimationID> TAnimationID;

TModelID GetModelID ()
{
    TModelID ret(3);
    return ret;
}

void SetAnimationID (TAnimationID animId)
{
    // do something with the animation id
}

int main(int argc, char **argv)
{
    TModelID id = GetModelID();
    SetAnimationID(id);
    return 0;
}

If you try to compile that you get a very lovely compile error.

error C2664: ‘SetAnimationID’ : cannot convert parameter 1 from ‘TModelID’ to ‘TAnimationID’

And taking it a step further, you could abstract the unsigned int away to make this code, in case you want to change the underlying type of either ID in the future:

// our phantom-type-using struct
template <typename T, typename PHANTOM_TYPE>
struct SStronglyTypedType
{
public:
    SStronglyTypedType (T value) : m_value(value) { }
    inline T& Value () { return &m_value; }
private:
    T m_value;
};

// some "tag types" to use as phantom types
struct SModelID {};
struct SAnimationID {};

// our strongly typed typedefs, which both are really just unsigned ints
typedef SStronglyTypedType<unsigned int, SModelID> TModelID;
typedef SStronglyTypedType<unsigned int, SAnimationID> TAnimationID;

TModelID GetModelID ()
{
    TModelID ret(3);
    return ret;
}

void SetAnimationID (TAnimationID animId)
{
    // do something with the animation id
}

int main(int argc, char **argv)
{
    TModelID id = GetModelID();
    SetAnimationID(id);
    return 0;
}

Trying to compile that code again gives the compile error:

error C2664: ‘SetAnimationID’ : cannot convert parameter 1 from ‘TModelID’ to ‘TAnimationID’

If you are going to be using complex objects with this setup, you are going to want to more efficiently handle copy construction and such (check links section for more info), but the above solves our problem. We can no longer mix up model id’s and animation id’s, even though they are both just really an unsigned int.

And best of all, except for the copy construction type issues, this code adds no overhead in memory or run time processing costs.

Other Uses

Besides the above, this strongly typed phantom type stuff has some other interesting uses.

  • Units – If you have some functions that take units of time in seconds, and others that take it in milliseconds, you could use this setup to make sure you never passed the wrong type to the wrong function. You could perhaps even make implicit conversion to fix it for you, if you desired that.
  • Validation – This was originally shown to me as a way to make sure that raw data from the user was validated before being used. Raw data from the user would use one phantom type, and validation functions would convert it to a different, validated phantom type. In this way, you could be sure that no unvalidated (or unsanitized!) data went to something that was vulnerable.
  • Deterministic Calculations – You could use this technique if you have a deterministic simulation that you need to run the same way every time. These phantom types could ensure no “non deterministic” data got into your equations – such as data from user input, or the clock, or based on frame rate, etc.

Links

Here is the info about copy construction (etc) that i mentioned. Every C++ programmer should know these things!
Sticky Bits: The Rule of The Big Three (and a half) – Resource Management in C++
Sticky Bits: The Rule of The Big Four (and a half) – Move Semantics and Resource Management


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s