Measuring Debug STL Container Perf Cost in MSVC

The code for this post can be found at: https://github.com/Atrix256/STLCost

Folks often will say STL containers are slow in debug, other folks say they aren’t.

Frankly, both are true. “Slow” as a negative statement depends both on what you are doing with them, as well as your need for debug performance.

In game development, you must absolutely be able to run a game in debug mode, and you are usually manipulating a lot of data. This is true both of games at runtime, as well as tool time operations.

This post looks at a pretty common tool time operation: making box filtered mip maps of a 512×512 RGBA F32 image.

This is in MSVC 2017, debug x64, using default project settings.

On my machine, it’s about 25 milliseconds to do this operation with a plain c style array. it’s about 240 milliseconds to do it with a std::array. It’s about 1900 milliseconds to do it with a std::vector.

This example is in line with my own experience that these basic STL containers are really slow in debug, and will bring a debug build to it’s knees. An unusable debug build is hell on a team and will significantly bring down productivity. It can be a project killer, especially as it makes it more difficult for folks to debug problems and validate that things are working correctly.

Maybe perf is better on other compilers though?

Have anything to add, correct, etc? Speak up! 🙂


7 comments

  1. – Code fairness is not exact as you do not need to memset() a std::vector to 0, as it defaults inits on resize.(this does not make much of a difference.

    – Next – if you don’t care about security checks, you can turn it off /GS- , Basic runtime checks off (set to default) and disable exceptions

    This gets me to:
    std::array:
    InitImage: 10.808100 ms
    MakeMips: 10.782800 ms

    std::vector:
    InitImage: 54.751100 ms
    MakeMips: 74.888000 ms

    c array:
    InitImage: 8.160700 ms
    MakeMips: 4.112100 ms

    – Next if you don’t care about bounds checking (as you don’t in the C array) specify _ITERATOR_DEBUG_LEVEL=0 in the preprocessor defines. This will stop all out of bounds array checks – but it is fair if you don’t do them in the C version.

    This leaves me with.
    std::array:
    InitImage: 9.207800 ms
    MakeMips: 9.356900 ms

    std::vector:
    InitImage: 21.524200 ms
    MakeMips: 29.700800 ms

    c array:
    InitImage: 7.205100 ms
    MakeMips: 3.853800 ms

    Finally, if you enable inline function expansion /Ob2 to any suitable. I then get
    std::array:
    InitImage: 4.415200 ms
    MakeMips: 5.154100 ms

    std::vector:
    InitImage: 8.167600 ms
    MakeMips: 3.931400 ms

    c array:
    InitImage: 6.910900 ms
    MakeMips: 4.394300 ms

    This is close enough for me personally – as it does all you to mix and match what level of security checks you want on a build without having to change code. And release mode does seem very similar performance.

    Like

    • Hopping on this to add, we typically setup a “ReleaseWithDebug” or “ReleaseWithSymbols” build profile in VS. Essentially just cloning the Release config but disabling optimizations, nothing else. Good mix of both worlds, optimized versions of the std:: calls but I can still debug my code. Just doing that gets me very similar numbers to Damians:

      std::array:
      InitImage: 10.626950 ms
      MakeMips: 13.185507 ms

      c array:
      InitImage: 12.482779 ms
      MakeMips: 6.782774 ms

      std::vector:
      InitImage: 19.900966 ms
      MakeMips: 15.201926 ms

      Like

    • Yeah for sure. Having a standardized interface for containers is nice for engineers moving from team to team, or from company to company. That’s no small thing. Having a better-suited-for-high-perf implementation is a nice way of having your cake and eating it too.

      Like

  2. `#define _ITERATOR_DEBUG_LEVEL 0` should solve a significant part of your problem (of course you’ll still have function call overhead in the unlocked builds).

    Frankly, I’m a bit disappointed that someone, who is so concerned with debug performance of his code that he writes a blog post + benchmark code about it, doesn’t even mention (or know) one of the most common mechanisms to speed up debug builds in msvc.

    Like


Leave a comment