The code for this post can be found at: https://github.com/Atrix256/STLCost
Folks often will say STL containers are slow in debug, other folks say they aren’t.
Frankly, both are true. “Slow” as a negative statement depends both on what you are doing with them, as well as your need for debug performance.
In game development, you must absolutely be able to run a game in debug mode, and you are usually manipulating a lot of data. This is true both of games at runtime, as well as tool time operations.
This post looks at a pretty common tool time operation: making box filtered mip maps of a 512×512 RGBA F32 image.
This is in MSVC 2017, debug x64, using default project settings.
On my machine, it’s about 25 milliseconds to do this operation with a plain c style array. it’s about 240 milliseconds to do it with a std::array. It’s about 1900 milliseconds to do it with a std::vector.
This example is in line with my own experience that these basic STL containers are really slow in debug, and will bring a debug build to it’s knees. An unusable debug build is hell on a team and will significantly bring down productivity. It can be a project killer, especially as it makes it more difficult for folks to debug problems and validate that things are working correctly.
Maybe perf is better on other compilers though?
Have anything to add, correct, etc? Speak up! 🙂
– Code fairness is not exact as you do not need to memset() a std::vector to 0, as it defaults inits on resize.(this does not make much of a difference.
– Next – if you don’t care about security checks, you can turn it off /GS- , Basic runtime checks off (set to default) and disable exceptions
This gets me to:
std::array:
InitImage: 10.808100 ms
MakeMips: 10.782800 ms
std::vector:
InitImage: 54.751100 ms
MakeMips: 74.888000 ms
c array:
InitImage: 8.160700 ms
MakeMips: 4.112100 ms
– Next if you don’t care about bounds checking (as you don’t in the C array) specify _ITERATOR_DEBUG_LEVEL=0 in the preprocessor defines. This will stop all out of bounds array checks – but it is fair if you don’t do them in the C version.
This leaves me with.
std::array:
InitImage: 9.207800 ms
MakeMips: 9.356900 ms
std::vector:
InitImage: 21.524200 ms
MakeMips: 29.700800 ms
c array:
InitImage: 7.205100 ms
MakeMips: 3.853800 ms
Finally, if you enable inline function expansion /Ob2 to any suitable. I then get
std::array:
InitImage: 4.415200 ms
MakeMips: 5.154100 ms
std::vector:
InitImage: 8.167600 ms
MakeMips: 3.931400 ms
c array:
InitImage: 6.910900 ms
MakeMips: 4.394300 ms
This is close enough for me personally – as it does all you to mix and match what level of security checks you want on a build without having to change code. And release mode does seem very similar performance.
LikeLike
Thanks for sharing this analysis!
The perf also is quite a bit better in MSVC 2019 which is nice to see. I need to post those results.
LikeLike
Hopping on this to add, we typically setup a “ReleaseWithDebug” or “ReleaseWithSymbols” build profile in VS. Essentially just cloning the Release config but disabling optimizations, nothing else. Good mix of both worlds, optimized versions of the std:: calls but I can still debug my code. Just doing that gets me very similar numbers to Damians:
std::array:
InitImage: 10.626950 ms
MakeMips: 13.185507 ms
c array:
InitImage: 12.482779 ms
MakeMips: 6.782774 ms
std::vector:
InitImage: 19.900966 ms
MakeMips: 15.201926 ms
LikeLike
That is why people are not using STL in game industry. Try e.g. https://github.com/electronicarts/EASTL
Outside of game industry we care more about correctness and do not care about speed in debug mode, then STL is fine.
LikeLike
Yeah for sure. Having a standardized interface for containers is nice for engineers moving from team to team, or from company to company. That’s no small thing. Having a better-suited-for-high-perf implementation is a nice way of having your cake and eating it too.
LikeLike
`#define _ITERATOR_DEBUG_LEVEL 0` should solve a significant part of your problem (of course you’ll still have function call overhead in the unlocked builds).
Frankly, I’m a bit disappointed that someone, who is so concerned with debug performance of his code that he writes a blog post + benchmark code about it, doesn’t even mention (or know) one of the most common mechanisms to speed up debug builds in msvc.
LikeLike
If you have to disable key functionality to make it run fast enough to be usable, that’s a problem.
LikeLike