External C++ Header Guards

A buddy at work said “I wish C++ had a two pass pre processor so that we could do external header guards”. It got me thinking about some random macro stuff i had seen before and i thought “hrm… you know, that actually might be possible to do… i’m going to give it a try”.

I ended up working something up tonight at home that’s semi-palatable. The way you use it is a little weird, but i think it satisfies the spirit of the challenge, and works as a proof of concept that you can do external header guards without having to type a bunch of stuff.

If you can think of a way to improve it, post a comment or something and let me know!!

Umm…External Header Guards? What are you Talking About??

Have you ever seen something like the below? it’s called a header guard:

#ifndef BLAH_H
#define BLAH_H

// code goes here

#endif BLAH_H

You might also have seen this variation:

#pragma once

Without the header guards, if you include the header file twice, it will complain that the classes etc have already been defined. Those make sure that doesn’t happen, by only including the contents of the file if it hasn’t already been included.

External header guards on the other hand would be guards in the place it’s included instead of in the header file itself. That is more typing (more work), but the benefit there is that the compiler doesn’t have to open the header file at all to see if it’s already been included, which could make for faster compile times in large projects.

Anyways, here’s the code:

Main.cpp

// include testoneblah.h which defines the typedef ProofIncluded__blah_h
// to prove that it was really included
#define FILESEQ (testone)(blah)
#include "Includer.h"

// try to include testoneblah.h again.  It won't get included again, and
// instead, ProofIncludeBlocked__blah_h will get typedef'd by includer.h
// to prove that the file did not get included.  Comment these lines out
// and you'll get a compiler error that ProofIncludeBlocked__blah_h is
// an undeclared identifier
#define FILESEQ (testone)(blah)
#include "Includer.h" 

int main(int argc, char **argv)
{
	ProofIncluded__blah_h a;
	ProofIncludeBlocked__blah_h b;
	return 0;
}

So it is a little weird… but to include a file, you define FILESEQ with a directory and filename (without .h on it), and then include “Includer.h”. Even though it’s weird to use, and doesn’t work for .inl files (and maybe other issues, some easily solved), it’s only one extra line of typing to do an external header guard, which is about as good as you can expect.

Ideally I wish the interface were like the below, but I haven’t been able to figure out how to make that work unfortunately.

IncludeFile_((testone)(blah))

Includer.h

//=====================================================================================================
// Rip off boost, hooray!!  boost_pp is really nice, you can just grab it from the boost bundle and
// start using it because it's just a bunch of includes.  you don't need to build or link with boost
// at all.  It's really nice.  http://www.boost.org/
//=====================================================================================================
# define BOOST_PP_EMPTY()
# define BOOST_PP_SEQ_ELEM(i, seq) BOOST_PP_SEQ_ELEM_I(i, seq)
# define BOOST_PP_SEQ_ELEM_I(i, seq) BOOST_PP_SEQ_ELEM_II((BOOST_PP_SEQ_ELEM_ ## i seq))
# define BOOST_PP_SEQ_ELEM_II(res) BOOST_PP_SEQ_ELEM_IV(BOOST_PP_SEQ_ELEM_III res)
# define BOOST_PP_SEQ_ELEM_III(x, _) x BOOST_PP_EMPTY()
# define BOOST_PP_SEQ_ELEM_IV(x) x

# define BOOST_PP_SEQ_ELEM_0(x) x, BOOST_PP_NIL
# define BOOST_PP_SEQ_ELEM_1(_) BOOST_PP_SEQ_ELEM_0
//=====================================================================================================
#define EB_COMBINETEXT(a, b) EB_COMBINETEXT_INTERNAL(a, b)
#define EB_COMBINETEXT_INTERNAL(a, b) a ## b
#define EB_TOTEXT(a) EB_TOTEXT_INTERNAL (a)
#define EB_TOTEXT_INTERNAL(a) #a
//=====================================================================================================

// extract the directory and file name
#define DIR BOOST_PP_SEQ_ELEM(0, FILESEQ)
#define FILE BOOST_PP_SEQ_ELEM(1, FILESEQ)

// create the full file name: /.h
#define THEFILENAME EB_TOTEXT(EB_COMBINETEXT(DIR, EB_COMBINETEXT(/, EB_COMBINETEXT(FILE, .h))))

#if !EB_COMBINETEXT(__, EB_COMBINETEXT(FILE, _h))
	//including file: YES
	//include the file
	#include THEFILENAME
#else
	//including file: NO
    //create a typedef called ProofIncludeBlocked___H to prove we didn't include the file
	typedef char EB_COMBINETEXT(ProofIncludeBlocked__, EB_COMBINETEXT(FILE, _h));
#endif

// clean up the things we created.  boost macros can stick around ::shrug::
#undef THEFILENAME
#undef DIR
#undef FILE

// defined by caller, but we're cleaning it up for convenience
#undef FILESEQ

I ripped some macros out of the boost preprocessor library (boost_pp) to help things out a little bit. In a nutshell what we are doing is this…

We test to see if the preprocessor value __<File>_h is not true (false, or undefined). If that is the case, we include the file <Directory>/<File>.h. Else, we define a typedef ProofIncludeBlocked<File>_h to prove that we blocked the include from happening.

Blah.h

#define __blah_h 1
class ProofIncluded__blah_h {};

Blah.h defines __blah_h as 1 (true). It’s important that it uses the same naming convention as Includer.h does (__<File>_h), otherwise this setup won’t work. If you screw it up, you’ll get compile errors about multiply defined symbols.

This file also defines a class ProofIncluded__blah_h to prove that this file actually got included, and also defines something that will complain if the file is included twice.

Issues

So, this is just a proof of concept and it has some issue including…

  • Duplicate file names – if you have the same file name in different folders, this setup has issues. It might be able to be helped by including the directory name into the header guard preprocessor symbol.
  • Referencing the same file different ways – if you reference the same file in different ways because there’s multiple ways to reach it in the include search paths, it won’t be able to tell that it’s the same file if you do the last fix. Maybe the real solution is to have another parameter defined specifying the header guard symbol? dont know…
  • Only supports .h files – It assumes a .h extension, but maybe another parameter could be the file extension to use so you could include .inl, .hpp etc.

Hopefully you find it interesting at least though (:

Is it Worth It?

Poday made some good points in the comments about it not being worth it, and my friend Doug also has this to say:

It’s not needed though all compilers already do that:

(GCC):
The GNU C preprocessor is programmed to notice when a header file uses this particular construct and handle it efficiently. If a header file is contained entirely in a `#ifndef’ conditional, then it records that fact. If a subsequent `#include’ specifies the same file, and the macro in the `#ifndef’ is already defined, then the file is entirely skipped, without even reading it.

(Clang)
The MultipleIncludeOpt class implements a really simple little state machine that is used to detect the standard “#ifndef XX / #define XX” idiom that people typically use to prevent multiple inclusion of headers. If a buffer uses this idiom and is subsequently #include‘d, the preprocessor can simply check to see whether the guarding condition is defined or not. If so, the preprocessor can completely ignore the include of the header.

(Clang still)
clang_isFileMultipleIncludeGuarded – Determine whether the given header is guarded against multiple inclusions, either with the conventional #ifndef/#define/#endif macro guards or with #pragma once.

For MSVC all I could find is Herb Sutter lead architect for MSVC and head of the C++ committee in his book ‘C++ Coding Standards: 101 Rules, Guidelines, and Best Practices’:
24. Always write internal #include guards. Never write external #include guards.
With a reason of:
Don’t try to be clever: Don’t put any code or comments before and after the guarded portion, and stick to the standard form as shown.
Today’s preprocessors can detect include guards, but they might have limited intelligence and expect the guard code to appear exactly at the beginning and end of the header.


5 comments

  1. These things are generally compiler, Operating System and hardware performance dependent and are based upon assumptions on the inner workings of the compiler so they tend to be brittle.

    Let’s examine some assumptions implicit in this solution:
    1. The cost of opening/parsing a header file in aggregate is substantial and that it should be avoided when possible.
    2. The cost of opening/parsing Includer.h is inconsequential.
    3. Parsing multiple nested defines is faster then opening/parsing a header file.

    Point 2 is where the crux of the problem with this solution lies. In the naive system opening every header would first open the Includer.h header and then maybe open the real header. Assuming that opening all files have the same cost this would actually make the system slower as the cost would be (number of includes) + (number of actually read includes). But modern systems aren’t nearly so naive. Both Operating Systems and hard drives have caches that keep recently used files closer at hand. So the assumption is that the Includer.h file would stick around in the cache so that the penalty for reading it from disk would be negligible. But this cache works just as well with the other files in compilation. It’s a fair assumption that a header that’s already been included in a cpp’s include chain is still in the cache so the penalty for opening the file is negated.

    *I’ve done a lot of hand waving by ignoring OS times for allocating/copying memory and getting file handles and that the compiler itself isn’t doing any caching.

    Like

    • Well, FWIW i never intended to use this, it was just more of a challenge to see if it could be done. I was pretty sure it could be, and wanted to explore it a bit and give it a try. I think it can be done better too, and it would be interesting if it could be done in a single line (instead of a define and an include), and like you point out, my technique includes “Includer.h” which defeats some of the benefit of external header guards. It would be nice to solve that part if possible.

      Like

  2. I was more responding to your coworker’s comment: “I wish C++ had a two pass pre processor so that we could do external header guards”.

    However it’s often fun to try to solve theoretical problems in a constrained environment. So to answer your question, I don’t believe it can be done in less then 2 lines. It’s fairly trivial to force the include of a header file in a build but that include happens before the first line of the cpp file so that you don’t have a chance to set up any custom defines yet. And it’s impossible to nest pre processor directives inside a #define statement.

    Several years ago I had an idea of a “todo macro”. The concept was that someone could use “todo(“Need to check for divide by zero”)” somewhere in their code and then have a define living somewhere that would toggle printing that macro to the compiler’s warning channel or hiding the todo. I spent a lot of time looking for tricks in macros to see if there was a way I could do something like:
    #if(SHOULD_PRINT_TODO)
    #define todo(x) #pragma message(x)
    #else
    #define todo()
    #endif
    Long story short: it’s not possible (caveat it’s possible but with all the hoops that would need to be jumped through it’s easier to just search for “todo” comments).

    C++ lives in that weird world where it’s too lenient for strong static analysis unlike java and C# but it’s too rigid to allow modification to it’s compilation process unlike interpreted languages like perl and javascript.

    Like


Leave a comment