Separate C++ Template Headers (*.h) and Implementation files (*.cpp)

Problem

It was a heavily touted methodology of separating the source code into header and implementation for better maintenance, development or what not. However, when it comes to C++ template aka metaprogramming, the conventional method to implement this structure collapse completely. Personally, I felt that C++ is trying to accomplish too many things at the same time, which is why people called C++ as a plethora of languages.

Conventional Non-Template Classes:

          foo.h    //Declaration of the Foo Class               bar.h //Declaration of Bar Class
                      class Foo{                                                       #include "foo.h"
                              void SomeFunc();                                     class Bar{
                      };                                                                      }
                      
          foo.cpp //Definition of the Foo Class                  bar.cpp //Definition of Bar Class

                      void Foo::SomeFunc(){                           void Bar::UseFooFunc(){
                                      int Data                                               Foo foo;
                                                                                                 foo.SomeFunc();
                             }                                                             }
                   
          foo.o  //The object file of Foo Class                  bar.o //The object files of Bar Class

As you can see from above, the foo and bar class are compiled separately without any linkage towards each other, ie. foo need not know bar exist or vice versa. The compiler only needs the header file(.h) to determine the symbols to create for the linker.

                      linker -> foo.o + bar.o  -> a.out

The linker's job is to link the two class such that the call in bar.o to foo.SomeFunc() will execute from Foo::SomeFunc() in foo.o.

Template Classes:

Firstly, I find it hard at the beginning to think of Template Class as templates (sigh). They are like recipes to create class with specific data types, eg. MyClassWithIntDataTypes, MyClassWithUserDefinedTypes. The compiler cannot compile a template file separately without a instantiate. I must stress this again, the compiler does not know in advance what kind of classes to build by simply looking at the template files.

           foo.h //Declaration of the Foo Template       bar.h //Declaration of the same Bar Class
                    template<class T>                                           #include "foo.h"
                    class Foo<T>{                                                class Bar{
                 
                     };                                                                   };
         
           foo.cpp //Definition of the Foo Template       bar.cpp //Declaration of the Bar Class
                    template<class T>                                           void Bar::UseFooInDoubleFunc(){
                    Foo<T>::SomeFunc(){                                            Foo<double> foo;
                             T data;                                                           foo.SomeFunc();
                    }                                                                     }

Problem 1: The compiler cannot compile foo.h, foo.cpp into foo.o because it is a template file, not a class. At this point, the compiler can see foo.cpp and can create the Foo class for anytype. However, the compiler does not know that Foo<double> foo is needed from bar.cpp. The important aspect is that foo.cpp does not contain an instance of the required data types (see solution 1).

Problem 2: When the compiler compiles bar.h, bar.cpp  into bar.o, it knows it needs to create Foo class with Double datatype when it sees Foo<double> foo.

It can see the interface foo.h and the compiler knows it needs to create a specific Foo<double> class. However because it cannot see the foo.cpp, ie the recipe, it cannot create that specific double class. The compiler fails here.

Problem 3:
Even if the compiler compiles, the linker would face problems as well (see Problem 1), no Foo Class for anytype is compiled to object files at all.

Solution

There are two solutions which I would suggest (yes I know there are others), 1) everything in header file, 2) explicitly create instances of the needed instances with required data types. It is crucial to know that we do not need to solve both problem 1 and 2, solving either one of the problems with each solution  would solve the general problem.

Solution 1 : All implementation into the header file

A simple method is to include the implementation straight into the header file (.h). There is no implementation files (.cpp) and there is only one header file. This solve problem 2 straight away because the compiler can 'see' the implementation just by looking at foo.h. It is now able to create instances of the template through the header file.

Pros: 

  • The compiler can resolve each function calls to a specific class of the template immediately upon compiling the compilation unit (ie. *.h and *.cpp). At the end of compilation, the compiler creates the implementation for all the functions calls for the object file.
Cons: 

  • A huge disadvantage with this method is that any changes in the implementation of this template class in the header file would automatically results in large rebuilds over all the source files that include this header. 
  • There is a risk that the developer would use many instances of the template, resulting in very bloated object files. One point to note is that for each Foo<T>, a whole set of instructions have to be created by the compiler.


Solution 2: Explicit Template Instantiation

This would allow separate header (*.h) and implementation file (*.cpp).

 foo.h //Declaration of the Foo Template Class       bar.h //Declaration of the same Bar Class
                    template<class T>                                           #include "foo.h"
                    class Foo<T>{                                                class Bar{
                 
                     };                                                                   };
         .
 foo.cpp //Definition of the Foo Template Class       bar.cpp //Declaration of the Bar Class
                   template class Foo<int>;
                   template class Foo<double>;
                   template<class T>                                           void Bar::UseFooInDoubleFunc(){
                    Foo<T>::SomeFunc(){                                            Foo<double> foo;
                             T data;                                                           foo.SomeFunc();
                    }                                                                     }

 foo.o //The object file containing two created foo    bar.o //The object file that calls a
          //instance of the template for int and double             //member function of a foo instance of
          //data type                                                               //double data type

Now, the compiler can create explicitly classes of the specific data types which solve problem 1. This is because the compiler can see that two classes are needed as it can observe the implementation. Thus it creates the two classes with specified data types and compiles an object file containing this specific classes.

Next the linker will be able to link foo.SomeFunc() with the created Foo<double> class in foo.o. This solves problem 3, as the linker is able to linked the function calls with the created member function.

Pros:

  • All benefits related to separation of headers and implementation files, all the source files that includes that include this template class need not recompile when the implementation changes. Unlike solution 1, which would recompile all the source files that include the template header file. Here, whenever a change is made on the implementation, all source files that depends on the instances of the template class need not be recompiled. 
  • We can determine which instances are no longer needed and remove them from the top of the template file, eg. no cpp files uses the Foo<long> foo, we remove template Foo<long>.
  • All undefined template functions used by other implementation would be known in compile-link time.
Cons: 

  • On the other hand, it is difficult to determine the usage of the different instances of the template class. Therefore, it may not be easy to determine which instances are still needed by implementation files throughout the project.
For more information, see what gcc says about this

Ps. Can someone give me the name of those class created from template classes?

Comments

Popular Posts