The include mechanism used in C++ is working just fine, but often programmers tend to add redundant #include when a simple class fwd declaration will do.
The problem get worse if a basic H file, one that many other files refer, has redundant dummy #includes, in such case a small modification may trigger a complete rebuild.
Two rules which we follow in my team * each xxxx.cc file first statement is: #include xxxx.h - thats proofs, in some sense, that our H files are compile-able stand alone. * each H file xxxx.cc is guarded with #ifdef __xxxx___
There are some other, more complex ways in reducing H file dependency, e.g. preferring member ptr or reference, which can be fwd declared, over a on board regular member. It may consume a bit more memory, add complexity in term of cnstr/dstr or even reduce compiler possible optimization but it does reduce H file dependency.
That have being said, getting to the point:
1) Is there a tool which can analyze code and remove redundant #include's? 2) Given the above two coding style - is there a better tool/script?
{ please don't top-post in this group -- see the FAQ. -mod }
In terms of performance and speeding up compile times where large numbers of headers are included... what you want to use are "precompiled headers". Precompiled headers are what they sound like; the compiler reads a header and generates a binary representation of the data in the header. Reading the binary representation of the header file contents is much faster than parsing the contents of the header each time it needs to be used. You can look at your compiler's relevant documentation on how to create precompiled headers. For GCC, if you include a header file and a file exists with the same name but with a ".gch" suffix, it will assume that the ".gch" file is a precompiled header, and it will include the precompiled header instead.
With regard to preprocessor guards (that is, the use of "#ifndef" and "#define" to prevent multiple inclusions)... you should follow this practice for any header file whose contents are not expected to change on each inclusion (i.e. content not controlled by #if...#else constructs, where the variables on which they depend are expected to be potentially redefined before each inclusion -- an example of this is <cassert>). For most header files, this will be the case (i.e., you should use preprocessor guards). Preprocessor guards do more than reduce compilation time by eliminating redundant includes; they also ensure that a source file can include two separate headers that each depend on and include a common header file (i.e. it ensures that the, even with very complex dependency graphs, the compiler will not encounter and, therefore, will not fail as a result of redundant declarations).
- Michael Safyan
On Nov 1, 8:57 pm, shahav <sha...@gmail.com> wrote:
> Hi All,
> The include mechanism used in C++ is working just fine, but often > programmers tend to add redundant #include when a simple class fwd > declaration will do. [snipped]
> 1) Is there a tool which can analyze code and remove redundant > #include's? > 2) Given the above two coding style - is there a better tool/script?
shahav wrote: > * each H file xxxx.cc is guarded with #ifdef __xxxx___
A bad idea. Names containing double underscores are reserved for the C++ implementation (i.e. for the compiler and the standard library) - you should not be using them in your own code, except where allowed by the C++ Stnandard (__LINE__, for example).
shahav wrote: > * each H file xxxx.cc is guarded with #ifdef __xxxx___
Identifiers with two consecutive underscores and those beginning with an underscore followed by an uppercase letter are reserved.
> There are some other, more complex ways in reducing H file dependency, > e.g. preferring member ptr or reference, which can be fwd declared, > over a on board regular member. It may consume a bit more memory, add > complexity in term of cnstr/dstr or even reduce compiler possible > optimization but it does reduce H file dependency.
shahav <sha...@gmail.com> wrote: > Two rules which we follow in my team > * each xxxx.cc file first statement is: #include xxxx.h - thats > proofs, in some sense, that our H files are compile-able stand alone.
Interesting. Since xxxx.h may want to reference various parts of the standard library, this approach requires that xxxx.h begins with #include <iostream> #include <string> #include <vector> #include <algorithm> #include "project_global_configuration.h" #include "low_level_class_used_by_xxxx.h" #include "another_low_level_class_used_by_xxxx.h" and so on for any other standard library or project facilities needed in xxxx.h.
An alternative design is that header files never include other header files, but rather state their prerequisites in comments, and push the find-the-transitive-closure-of-the-needed-set-of-includes problem back to the programmer. With this approach xxxx.cc begins with #include <iostream> #include <string> #include <vector> #include <algorithm> #include "project_global_configuration.h" #include "low_level_class_used_by_xxxx.h" #include "another_low_level_class_used_by_xxxx.h"
I can see advantages (+) and disadvantages (-) of each approach: Header files include all prerequisites: + header file is cleanly self-contained for clients + list of prerequisites is implicitly checked by the compiler, i.e., a missing prerequisite will be spotted at the next compile (although unnecessary prerequisites may go unnoticed) - lots of duplicate inclusion requires all header files to be guarded - lots of duplicate inclusion may slow build times (some compilers (e.g., gcc) specially optimize the header-file-guard #ifdef/#ifndef idiom, but there's still some extra overhead) - looking at xxxx.cc, it's relatively hard to identify the full set of all files it includes, or perhaps more importantly, it's relatively hard to tell whether or not it directly *or* *indirectly* includes some particular header file that's just had a must-check-all-clients change committed
Header files never include other header files: - if the list-of-prerequisites header comment is wrong, this may not be noticed for some time - programmer has to do the transitive-closure-of-needed-set-of-includes manually, and perhaps worse, has to keep it up to date as the software structure changes and evolves (this means that if some low-level header acquires a new prerequisite, many higher-level clients may need an extra #include) - xxxx.cc now starts with a rather lengthly list of includes (e.g., the top-level driver routine of my current C++ project shows 10 #include <...> statements, followed by 24 #include "..." statements) + there are no duplicate includes, so build times are fast + looking at xxxx.cc it's easy to tell whether or not it depends on some particular header file, and more generally, it's easy to tell what other project subsystems it uses + this approach follows the KISS rule
Are there other significant advantages/disadvantages?
-- -- "Jonathan Thornburg [remove -animal to reply]" <jth...@astro.indiana-zebra.edu> Dept of Astronomy, Indiana University, Bloomington, Indiana, USA "C++ is to programming as sex is to reproduction. Better ways might technically exist but they're not nearly as much fun." -- Nikolai Irgens
Jonathan Thornburg wrote: > shahav <sha...@gmail.com> wrote: >> Two rules which we follow in my team >> * each xxxx.cc file first statement is: #include xxxx.h - thats >> proofs, in some sense, that our H files are compile-able stand alone.
> Interesting. Since xxxx.h may want to reference various parts of the > standard library, this approach requires that xxxx.h begins with > #include <iostream> > #include <string> > #include <vector> > #include <algorithm> > #include "project_global_configuration.h" > #include "low_level_class_used_by_xxxx.h" > #include "another_low_level_class_used_by_xxxx.h" > and so on for any other standard library or project facilities needed > in xxxx.h.
> An alternative design is that header files never include other header > files, but rather state their prerequisites in comments, and push the > find-the-transitive-closure-of-the-needed-set-of-includes problem > back to the programmer. (....)
> I can see advantages (+) and disadvantages (-) of each approach: > Header files include all prerequisites: > + header file is cleanly self-contained for clients > + list of prerequisites is implicitly checked by the compiler, i.e., > a missing prerequisite will be spotted at the next compile (although > unnecessary prerequisites may go unnoticed) > - lots of duplicate inclusion requires all header files to be guarded
I don't see this as (-). For me 99.99% header guards via #ifdef or some #pragma is just part of the C or C++ language. A header file that misses a guard is just buggy :-)
> - (...) > - looking at xxxx.cc, it's relatively hard to identify the full set > of all files it includes, or perhaps more importantly, it's relatively > hard to tell whether or not it directly *or* *indirectly* includes > some particular header file that's just had a must-check-all-clients > change committed
There are tools for things like that.
> Header files never include other header files:
Intriguing. I have never seen this approach. (Not to say it doesn't exists, I haven't see that much).
> - (...) (...) > + there are no duplicate includes, so build times are fast
This claim should only be made after measurement under real conditions. I doubt the difference is significant for modern gcc / Visual Studio.
> + (...) > + this approach follows the KISS rule
"Keep it simple, stupid" - ? What's simple about having to maintain header dependencies for each and every cpp file? I think it's just stupid ;-)
> shahav <sha...@gmail.com> wrote: >> Two rules which we follow in my team >> * each xxxx.cc file first statement is: #include xxxx.h - thats >> proofs, in some sense, that our H files are compile-able stand alone.
> Interesting. Since xxxx.h may want to reference various parts of the > standard library, this approach requires that xxxx.h begins with > #include <iostream> > #include <string> > #include <vector> > #include <algorithm> > #include "project_global_configuration.h" > #include "low_level_class_used_by_xxxx.h" > #include "another_low_level_class_used_by_xxxx.h" > and so on for any other standard library or project facilities needed > in xxxx.h.
> An alternative design is that header files never include other header > files, but rather state their prerequisites in comments, and push the > find-the-transitive-closure-of-the-needed-set-of-includes problem > back to the programmer. With this approach xxxx.cc begins with > #include <iostream> > #include <string> > #include <vector> > #include <algorithm> > #include "project_global_configuration.h" > #include "low_level_class_used_by_xxxx.h" > #include "another_low_level_class_used_by_xxxx.h"
> I can see advantages (+) and disadvantages (-) of each approach: > Header files include all prerequisites: > + header file is cleanly self-contained for clients > + list of prerequisites is implicitly checked by the compiler, i.e., > a missing prerequisite will be spotted at the next compile (although > unnecessary prerequisites may go unnoticed)
This thing with the unnecessary header files goes for both methods.
> - lots of duplicate inclusion requires all header files to be guarded
Can you really leave header files unguarded with other methods? IMO if you have a project of a certain complexity, it's difficult to guarantee that a header isn't included twice, so you have to have guards, no matter what. Unless you use special techniques like the pimpl idiom throughout.
> - lots of duplicate inclusion may slow build times (some compilers > (e.g., gcc) specially optimize the header-file-guard #ifdef/#ifndef > idiom, but there's still some extra overhead)
Is this overhead relevant? You generally have the header files in the system cache during a build.
> - looking at xxxx.cc, it's relatively hard to identify the full set > of all files it includes, or perhaps more importantly, it's relatively > hard to tell whether or not it directly *or* *indirectly* includes > some particular header file that's just had a must-check-all-clients > change committed
> Header files never include other header files: > - if the list-of-prerequisites header comment is wrong, this may not > be noticed for some time > - programmer has to do the transitive-closure-of-needed-set-of-includes > manually, and perhaps worse, has to keep it up to date as the software > structure changes and evolves (this means that if some low-level > header acquires a new prerequisite, many higher-level clients may > need an extra #include) > - xxxx.cc now starts with a rather lengthly list of includes > (e.g., the top-level driver routine of my current C++ project shows > 10 #include <...> statements, followed by 24 #include "..." statements) > + there are no duplicate includes, so build times are fast
I don't think that duplicate includes have a significant impact on build times. Is this really so?
> + looking at xxxx.cc it's easy to tell whether or not it depends on > some particular header file, and more generally, it's easy to tell > what other project subsystems it uses
IMO this is not correct. Since there is no compiler support for showing unnecessary include files, changes in the code tend to leave unnecessary include files around. Since there are typically many include files with complex dependencies, it's also not a simple task to get down to the minimal set, so generally there are unneeded include files.
> + this approach follows the KISS rule
I disagree. I think manual management of include dependencies is not simple at all; it's a real pain. I'd call it following the MIOC principle... :)
> Are there other significant advantages/disadvantages?
I think it's almost impossible to avoid including headers in other headers completely, unless you use exclusively certain coding techniques, and never use a class in another class as member, for example. I don't see a single outstanding advantage of avoiding this, and the single outstanding advantage of doing it consistently is the compiler's management of include dependencies. IMO that's an application of the KISS principle.
Jonathan Thornburg wrote: > I can see advantages (+) and disadvantages (-) of each approach: > Header files include all prerequisites: > + header file is cleanly self-contained for clients > + list of prerequisites is implicitly checked by the compiler, i.e., > a missing prerequisite will be spotted at the next compile > (although unnecessary prerequisites may go unnoticed) > - lots of duplicate inclusion requires all header files to be guarded
I would make that at worst a neutral point. It is good style to have include-guards in your header files anyway.
> - lots of duplicate inclusion may slow build times (some compilers > (e.g., gcc) specially optimize the header-file-guard #ifdef/#ifndef > idiom, but there's still some extra overhead) > - looking at xxxx.cc, it's relatively hard to identify the full set > of all files it includes, or perhaps more importantly, it's > relatively hard to tell whether or not it directly *or* > *indirectly* includes some particular header file that's just had a > must-check-all-clients change committed
+ it is possible to use a coding style where the start of xxxx.cc tells you exactly from which headers facilities are used by the code in xxxx.cc itself
> Header files never include other header files: > - if the list-of-prerequisites header comment is wrong, this may not > be noticed for some time > - programmer has to do the > transitive-closure-of-needed-set-of-includes > manually, and perhaps worse, has to keep it up to date as the > software structure changes and evolves (this means that if some > low-level header acquires a new prerequisite, many higher-level > clients may need an extra #include) > - xxxx.cc now starts with a rather lengthly list of includes > (e.g., the top-level driver routine of my current C++ project shows > 10 #include <...> statements, followed by 24 #include "..." > statements) > + there are no duplicate includes, so build times are fast > + looking at xxxx.cc it's easy to tell whether or not it depends on > some particular header file, and more generally, it's easy to tell > what other project subsystems it uses > + this approach follows the KISS rule
I am sorry, but I don't buy that last one. Having to keep track of all dependencies yourself is not something that I could call KISS.
And I would like to add: - Even an innocent interface change (such as adding another function) can cause an avalanche of edits all over the source code. - This approach is completely unacceptable for many organisations for third-party libraries, because integrating updates will be too painful. - You can't know if a header was included because it is needed by the code of xxxx.cc itself or because it was a prerequisite for some other header.
> The include mechanism used in C++ is working just fine, but often > programmers tend to add redundant #include when a simple class fwd > declaration will do.
> The problem get worse if a basic H file, one that many other files > refer, has redundant dummy #includes, in such case a small > modification may trigger a complete rebuild.
In other words, the root of the problem is redundant includes.
> Two rules which we follow in my team > * each xxxx.cc file first statement is: #include xxxx.h - thats > proofs, in some sense, that our H files are compile-able stand alone. > * each H file xxxx.cc is guarded with #ifdef __xxxx___
This does not solve the problem that a header modification trigger a complete rebuild. This is because if __xxxx___ is undefined the header gets read, otherwise the header has already been read. In any case a change in that header trigger recompilation of the files that include that header directly or indirectly.
Stick with the standard include guards, weed out unnecessary includes from headers, and, more importantly, use pimpl idiom to hide class implementation, so that class implementation includes only go into class implementation source files.
Gerhard Fiedler <geli...@gmail.com> wrote: > I think it's almost impossible to avoid including headers in other > headers completely, unless you use exclusively certain coding > techniques, and never use a class in another class as member, for > example.
Could you explain why this is the case? I've been following the headers-don't-include-other-headers rule for 10+ years and haven't had noticed any particular difficulties. Obviously different projects and environments have different constraints, and this thread has certainly presented arguments that headers-don't-include-other-headers might not be the *best* style guideline, but I don't see how it's "almost impossible".
Perhaps our discussion might be clearer if we focus on a specific example: Consider the following inheritance hierarchy, taken from a C++ project I'm currently working on. (This needs a window at least 119 characters wide to display the ASCII-art properly.)
***** Notation ***** D ==> B D is-a B, i.e. D is derived from B C --> M C has-a or points-to-a M, i.e. C contains one or more members of type M, M*, and/or M&
Berger_Oliger<physics_system, FD_scheme> | | V mesh<physics_system, FD_scheme> | | V chunk<physics_system, FD_scheme> ==> FD_scheme::chunk_base<physics_ystem> ==> chunk_base_common<physics_system> | | | V | radial_fn | V
As you can see, most of the higher-level classes are templates with 2 template parameters, physics_system and FD_scheme. These templates are all instantiated 4 times, once for each combination of two different choices of physics_system and two different choices of FD_scheme.
With headers-include-other-headers, Berger_Oliger.cc would begin with #include "mesh.hh" together with any auxiliary classes directly used in the Berger_Oliger API. With headers-don't-include-other-headers, Berger_Oliger would begin with includes for everything shown in the diagram, together with all the auxiliary classes used by those classes.
I can understand how one can reasonably debate which of these approaches is a preferable programming style, but I'm puzzled how either approach is "almost impossible".
-- -- "Jonathan Thornburg [remove -animal to reply]" <jth...@astro.indiana-zebra.edu> Dept of Astronomy, Indiana University, Bloomington, Indiana, USA "Most investment bankers' [...] idea of a long-term investment is thirty-six hours" -- Robert Townsend, "Up the Organization"
Jonathan Thornburg wrote: > Gerhard Fiedler <geli...@gmail.com> wrote: >> I think it's almost impossible to avoid including headers in other >> headers completely, unless you use exclusively certain coding >> techniques, and never use a class in another class as member, for >> example.
> Could you explain why this is the case? I've been following the > headers-don't-include-other-headers rule for 10+ years and haven't > had noticed any particular difficulties. Obviously different projects > and environments have different constraints, and this thread has > certainly presented arguments that headers-don't-include-other-headers > might not be the *best* style guideline, but I don't see how it's > "almost impossible".
Then how do you write the header file for the definition of a class that uses a std::string?
class x { std::string name; public: // whatever
};
Or are you suggesting that every file that uses an x object must not only #include the header file for x but also the header <string>. That seems to result in repetitive coding which most would wish to avoid.
Francis Glassborow <francis.glassbo...@btinternet.com> wrote: > how do you write the header file for the definition of a class that > uses a std::string?
> class x { > std::string name; > public: > // whatever > };
> Or are you suggesting that every file that uses an x object must not > only #include the header file for x but also the header <string>.
Yes, that's what I'm suggesting, or at least suggesting as a reasonable alternative to be considered. ("One size fits all" is almost never a good style guide in the C++ world. Or even in C, for that matter!) More precisely, I'm suggesting that in this style the header file (let's call it x.hh) should read
// // prerequisites: // <string> //
// // This class ... // class x { std::string name; public: // whatever };
I would like to hope that before writing #include "x.hh" a programmer will first read enough of x.hh to understand the client API provided by class x. In this programming style a prerequisite #include <string> is part of the "#include API", and is explicitly documented in x.hh .
[I emphasize that I am *not* claiming any originality in this scheme. For example, Rob Pike famously advocated it 20 years ago in his "Notes on Programming in C", http://www.lysator.liu.se/c/pikestyle.html . Of course, that doesn't mean it's necessarily a good idea for C++ today -- that's the topic under discussion.]
> That > seems to result in repetitive coding which most would wish to avoid.
Yes, this is a disadvantage of the header-don't-include-other-headers scheme. But is it a large enough disadvantage to worry about? The effort of typing (and getting correct and maintaining) #include <string> is surely much smaller than the effort of understanding and (correctly) using any nontrivial class x. And if I forget the #include or get it wrong, the compiler will complain the next time I try to compile.
The one disadvantage that does seem possibly-significant to me is that headers-include-other-headers lets a header provide a slightly cleaner abstraction (one with slightly less "mental workload") to clients:
// instructions in "foo.hh includes whatever other stuff it needs" // style: To use the foo subsystem, just #include "foo.hh" then follow the comments in "foo.hh" describing the care and feeding of foo objects.
as compared with
// instructions in "clients of foo.hh include whatever other stuff // foo.hh needs" style: To use the foo subsystem, just #include <iostream> #include <string> #include <vector> #include <map> #include "baz.hh" #include "quux.hh" #include "foo.hh" then follow the comments in "foo.hh" describing the care and feeding of foo objects.
I don't want to be too dogmatic: In the past I've always used the headers-dont-include-other-headers style, but maybe I'm wrong and should change in some (all?) circumstances. I find the responses in this thread (including those that I might disagreee with) very insightful!
ciao,
-- -- "Jonathan Thornburg [remove -animal to reply]" <jth...@astro.indiana-zebra.edu> Dept of Astronomy, Indiana University, Bloomington, Indiana, USA "Washing one's hands of the conflict between the powerful and the powerless means to side with the powerful, not to be neutral." -- quote by Freire / poster by Oxfam
> Jonathan Thornburg wrote: >> Gerhard Fiedler <geli...@gmail.com> wrote: >>> I think it's almost impossible to avoid including headers in other >>> headers completely, unless you use exclusively certain coding >>> techniques, and never use a class in another class as member, for >>> example.
>> Could you explain why this is the case? I've been following the >> headers-don't-include-other-headers rule for 10+ years and haven't >> had noticed any particular difficulties. Obviously different projects >> and environments have different constraints, and this thread has >> certainly presented arguments that headers-don't-include-other-headers >> might not be the *best* style guideline, but I don't see how it's >> "almost impossible".
> Then how do you write the header file for the definition of a class that > uses a std::string?
> class x { > std::string name; > public: > // whatever > };
> Or are you suggesting that every file that uses an x object must not > only #include the header file for x but also the header <string>. That > seems to result in repetitive coding which most would wish to avoid.
In this case if the header start using a new class, say std::list<>, and does not include <list>, this change breaks the compilation of all files that include that header, hardly a good practice.
Jonathan Thornburg wrote: > Francis Glassborow <francis.glassbo...@btinternet.com> wrote: >> how do you write the header file for the definition of a class that >> uses a std::string?
>> class x { >> std::string name; >> public: >> // whatever >> };
>> Or are you suggesting that every file that uses an x object must not >> only #include the header file for x but also the header <string>.
> Yes, that's what I'm suggesting, or at least suggesting as a reasonable > alternative to be considered. ("One size fits all" is almost never a > good style guide in the C++ world. Or even in C, for that matter!) > More precisely, I'm suggesting that in this style the header file > (let's call it x.hh) should read
> // > // prerequisites: > // <string> > //
> // > // This class ... > // > class x { > std::string name; > public: > // whatever > };
> I would like to hope that before writing #include "x.hh" a programmer > will first read enough of x.hh to understand the client API provided > by class x. In this programming style a prerequisite > #include <string> > is part of the "#include API", and is explicitly documented in x.hh .
That seems to get awfully close to makeing me understand implementation details. When that class changes and no longer requires <string> how will I know? I don't claim the impact of unecessary headers is always a problem, but it will never be zero and it sure seems like a suboptimal design to have unecessary headers or force me to stay current on all the implementations of all classes.
> I'm suggesting that in this style the header file > (let's call it x.hh) should read
> // > // prerequisites: > // <string> > //
> // > // This class ... > // > class x { > std::string name; > public: > // whatever > };
> I would like to hope that before writing #include "x.hh" a programmer > will first read enough of x.hh to understand the client API provided > by class x. In this programming style a prerequisite > #include <string> > is part of the "#include API", and is explicitly documented in x.hh . Maxim Yegorushkin <maxim.yegorush...@gmail.com> wrote: > In this case if the header start using a new class, say std::list<>, and > does not include <list>, this change breaks the compilation of all files > that include that header
In the headers-don't-include-other-headers style, the "prerequisites" comment will now read // // prerequisites: // <string> // <list> // and all clients are required to update their source code accordingly.
> hardly a good practice.
Hmm. I agree that it's not an ideal situation, but it seems to me that it's basically the same as that for any other change in class x that's not 100% backwards compatible. Perhaps it's just that in my experience, 100% backwards compatible changes are rare.
In fact, adding new #include prerequisites is a lot *less* onerous than a non-backwards-compatible API change, because it (new #include) only requires a mechanical change for each *file* using the API, whereas the (in my experience far more common) non-backwards-compatible API change may require non-trivial (sometimes *very* non-trivial) changes at each *call site*.
-- -- "Jonathan Thornburg [remove -animal to reply]" <jth...@astro.indiana-zebra.edu> Dept of Astronomy, Indiana University, Bloomington, Indiana, USA "Most investment bankers' [...] idea of a long-term investment is thirty-six hours" -- Robert Townsend, "Up the Organization"
> In an earlier posting in this thread, I wrote: >> I'm suggesting that in this style the header file >> (let's call it x.hh) should read
>> // >> // prerequisites: >> //<string> >> //
>> // >> // This class ... >> // >> class x { >> std::string name; >> public: >> // whatever >> };
>> I would like to hope that before writing #include "x.hh" a programmer >> will first read enough of x.hh to understand the client API provided >> by class x. In this programming style a prerequisite >> #include<string> >> is part of the "#include API", and is explicitly documented in x.hh .
> Maxim Yegorushkin<maxim.yegorush...@gmail.com> wrote: >> In this case if the header start using a new class, say std::list<>, and >> does not include<list>, this change breaks the compilation of all files >> that include that header
> In the headers-don't-include-other-headers style, the "prerequisites" > comment will now read > // > // prerequisites: > //<string> > //<list> > // > and all clients are required to update their source code accordingly.
In my personal opinion this is rather unsatisfactory.
With self-contained headers (headers that include other headers as needed) a new header dependency only triggers client recompilation, but no editing of clients required (provided interfaces don't change).
A good practice they use in boost to ensure that headers are self-contained is that includes are normally ordered from most local headers (project headers) to standard headers. This way a missing include in a header causes a compile time error because standard headers are less likely to have been included yet.
Example from boost/interprocess/mapped_region.hpp: /////////////////////////////////////////////////////////////////////////// /// // // (C) Copyright Ion Gaztanaga 2005-2008. Distributed under the Boost // Software License, Version 1.0. (See accompanying file // LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) // // See http://www.boost.org/libs/interprocess for documentation. // /////////////////////////////////////////////////////////////////////////// ///
#if (defined BOOST_INTERPROCESS_WINDOWS) # include <boost/interprocess/detail/win32_api.hpp> #else # ifdef BOOST_HAS_UNISTD_H # include <fcntl.h> # include <sys/mman.h> //mmap # include <unistd.h> # include <sys/stat.h> # include <sys/types.h> # include <sys/shm.h> # include <cassert> # else # error Unknown platform # endif
#endif //#if (defined BOOST_INTERPROCESS_WINDOWS)
...
In this header boost includes go first followed by standard C++ includes, followed by system includes. (This is taken from a header, but the same include order principle applies to source files).
>> hardly a good practice.
> Hmm. I agree that it's not an ideal situation, but it seems to me > that it's basically the same as that for any other change in class x > that's not 100% backwards compatible. Perhaps it's just that in my > experience, 100% backwards compatible changes are rare.
At work we use interfaces and Pimpl idiom a lot. In the source control system there are much more revisions to .cc files than to corresponding .h files. In other words, there are a lot more changes to implementation source files than to the interface headers.
Separating interfaces and implementation does require more discipline and upfront thinking, but it makes code base less entangled and more flexible. Changes to implementation source files this way only trigger relinking of the clients (because when a shared library is relinked make can not know whether its binary interface has actually changed and thus it has to relink its clients).
> In fact, adding new #include prerequisites is a lot *less* onerous > than a non-backwards-compatible API change, because it (new #include) > only requires a mechanical change for each *file* using the API, > whereas the (in my experience far more common) non-backwards-compatible > API change may require non-trivial (sometimes *very* non-trivial) > changes at each *call site*.
True.
The original question, however, was concerned with redundant includes which trigger recompilation that otherwise would be unnecessary.
A good way to cope with this problem has long been known: * Make header files self-contained by having them include whatever they require. * Don't expose implementation details in header files by separating interfaces and implementation. It can be done with by utilizing Pimpl idiom or abstract interfaces and factory functions.
Jonathan Thornburg wrote: > In an earlier posting in this thread, I wrote: >> I'm suggesting that in this style the header file >> (let's call it x.hh) should read
>> // >> // prerequisites: >> // <string> >> //
>> // >> // This class ... >> // >> class x { >> std::string name; >> public: >> // whatever >> };
>> (...)
> Maxim Yegorushkin <maxim.yegorush...@gmail.com> wrote: >> In this case if the header start using a new class, say std::list<>, and >> does not include <list>, this change breaks the compilation of all files >> that include that header
> In the headers-don't-include-other-headers style, the "prerequisites" > comment will now read > // > // prerequisites: > // <string> > // <list> > // > and all clients are required to update their source code accordingly.
>> hardly a good practice.
> Hmm. I agree that it's not an ideal situation, but it seems to me > that it's basically the same as that for any other change in class x > that's not 100% backwards compatible. Perhaps it's just that in my > experience, 100% backwards compatible changes are rare.
> In fact, adding new #include prerequisites is a lot *less* onerous > than a non-backwards-compatible API change, because it (new #include) > only requires a mechanical change for each *file* using the API, > (...)
I will argue that even pure mechanical changes to code can be bad.
1.) It will mess up the change history in any SCC. For a given change, someone looking at the history will see a change to a header + n * cc-files where most of the cc-file changes are just edits to include a new header. 2.) For concurrent development, it will increase the number of files needing a merge (costing developer time) even though no functional change to these files has been applied.
Jonathan Thornburg wrote: > Maxim Yegorushkin <maxim.yegorush...@gmail.com> wrote: >> In this case if the header start using a new class, say std::list<>, >> and does not include <list>, this change breaks the compilation of >> all files that include that header
> In the headers-don't-include-other-headers style, the "prerequisites" > comment will now read > // > // prerequisites: > // <string> > // <list> > // > and all clients are required to update their source code accordingly.
>> hardly a good practice.
> Hmm. I agree that it's not an ideal situation, but it seems to me > that it's basically the same as that for any other change in class x > that's not 100% backwards compatible. Perhaps it's just that in my > experience, 100% backwards compatible changes are rare.
> In fact, adding new #include prerequisites is a lot *less* onerous > than a non-backwards-compatible API change, because it (new #include) > only requires a mechanical change for each *file* using the API, > whereas the (in my experience far more common) > non-backwards-compatible API change may require non-trivial (sometimes > *very* non-trivial) changes at each *call site*.
Have you ever followed your approach in a project that involves ten developers and hundreds of thousands of lines of code? I seriously doubt it. The costs of having to check out, change, and check in hundreds of files just because an implementation detail of a frequently used, low-level utility class changes are just prohibitive.
Leaving header file inclusion to the end client amounts to needless duplication of preprocessor directives and replaces automatic handling by manual, error-prone management. It is inconceivable to me that this should work in any project that involves more than a single developer and more than a few dozens of source files.
-- Gerhard Menzl
Non-spammers may respond to my email address, which is composed of my full name, separated by a dot, followed by at, followed by "fwz", followed by a dot, followed by "aero".
On Nov 9, 4:27 am, Maxim Yegorushkin <maxim.yegorush...@gmail.com> wrote: ...
> The original question, however, was concerned with redundant includes > which trigger recompilation that otherwise would be unnecessary.
> A good way to cope with this problem has long been known: > * Make header files self-contained by having them include whatever they > require. > * Don't expose implementation details in header files by separating > interfaces and implementation. It can be done with by utilizing Pimpl > idiom or abstract interfaces and factory functions.
> -- > Max
In the last year, I've being working on a medium size project, say about 120mm. The code was a "wiled west" - no standards at all. We did modify the code by adding coding style. beside being supported by a powerful regression sometimes we compared assembly code to verify automatic applied coding style changes, e.g. removing tones of redundant (), or preferring xxx-> over (*xxx). What i'm looking for, is a similar tool/capability as in doxigen, to analyze the code and remove redundant includes. A different option would be writing a script which iterate-ly compiles the same module and on each iteration add -D__xxxx__ to prevent that module participation, similar effect as removing the include.... Still i dont want to invent the wheel...
I wrote [[after an implementation change so that a header file which formerly only had <string> as a prerequisite, now also has <list> as a prerequisite]]
> In the headers-don't-include-other-headers style, the "prerequisites" > comment will now read > // > // prerequisites: > // <string> > // <list> > // > and all clients are required to update their source code accordingly. Gerhard Menzl <clcppm-pos...@this.is.invalid> wrote: > Have you ever followed your approach in a project that involves ten > developers and hundreds of thousands of lines of code?
Not in the sense that we're discussing here. I did use it successfully on a project of that size... but it was a mixed-language project with a C core and individual modules in mostly C, Fortran 77, and Fortran 90, with only a few C++ modules (of which mine was one). By virtue of deliberate project design, there was only very limited interaction between modules, and in fact almost all of that interaction had to fit into C semantics. (Ick!)
> I seriously doubt > it. The costs of having to check out, change, and check in hundreds of > files just because an implementation detail of a frequently used, > low-level utility class changes are just prohibitive.
> Leaving header file inclusion to the end client amounts to needless > duplication of preprocessor directives and replaces automatic handling > by manual, error-prone management. It is inconceivable to me that this > should work in any project that involves more than a single developer > and more than a few dozens of source files.
You and many others in this thread make cogent points. You've persuaded me to try header-files-include-their-prerequisites the next time I'm in a position to do so.
Again, my thanks to all who have contributed to this thread -- I've found the various points of view to be very enlightening.
ciao,
-- -- "Jonathan Thornburg [remove -animal to reply]" <jth...@astro.indiana-zebra.edu> Dept of Astronomy, Indiana University, Bloomington, Indiana, USA "being able to send information a little bit faster than light is like being a little bit pregnant: it has implications which go far beyond the immediate direct importance of the phenomenon itself." -- Henry Spencer
> On Nov 9, 4:27 am, Maxim Yegorushkin<maxim.yegorush...@gmail.com> > wrote: > ...
>> The original question, however, was concerned with redundant includes >> which trigger recompilation that otherwise would be unnecessary.
>> A good way to cope with this problem has long been known: >> * Make header files self-contained by having them include whatever they >> require. >> * Don't expose implementation details in header files by separating >> interfaces and implementation. It can be done with by utilizing Pimpl >> idiom or abstract interfaces and factory functions.
>> -- >> Max
> In the last year, I've being working on a medium size project, say > about 120mm. The code was a "wiled west" - no standards at all. > We did modify the code by adding coding style. beside being supported > by a powerful regression sometimes we compared assembly code to > verify automatic applied coding style changes, e.g. removing tones of > redundant (), or preferring xxx-> over (*xxx).
Hm, (*xxx). is two symbols longer than xxx->. Why is the first one better?
> What i'm looking for, is a similar tool/capability as in doxigen, to > analyze the code and remove redundant includes. > A different option would be writing a script which iterate-ly compiles > the same module and on each iteration add -D__xxxx__ to prevent that > module participation, similar effect as removing the include....
The algorithm could work as follows:
1) Read a header file. 2) Cut all #include lines from the header, remove duplicates and store them as an array-of-includes. Let N be the length of the array. 3) All possible variants of removing headers can be represented by a string of 1s and 0s of N bits. 4) Treat that string of N bits as a binary number. 5) Do a loop for i in range(0, 2 ** N). 5.1) For every set bit in i paste a corresponding include line into the header (with all includes removed on step 2) and try to compile the header. 5.2) If the header compiles, save the header and terminate the loop. As it iterates from 0, current i has the least number of bits set and therefore corresponds the least number of includes. 5.3) Otherwise, next iteration.
The complexity of this algorithm is O(2 ** N), where N is the number of unique includes in the header. Might not scale well to large Ns.
One shortcoming of this algorithm is that a successful compile may be triggered too early by an unrelated header which happens to include other headers that are required by the header being analyzed. To overcome this to some extent it may be a good idea to start analyzing headers with the least number of includes first.
Is there a more interesting solution for automatically removing redundant headers?
> > In the last year, I've being working on a medium size project, say > > about 120mm. The code was a "wiled west" - no standards at all. > > We did modify the code by adding coding style. beside being supported > > by a powerful regression sometimes we compared assembly code to > > verify automatic applied coding style changes, e.g. removing tones of > > redundant (), or preferring xxx-> over (*xxx).
> Hm, (*xxx). is two symbols longer than xxx->. Why is the first one better?
round brackets '()' introduce a violation for the natural order from left to write e.g.
> > What i'm looking for, is a similar tool/capability as in doxigen, to > > analyze the code and remove redundant includes. > > A different option would be writing a script which iterate-ly compiles > > the same module and on each iteration add -D__xxxx__ to prevent that > > module participation, similar effect as removing the include....
> The algorithm could work as follows:
> 1) Read a header file. > 2) Cut all #include lines from the header, remove duplicates and store > them as an array-of-includes. Let N be the length of the array. > 3) All possible variants of removing headers can be represented by a > string of 1s and 0s of N bits. > 4) Treat that string of N bits as a binary number. > 5) Do a loop for i in range(0, 2 ** N). > 5.1) For every set bit in i paste a corresponding include line into the > header (with all includes removed on step 2) and try to compile the header. > 5.2) If the header compiles, save the header and terminate the loop. As > it iterates from 0, current i has the least number of bits set and > therefore corresponds the least number of includes. > 5.3) Otherwise, next iteration.
> The complexity of this algorithm is O(2 ** N), where N is the number of > unique includes in the header. Might not scale well to large Ns.
> One shortcoming of this algorithm is that a successful compile may be > triggered too early by an unrelated header which happens to include > other headers that are required by the header being analyzed. To > overcome this to some extent it may be a good idea to start analyzing > headers with the least number of includes first.
> Is there a more interesting solution for automatically removing > redundant headers?
> -- > Max
First it make sense to replace all #include with their transitive closure. otherwise, the order of dealing with the h files has effect on the optimality. That be done, N could be quite big, say over 64, and therefor may not be applicable. R.