Exploring Clang Tooling, Part 0: Building Your Code with Clang

This post is part of a regular series of posts where the C++ product team and other guests answer questions we have received from customers. The questions can be about anything C++ related: MSVC toolset, the standard language and library, the C++ standards committee, isocpp.org, CppCon, etc.

Today’s post is by guest author Stephen Kelly, who is a developer at Havok, a contributor to Qt and CMake and a blogger. This is the first post in a series where he is sharing his experience using Clang tooling in his current team.

Conformance and Compatibility

During the long development of C++ compilers so far, a few compilers have dominated in mainstream domains. MSVC 6 and GCC 2.95 have sometimes been described as ‘too successful’. Their success has produced lasting effect of driving C++ adoption, but they had many incompatibilities with each other, resulting in beautiful walled gardens. In some cases (such as omitting the typename keyword inside templates) this made development tasks more convenient, even if the resulting code was divergent from the C++ standard or from GCC. For developers working on code intended to be portable between different platforms, this has been a lasting annoyance – colleagues would check-in code without the required typename keyword, causing other colleagues who use a more conforming compiler to repair the code.

On Unix-like platforms where GCC dominated, standards conformance has held a higher status in part due to Free Software culture of many compatible tools (and people!) which can collaborate in the venue of the standard. Clang, a more recent entry into the field of C++ compilers, shared that aim of standards conformance but gained marketshare for several reasons, among them licence choice, implementation language choice, cross-compilation, improved diagnostic messages etc. Such competition has benefited both GCC and Clang as they match each other in terms of compliance and developer utility. Part of the strategy employed by Clang was to implement compatibility with GCC in terms of behavior, GCC bugs which are relied upon in standard Unix headers, and implementing a ‘compiler driver’ which is compatible on the command line with GCC in terms of warning flags and other compile options.

MSVC compiler has not been left behind on any front and is implementing and driving standards conformance for developers who value portability. The /permissive- option is standard conformant modulo bugs. Those remaining bugs will be resolved as releases continue.

Another development in the last several years is the emergence of tooling APIs for C++ based on Clang. Clang developers have created tooling which can be used to mechanically refactor large amounts of C++ code, including clang-tidy and clang-format. Custom tools based on clang-tidy in particular is the subject of the follow-up posts in this blog series.

Acquiring the Clang/LLVM Tools

Together with the use of Clang as a viable compiler for production use on Windows for multiple web browsers, all of this leaves the question of how you can benefit from Clang tooling. If your codebase has only ever been built with the MSVC compiler, the code must first be made buildable with Clang.

Installers of Clang for Windows are available here. One of the ways Clang has been able to repeat its success on Windows is to implement compatibility with MSVC. The Clang installer for Windows includes two ‘compiler drivers’. clang++ is GCC compatible even on Windows and may be used with a codebase which relies on MinGW. clang-cl is the second compiler driver shipped with the Clang installer and is compatible with the MSVC compiler driver. This compatibility appears in terms of accepting command line flags in the style of cl.exe and in terms of running Clang in a permissive mode in line with cl.exe.

After running the Clang installer, a toolset file is required to instruct Visual Studio 2017 to use clang-cl.exe instead of the native cl.exe. The ‘LLVM Compiler Toolchain’ extension for Visual Studio 2017 must be installed to enable the toolchain.

After installing the ‘LLVM Compiler Toolchain’ the platform toolset can be changed to llvm in Visual Studio, or by specifying it in your buildsystem generator, such as CMake with cmake .. -T llvm.

Using MSVC compiler’s /permissive- flag

Some of the C++ conformance issues detected by Clang are also detected by recent MSVC releases with the /permissive- flag. So, a first step to getting code to build with Clang may be to first build it with MSVC Conformance mode. The downside of this is that MSVC Conformance mode is more strict than Clang in some cases (because Clang attempts to emulate permissive MSVC). Certain MSVC behaviors such as incorrect two-phase-lookup is allowed by Clang-CL by default and rejected by MSVC Conformance mode.

Iterative Conformance

A more conforming mode of Clang-CL can be activated by adding the compile flags -fno-delayed-template-parsing -Werror=microsoft -Werror=typename-missing. Even after turning Microsoft diagnostics into errors, individual such diagnostics can be disabled if required by adding -Wno- versions of the documented flag such as -Wno-microsoft-sealed. Note that such a flag must appear after the -Werror=microsoft option.

Even with the compatibility that Clang-CL provides, a fallback option can be used to instruct the compiler to compile with CL.exe if required.

Several features of Clang may need to be enabled in the buildsystem to build the code. Use of intrinsics such as _mm_shuffle_epi8 requires the use of the -msse4.1 flag or similar.

As with enabling compiler warning on a codebase which has never used them before, the most sensible approach is to start building with Clang using the most permissive settings necessary, and then introduce more conformance over time.

If your interest in Clang is limited to making use of semantically-rich tooling, it is important to remember that your code does not need to run after it is compiled. It also does not need to link, or even to compile – judicious use of ifdefs can be used to make most of your code reachable to Clang tooling while limiting the effect on code which is too affected to compile, or on code you don’t control. Some parts of the Windows headers do not compile with Clang yet. That is likely to affect your code too. Certain efforts within Microsoft are addressing these issues in shipped headers.

Porting existing C++ code

Here are some issues that may occur when porting your code to build with Clang or /permissive-, as you increase the level of standard conformance in your build:

Only one user-defined conversion may be invoked implicitly
MSVC in permissive mode accepts C++ code which invokes two implicit conversions at once (instead of the standard-conforming one). Clang-CL does not accept such double-conversions. This can arise when converting between two distinct string types for example (implicitly to, then from const char*).
Both result statements in a ternary expression must have the same type
Where ternaries are used with two different result types, it will be necessary to ensure that both result types are the same without implicit conversion. This can occur with simple wrapper types, or more complex expressions, such as when using a smart pointer which converts implicitly to a pointer type:

smart_pointer<int> func(bool b)
{
    int* defaultPtr = nullptr;
    smart_pointer<int> sp;
    return b ? sp : defaultPtr;
}
Missing `typename` and `template` keywords
C++ requires the user to use the typename keyword in certain template contexts in order to disambiguate between types and values in permissive mode and requires the template keyword to disambiguate between template functions and comparisons.

In the case of variables within templates, the solution is usually to use auto instead of adding the template keyword.
This can also occur when using type traits such as std::remove_reference. Developers might write (or copy from elsewhere in your codebase), code such as

template<typename T>
void doSomething(T t)
{
    std::remove_reference<T>::type value = input;
    ++value;
    return value;
}

and Clang rightly issues a diagnostic and recommends use of the typename keyword.

However, at this point it is obvious that developers are not going to remember to add such keywords unless they are reminded of it, so perhaps a different strategy of teaching developers to use std::remove_reference_t instead, and omit the trailing ::type. It is simpler all-round and more portable. Similar transformations can be introduced for any custom type traits used in your code.

Excessive typename keyword
Similarly, sometimes MSVC accepts a typename keyword where it should not.

Unfortunately, there are opposite cases where Clang requires a template keyword where it should not, so getting this right can at times require an #ifdef.

Static variables defined inside if statement
MSVC incorrectly accepts code such as

if (static bool oneTimeInit = true)
{
}

In a modern codebase, this should be changed to

if (static bool oneTimeInit = true; oneTimeInit)
{
}

This requires the use of the /std:c++17 flag when compiling with MSVC.

Implicit deletion of defaulted constructor
Clang does not allow user-defined but defaulted constructors where they are required to initialize a union.

The solution here is to actually define the constructor instead of defaulting it:

B::B()
{        
}
Clang does not accept the for each extension
The solution here is to use cxx_range_for instead. Replace

for each (auto item in container)

With

for (auto item : container)
Qualify lookup in template-dependent base
MSVC incorrectly accepts (and disambiguates) use of declarations from template-dependent bases, though Clang does not. This is again due to lax implementation of two-phase lookup. The solution here is to prefix such calls with this-> or the name of the base class.
Always-false static asserts
static_assert(false) is sometimes attempted by developers who wish to issue an informative error message if there is an attempt to instantiate a template which should not be instantiated. Unfortunately it doesn’t work with two phase lookup. This is accepted by MSVC, but not by Clang. The solution is to make the value dependent on the template type.
Specialization in different scope to declaration
MSVC allows specialization of a C++ template in different a namespace to the one it was declared in. Clang does not allow this. The specialization should be in the same namespace as the initial template.

template<typename T>
struct Templ
{
};

namespace NS {

template<>
struct Templ<int>
// error: class template specialization of 'Templ' must occur at global scope
{
};

}

Conclusion

It is important to realize that being able to build code with Clang is useful even if you can not run it (due to #ifdefs or dependency issues). Running Clang-built code opens up the possibility of using Clang sanitizers for example, but just being able to compile the code with Clang enables use of source-transformation tools such as clang-tidy, clazy and clang-format. Further blog posts in this series will explore workflows for source-to-source transformation with Clang.

Please leave a comment if you encountered other changes you needed to make in your code to compile with Clang-CL!