April 2016

Volume 31 Number 4

[Visual C++]

Microsoft Pushes C++ into the Future

By Kenny Kerr | April 2016

Visual C++ has a reputation for lagging behind the curve. If you need the latest and greatest C++ features you should just use Clang or GCC, or so the oft-repeated story goes. I would like to suggest that there has been a change to the status quo, a glitch in the matrix, a disturbance in the force, if you will. It’s true that the Visual C++ compiler has an incredibly old code base that has made it difficult for the C++ team at Microsoft to add new features rapidly (goo.gl/PjSC7v). This is starting to change, though, with Visual C++ being ground zero for many new proposals to the C++ language and the Standard Library. I’m going to highlight a few new or improved features in the Visual C++ Update 2 release that I’ve found particularly compelling and that illustrate that there is life yet in this tenured compiler.

Modules

A few developers at Microsoft, notably Gabriel Dos Reis and Jonathan Caves, have been working on a design to add componentization support directly to the C++ language. A secondary goal is to improve build throughput, akin to a precompiled header. This design, called a module system for C++, has been proposed for C++ 17, and the new Visual C++ compiler provides a proof of concept and the start of a working implementation for modules in C++. Modules are designed to be very straightforward and natural to both create and consume for any developer using standard C++. Make sure you’ve got Visual C++ Update 2 installed, open a developer command prompt and follow along as I show you how. As the feature is still quite experimental, it lacks any IDE support and the best way to get started is by using the compiler directly from the command prompt.

Let’s imagine I have an existing C++ library that I’d like to distribute as a module, perhaps something elaborate, like this:

C:\modules> type animals.h
#pragma once
#include <stdio.h>
inline void dog()
{
  printf("woof\n");
}
inline void cat()
{
  printf("meow\n");
}

I might also have a compelling sample app to accompany my domesticated library:

C:\modules> type app.cpp
#include "animals.h"
int main()
{
  dog();
  cat();
}

Pressure from C++ activists has caused me to blush over the use of printf, but I can’t deny its unmatched performance, so I decide to turn the library into a module to obscure the truth that I prefer printf over other forms of I/O. I can start by writing the module interface:

C:\modules> type animals.ixx
module animals;
#include "animals.h"

I could, of course, just define the cat and dog functions right inside the module interface file, but including them works just as well. The module declaration tells the compiler that what follows is part of the module, but that doesn’t mean subsequent declarations are all exported as part of the module’s interface. So far, this module doesn’t export anything, unless the stdio.h header that animals.h includes happens to export something all by itself. I can even guard against that by including stdio.h prior to the module declaration. So if this module interface doesn’t actually declare any public names, how do I go about exporting something for others to consume? I need to use the export keyword; this—and the module and import keywords—are the only additions to the C++ language that I need to think about. This speaks to the beautiful simplicity of this new language feature.

As a start, I can export the cat and dog functions. This involves updating the animals.h header and beginning both declarations with the export specifier as follows:

C:\modules> type animals.h
#pragma once
#include <stdio.h>
export inline void dog()
{
  printf("woof\n");
}
export inline void cat()
{
  printf("meow\n");
}

I can now compile the module interface file using the compiler’s experimental module option:

C:\modules> cl /c /experimental:module animals.ixx

Notice that I also included the /c option to instruct the compiler merely to compile but not link the code. At this stage, it wouldn’t make sense to have the linker attempt to create an executable. The module option instructs the compiler to produce a file containing metadata describing the interface and the implementation of the module in a binary format. This metadata isn’t machine code, but rather a binary representation for C++ language constructs. However, it’s also not source code, which is both good and bad, depending on how you look at it. It’s good in that it should improve build throughput because apps that might import the module don’t need to parse the code anew. On the other hand, it also means that there isn’t necessarily any source code for traditional tools like Visual Studio and its IntelliSense engine to visualize and parse. That means that Visual Studio, and other tools, need to be taught how to mine and visualize code within a module. The good news is that the code or metadata inside a module is stored in an open format and tools can be updated to deal with it.

Moving on, the app can now import the module rather than the library header directly:

C:\modules> type app.cpp
import animals;
int main()
{
  dog();
  cat();
}

The import declaration instructs the compiler to look for a matching module interface file. It can then use that, along with any other includes that might be present in the app, to resolve the dog and cat functions. Thankfully, the animals module exports a pair of furry functions and the app can be recompiled using the same module command-line option:

C:\modules> cl /experimental:module app.cpp animals.obj

Notice this time that I allow the compiler to call the linker because I now actually want to produce an executable. The experimental module option is still required because the import keyword is not yet official. Further, the linker also requires the object file be produced when the module was compiled. This again hints at the fact that the new binary format that contains the module’s metadata isn’t actually the “code,” but merely a description of the exported declarations, functions, classes, templates and so on. At the point at which you want to actually build the app that uses the module, you still need the object file to allow the linker to do its job of assembling the code into an executable. If all went well, I now have an executable I can run just like any other—the end result is no different from the original app using the header-only library. Put another way, a module is not a DLL.

Now, I happen to work on a rather large library, and the thought of adding export to every declaration is not at all appealing. Fortunately, the export declaration can export more than just functions. One option is to export a bunch of declarations with a pair of braces, as follows:

C:\modules> type animals.h
#pragma once
#include <stdio.h>
export
{
  inline void dog()
  {
    printf("woof\n");
  }
  inline void cat()
  {
    printf("meow\n");
  }
}

This doesn’t introduce a new scope and is merely used to group any contained declarations for export. Of course, no self-respecting C++ programmer would write a library with a bunch of declarations at global scope. Rather, it’s far more likely my animals.h header declared the dog and cat functions inside a namespace, and the namespace as a whole can be exported quite simply:

C:\modules> type animals.h
#pragma once
#include <stdio.h>
export namespace animals
{
  inline void dog()
  {
    printf("woof\n");
  }
  inline void cat()
  {
    printf("meow\n");
  }
}

Another subtle benefit of moving from a header-only library to a module is that the app can no longer accidentally take a dependency on stdio.h because that’s not part of the module’s interface. What if my header-only library includes a nested namespace including implementation details not intended for apps to use directly? Figure 1 shows a typical example of such a library.

Figure 1 Header-Only Library with an Implementation Namespace

C:\modules> type animals.h
#pragma once
#include <stdio.h>
namespace animals
{
  namespace impl
  {
    inline void print(char const * const message)
    {
      printf("%s\n", message);
    }
  }
  inline void dog()
  {
    impl::print("woof");
  }
  inline void cat()
  {
    impl::print("meow");
  }
}

A consumer of this library knows not to take a dependency on anything in the implementation namespace. Of course, the compiler won’t stop nefarious developers from doing just that:

C:\modules> type app.cpp
#include "animals.h"
using namespace animals;
int main()
{
  dog();
  cat();
  impl::print("rats");
}

Can modules help here? Sure, but keep in mind that the design of modules is based upon keeping the feature as small or as simple as possible. So once a declaration is exported, everything is exported unconditionally:

C:\modules> type animals.h
#pragma once
#include <stdio.h>
export namespace animals
{
  namespace impl
  {
    // Sadly, this is exported, as well
  }
  // This is exported
}

Fortunately, as Figure 2 shows, you can rearrange the code such that the animals::impl namespace is declared separately while preserving the library’s namespace structure.

Figure 2 Preserving the Library’s Namespace Structure

C:\modules> type animals.h
#pragma once
#include <stdio.h>
namespace animals
{
  namespace impl
  {
    // This is *not* exported -- yay!
  }
}
export namespace animals
{
  // This is exported
}

Now all we need is Visual C++ to implement the nested namespace definitions and it becomes quite a bit prettier to look at and a lot easier to manager for libraries with a lot of nested namespaces:

C:\modules> type animals.h
#pragma once
#include <stdio.h>
namespace animals::impl
{
  // This is *not* exported -- yay
}
export namespace animals
{
  // This is exported
}

Hopefully this feature will arrive in Visual C++ Update 3. Fingers crossed! As it stands, the animals.h header will break existing apps that simply include the header and are perhaps built with a compiler that doesn’t yet support modules. If you need to support existing library users while slowly transitioning them to modules, you can use the dreaded preprocessor to smooth things over during the transition. This is not ideal. The design of many of the newer C++ language features, including modules, are meant to make programming C++ without macros increasingly plausible. Still, until modules actually land in C++ 17 and commercial implementations are available to developers, I can use a little preprocessor trickery to make the animals library build both as a header-only library and as a module. Inside my animals.h header, I can conditionally define the ANIMALS_­EXPORT macro as nothing and use that to precede any namespaces I’d like to export if this were a module (see Figure 3).

Figure 3 Building a Library Both as a Header-Only Library and as a Module

C:\modules> type animals.h
#pragma once
#include <stdio.h>
#ifndef ANIMALS_EXPORT
#define ANIMALS_EXPORT
#endif
namespace animals { namespace impl {
// Please don't look here
}}
ANIMALS_EXPORT namespace animals {
// This is all yours
}

Now any developer unfamiliar with modules, or lacking an adequate implementation, can simply include the animals.h header and use it just like any other header-only library. I can, however, update the module interface to define ANIMALS_EXPORT so that this same header can produce a set of exported declarations, as follows:

C:\modules> type animals.ixx
module animals;
#define ANIMALS_EXPORT export
#include "animals.h"

Like many C++ developers today, I dislike macros and would rather live in a world without them. Still, this is a useful technique as you transition a library to modules. Best of all, while the app that includes the animals.h header will see the benign macro, it won’t be visible at all to those who simply import the module. The macro is stripped out prior to the creation of the module’s metadata and, as a result, will never bleed into the app or any other libraries and modules that might make use of it.

Modules are a welcome addition to C++ and I look forward to a future update to the compiler with full commercial support. For now, you can experiment along with us as we push the C++ standard forward with the prospect of a module system for C++. You can learn more about modules by reading the technical specification (goo.gl/Eyp6EB) or by watching a talk given by Gabriel Dos Reis at CppCon last year (youtu.be/RwdQA0pGWa4).

Coroutines

While coroutines, previously called resumable functions, have been around a little longer in Visual C++, I continue to be excited about the prospect of having true coroutine support in the C++ language—with its deep roots in the stack-based language design of C. As I was thinking what to write, it dawned on me that I wrote not just one but at least four articles about the topic for MSDN Magazine. I suggest you start with the latest article in the October 2015 issue (goo.gl/zpP0AO), where you’ll be introduced to the coroutines support provided in Visual C++ 2015. Rather than rehash the benefits of coroutines, let’s drill into them a little further. One of the challenges with getting coroutines adopted by C++ 17 is that the standardization committee didn’t like the idea that they could provide automatic type deduction. The type of the coroutine can be deduced by the compiler such that the developer doesn’t have to think about what that type might be:

auto get_number()
{
  await better_days {};
  return 123;
}

The compiler is more than able to produce a suitable coroutine type and arguably this was inspired by C++ 14, which stated that functions can have their return type deduced:

auto get_number()
{
  return 123;
}

Still, the standardization committee is not yet comfortable with this idea being extended to coroutines. The problem is that the C++ Standard Library doesn’t provide suitable candidates, either. The closest approximation is the clumsy std::future with its often heavy implementation and its very impractical design. It also doesn’t help much in the way of asynchronous streams produced by coroutines that yield values rather than simply returning a single value asynchronously. So if the compiler can’t provide a type and the C++ Standard Library doesn’t provide a suitable type, I need to look a little closer to see how this actually works if I’m going to make any progress with coroutines. Imagine I have the following dummy awaitable type:

struct await_nothing
{
  bool await_ready() noexcept
  {
    return true;
  }
  void await_suspend(std::experimental::coroutine_handle<>) noexcept
  {}
  void await_resume() noexcept
  {}
};

It doesn’t do anything, but allows me to construct a coroutine by awaiting on it:

coroutine<void> hello_world()
{
  await await_nothing{};
  printf("hello world\n");
}

Again, if I can’t rely on the compiler automatically deducing the coroutine’s return type and I choose not to use std::future, then how might I define this coroutine class template?

template <typename T>
struct coroutine;

Because I’m already running out of space for this article, let’s just look at the example of a coroutine returning nothing, or void. Here’s the specialization:

template <>
struct coroutine<void>
{
};

The first thing the compiler does is look for a promise_type on the coroutine’s return type. There are other ways to wire this up, particularly if you need to retrofit coroutine support into an existing library, but because I’m writing the coroutine class template I can simply declare it right there:

template <>
struct coroutine<void>
{
  struct promise_type
  {
  };
};

Next, the compiler will look for a return_void function on the coroutine promise, at least for coroutines that don’t return a value:

struct promise_type
{
  void return_void()
  {}
};

While return_void doesn’t have to do anything, it can be used by different implementations as a signal of the state change that the logical result of the coroutine is ready to be inspected. The compiler also looks for a pair of initial_suspend and final_suspend functions:

struct promise_type
{
  void return_void()
  {}
  bool initial_suspend()
  {
    return false;
  }
  bool final_suspend()
  {
    return true;
  }
};

The compiler uses these to inject some initial and final code into the coroutine, which tells the scheduler whether to begin the coroutine in a suspended state and whether to suspend the coroutine prior to completion. This pair of functions can actually return awaitable types so that in effect the compiler could await on both as follows:

coroutine<void> hello_world()
{
  coroutine<void>::promise_type & promise = ...;
  await promise.initial_suspend();
  await await_nothing{};
  printf("hello world\n");
  await promise.final_suspend();
}

Whether or not to await and, thus, inject a suspension point, depends on what you’re trying to achieve. In particular, if you need to query the coroutine following its completion, you’ll want to ensure that there’s a final suspension; otherwise, the coroutine will be destroyed before you get a chance to query for any value captured by the promise.

The next thing the compiler looks for is a way to get the coroutine object from the promise:

struct promise_type
{
  // ...
  coroutine<void> get_return_object()
  {
    return ...
  }
};

The compiler makes sure the promise_type is allocated as part of the coroutine frame. It then needs a way to produce the coroutine’s return type from this promise. This then gets returned to the caller. Here I must rely on a very low-level helper class provided by the compiler called a coroutine_handle and currently provided in the std::experimental namespace. A coroutine_handle repre­sents one invocation of a coroutine; thus, I can store this handle as a member of my coroutine class template:

template <>
struct coroutine<void>
{
  // ...
  coroutine_handle<promise_type> handle { nullptr };
};

I initialize the handle with a nullptr to indicate that the coroutine isn’t currently in flight, but I can also add a constructor to explicitly associate a handle with a newly constructed coroutine:

explicit coroutine(coroutine_handle<promise_type> coroutine) :
  handle(coroutine)
{}

The coroutine frame is somewhat like a stack frame, but is a dynamically allocated resource and must be destroyed, so I’ll naturally use the destructor for that:

~coroutine()
{
  if (handle)
  {
    handle.destroy();
  }
}

I should also delete the copy operations and allow move semantics, but you get the idea. I can now implement the promise_type’s get_return_object function to act as a factory for coroutine objects:

struct promise_type
{
  // ...
  coroutine<void> get_return_object()
  {
    return coroutine<void>(
      coroutine_handle<promise_type>::from_promise(this));
  }
};

I should now have enough for the compiler to produce a coroutine and kick it into life. Here, again, is the coroutine followed by a simple main function:

coroutine<void> hello_world()
{
  await await_nothing{};
  printf("hello world\n");
}
int main()
{
  hello_world();
}

I haven’t yet done anything with the result of hello_world, yet running this program causes printf to be called and the familiar message printed to the console. Does that mean the coroutine actually completed? Well I can ask the coroutine that question:

int main()
{
  coroutine<void> routine = hello_world();
  printf("done: %s\n", routine.handle.done() ? "yes" : "no");
}

This time I’m not doing anything with the coroutine but asking whether it’s done, and sure enough it is:

hello world
done: yes

Recall that the promise_type’s initial_suspend function returns false so the coroutine itself doesn’t begin life suspended. Recall also that await_nothing’s await_ready function returns true, so that doesn’t introduce a suspension point, either. The end result is a coroutine that actually completes synchronously because I gave it no reason to do otherwise. The beauty is that the compiler is able to optimize coroutines that behave synchronously and apply all of the same optimizations that make straight-line code so fast. Still, this isn’t very exciting, so let’s add some suspense, or at least some suspension points. This can be as simple as changing the await_nothing type to always suspend, even though it has nothing to do:

struct await_nothing
{
  bool await_ready() noexcept
  {
    return false;
  }
  // ...
};

In this case, the compiler will see that this awaitable object isn’t ready and return to the caller before resuming. Now if I return to my simple hello world app:

int main()
{
  hello_world();
}

I’ll be disappointed to find that this program doesn’t print anything. The reason should be obvious: The coroutine suspended prior to calling printf and the caller that owns the coroutine object didn’t give it an opportunity to resume. Naturally, resuming a coroutine is as simple as calling the handle-provided resume function:

int main()
{
  coroutine<void> routine = hello_world();
  routine.handle.resume();
}

Now the hello_world function again returns without calling printf, but the resume function will cause the coroutine to complete. To illustrate further, I can use the handle’s done method before and after resuming, as follows:

int main()
{
  coroutine<void> routine = hello_world();
  printf("done: %s\n", routine.handle.done() ? "yes" : "no");
  routine.handle.resume();
  printf("done: %s\n", routine.handle.done() ? "yes" : "no");
}

The results clearly show the interaction between the caller and the coroutine:

done: no
hello world
done: yes

This could be very handy, particularly in embedded systems that lack sophisticated OS schedulers and threads, as I can write a lightweight cooperative multitasking system quite easily:

while (!routine.handle.done())
{
  routine.handle.resume();
  // Do other interesting work ...
}

Coroutines aren’t magical, nor do they require complex scheduling or synchronization logic to get them to work. Supporting coroutines with return types involves replacing the promise_type’s return_void function with a return_value function that accepts a value and stores it inside the promise. The caller can then retrieve the value when the coroutine completes. Coroutines that yield a stream of values require a similar yield_value function on the promise_type, but are otherwise essentially the same. The hooks provided by the compiler for coroutines are quite simple yet amazingly flexible. I’ve only scratched the surface in this short overview, but I hope it’s given you an appreciation for this amazing new language feature.

Gor Nishanov, another developer on the C++ team at Microsoft, continues to push coroutines toward eventual standardization. He’s even working on adding support for coroutines to the Clang compiler! You can learn more about coroutines by reading the technical specification (goo.gl/9UDeoa) or by watching a talk given by Nishanov at CppCon last year (youtu.be/_fu0gx-xseY). James McNellis also gave a talk on coroutines at Meeting C++ (youtu.be/YYtzQ355_Co).

There’s so much more happening with C++ at Microsoft. We’re adding new C++ language features, including variable templates from C++ 14 that allow you to define a family of variables (goo.gl/1LbDJ2). Neil MacIntosh is working on new proposals to the C++ Standard Library for bounds-safe views of strings and sequences. You can read up on span<> and string_span at goo.gl/zS2Kau and goo.gl/4w6ayn, and there’s even an implementation available of both (GitHub.com/Microsoft/GSL).

On the back end, I recently discovered that the C++ optimizer is a lot smarter than I thought when it comes to optimizing away calls to strlen and wcslen when called with string literals. That’s not particularly new, even if it’s a well-guarded secret. What is new is that Visual C++ finally implements the complete empty base optimization, which it has lacked for well over a decade. Applying __declspec(empty_bases) to a class results in all direct empty base classes being laid out at offset zero. This isn’t yet the default because it would require a major version update to the compiler to introduce such a breaking change, and there are still some C++ Standard Library types that assume the old layout. Still, library developers can finally take advantage of this optimization. Modern C++ for the Windows Runtime (moderncpp.com) particularly benefits from this and is actually the reason why this feature was finally added to the compiler. As I mentioned in the December 2015 issue, I recently joined the Windows team at Microsoft to build a new language projection for the Windows Runtime based on moderncpp.com and this is also helping to push C++ forward at Microsoft. Make no mistake, Microsoft is serious about C++.


Kenny Kerr is a software engineer on the Windows team at Microsoft. He blogs at kennykerr.ca and you can follow him on Twitter: @kennykerr.

Thanks to the following Microsoft technical expert who reviewed this article: Andrew Pardoe


Discuss this article in the MSDN Magazine forum