C++ lambda preprocessor

original homepage | original author | browse svn | home page

Overview

The C++ lambda preprocessor (clamp) converts C++ code containing lambda expressions into ordinary C++ code. It was originally written by Raoul Gough, and is now maintained by Yang Zhang.

Changes that have taken place since the original clamp release:

updated the source to be more standards-conformant and build on modern platforms out of the box
added --prefix option that allows you to customize the names of the generated structs/files (to avoid name conflicts/clobbering)
added --outdir to allow you to specify an alternate output directory instead of the current working directory (to minimize clutter)

The rest of this document is mostly a direct copy of the content from the original homepage.

Introduction

The C++ lambda preprocessor (clamp) converts C++ code containing lambda expressions into ordinary C++ code. Here’s a simple example:

vector<int> v;
// ...
std::for_each(v.begin(), v.end(), lambda (int &p) {
  if (p == 5) p = 0;
});

This example uses the standard algorithm for_each to apply an anonymous function to each element of a vector. The anonymous function accepts an integer parameter by reference, and resets the value to zero if it is currently five (a simple, but not very useful example). The preprocessor replaces the entire lambda expression in its output, so that the C++ compiler ends up seeing something like the following:

std::for_each(v.begin(), v.end(),
  lambda_generator_1<void, int &>::generate());

The exact nature of the template lambda_generator_1 is beyond the scope of this introduction, except to say that its generate() member function returns a function object by value. The function object has, in this case, a member function void operator()(int &) which for_each applies to each element of the vector. Some people would probably prefer to use the standard transform algorithm for this example, as in:

std::transform(v.begin(), v.end(), v.begin(),
  lambda int (int p) { return p == 5 ? 0 : p; });

This example shows an anonymous function that returns a value, in this case an int. Rather than hard-wiring a value into the function body, it is also possible to include contextual information in the function object. For instance:

void reset(std::vector<int> &v, int val) {
  std::transform(v.begin(), v.end(), v.begin(),
    lambda int (int p) { return p == __ctx(val) ? 0 : p; });
}

The __ctx expression is an example of context information bound by value. The clamp preprocessor also supports reference semantics for contextual information via __ref expressions. For example:

int sum = 0;
std::for_each(v.begin(), v.end(),
  lambda (int p) { __ref(sum) += p; });

This, of course, calculates the sum of elements in the vector.

Getting into some more complicated examples, it is possible to name the type of the function object generated by a lambda expression by simply omitting the function body. You have to do this, for instance, if you want to use an anonymous function generated by a lambda expression as a function parameter or return value. For example, the type of the expression from the previous example:

lambda (int p) { __ref(sum) += p; }

can be referred to in the code as lambda (int &) (int). The first pair of brackets contains the context binding (or closure) parameters, and the second pair contains the function parameters. The closure parameter list is optional for context-less functions, as is the return type for functions returning void, such as this one. Putting all of that together, here’s a templated function that returns a function object:

template<typename T>
lambda bool(T) (const T &)
match(const T &target) {
  return lambda bool(const T &candidate) {
    return candidate == __ctx(target);
  };
}

// Use a generated comparison object
std::vector<int>::iterator i =
  find_if(v.begin(), v.end(), match(7));

This find_if example returns an iterator to the first 7 in the vector (or v.end(), if none) using an instantiation of the match template with an int parameter. For a vector of strings, you could do the following:

std::vector<std::string>::iterator i =
  find_if(v.begin(), v.end(), match(std::string("hello")));

Why a preprocessor?

I wrote the preprocessor just for fun. There doesn’t seem to be any way to achieve real lambda expressions in pure C++, since it won’t let you insert a function definition in the middle of an expression. The limits of what pure C++ allows are pretty well exhausted by the boost lambda library.

Lambda expressions simplify some coding tasks, so it would be nice to have them in C++. In the time it takes you to extract that one-liner into a named function, I bet you could write two lambda expressions for sure. Not to mention cases which require a named class that contains context information.

What it does

clamp scans its input for lambda expressions, passing any plain C++ through unchanged. When it encounters a lambda expression, it extracts the function body into a separate file. It also generates a class template with a suitable operator() and (where necessary) member variables to store any context binding. This class template also goes into a separate file. The whole lambda expression is then replaced in the output by a single constructor call, which creates an object of the templated class.

The first line of the output is always a #include directive, which drags in the generated templates and (indirectly) the function bodies. The generated templates do not refer explicitly to any types used in the original lambda expressions, which is how it can be included before any user code. The actual types are only bound at the point of use. Because of this, the clamp parser doesn’t have to know what scope a lambda expression appears in, or where the required types are defined. This also makes including lambda expressions in templated code a breeze, since the type binding is done within the template scope where the expression was originally used.

How it works

The clamp preprocessor consists of a lexical analyser (lexer) written in flex, a parser written in bison and a code generator in plain C++. The clamp parser mostly tries to ignore everything in the input file, letting the lexer copy input to output. When the lexer encounters the lambda keyword, it enters a different mode (“start condition” in flex terminology) in which is behaves like a normal lexer and supplies tokens to the parser. The parser does some messy stuff redirecting output and resetting the lexer mode as necessary. Note: clamp is actually pretty dumb. It performs purely syntactic transformations on the input, without really understanding scope, types or variables. This will no doubt result in some incomprehensible errors from the C++ compiler if something goes wrong. This is also the reason that clamp requires the __ctx and __ref keywords, since it wouldn’t otherwise be able to tell that an expression relies on surrounding context information.

Grammar

clamp introduces three keywords: lambda, __ctx and __ref. The parser recognises more or less the following grammar:

lambda-expression:
 lambda-decl lambda-body_opt
lambda-decl:
 lambda return-type_opt param-list_opt param-list
return-type:
 type-id
param-list:
 ( ) |
 ( parameter ) |
 ( parameter , … )
parameter:
 type-id identifier_opt initialiser_opt
initialiser:
 = expression
lambda-body:
 { statement_opt … }

where statement represents any valid C++ statement, possibly making use of the following extended expressions:

extended-expression:
 lambda-expression |
 __ctx ( expression ) |
 __ref ( expression )

Portability

I wrote clamp using the following tools, all under Cygwin on Windows 2000.

g++ 2.95.3–5
boost 1.25.1
flex 2.5.4
bision 1.28
gnu make 3.79.1

The preprocessor builds successfully with g++ 3.1, but the code that it generates causes an internal compiler error when taking the address of a member function. This is probably fixed in later versions of g++.

The preprocessor itself might build with yacc and/or traditional Unix make (maybe) with any reasonable C++ compiler. The lexer probably won’t compile with plain lex, because (according to the flex manual), lex doesn’t support exclusive start conditions.

Contact information

This page and clamp itself are Copyright (C) 2002, 2003 by Raoul Gough and may be used and distributed free of charge. Please send any clamp-related comments to . It might be a good idea to include the word “clamp” in the subject line, because that email address attracts a bit of spam.

License

C++ lambda preprocessor is released under the GNU GPL3.

Contact

Back to assorted.sf.net.