# Syntactic Aspartame – Recreational Operator Overloading

Sander Stoks

Operator overloading seems to be a controversial feature. There are opponents who feel it is merely syntactic sugar with too much potential for abuse, and it should never have made it into the language. However, one could object that `pObject->oo(x)` is merely syntactic sugar for `Foo(pObject, x)` and while many people would agree that `event += observer` is less clear than `event.Register(observer)`, many other people (especially programmers dealing with scientific applications) would certainly not like to go back to ```a = add(mul(p, q), r)```.

In fact, most of the people I know who learned C++ as a "second language" coming from C, Pascal, or Fortran pointed out operator overloading as one of the most exciting features of the language. In some cases (including my own) this lead to overzealous overloading at first. The first thing most of us tried is defining a "power" operator (didn't you?). It quickly turns out we couldn't get away with defining `operator**` (some other languages have the `**` operator for "raised to the power of") because this is parsed differently from what we expected. Next thing to try is `operator^`, which works but leads to unexpected results because of operator precedence rules.

So, maybe the initial enthousiasm has cooled off a little bit, but operator overloading is still very nice. Especially the fact that a matrix-times-vector multiplication looks much like you would write it down on paper is appealing to programmers dealing with numerical algorithms. Yet, they inevitably run into a problem with vectors.

Vectors have two kinds of multiplication, known as the dot (or inner) product, and the cross (or outer) product. For three-dimensional vectors, these are defined as

a . b = ax*bx + ay*by + az*bz
a × b = (ay*bz - az*by, az*bx - ax*bz, ax*by - ay*bx)

Unfortunately, we don't have "centered dot" or "cross" symbols at our disposal, and C++ doesn't let you define new infix operators. You're stuck with only one `operator*` for vectors, and you'll have to pick either the dot product or the cross product for your implementation.

To a mathematician, the operator symbol used for the vector product is actually often redundant. It is clear from the context which one of the two is meant: if the result appears in a scalar expression, it was the dot product; if it is in a vector expression, it was the cross product. This observation lands us on the slippery slope of Return Type Overloading, which is even more controversial than operator overloading, and which C++ doesn't support anyway.

Or does it?

As the basis of my examples, I will use a simple `Vector3` type, along with `inner()` and `outer()` functions for both types of products:

```struct Vector3
{
Vector3(double x_, double y_, double z_)
: x(x_), y(y_), z(z_) {}

double x;
double y;
double z;
};

double inner(const Vector3& lhs, const Vector3& rhs)
{
return a.x*b.x + a.y*b.y + a.z*b.z;
}

Vector3 outer(const Vector3& lhs, const Vector3& amp;rhs)
{
return Vector3(a.y*b.z - a.z*b.y, a.z*b.x - a.x*b.z, a.x*b.y - a.y*b.x);
}```

Since I will focus on multiplication, I have elided all the other stuff. If you could have code like

```Vector3 a(1, 2, 3);
Vector3 b(7, 5, 6);

double dp1 = a*b;
Vector3 cp1 = a*b;```

it would be apparent that the former multiplication is meant to be a dot product, and the latter a cross product. The first thought would be to simply define

```Vector3 operator* (const Vector3& lhs, const Vector3& rhs)
{
return outer(lhs, rhs);
}

double operator* (const Vector3& lhs, const Vector3& rhs)
{
return inner(lhs, rhs);
}```

but this is Return Type Overloading, and you can't. What you can do, however, is use Copliens proxy class pattern. You do not return a `Vector3` or a `double` from your multiplication immediately, but a special "intermediate result":

```VectorProduct operator*(const Vector3& lhs, const Vector3& rhs)
{
return VectorProduct(lhs, rhs);
}```

You define this `VectorProduct` class to be convertible into a `Vector3` or a `double`, depending on how it is "further processed":

```class VectorProduct
{
friend VectorProduct operator*(const Vector3& lhs, const Vector3& rhs);
VectorProduct(const Vector3& lhs, const Vector3& rhs): lhs_(lhs), rhs_(rhs) {}
const Vector3& lhs_;
const Vector3& rhs_;
public:
operator double() const
{
return inner(lhs_, rhs_);
}
operator Vector3() const
{
return outer(lhs_, rhs_);
}
};```

Note that its constructor is private and only accessible from the `operator*`, because that's the only context it makes sense in. With only these two additions, we have a poor man's RTO. Of course, one of the main problems with RTO is the ambiguity when an expression is used where more than one of its resulting types is valid. For example, `cout << a*b` will make your compiler complain about ambiguity, and will have to be specified with either `cout << double(a*b)` or `cout << Vector3(a*b)`.

It turns out there is also a way to define new infix operators. So as not to spoil the surprise, I will fist show the resulting client code:

```Vector3 a(1, 2, 3);
Vector3 b(7, 5, 6);

double dp2 = a dot b;
Vector3 cp2 = a cross b;```

This looks rather un-C++. It uses Copliens trick twice, with a bit of preprocessor added. When shown the definitions of `dot` and `cross`, you can probably guess how it works:

```#define dot *Dot*
#define cross *Cross*```

(Of course, the convention is to write preprocessor definitions in BIG_UGLY_CAPS, but that would have "given it away" instantly.) `Dot` and `Cross` are defined as follows:

```const struct Inner {} Dot;
const struct Outer {} Cross;```

The idea is that `a dot b` is parsed into `(a*Dot)*b`, where you can make up your own definition of `a*Dot`; in this case, resulting in an intermediate result - sort of a "half-way dot product". I've thrown in some gratuitous template code because this makes it reusable for the cross product later on:

```template <typename T>
struct VecOp
{
VecOp(const Vector3& v) : v_(v) {}
const Vector3& v_;
};

VecOp<Inner> operator* (const Vector3& lhs, const Inner& op)
{
return VecOp<Inner>(lhs);
}```

This intermediate result is then multiplied by b, and this multiplication is again overridden to yield the expected dot product:

```double operator* (const VecOp<Inner>& lhs, const Vector3& rhs)
{
return inner(lhs.v_, rhs);
}```

The version for the cross product is very similar:

```Vector3 operator* (const VecOp<Outer>& lhs, const Vector3& rhs)
{
return outer(lhs.v_, rhs);
}```

You may be tempted to use a similar trick to give you an infix ** operator, but this has some serious drawbacks. First, observe that dereferencing a `double` has no meaning. This is exactly what the compiler tells you if you accidentally type

```double a = 3;
double b = 2;
double c = a ** b;```

My compiler says "Illegal indirection", because `a ** b` is parsed as `a*(*b)`. If you could somehow make `*b` return a proxy type, you would be ready. While you are free to overload the "unary *" operator for your own types, you cannot overload `operator*(double)` because it's a builtin type (even though this particular operator isn't provided). You therefore need one change in the spelling to make the above code work:

```double a = 3;
Double b = 2;
double c = a ** b;```

Note that `b` is now of type `Double`, a little wrapper around the builtin `double`:

```struct Double
{
Double() {}
Double(double d): d_(d) {}
operator double() { return d_; }
double d_;
};```

so that you can define

```class Exponent
{
friend Exponent operator*(const Double& d);
Exponent(const Double& d): d_(d.d_) {}
public:
const double& d_;
};

Exponent operator*(const Double& d)
{
return Exponent(d);
}

double operator*(double lhs, const Exponent& rhs)
{
return pow(lhs, rhs.d_);
}```

Unfortunately, this can never work for literals. This is a nuisance since this is what a "power" operator would be used most often with. With a little help from the preprocessor:

`#define POW **(Double)`

you could type

```double a = 3;
double c = a POW 2;```

But note that this operator still gets its precedence wrong, since you would expect that `2*a**b` is parsed as `2*(a**b)`, yet it is instead parsed as `(2*a)*(*b)`, or `(2*a)` to-the-power-of `b`. Therefore, it is best to just stick to using the regular `pow(a, b)` function. Note that for the vector product examples above precedence was not a problem because of the linearity of the products.

Next, I will show one last trick which makes range checks look more "mathematic". Beginning C programmers often type in something like this:

`if (lower < x < upper) { ... }`

`if (lower < x && x < upper) { ... }`

Unfortunately, the VC6 and VC7.1 compilers only emit a warning if you try the former syntax: "unsafe use of type '`bool`' in operation". Even worse: Many compilers (I tried GCC 2.95 and GCC 3.3.3, and even Comeau 4.3.3 with the `-remarks` option) don't feel there's anything wrong with it at all, yet the meaning is quite opposite from what you would expect. It may or may not evaluate to the correct answer: If `lower < x`, the first part of the expression is `true`, and if `upper > 1`, the entire expression evaluates to `true` too (which may be correct). However, if `lower >= x`, then the first part is `false`; the expression becomes ```if (0 < upper)```, and if `upper` is then positive, the entire expression also evaluates to `true` (which is definitely incorrect)!

However, if you are prepared to make at least lower be of a custom type (`Double`, for instance), we can apply the proxy trick once more:

```class Comparison
{
friend Comparison operator<(const Double& lhs, const double& rhs);
Comparison(const double& lhs, const double& rhs) : lhs_(lhs), rhs_(rhs) {}
public:
const double& lhs_;
const double& rhs_;
operator bool() const { return lhs_ < rhs_; }
};

Comparison operator< (const Double& lhs, const double& rhs)
{
return Comparison(lhs.d_, rhs);
}

bool operator< (const Comparison& lhs, const double& rhs)
{
return lhs && lhs.rhs_ < rhs;
}```

This makes the former `if` statement work. Of course, this is highly dangerous code, especially if your compiler doesn't emit a warning if you accidentally leave the lower bound a builtin type.

The observation that comparison operators need not return a `bool` is quite interesting though, and I will conclude with a way to define new infix operators using a syntax which looks slightly more C++-ish by maximizing the use of angle brackets. First, let me set the stage. Given a simple type representing rectangles:

```typedef struct
{
int left;
int top;
int right;
int bottom;
} Rect;```

you want to write a function which tells you whether a certain rectangle is fully contained within another. One way of doing this would be a simple function:

```bool contains(const Rect& lhs, const Rect& rhs)
{
return lhs.left   <= rhs.left &&
lhs.top    <= rhs.top &&
lhs.right  >= rhs.right &&
lhs.bottom >= rhs.bottom;
}```

but when you see the function being used, you can't immediately guess which rectangle is supposed to be contained within which: `if (contains(a, b))` is at least slightly ambiguous (to the human reader, that is). If you make it a member function of the `struct`, the ambiguity goes away: `if (a.contains(b))` but we've been taught not to add too many member functions (or member operators, for that matter) if these functions can be written as external functions just as well. It looks like this "containment" function would be an ideal candidate for a new infix operator. Let's re-use the trick this article started out with using a slightly different syntax:

```const struct contains_ {} contains;

template <typename T>
struct ContainsProxy
{
ContainsProxy(const T& t): t_(t) {}
const T& t_;
};

template <typename T>
ContainsProxy<T> operator<(const T& lhs, const contains_& rhs)
{
return ContainsProxy<T>(lhs);
}

bool operator>(const ContainsProxy<Rect>& lhs, const Rect& rhs)
{
return lhs.t_.left   <= rhs.left &&
lhs.t_.top    <= rhs.top &&
lhs.t_.right  >= rhs.right &&
lhs.t_.bottom >= rhs.bottom;
}```

With this bit of code, you can write

`if (a <contains> b) ...`

which may be surprising to a human reader, but not ambiguous. Also, given a simple `Point` type with `x` and `y` members, it is easily extended:

```bool operator>(const ContainsProxy<Rect>& lhs, const Point& rhs)
{
return rhs.x >= lhs.t_.left &&
rhs.x <= lhs.t_.right &&
rhs.y >= lhs.t_.top &&
rhs.y <= lhs.t_.bottom;
}```

Finally, let's reveal why `ContainsProxy` was made a template. When you define

```template <typename U, typename V>
bool operator>(const ContainsProxy<V>& lhs, const U& rhs)
{
return find(lhs.t_.begin(), lhs.t_.end(), rhs) != lhs.t_.end();
}```

you can even do

```vector<int> v;
if (v <contains> 42) ...```

Of course, the opponents of operator overloading will have turned away in disgust by now, but the fact that you shouldn't do something, doesn't mean you can't.

By the way: my introductionary computer programming book is available here. It uses C and stays away from the type of exotic Spielerei described in this article.