Thursday, April 28, 2016

Powers of Graphs Are Almost Always Hard

Early in 2011, I spent a few months as an exchange student in the University of Melbourne. Although I was writing my thesis at the time, I was looking for topics for a small-scale research, as I had to turn in some written assignment as part of the exchange program. I can't remember why, but I thought a bit about powers of graphs, and pondered about their relation to hardness. Since it turned out to be an easy question to solve, it wasn't enough for the written assignment (let alone a research paper), but I do have a warm spot for it, so here it is:

(If you are browsing on mobile, switch to desktop view to display the math stuff nicely)

Background: Powers of Graphs 

Given a simple graph (we restrict our discussion to this type for brevity) $G(V, E)$, we define $G^2$ by adding an edge between every two vertices $v,j\in V$ if there exists some $w\in V$ such that $\{v, w\} \in E$, and $\{u, w\} \in E$.
In other words - if there was a path of length 2 between $v$ and $u$ in $G$, then there's an edge between them in $G^2$. This works similarly for $G^3$ (where paths of length 3 or less in $G$ become edges) and so on. 
Clearly, if the longest distance in the graph is of length $d$ (in which case we say that $d$ is the diameter of $G$), then $G^d$ is simply a clique, since all paths in $G$ are at most of length $d$.

Background - Independent Set

Given a graph $G(V, E)$, an independent set of vertices is some $U\subseteq V$ such that $u,v\in U \Rightarrow \{u,v\}\notin E$. Finding a maximum independent set (i.e. largest in size, not to be confused with maximal which means it is not strictly a subset of any other independent set) in a graph is known to be NP-hard, and also hard to approximate.

So here's the question:

 In a clique, every vertex is a maximum (and also maximal) independent set, so for every simple graph $G$ with diameter $d$, solving the independent set problem for $G^d$ is easy: just choose any vertex you like, and you're done. This raises the question - if a problem is in general hard on $G$, but easy on $G^d$, how hard is it for $G^k$ where $k < d$? I mean, at which point does this very hard problem becomes trivial? 

I'll save you the suspense - independent set is generally hard on $G^{d-1}$, so the problem is not all that interesting, and the hardness of independent set on powers of graphs is basically: hard, hard, hard, ... , hard, trivial. Not an exciting result, I admit, but let's prove it anyway.

Proof by Reduction

Proof sketch: given a graph $G$, we'll construct a graph $H$, that is only slightly larger than $G$, has the same diameter, and is such that solving the independent set problem for $H^{d-1}$ will give us the solution for $G$.

First, the case of an even $d$, which is simpler:
Denote $G$'s vertices by $v_1, v_2, ... , v_n$, we construct $H$ by taking $n$ disconnected vertices denoted $u_1, u_2, ..., u_n$ - we'll call those the 'old' vertices - and adding $n\cdot (\frac d 2 - 1)$ 'new' vertices to them in the following manner:
1. Add one special vertex $c$.
2. For every $i : 1\leq i \leq n$, connect $u_i$ to $c$ by a path of length $\frac d 2$ using $\frac d 2 - 1$ of the 'new' vertices (all of these paths should be disjoint, and that's okay, since we have exactly $n\cdot (\frac d 2 - 1)$ of these new vertices). Denote the vertices of the path from $c$ to $u_i$: $c\rightarrow u_i ^1 \rightarrow u_i ^ 2 \rightarrow ... \rightarrow u_i ^ {\frac d 2 - 2} \rightarrow u_i ^{\frac d 2 - 1} \rightarrow u_i$.
3. For every $\{v_k, v_j\} \in E$ that form an edge in $G$, add to $H$ the edge $\{u_k ^1, u_j ^1\}$.
 $H$ is thus sort of a big star graph, having $c$ in the middle, with some additional edges due to step (3) above.

A few observations:
1. For every two vertices of the original graph $\{v_l,v_t\}\notin E$, the shortest path from $u_l$ to $u_j$ in $H$ goes through $c$ and is of length $d$. It follows that the diameter of $H$ is at least $d$.
2. In $H^{d-1}$, all of the new vertices form a clique, and each of them is connected by an edge to each of the old vertices. The diameter of $H$ is therefore at most $d$, and so it is exactly $d$. 
3. It follows that a maximum independent set in $H^{d-1}$ cannot contain any of the new vertices.
4. $\{u_j, u_k\}$ is an edge in $H^{d-1}$ if and only if $\{v_j, v_k\}$ is an edge in $G$, so every independent set in $G$ correlates to an independent set of 'old' vertices in $H^{d-1}$ and vice versa.  
Putting all these observations together boils down to the fact tha every maximum independent in $H^{d-1}$ correlates to a maximum independent set in $G$ and vice versa.

What about an odd $d$? well, instead of a single vertex serving as the center, we use a clique of n vertices that serves as center of $H$, and everything else works very much the same.

Note that a slight tweak can be used to show that $H^{d-2}$ is no easier, and so on.

So, alas, independent set is hard on all but the $d$-th power of a graph, in which case it is trivial.


Sunday, January 31, 2016

NP-Complete Problems as Repeatable Math Puzzles


Repeatable Puzzles

 In the 3 years I worked at Matific, I was involved in designing all of the mini-games (we called them 'episodes') produced by the company. This meant that I spent a non-negligible part of my time on the hunt for math puzzles that are
  • interesting and rewarding to solve
  • rely only on elementary school math knowledge
  • worth playing over and over, with respect to the above
Most of the really cool puzzles, such as determining which - out of 12 coins - was fake, using a balance scale and only 3 weighings, fall short of the "worth repeating" notion, as there's no real pleasure or challenge in solving them a second time. So googling "Math Puzzles" and reading puzzle books, though fun, was often of little use.

As time went by, and development progressed, puzzles turned out to be only a small part of what we did. A typical episode was rather required to help in understanding some deeper principal of a previously taught skill, via hands-on experience, which I think we managed to do well, and that's a topic for another post. 

Nevertheless, there were some puzzles in the mix, for example: a game where the user is given 4-5 numbers (e.g. 2, 3, 4, 7), and is asked to get to a target number (e.g. 23) using the some of the four operations and each of the initial numbers (e.g. 4/2 + 3*7).

Another puzzle, for 1st graders, asked the student to arrange toys on shelves, subject to some constraints, e.g. - the doll must be above the teddybear, the car has to be to the right of the dinosaur, and so on.

A third puzzle was simply variants of a well known water-pouring puzzle: one has 2-3 containers, each of different volume (e.g. 5L and 3L) and the objective is to obtain another quantity (e.g. 4L) via pouring and filling.


Starting With the Solution

 As we designed more of those repeatable puzzles, a few common traits emerged:

  •  they involved a more than one skill. The pouring puzzles, for example, involves addition, subtraction, understanding of volume, and strategizing.
  •  we had to generate problems with well-tuned difficulty levels.
Let me focus on that last one: When one designs a puzzle, the first instinct may be to randomize the problem (e.g. how much water in each container? what is the target quantity?) and then write a solver that solves it, but that is rarely the right thing to do, for two reasons: 1) This leaves the difficulty level to chance, 2) Sometimes the problem has no quick solver, and the solver will needs to go over all possibilites to find the solution, or reach a conclusion that it has no solution, in which case a new problem has to be generated at random and checked and so on. Our standard strategy, which applied to most of the cases, was starting from the solution. 


For example, choose 4 random numbers in some range, e.g. 2, 5, 9, 11, and now choose operations at random, e.g. *, +, -, which gets 2 * 5 + 9 - 11 = 8. And so we present the user with the numbers 2, 5, 9, 11, setting 8 to be the target number. This approach has its drawbacks, such as producing problems using a certain solution while a simpler one exists, but they are less severe, product-wise.
This "start from the solution" approach took us a while to figure out, in which time we tried the naive approach, of randomizing a problem and running a solver. After a few of these, I noticed that we often realize that the problem is NP-hard, or in some cases assume it is so, not being sure how to prove it. A solver for such puzzles is problematic and can't scale to larger instances of the problem, especially seeing how the code had to be very light-weight to run on a browser. Lastly, a crucial part of programming a puzzle is checking the user's solution, which was almost always considerably easier than generating the problem. And so at some point it became clear to me that many of our so-called puzzles, were simply instances of NP-complete problems.
This was neat because it gave us something to draw inspiration from, in addition to poring over puzzle books.

Some Thoughts 

 I think that what makes NP-complete problems such useful puzzles is that they somehow seem to come from the real world, at least more than those in harder complexity classes, which means many of them can be explained to a layman in a minute. The NP-completeness also means that there's no trick that the user learns once, and is then bored by subsequent puzzles that are solved using that same trick. Being in NP also means that solutions can be checked easily, so the software can check the user's solution, but also, when the user fails - and the software shows a correct solution, the user can see very quickly that it is correct. Lastly, they encourage a heuristic cross-topical approach to problem solving, which is to say there's no title saying "this is a subtraction problem, solve it using subtraction". Both the method and the tools used by the user (or student) are not part of the problem's definition, and that is how math usually presents itself in real life.

Wednesday, December 23, 2015

Casting to double is faster than Kahan's summation

A floating problem

Floating point precision problems are notoriously annoying, and personally I was somewhat surprised they are still an issue in this day and age, but they are.
For instance, by the end of the following code:
float x = 0;
float f = 1e-6;
for (unsigned i = 0; i < 1e6; ++i)
  x = x + f;
x's value will be 1.00904 and not 1.
Even much simpler code reveals the issue:
float x = 1e6;
float y = 1e-6;
float z = x + y;

z's value here will be exactly 1,000,000 and not 1,000,000.000001.
The reason for this is that 32 bit (the number of bits used for a float type) can represent a number to a limited precision, and so it makes sense to use them for high precision when representing small numbers, and to sacrifice that precision gradually as the integral part of the number increases.

Why use floats when you can use doubles?

Well, in many cases, doubles simply solve the problem. Double is also a finite representation, but is far more precise than float. The smallest positive number a float can represent is about 1e-38, whereas for a double it's something around 2e-308. To emphasize, the ratio between the two is something like 1e270 which is considerably more than the number of atoms in the universe raised to the third power.
The reason is that floats occupy half the space as doubles, and when writing a code that is meant to be vectorized by the compiler, which one often does in high performance computing, this can translate to a factor of 2 speed up. Additionally, if your data is basically a bunch of real numbers, it would also mean your data in floats is around half the size of your data in doubles, so transmitting it over a network may be twice as fast, and so on.

Kahan's summation algorithm

For running sums, William Kahan suggested the following algorithm, cleverly using a second variable to keep track of the deviation from the "true" sum (the following snippet is from Wikipedia, and not in C++, but straightforward nevertheless):

function KahanSum(input)
    var sum = 0.0
    var y,t                      // Temporary values.
    var c = 0.0                  // A running compensation for lost low-order bits.
    for i = 1 to input.length do
        y = input[i] - c         // So far, so good: c is zero.
        t = sum + y              // Alas, sum is big, y small, so low-order digits of y are lost.
        c = (t - sum) - y        // (t - sum) recovers the high-order part of y; subtracting y recovers -(low part of y)
        sum = t                  // Algebraically, c should always be zero. Beware overly-aggressive optimizing compilers!
    next i                       // Next time around, the lost low part will be added to y in a fresh attempt.
    return sum

And this is, in my opinion, really beautiful. 
However, as it turns out:

Casting to double is faster

As it turns out, the dumber, less beautiful solution, is faster:

double x = 0;
float f = 1e-6;
for (unsigned i = 0; i < 1e6; ++i) {
  x = x + f; // f is implicitly cast to double here
}

I ran some experiments, and for a very large set of numbers (adding 1e10 numbers, each equal to 1e-7), casting to double was twice as fast as using Kahan's summation algorithm.
A cool algorithm, though.