Wednesday, October 5, 2016
Notes from my "Intro to Information Theory" time
For two years, starting October 2010, I was a teaching assistant in the Introduction to Information Theory course at the Hebrew University, which was taught by Prof. Michael Werman, and the following year by Prof. Michael Ben-Or.
During that time, I a weekly two-hour tutorial, which was sometimes an extension of what was discussed in the lecture, and sometimes (as in the case of error correction codes) simply dealt with other topics.
After writing the previous post on check digits, I dug up the old notes, and found them to be quite helpful in refreshing my memory on the matter.
So, hoping it will prove helpful to others as well, and under an assumption that I'm not violating anyone's copyrights, here they are:
Tuesday, September 27, 2016
Check digit: the poor man's error correction code
The Israeli ID number and the Luhn Algorithm
The check-digit is computed using the Luhn algortihm which is:
1. Multiply digits in even places by 2.
2. Add the digits in the odd places (not including the check-digit itself) to the sum of digits of numbers you got in (1).
3. Set the 9th digit to be the number required to complete the sum from (2) to the next multiple of 10.
For example
9 - odd place, leave as is -> 9
9 - even place, multiply by 2 and take sum of digits -> 18 -> 9
0 - odd place, leave as is -> 0
3 - even place - multiply by 2 -> 6
5 - odd place, leave as is -> 5
6 - even place, multiply by 2 and take sum of digits -> 12 -> 3
1 - odd place, leave as is -> 1
2 - even place, multiply by 2 -> 4
sum all the above to obtain 37, which requires adding 3 to complete it to the nearest multiple of 10, thus 3 is chosen to be the check digit.
Had this been a valid ID, the check digit would have been 2 (not 9) (image source: nbn.org.il) |
Why all the fuss?
The Luhn Algorithm's "unfixable" fault
Interestingly, the Luhn algorithm cannot detect all transpositions of two adjacent digits. Specifically, the sequences 09 and 90 both contribute 9 to the sum computed in step (2) so this transposition will not cause a change in the check digit and will not be detected.
So I grabbed a piece of paper and tried to come up with a similar algorithm that will catch *all* transpositions, only to find out that - if we restrict ourselves to similar algorithms (wanting to preserve the simplicity of the computation) - there is no such solution.
To be precise, let $f:\{0,...,9\}\rightarrow\{0,...,9\}$ be a permutaion, then there exist $a \neq b \in \{0,...,9\}$ such that $f(a) + b \equiv a + f(b) (\textrm{mod} 10)$.
My friend, Lior Yanovski, suggested an elegant proof for this:
Assume, by way of negation, that $a\neq b \Rightarrow f(a) + b \not\equiv a + f(b) (\textrm{mod} 10)$.
It follows that $ f(a) - a \not\equiv f(b) - b (\textrm{mod} 10)$, which means the function $g(x) := f(x) - x (\textrm{mod} 10)$ is a permutation.
If we sum $g(x)$ over $x$ we obtain $\sum_x f(x) - \sum_x x \equiv 0 (\textrm{mod} 10)$ simply because both $f$ and the identity are permutaions adding up to 55. On the other hand, if $g$ is indeed a permutation - summing up its values would give $5 (\textrm{mod} 10)$, i.e., a contradiction!
So if we were to look for a function to replace the "multiply by 2 and take sum of digits" from the Luhn algorithm, while maintaining the rest of the algorithm's structure, and covering all adjacent digits transpositions - we won't find one.
Hans Peter Luhn 1896-1964 (Image source: asist.org) |
Fix #0: Find only transpositions
We could multiply every digit by its place, and take this sum modulo 10, i.e. define the check digit to be $\sum_{i=1} ^n i \cdot a_i (\textrm{mod} 10)$. This will detect all adjacent transpositions, but it will do a very poor job in detecting single digit errors. Seeing how 10 shares divisors with most of the numbers in $0,...,9$, multiplying by either of them and taking the remainder modulo 10 discards some information. For example $5\cdot x$ will always be either 0 or 5 (assuming $x$ is an integer). This means that a random error in the fifth digit will cause a change in the check digit only 50% of the times, and somewhat similar problems arise with digits in even places.
Fix #1: Abandon base 10
A closer inspection would reveal that the proof of the Luhn algorithm's fault relies heavily on the fact that 10 is an even number, since the expression $\sum_{i=1} ^n i (\textrm{mod} n)$ is $\frac n 2$ for even values of $n$ but $0$ for odd values of $n$. If we had used an odd base for this entire computation - things would be different, but perhaps less intuitive to compute by hand.
Indeed, some check digit systems, such as the one used in ISBN 10 (for bank transfers) employ a similar trick with 11 as the base of the modular operations.
Fix #2: Abandon simplicity
If we allow yet more complex computations, then there are some algorithms that produce a check digit that will detect all single-digit errors as well as all transpositions of adjacent digits. The oldest one probably being Verhoeff algorithm, from 1967. Verhoeff, who is now retired and an avid creator of math-inspired sculptures, took an empirical approach, by analyzing data from the Dutch postal service, classifying and counting common errors. He followed to create a code that not only detects all adjacent digits transpositions and single digit errors, but also detects errors based on phonetic similarities in dutch, such as confusing 15 (vijftien) with 50 (vijftig).
Jacobus Verhoeff, with his sculptures (Image source: groene.nl) |
Some tests
Running some quick scripts in python, I wanted to find out how well the Luhn algorithm, the Damm algorithm (I skipped implementing Verhoeff's) and the algorithm proposed in Fix #0 above fare. I also added two more algorithms for fun:
- Naive - $|\sum_{i = 1} ^{n - 1} (a_i - a_{i+1})| (\textrm{mod} 10)$.
- RandomTable - the Damm algorithm is based on using a table of a quasi group of order 10. So I wondered how well a random table would do, so I took 10 random permutations, one above the other, and used it in the same way the Damm algorithm is implemented.
The errors introduced were:
- Adjacent transposition: 12345 -> 12435
- One-over transposition: 12345 -> 14325
- Random 2-digit transposition: 12345 -> 42315
- Single digit random error: 12345 -> 19345
- Two-digit random errors: 12345 -> 22346
- 3, 4, 5... 9 random errors.
The table shows the miss rate. Note that 10% miss rate is the magic number that they all converge to, given enough noise. This is because if we were to choose some random function $f:(0...9)^n\rightarrow\{0...9\}$ then for two random inputs $x \neq y \in (0...9)^n$, clearly $Pr(f(x) \neq f(y)) = 0.9$. A random function on random data has a 10% miss rate, so useful algorithms try to get significantly better than 10% for specific cases, without causing spikes in the miss rate for other errors.
error \ algo | Luhn | Damm | Fix #0 | Naive | Random |
---|---|---|---|---|---|
Adjacent transposition | 2% | 0% | 0% | 70% | 34% |
One-over transposition | 87% | 10% | 9% | 63% | 22% |
Random transposition | 45% | 8% | 10% | 53% | 27% |
1-digit error | 0% | 0% | 10% | 67% | 36% |
2-digit error | 10% | 10% | 10% | 44% | 25% |
3-digit error | 10% | 10% | 10% | 27% | 18% |
4-digit error | 10% | 10% | 10% | 17% | 14% |
5-digit error | 10% | 10% | 10% | 12% | 11% |
6-digit error | 10% | 10% | 10% | 10% | 11% |
7-digit error | 10% | 10% | 10% | 10% | 10% |
8-digit error | 10% | 10% | 10% | 10% | 10% |
9-digit error | 10% | 10% | 10% | 10% | 10% |
Error correction codes for edit distance?
So, while this is a pre-computing-age problem, that is nowadays a toy problem, it highlights an interesting issue: the standard error-correction codes treat a very specific domain, where errors are modeled as noisy coordinates in a vector, and so the notion of distance is just the Hamming distance. Are there good codes for domains where the distance is more complicated, such as Levenshtein distance?
It turns out that once we tweak the noise model, we quickly step into the realm of open problems. While for the Binary Erasure Channel, which models noise as flipping bits, we've known the capacity ever since it was introduced, for the Deletion Channel, which models noise as removing a bit from the input string, the capacity is still an open problem.
To summarize - edit distance correction codes are an interesting idea, but I doubt we'll see any progress in them soon, as even the fundamental questions of Information Theory turn out to be extremely challenging in this domain.
Tuesday, August 2, 2016
Pokémon GO meets FarmVille: a game pitch
So, here's a sketch for a mobile location-based farming simulation game I thought of:
Pokémon GO meets FarmVille: players walk around in the real world, where real world locations have in-game roles. The players grow crops, and harvest them, and since this is the real world - they can 'own' land.
Walking around in the real world, the player can see which pieces of land are available for buying, and for how much. Abandoned pieces of land are priced by their size, and the price of land owned by other users is determined by that user, who can also choose "not selling". Any player can bid for any piece of land they pass by, as long as they have the money. Selling land does not require the player to be present at the spot.
Various achievements can be unlocked, such as managing 10 workers at once, growing exotic crops, owning a lot of land.
Elevator pitch:
Basic principles:
- Each player can "own" portions of land in the real world, and use it to grow crops.
- Crops can be sold in markets, which are also places in the real world (kind of like PokéStops).
- Land can be bought and sold, workers can be hired, however, a player must visit his/her land often enough, otherwise it becomes abandoned, in which case other players can buy it cheaply.
Gameplay:
Buying land
Each player starts the game with some in-game money, no workers, and some basic seeds (wheat, corn, or rice, depending on the climate).Walking around in the real world, the player can see which pieces of land are available for buying, and for how much. Abandoned pieces of land are priced by their size, and the price of land owned by other users is determined by that user, who can also choose "not selling". Any player can bid for any piece of land they pass by, as long as they have the money. Selling land does not require the player to be present at the spot.
Working the land
Having bought a new piece of land, the player has to work the land and prepare it for seeding. If the player has hired workers - they can be left there to do the job, but otherwise - the player has to stand close to the land for a minute, tapping the screen of the mobile device, thereby "working the land".Crops
Seeds for various crops can be bought at any market and planted in the land, requiring the player to visit the land in order to do so. Similarly to FarmVille, the crops take time to grow, and can rot if not harvested in time. Harvesting, like any type of work on the land, can be done by the player standing next to it and tapping the screen, or by workers left there for that purpose.Markets
These are places in the real world, that play a part in the game but cannot be bought or sold and are determined by the system, similarly to PokéStops. Standing close to a market, a player can buy seeds, machinery, sell crops, and hire workers (for a predetermined time) with various levels of expertise.Weather and Location
Crops can be grown only in places where the weather makes sense for it to happen, or in greenhouses, and real-world weather affects the virtual crops and their prices, so a storm may kill some of the crops, but raise prices.Advancement
Players advance through levels through experience points that are gained by playing the game (growing crops, buying land, harvesting, hiring help, etc.).Various achievements can be unlocked, such as managing 10 workers at once, growing exotic crops, owning a lot of land.
Social aspect?
Not sure, I haven't thought about it much, since I really don't like the way FarmVille did it...Sunday, July 10, 2016
My Perspective on Prisma: It's Burning Money Fast
For the past couple of weeks - the photo-processing app Prisma has been gaining huge attention, and rightfully so - they use CNNs (+3 points for trendy buzz word) and AI (+2.5 points) to process its users' photos and turn them into truly impressive art work, inspired by great artists such as Van Gogh, Picasso, and others (+1.5 points for name dropping).
Here's what it looks like on the famous Success Kid's image:
Which is it pretty impressive. As usual with these things, well lit highly contrasted photos work best, and it's loads of fun to play with.
Since the app is free, it begs the question: how do these people plan to make money?
Well, here are a few things to consider:
Here's what it looks like on the famous Success Kid's image:
Since the app is free, it begs the question: how do these people plan to make money?
Well, here are a few things to consider:
- Some of their filters are branded (e.g. by Palmolive), thus generating revenue through advertising.
- Since yesterday, the photos come with a water mark on their bottom-right corner, which I guess won't be there in some premium paid version.
- As a friend of mine pointed out: it is very hard to do these computations on the device, and as he found out - indeed when you put your device into airplane mode - the app doesn't work, so the heavy lifting must be done in the cloud. To quote my friend: "this is one free app with a huge AWS bill".
- In its Product Hunt thread, one of the Co-founders responded to an inquiry about their business model by promising that the app will always be free:
Conclusion: smells like trouble
So while this company is obviously putting out a good product (especially if we ignore the annoying 2-3 seconds wait when applying a filter), I believe balancing their cloud bill with revenue from advertisers seems like a big challenge. Of course, they may be aiming at getting bought quickly by the big guys, but the model of allowing users to heavily drain cloud resources, with a limited ability to monetize (unlike search results, social networks feed, etc.), may mean they won't live long enough for this to happen.Friday, July 1, 2016
Deliberate Learning: 3 Tips for Acquiring New Skills
Recently, through Freakonomics, I came across the notion of deliberate practice. The idea, originally introduced by Ericsson et al in 1993, and more notably popularized by Malcolm Gladwell's "Outliers", is that the contribution of natural talent to greatness in art, science, and sports, is far smaller than we tend to assume.
The tl;dr version of it is: the thing that differentiates the greatest among us from the rest is not their talent, but the manner in which they practice. Specifically, they practice an awful lot, and do so in a reflective process that efficiently translates the invested time to a consistent improvement in performance.
While I don't expect to become the world's greatest anything, some principles of deliberate practice struck me as familiar, and made me think about the way that I learn. I like learning new things, and having done so in a relatively wide variety of areas (painting, music, education, math, algorithms, poetry, design), I wanted to share what works for me when I come across new stuff to learn. To clarify, this is not deliberate practice. Deliberate practice is about becoming great at something over a long period of time. I want to focus on something much more common and useful for me: becoming reasonably proficient at something quickly.
Let's call it deliberate learning.
This seems to me particularly useful, while being quite challenging for anyone with a critical mind. It's also very obvious and most people do it to some extent, but it's a good idea to do so consciously as it allows us to switch it on when we need to learn something, whereas the natural tendency, especially with age, is to do it less and less, mostly due to ego.
The idea is to find a teacher, who may be someone teaching you a course at the university, a colleague at work, or even your superior. This person should be very experienced and much better than you at what you're trying to learn, and should be available to you on a regular basis. You should buy into this person's point of view on the subject you're trying to learn: imitate this person's methods, ask for advice, ask questions, and just believe everything they tell you (professionally). For the limited time that you're adopting this person as your mentor, you should also suspend your disbelief about their professional shortcomings. It is a surprisingly powerful approach, as it let's you really run with ideas and concepts along a path someone else already paved, and with their help. After the limited time has passed (e.g., the university term), you can reevaluate all the stuff you used to embrace, and gradually throw away whatever you disagree with, or adapt it to your own style. One last point: this mentor doesn't have to be a person - it may even be a book.
This is obviously necessary, and naturally hard on the ego, so encourage yourself with the following as you go:
- Paraphrasing Fight Club - you are not your work, so your work can and will suck sometimes, which doesn't mean you, in general, do.
- Going from terrible to okay is way easier than going from good to great, so if your work is shitty, this is good news as it means with little work you can improve a lot.
- Crummy work counts for experience too, so it wasn't a waste of time.
The tl;dr version of it is: the thing that differentiates the greatest among us from the rest is not their talent, but the manner in which they practice. Specifically, they practice an awful lot, and do so in a reflective process that efficiently translates the invested time to a consistent improvement in performance.
Mozart - a guy who was great at something |
Let's call it deliberate learning.
1. Choose a mentor and follow blindly, for a limited time
For a limited time, blindly follow you will |
The idea is to find a teacher, who may be someone teaching you a course at the university, a colleague at work, or even your superior. This person should be very experienced and much better than you at what you're trying to learn, and should be available to you on a regular basis. You should buy into this person's point of view on the subject you're trying to learn: imitate this person's methods, ask for advice, ask questions, and just believe everything they tell you (professionally). For the limited time that you're adopting this person as your mentor, you should also suspend your disbelief about their professional shortcomings. It is a surprisingly powerful approach, as it let's you really run with ideas and concepts along a path someone else already paved, and with their help. After the limited time has passed (e.g., the university term), you can reevaluate all the stuff you used to embrace, and gradually throw away whatever you disagree with, or adapt it to your own style. One last point: this mentor doesn't have to be a person - it may even be a book.
2. Steer toward the boring and the difficult
Boring and difficult? Do go on! |
Like the mentor thing, this is about suspending the self for a while. After you finish your learning period on the subject (I know the common wisdom is one should always be learning, but very rarely can one maintain a state of learning about everything) you'll naturally stay away from the stuff that bores you or that you don't understand well. This means that now is the only time you'll have to gain some ground in these areas. This being a learning period, you're also relatively energetic, and can muster the courage to face those areas, so it is very literally now or never. So be on the lookout for boring or difficult topics, and fearlessly dive into them. Your environment (at work or school) knows you're just learning, so it will be more forgiving to failure, which is inevitable when trying new and difficult things.
3. Get people whose judgement you trust to criticize your work frequently
My work sucks? I demand a trial by combat! |
- Paraphrasing Fight Club - you are not your work, so your work can and will suck sometimes, which doesn't mean you, in general, do.
- Going from terrible to okay is way easier than going from good to great, so if your work is shitty, this is good news as it means with little work you can improve a lot.
- Crummy work counts for experience too, so it wasn't a waste of time.
Tuesday, June 7, 2016
The God Awful UX of Israeli Train Stations
It does not take a UX expert to see that the electronic ticket validation system in Israel Railways is poorly designed.
Here's how its designers apparently expect it to be used:
- The passenger walks along the lane (1).
- The rotating bar (2) lies about 70cm above ground, and is locked in position until the ticket is validated.
- The passenger presses the ticket against the screen (3).
- The ticket is successfully validated, the passenger pushes the bar (2) that subsequently rotates to allow the passenger to enter the ramp.
Well, it doesn't exactly play out that way.
About 50% of the time the ticket isn't validated correctly and the rotating bar is locked in place, leaving the passenger stumbling against it, delaying a group of annoyed people standing behind him or her.
Why isn't the passenger validating again? Well, indeed an error message is shown on the screen, and there's also some kind of beep to indicate something went wrong, but none of this matters because:
1) The ever-so-gentle beeping sound is inaudible in any average train station.
2) The screen is parallel to the passenger's field of vision, so one has to bend backward to see it.
3) The screen is often blocked at this point by the passenger's own hand holding the ticket against it to be validated.
4) The screen is almost always dirty because - again - this is where people stick their hand/wallet to validate.
All this together means that every other passenger gets stuck in the lane. It takes a couple of seconds to figure out that waiting a bit longer isn't fixing it, thereby delaying all other passengers. To top it all, every subsequent validation attempt takes a couple of seconds more than it should - due to the same design flaws. I get annoyed all over again just writing this!
Mind you, this is not some ground-breaking product that provides completely new design challenges. It's not as if trains or electronic tickets were invented by Israel Railways. Rather, there are tons of examples to research from all over the world!
It's one of those cases where poor design makes hundreds of thousands of people's daily routine just that much more annoying, as we are helplessly victimized by the laziness and just general inadequacy of others.
Here's how its designers apparently expect it to be used:
- The passenger walks along the lane (1).
- The rotating bar (2) lies about 70cm above ground, and is locked in position until the ticket is validated.
- The passenger presses the ticket against the screen (3).
- The ticket is successfully validated, the passenger pushes the bar (2) that subsequently rotates to allow the passenger to enter the ramp.
Well, it doesn't exactly play out that way.
About 50% of the time the ticket isn't validated correctly and the rotating bar is locked in place, leaving the passenger stumbling against it, delaying a group of annoyed people standing behind him or her.
Why isn't the passenger validating again? Well, indeed an error message is shown on the screen, and there's also some kind of beep to indicate something went wrong, but none of this matters because:
1) The ever-so-gentle beeping sound is inaudible in any average train station.
2) The screen is parallel to the passenger's field of vision, so one has to bend backward to see it.
3) The screen is often blocked at this point by the passenger's own hand holding the ticket against it to be validated.
4) The screen is almost always dirty because - again - this is where people stick their hand/wallet to validate.
All this together means that every other passenger gets stuck in the lane. It takes a couple of seconds to figure out that waiting a bit longer isn't fixing it, thereby delaying all other passengers. To top it all, every subsequent validation attempt takes a couple of seconds more than it should - due to the same design flaws. I get annoyed all over again just writing this!
Mind you, this is not some ground-breaking product that provides completely new design challenges. It's not as if trains or electronic tickets were invented by Israel Railways. Rather, there are tons of examples to research from all over the world!
It's one of those cases where poor design makes hundreds of thousands of people's daily routine just that much more annoying, as we are helplessly victimized by the laziness and just general inadequacy of others.
Wednesday, May 18, 2016
Triangular numbers, a 3D cube, and the octahedral group
For some computational geometry project, I had to understand the orbits of the action of the octahedral group on a 3D cube, and like so many times with math, I did it the hard way only to find out later that a simple observation would have saved me the effort. Well, some of the effort.
For simplicity, let's assume that the 3D cube is actually a discrete object, comprised of $n^3$ cells for some even value of $n$ (having an odd $n$ doesn't change the outcome by a lot, but is a bit different). The group action breaks the cube into orbits, for example, the corner cells form a single orbit of size 8. Note that the group elements can be also thought of as $3\times 3$ matrices, all of whose determinents are $1$ or $-1$ (the latter if the element induces a reflection), and the action thus preserves norms. To put it plainly, the distance of a cell from the cube's center is an invariant of the group action. This observation means that we can think of the cube as comprised of $\frac n 2$ nested hollow cubes of dimensions $n\times n \times n, (n-2)\times(n-2)\times(n-2), ... 4\times 4\times 4, 2\times 2\times 2$, sort of like a matryoshka doll of 3D cubes.
It suffices to understand how the group acts only on one of these hollow cubes, so let's examine only the largest one - the $n\times n\times n$ hollow cube.
Observe that using this notation, together with thinking of the group elements as permuting coordinates and/or pointwise multiplying them by $-1$, makes it evident that, up to a permutation, there are only 4 cases, or types of orbits:
Case 1: $\frac n 2 = |i| \neq |j|, |i|\neq |k|, |j| \neq |k|$, choosing such indices defines an orbit of size 48, since no permutation of the coordinates or multiplication of them by $-1$ leaves the indices as they were.
Case 2: $i = |j| = |\frac n 2|, |j|\neq |k|$, these are all the cells that lie on the edges of the hollow cube. A cell with such indices is invariant under elements switching $i$ and $j$, so its orbit's size is 24. To be precise, in the case where $j = -i = \pm \frac n 2$, swapping $i$ and $j$ is just like pointwise multiplying both by $-1$, so the orbit is obtained simply by moving $k$ around and pointwise multiplication by $-1$, and indeed the orbit size is 24.
Case 3: $|i| = |j|, |j| < |k| = \frac n 2$, these are all the cells that lie on the diagonals that run along the face of the hollow cube. For the same arguments as Case 2 - the orbit size here is 24.
Case 4: $|i| = |j| = |k| = \frac n 2$, these are the corner cells. The difference between two corners is only the sign of the coordinates, so the size of such an orbit is 8.
Case 1: The choice is of some unordered pair of elements from $\{1,2,...,\frac n 2 - 1\}$. The pair is unordered since the group action puts $(i,j,k)$ in the same orbit as $(j,i,k)$, and we omit the negatives since they are obtained via the pointwise multiplication by $-1$. The number of ways to choose such a pair is therefore $\frac 1 2 (\frac n 2 - 1)(\frac n 2 - 2)$.
Case 2: Since $|j|$ and $|i|$ must be equal to $\frac n 2$, the only choice here is of $1\leq k < \frac n 2$ (negative values of $k$ are obtained via the pointwise multiplication by $-1$ of the group action). So clearly there are $\frac n 2 - 1$ of those.
Case 3: Like Case 2, the only choice here is of $1 \leq |i| < \frac n 2$, which in turn determins $|j|$. So there are $\frac n 2 - 1$ such orbits.
Case 4: This is the corners' orbit, so there is exactly one for a hollow cube.
Putting it together, a hollow 3D cube of dimension $n$ has $\frac 1 2 (\frac n 2 - 1)(\frac n 2 - 2) + 2 \cdot (\frac n 2 - 1) + 1 = \frac 1 8 (n^2 + 2n)$.
Since the full 3D cube, with an even dimension $s$, is comprised of $\frac s 2$ such hollow cubes, the number of orbits is: $\frac 1 8 \sum _{n=1} ^{\frac s 2} ((2n)^2 + 2(2n))$, which surprisingly gives us a sum of triangular numbers $\sum_{n=1} ^{\frac s 2} \frac {n(n+1)} 2$, which has a nice closed form $\frac {\frac s 2 \cdot (\frac s 2 + 1) \cdot (\frac s 2 + 2)} 6$.
As noted in the previous paragraph, the result is a sum of triangular numbers, which are called thus because they also represent the number of blocks you'd use to construct a triangle where the bottom row has $n$ blocks, the next row has $n-1$ blocks and so on.
The image below shows one representative of each orbit in the outer layer of an $8\times 8 \times 8$ cube. The Case 1 representatives are red, Case 2 - orange, Case 3 - yellow, and Case 4 - green.
And indeed these representatives form a triangle, and the sum of triangles really represents (sort of) a pyramid that goes from the outer layer of the cube inward, the colorful part of the image below being its base.
tl;dr
Upon counting the orbits of the cells of a 3D cube under the octahedral group action, I found out the result is a sum of triangular numbers, which has a very intuitive geometric reason.
The gory details
The octahedral group can be thought of as the group of symmetries obtained by applying a series of 90-degree rotations and/or face-parallel reflections to a cube. Another way of thinking about it is this: if the cube is centered at the origin, applying an element of the group to the cube is equivalent to shuffling the coordinates and/or multiplying some of the coordinates by $-1$. Since there are 6 permutations of the coordinates, and 8 choices as to which coordinates to multiply by $-1$, the number of elements in the group is 48.For simplicity, let's assume that the 3D cube is actually a discrete object, comprised of $n^3$ cells for some even value of $n$ (having an odd $n$ doesn't change the outcome by a lot, but is a bit different). The group action breaks the cube into orbits, for example, the corner cells form a single orbit of size 8. Note that the group elements can be also thought of as $3\times 3$ matrices, all of whose determinents are $1$ or $-1$ (the latter if the element induces a reflection), and the action thus preserves norms. To put it plainly, the distance of a cell from the cube's center is an invariant of the group action. This observation means that we can think of the cube as comprised of $\frac n 2$ nested hollow cubes of dimensions $n\times n \times n, (n-2)\times(n-2)\times(n-2), ... 4\times 4\times 4, 2\times 2\times 2$, sort of like a matryoshka doll of 3D cubes.
Image credit: Wikipedia |
It suffices to understand how the group acts only on one of these hollow cubes, so let's examine only the largest one - the $n\times n\times n$ hollow cube.
Classifying orbits
Suppose the cells are indexed w.r.t. the center of cube, so each cell is indexed by some $i, j, k\in \{-\frac n 2,..., \frac n 2\}\setminus \{0\}$ such that at least one of $i, j, k$ is $\pm \frac n 2$. We omit zero because $n$ is even and no cell is the center cell of the cube.Observe that using this notation, together with thinking of the group elements as permuting coordinates and/or pointwise multiplying them by $-1$, makes it evident that, up to a permutation, there are only 4 cases, or types of orbits:
Case 1: $\frac n 2 = |i| \neq |j|, |i|\neq |k|, |j| \neq |k|$, choosing such indices defines an orbit of size 48, since no permutation of the coordinates or multiplication of them by $-1$ leaves the indices as they were.
Case 2: $i = |j| = |\frac n 2|, |j|\neq |k|$, these are all the cells that lie on the edges of the hollow cube. A cell with such indices is invariant under elements switching $i$ and $j$, so its orbit's size is 24. To be precise, in the case where $j = -i = \pm \frac n 2$, swapping $i$ and $j$ is just like pointwise multiplying both by $-1$, so the orbit is obtained simply by moving $k$ around and pointwise multiplication by $-1$, and indeed the orbit size is 24.
Case 3: $|i| = |j|, |j| < |k| = \frac n 2$, these are all the cells that lie on the diagonals that run along the face of the hollow cube. For the same arguments as Case 2 - the orbit size here is 24.
Case 4: $|i| = |j| = |k| = \frac n 2$, these are the corner cells. The difference between two corners is only the sign of the coordinates, so the size of such an orbit is 8.
Counting orbits
It is now left to understand how many orbits there are for each case. We choose a representative for each orbit - $(i, j, k)$, and count those.Case 1: The choice is of some unordered pair of elements from $\{1,2,...,\frac n 2 - 1\}$. The pair is unordered since the group action puts $(i,j,k)$ in the same orbit as $(j,i,k)$, and we omit the negatives since they are obtained via the pointwise multiplication by $-1$. The number of ways to choose such a pair is therefore $\frac 1 2 (\frac n 2 - 1)(\frac n 2 - 2)$.
Case 2: Since $|j|$ and $|i|$ must be equal to $\frac n 2$, the only choice here is of $1\leq k < \frac n 2$ (negative values of $k$ are obtained via the pointwise multiplication by $-1$ of the group action). So clearly there are $\frac n 2 - 1$ of those.
Case 3: Like Case 2, the only choice here is of $1 \leq |i| < \frac n 2$, which in turn determins $|j|$. So there are $\frac n 2 - 1$ such orbits.
Case 4: This is the corners' orbit, so there is exactly one for a hollow cube.
Putting it together, a hollow 3D cube of dimension $n$ has $\frac 1 2 (\frac n 2 - 1)(\frac n 2 - 2) + 2 \cdot (\frac n 2 - 1) + 1 = \frac 1 8 (n^2 + 2n)$.
Since the full 3D cube, with an even dimension $s$, is comprised of $\frac s 2$ such hollow cubes, the number of orbits is: $\frac 1 8 \sum _{n=1} ^{\frac s 2} ((2n)^2 + 2(2n))$, which surprisingly gives us a sum of triangular numbers $\sum_{n=1} ^{\frac s 2} \frac {n(n+1)} 2$, which has a nice closed form $\frac {\frac s 2 \cdot (\frac s 2 + 1) \cdot (\frac s 2 + 2)} 6$.
Why triangular numbers?
This is a satisfyingly clean formula to ed up with, after a tedious computation, and as one would expect, it is so for a reason:As noted in the previous paragraph, the result is a sum of triangular numbers, which are called thus because they also represent the number of blocks you'd use to construct a triangle where the bottom row has $n$ blocks, the next row has $n-1$ blocks and so on.
The image below shows one representative of each orbit in the outer layer of an $8\times 8 \times 8$ cube. The Case 1 representatives are red, Case 2 - orange, Case 3 - yellow, and Case 4 - green.
And indeed these representatives form a triangle, and the sum of triangles really represents (sort of) a pyramid that goes from the outer layer of the cube inward, the colorful part of the image below being its base.
Thursday, April 28, 2016
Powers of Graphs Are Almost Always Hard
Early in 2011, I spent a few months as an exchange student in the University of Melbourne. Although I was writing my thesis at the time, I was looking for topics for a small-scale research, as I had to turn in some written assignment as part of the exchange program. I can't remember why, but I thought a bit about powers of graphs, and pondered about their relation to hardness. Since it turned out to be an easy question to solve, it wasn't enough for the written assignment (let alone a research paper), but I do have a warm spot for it, so here it is:
(If you are browsing on mobile, switch to desktop view to display the math stuff nicely)
(If you are browsing on mobile, switch to desktop view to display the math stuff nicely)
Background: Powers of Graphs
Given a simple graph (we restrict our discussion to this type for brevity) $G(V, E)$, we define $G^2$ by adding an edge between every two vertices $v,j\in V$ if there exists some $w\in V$ such that $\{v, w\} \in E$, and $\{u, w\} \in E$.
In other words - if there was a path of length 2 between $v$ and $u$ in $G$, then there's an edge between them in $G^2$. This works similarly for $G^3$ (where paths of length 3 or less in $G$ become edges) and so on.
Clearly, if the longest distance in the graph is of length $d$ (in which case we say that $d$ is the diameter of $G$), then $G^d$ is simply a clique, since all paths in $G$ are at most of length $d$.
Background - Independent Set
Given a graph $G(V, E)$, an independent set of vertices is some $U\subseteq V$ such that $u,v\in U \Rightarrow \{u,v\}\notin E$. Finding a maximum independent set (i.e. largest in size, not to be confused with maximal which means it is not strictly a subset of any other independent set) in a graph is known to be NP-hard, and also hard to approximate.
So here's the question:
In a clique, every vertex is a maximum (and also maximal) independent set, so for every simple graph $G$ with diameter $d$, solving the independent set problem for $G^d$ is easy: just choose any vertex you like, and you're done. This raises the question - if a problem is in general hard on $G$, but easy on $G^d$, how hard is it for $G^k$ where $k < d$? I mean, at which point does this very hard problem becomes trivial?
I'll save you the suspense - independent set is generally hard on $G^{d-1}$, so the problem is not all that interesting, and the hardness of independent set on powers of graphs is basically: hard, hard, hard, ... , hard, trivial. Not an exciting result, I admit, but let's prove it anyway.
Proof by Reduction
Proof sketch: given a graph $G$, we'll construct a graph $H$, that is only slightly larger than $G$, has the same diameter, and is such that solving the independent set problem for $H^{d-1}$ will give us the solution for $G$.
First, the case of an even $d$, which is simpler:
Denote $G$'s vertices by $v_1, v_2, ... , v_n$, we construct $H$ by taking $n$ disconnected vertices denoted $u_1, u_2, ..., u_n$ - we'll call those the 'old' vertices - and adding $n\cdot (\frac d 2 - 1)$ 'new' vertices to them in the following manner:
1. Add one special vertex $c$.
2. For every $i : 1\leq i \leq n$, connect $u_i$ to $c$ by a path of length $\frac d 2$ using $\frac d 2 - 1$ of the 'new' vertices (all of these paths should be disjoint, and that's okay, since we have exactly $n\cdot (\frac d 2 - 1)$ of these new vertices). Denote the vertices of the path from $c$ to $u_i$: $c\rightarrow u_i ^1 \rightarrow u_i ^ 2 \rightarrow ... \rightarrow u_i ^ {\frac d 2 - 2} \rightarrow u_i ^{\frac d 2 - 1} \rightarrow u_i$.
3. For every $\{v_k, v_j\} \in E$ that form an edge in $G$, add to $H$ the edge $\{u_k ^1, u_j ^1\}$.
$H$ is thus sort of a big star graph, having $c$ in the middle, with some additional edges due to step (3) above.
A few observations:
1. For every two vertices of the original graph $\{v_l,v_t\}\notin E$, the shortest path from $u_l$ to $u_j$ in $H$ goes through $c$ and is of length $d$. It follows that the diameter of $H$ is at least $d$.
2. In $H^{d-1}$, all of the new vertices form a clique, and each of them is connected by an edge to each of the old vertices. The diameter of $H$ is therefore at most $d$, and so it is exactly $d$.
3. It follows that a maximum independent set in $H^{d-1}$ cannot contain any of the new vertices.
4. $\{u_j, u_k\}$ is an edge in $H^{d-1}$ if and only if $\{v_j, v_k\}$ is an edge in $G$, so every independent set in $G$ correlates to an independent set of 'old' vertices in $H^{d-1}$ and vice versa.
Putting all these observations together boils down to the fact tha every maximum independent in $H^{d-1}$ correlates to a maximum independent set in $G$ and vice versa.
What about an odd $d$? well, instead of a single vertex serving as the center, we use a clique of n vertices that serves as center of $H$, and everything else works very much the same.
Note that a slight tweak can be used to show that $H^{d-2}$ is no easier, and so on.
So, alas, independent set is hard on all but the $d$-th power of a graph, in which case it is trivial.
Sunday, January 31, 2016
NP-Complete Problems as Repeatable Math Puzzles
Repeatable Puzzles
In the 3 years I worked at Matific, I was involved in designing all of the mini-games (we called them 'episodes') produced by the company. This meant that I spent a non-negligible part of my time on the hunt for math puzzles that are
- interesting and rewarding to solve
- rely only on elementary school math knowledge
- worth playing over and over, with respect to the above
Most of the really cool puzzles, such as determining which - out of 12 coins - was fake, using a balance scale and only 3 weighings, fall short of the "worth repeating" notion, as there's no real pleasure or challenge in solving them a second time. So googling "Math Puzzles" and reading puzzle books, though fun, was often of little use.
As time went by, and development progressed, puzzles turned out to be only a small part of what we did. A typical episode was rather required to help in understanding some deeper principal of a previously taught skill, via hands-on experience, which I think we managed to do well, and that's a topic for another post.
Nevertheless, there were some puzzles in the mix, for example: a game where the user is given 4-5 numbers (e.g. 2, 3, 4, 7), and is asked to get to a target number (e.g. 23) using the some of the four operations and each of the initial numbers (e.g. 4/2 + 3*7).
Another puzzle, for 1st graders, asked the student to arrange toys on shelves, subject to some constraints, e.g. - the doll must be above the teddybear, the car has to be to the right of the dinosaur, and so on.
A third puzzle was simply variants of a well known water-pouring puzzle: one has 2-3 containers, each of different volume (e.g. 5L and 3L) and the objective is to obtain another quantity (e.g. 4L) via pouring and filling.
As we designed more of those repeatable puzzles, a few common traits emerged:
For example, choose 4 random numbers in some range, e.g. 2, 5, 9, 11, and now choose operations at random, e.g. *, +, -, which gets 2 * 5 + 9 - 11 = 8. And so we present the user with the numbers 2, 5, 9, 11, setting 8 to be the target number. This approach has its drawbacks, such as producing problems using a certain solution while a simpler one exists, but they are less severe, product-wise.
This "start from the solution" approach took us a while to figure out, in which time we tried the naive approach, of randomizing a problem and running a solver. After a few of these, I noticed that we often realize that the problem is NP-hard, or in some cases assume it is so, not being sure how to prove it. A solver for such puzzles is problematic and can't scale to larger instances of the problem, especially seeing how the code had to be very light-weight to run on a browser. Lastly, a crucial part of programming a puzzle is checking the user's solution, which was almost always considerably easier than generating the problem. And so at some point it became clear to me that many of our so-called puzzles, were simply instances of NP-complete problems.
This was neat because it gave us something to draw inspiration from, in addition to poring over puzzle books.
A third puzzle was simply variants of a well known water-pouring puzzle: one has 2-3 containers, each of different volume (e.g. 5L and 3L) and the objective is to obtain another quantity (e.g. 4L) via pouring and filling.
Starting With the Solution
As we designed more of those repeatable puzzles, a few common traits emerged:- they involved a more than one skill. The pouring puzzles, for example, involves addition, subtraction, understanding of volume, and strategizing.
- we had to generate problems with well-tuned difficulty levels.
Let me focus on that last one: When one designs a puzzle, the first instinct may be to randomize the problem (e.g. how much water in each container? what is the target quantity?) and then write a solver that solves it, but that is rarely the right thing to do, for two reasons: 1) This leaves the difficulty level to chance, 2) Sometimes the problem has no quick solver, and the solver will needs to go over all possibilites to find the solution, or reach a conclusion that it has no solution, in which case a new problem has to be generated at random and checked and so on. Our standard strategy, which applied to most of the cases, was starting from the solution.
For example, choose 4 random numbers in some range, e.g. 2, 5, 9, 11, and now choose operations at random, e.g. *, +, -, which gets 2 * 5 + 9 - 11 = 8. And so we present the user with the numbers 2, 5, 9, 11, setting 8 to be the target number. This approach has its drawbacks, such as producing problems using a certain solution while a simpler one exists, but they are less severe, product-wise.
This "start from the solution" approach took us a while to figure out, in which time we tried the naive approach, of randomizing a problem and running a solver. After a few of these, I noticed that we often realize that the problem is NP-hard, or in some cases assume it is so, not being sure how to prove it. A solver for such puzzles is problematic and can't scale to larger instances of the problem, especially seeing how the code had to be very light-weight to run on a browser. Lastly, a crucial part of programming a puzzle is checking the user's solution, which was almost always considerably easier than generating the problem. And so at some point it became clear to me that many of our so-called puzzles, were simply instances of NP-complete problems.
This was neat because it gave us something to draw inspiration from, in addition to poring over puzzle books.
Some Thoughts
I think that what makes NP-complete problems such useful puzzles is that they somehow seem to come from the real world, at least more than those in harder complexity classes, which means many of them can be explained to a layman in a minute. The NP-completeness also means that there's no trick that the user learns once, and is then bored by subsequent puzzles that are solved using that same trick. Being in NP also means that solutions can be checked easily, so the software can check the user's solution, but also, when the user fails - and the software shows a correct solution, the user can see very quickly that it is correct. Lastly, they encourage a heuristic cross-topical approach to problem solving, which is to say there's no title saying "this is a subtraction problem, solve it using subtraction". Both the method and the tools used by the user (or student) are not part of the problem's definition, and that is how math usually presents itself in real life.
Subscribe to:
Posts (Atom)