Tuesday, March 28, 2017

The UX of Work Emails

The polite way to get a co-worker's attention to discuss an issue without bothering someone, is to send an email or some other form of asynchronous communication.


In recent years I have found that it is very useful to think, when writing an email, more like a UX designer than a writer. This is because an email is meant to induce behavior, specifically, in addition to getting my message across, I want a quick response.

So when composing an email I...
  1. Use bullet points: they make replying to different points in the email much easier, and requires less explaining when doing so.
  2. Highlight important stuff, and in general, make scheming through it easy and informative. 
  3. Use images when I can, because people love images and respond to them more than to text (as any Facebook page manager will tell you). This is not to say I start digging through image banks, rather - I use plenty of screenshots all the time (I use Shutter on Ubuntu and the Windows Snipping Tool on Win7). And yes, the occasional meme, if I have a big block of text with no relevant image to accompany it.
  4. For a long email - start it with a tl;dr version of it, summarizing the whole thing in a sentence or two. This works in a similar way to a news story's headline - it gives the appropriate framing and context.
  5. Use the subject line to help recipients sort through their emails. I believe that the subject line is usually ignored when actually reading an email. It is read when people sort through a long list of emails, or get a notification. It therefore should be short, and useful in reminding the reader how this email differs from others on the list.
  6. Use code that looks like code. I use this website to format the code parts of an email and make them look like code. It helps readability, and looks cool.
    int RandomInt() {
      return 4;
    }
  7. Make it cross-platform. People read their emails on various devices, so I try to be very conservative with attachments, and limit them to PDFs and other easily-read formats, when possible. When I have some 3D model in a proprietary format that illustrates my point - I may send it, but I send an image of it as well, so that colleagues can make sense of it even if they are on the train, using their phone to read emails.
An image. Yeah, I know, amazing.


Tuesday, January 31, 2017

Generate a Puzzle for Your Math Class in 5 Minutes


There's a famous puzzle, often referred to as the Zebra Puzzle. It goes something like this:


  1. There are five houses.
  2. The Englishman lives in the red house.
  3. The Spaniard owns the dog.
  4. ...
  5. ...
    ...
Here follow many more facts detailing who lives next to whom, and which pets they own and so on, until the final question is given: Who owns the zebra?
There is no quick and clean solution for this puzzle that involves a single big "Aha!" moment, but rather a series of steps, making ad-hoc assumptions, and trying to see if they sit well with each other and with the facts given, until a solution is found.


Constraint Satisfaction Problems

This puzzle belongs to a class of problems called Constraint Satisfaction Problems, or CSP. 
While CSPs in general are hard to solve, even for computers, small instances of them are both easy to create, and make for popular puzzles. An example of such puzzle is Sudoku.

Floors Puzzles

Here's a really easy CSP I just came up with:
  1. There are 4 floors in the building.
  2. The rooster does not live on the top floor.
  3. The cat lives two floors under the lion.
  4. the dog lives in a lower floor than the rooster.

Which animal lives on which floor?
Spoiler alert: if you like to solve this puzzle, do so now, since the next section will reveal the solution. The puzzle is, by the way, so easy that I suggest you solve it yourself anyway, just to get an understanding of what it feels like.
An animal that does not live on the top floor

This is a nice puzzle for 10 year old students. A similar puzzle with 3 floors will be suitable for 7 year olds, and naturally, more floors can make it increasingly harder to solve.


Create Your Own Floors Puzzle in 5 Minutes

  • Choose animals and number of floors. 
  • Draw a diagram of which animal goes where, e.g.:
  • Write down rules that describe your building - e.g. "the dog lives 3 floors below the lion".




Over a certain number - some of the rules will be redundant, that is - the puzzle may be solved even if those rules are not given, but don't worry about that. As a rule of thumb - the number of rules should be slightly smaller than the number of floors.

Checking Solutions

Note that it may be the case that there will be correct solutions that are not identical to the one you used to write the puzzle, but that's okay, because the puzzle is about finding a solution, not the solution. While it is possible, in principle, to check whether the rules you chose have more than one solution, it will probably take more than the 5 minutes promised in the title of this post... 

So there you have it - generate a puzzle for your math class in five minutes or less. Enjoy.



P.S.

On a less-practical note, about a year ago I wrote a post tying "good" puzzles to NP-complete problems, and this current post is really just another example of that.

Wednesday, October 5, 2016

Notes from my "Intro to Information Theory" time


For two years, starting October 2010, I was a teaching assistant in the Introduction to Information Theory course at the Hebrew University, which was taught by Prof. Michael Werman, and the following year by Prof. Michael Ben-Or.
During that time, I a weekly two-hour tutorial, which was sometimes an extension of what was discussed in the lecture, and sometimes (as in the case of error correction codes) simply dealt with other topics.

After writing the previous post on check digits, I dug up the old notes, and found them to be quite helpful in refreshing my memory on the matter.

So, hoping it will prove helpful to others as well, and under an assumption that I'm not violating anyone's copyrights, here they are:

Tuesday, September 27, 2016

Check digit: the poor man's error correction code

The Israeli ID number and the Luhn Algorithm


The Israeli ID number, assigned to every Israeli resident or citizen, has 9 digits: the left 8 digits are the actual number, whereas the right-most digit is a check digit.
The check-digit is computed using the Luhn algortihm which is:
1. Multiply digits in even places by 2.
2. Add the digits in the odd places (not including the check-digit itself) to the sum of digits of numbers you got in (1).
3. Set the 9th digit to be the number required to complete the sum from (2) to the next multiple of 10.

For example


The number 99035612 will be processed as follows:
9 - odd place, leave as is -> 9
9 - even place, multiply by 2 and take sum of digits -> 18 -> 9
0 - odd place, leave as is -> 0
3 - even place - multiply by 2 -> 6
5 - odd place, leave as is -> 5
6 - even place, multiply by 2 and take sum of digits -> 12 -> 3
1 - odd place, leave as is -> 1
2 - even place, multiply by 2 -> 4

sum all the above to obtain 37, which requires adding 3 to complete it to the nearest multiple of 10, thus 3 is chosen to be the check digit.

Had this been a valid ID, the check digit would have been 2 (not 9)
(image source: nbn.org.il)

Why all the fuss?


The most obvious idea for a check digit would be sum of digits modulo 10, so why all this odd-even business? It turns out it is intended to catch the very common transcription mistake of transposing two adjacent digits, and by the by also catches most of the cases of twin digits mistakes (i.e. aa->bb). In addition, it is simple, and therefore realistically usable for clerks in the pre-computers age.


The Luhn Algorithm's "unfixable" fault


Interestingly, the Luhn algorithm cannot detect all transpositions of two adjacent digits. Specifically, the sequences 09 and 90 both contribute 9 to the sum computed in step (2) so this transposition will not cause a change in the check digit and will not be detected.
So I grabbed a piece of paper and tried to come up with a similar algorithm that will catch *all* transpositions, only to find out that - if we restrict ourselves to similar algorithms (wanting to preserve the simplicity of the computation) - there is no such solution.
To be precise, let $f:\{0,...,9\}\rightarrow\{0,...,9\}$ be a permutaion, then there exist $a \neq b \in \{0,...,9\}$ such that $f(a) + b \equiv a + f(b) (\textrm{mod} 10)$.
My friend, Lior Yanovski, suggested an elegant proof for this:
Assume, by way of negation, that $a\neq b \Rightarrow f(a) + b \not\equiv a + f(b) (\textrm{mod} 10)$.
It follows that $ f(a) - a \not\equiv f(b) - b (\textrm{mod} 10)$, which means the function $g(x) := f(x) - x (\textrm{mod} 10)$ is a permutation.
If we sum $g(x)$ over $x$ we obtain $\sum_x f(x) - \sum_x x \equiv 0 (\textrm{mod} 10)$ simply because both $f$ and the identity are permutaions adding up to 55. On the other hand, if $g$ is indeed a permutation - summing up its values would give $5 (\textrm{mod} 10)$, i.e., a contradiction!
So if we were to look for a function to replace the "multiply by 2 and take sum of digits" from the Luhn algorithm, while maintaining the rest of the algorithm's structure, and covering all adjacent digits transpositions - we won't find one.


Hans Peter Luhn 1896-1964
(Image source: asist.org)


Fix #0: Find only transpositions


We could multiply every digit by its place, and take this sum modulo 10, i.e. define the check digit to be $\sum_{i=1} ^n i \cdot a_i (\textrm{mod} 10)$. This will detect all adjacent transpositions, but it will do a very poor job in detecting single digit errors. Seeing how 10 shares divisors with most of the numbers in $0,...,9$, multiplying by either of them and taking the remainder modulo 10 discards some information. For example $5\cdot x$ will always be either 0 or 5 (assuming $x$ is an integer). This means that a random error in the fifth digit will cause a change in the check digit only 50% of the times, and somewhat similar problems arise with digits in even places.

Fix #1: Abandon base 10


A closer inspection would reveal that the proof of the Luhn algorithm's fault relies heavily on the fact that 10 is an even number, since the expression $\sum_{i=1} ^n i (\textrm{mod} n)$ is $\frac n 2$ for even values of $n$ but $0$ for odd values of $n$. If we had used an odd base for this entire computation - things would be different, but perhaps less intuitive to compute by hand.
Indeed, some check digit systems, such as the one used in ISBN 10 (for bank transfers) employ a similar trick with 11 as the base of the modular operations.


Fix #2: Abandon simplicity


If we allow yet more complex computations, then there are some algorithms that produce a check digit that will detect all single-digit errors as well as all transpositions of adjacent digits. The oldest one probably being Verhoeff algorithm, from 1967. Verhoeff, who is now retired and an avid creator of math-inspired sculptures, took an empirical approach, by analyzing data from the Dutch postal service, classifying and counting common errors. He followed to create a code that not only detects all adjacent digits transpositions and single digit errors, but also detects errors based on phonetic similarities in dutch, such as confusing 15 (vijftien) with 50 (vijftig).

Jacobus Verhoeff, with his sculptures
(Image source: groene.nl)
More recently (2004), H. Michael Damm proposed an algorithm that is slightly less complicated (but only very slightly), and has similar qualities.



Some tests


Running some quick scripts in python, I wanted to find out how well the Luhn algorithm, the Damm algorithm (I skipped implementing Verhoeff's) and the algorithm proposed in Fix #0 above fare. I also added two more algorithms for fun:

  • Naive - $|\sum_{i = 1} ^{n - 1} (a_i - a_{i+1})| (\textrm{mod} 10)$.
  • RandomTable - the Damm algorithm is based on using a table of a quasi group of order 10. So I wondered how well a random table would do, so I took 10 random permutations, one above the other, and used it in the same way the Damm algorithm is implemented.

The errors introduced were:

  • Adjacent transposition: 12345 -> 12435
  • One-over transposition: 12345 -> 14325
  • Random 2-digit transposition: 12345 -> 42315
  • Single digit random error: 12345 -> 19345
  • Two-digit random errors: 12345 -> 22346
  • 3, 4, 5... 9 random errors.

The table shows the miss rate. Note that 10% miss rate is the magic number that they all converge to, given enough noise. This is because if we were to choose some random function $f:(0...9)^n\rightarrow\{0...9\}$ then for two random inputs $x \neq y \in (0...9)^n$, clearly $Pr(f(x) \neq f(y)) = 0.9$. A random function on random data has a 10% miss rate, so useful algorithms try to get significantly better than 10% for specific cases, without causing spikes in the miss rate for other errors.



error \ algo Luhn Damm Fix #0 Naive Random
Adjacent transposition 2% 0% 0% 70% 34%
One-over transposition 87% 10% 9% 63% 22%
Random transposition 45% 8% 10% 53% 27%
1-digit error 0% 0% 10% 67% 36%
2-digit error 10% 10% 10% 44% 25%
3-digit error 10% 10% 10% 27% 18%
4-digit error 10% 10% 10% 17% 14%
5-digit error 10% 10% 10% 12% 11%
6-digit error 10% 10% 10% 10% 11%
7-digit error 10% 10% 10% 10% 10%
8-digit error 10% 10% 10% 10% 10%
9-digit error 10% 10% 10% 10% 10%



Error correction codes for edit distance?


So, while this is a pre-computing-age problem, that is nowadays a toy problem, it highlights an interesting issue: the standard error-correction codes treat a very specific domain, where errors are modeled as noisy coordinates in a vector, and so the notion of distance is just the Hamming distance. Are there good codes for domains where the distance is more complicated, such as Levenshtein distance?


It turns out that once we tweak the noise model, we quickly step into the realm of open problems. While for the Binary Erasure Channel, which models noise as flipping bits, we've known the capacity ever since it was introduced, for the Deletion Channel, which models noise as removing a bit from the input string, the capacity is still an open problem.
To summarize - edit distance correction codes are an interesting idea, but I doubt we'll see any progress in them soon, as even the fundamental questions of Information Theory turn out to be extremely challenging in this domain.

Tuesday, August 2, 2016

Pokémon GO meets FarmVille: a game pitch

So, here's a sketch for a mobile location-based farming simulation game I thought of:

Elevator pitch:


Pokémon GO meets FarmVille: players walk around in the real world, where real world locations have in-game roles. The players grow crops, and harvest them, and since this is the real world - they can 'own' land.


Basic principles:


  • Each player can "own" portions of land in the real world, and use it to grow crops.
  • Crops can be sold in markets, which are also places in the real world (kind of like PokéStops).
  • Land can be bought and sold, workers can be hired, however, a player must visit his/her land often enough, otherwise it becomes abandoned, in which case other players can buy it cheaply.
The latter is crucial to the game, as it plays to users habits (making it easy to buy and maintain land next to home / work, or on the way there), and prevents early adopters from hoarding land.



Gameplay:


Buying land

Each player starts the game with some in-game money, no workers, and some basic seeds (wheat, corn, or rice, depending on the climate).
Walking around in the real world, the player can see which pieces of land are available for buying, and for how much. Abandoned pieces of land are priced by their size, and the price of land owned by other users is determined by that user, who can also choose "not selling". Any player can bid for any piece of land they pass by, as long as they have the money. Selling land does not require the player to be present at the spot.

Working the land

Having bought a new piece of land, the player has to work the land and prepare it for seeding. If the player has hired workers - they can be left there to do the job, but otherwise - the player has to stand close to the land for a minute, tapping the screen of the mobile device, thereby "working the land".

Crops 

Seeds for various crops can be bought at any market and planted in the land, requiring the player to visit the land in order to do so. Similarly to FarmVille, the crops take time to grow, and can rot if not harvested in time. Harvesting, like any type of work on the land, can be done by the player standing next to it and tapping the screen, or by workers left there for that purpose.

Markets

These are places in the real world, that play a part in the game but cannot be bought or sold and are determined by the system, similarly to PokéStops. Standing close to a market, a player can buy seeds, machinery, sell crops, and hire workers (for a predetermined time) with various levels of expertise.

Weather and Location

Crops can be grown only in places where the weather makes sense for it to happen, or in greenhouses, and real-world weather affects the virtual crops and their prices, so a storm may kill some of the crops, but raise prices.

Advancement

Players advance through levels through experience points that are gained by playing the game (growing crops, buying land, harvesting, hiring help, etc.).
Various achievements can be unlocked, such as managing 10 workers at once, growing exotic crops, owning a lot of land.

Social aspect?

Not sure, I haven't thought about it much, since I really don't like the way FarmVille did it...

Sunday, July 10, 2016

My Perspective on Prisma: It's Burning Money Fast

For the past couple of weeks - the photo-processing app Prisma has been gaining huge attention, and rightfully so - they use CNNs (+3 points for trendy buzz word) and AI (+2.5 points) to process its users' photos and turn them into truly impressive art work, inspired by great artists such as Van Gogh, Picasso, and others (+1.5 points for name dropping).

Here's what it looks like on the famous Success Kid's image:


 Which is it pretty impressive. As usual with these things, well lit highly contrasted photos work best, and it's loads of fun to play with.


Since the app is free, it begs the question: how do these people plan to make money?
Well, here are a few things to consider:

  1. Some of their filters are branded (e.g. by Palmolive), thus generating revenue through advertising.
  2. Since yesterday, the photos come with a water mark on their bottom-right corner, which I guess won't be there in some premium paid version.
  3. As a friend of mine pointed out: it is very hard to do these computations on the device, and as he found out - indeed when you put your device into airplane mode - the app doesn't work, so the heavy lifting must be done in the cloud. To quote my friend: "this is one free app with a huge AWS bill".
  4. In its Product Hunt thread, one of the Co-founders responded to an inquiry about their business model by promising that the app will always be free:



Conclusion: smells like trouble

So while this company is obviously putting out a good product (especially if we ignore the annoying 2-3 seconds wait when applying a filter), I believe balancing their cloud bill with revenue from advertisers seems like a big challenge. Of course, they may be aiming at getting bought quickly by the big guys, but the model of allowing users to heavily drain cloud resources, with a limited ability to monetize (unlike search results, social networks feed, etc.), may mean they won't live long enough for this to happen.



Friday, July 1, 2016

Deliberate Learning: 3 Tips for Acquiring New Skills

Recently, through Freakonomics, I came across the notion of deliberate practice. The idea, originally introduced by Ericsson et al in 1993, and more notably popularized by Malcolm Gladwell's "Outliers", is that the contribution of natural talent to greatness in art, science, and sports, is far smaller than we tend to assume.
The tl;dr version of it is: the thing that differentiates the greatest among us from the rest is not their talent, but the manner in which they practice. Specifically, they practice an awful lot, and do so in a reflective process that efficiently translates the invested time to a consistent improvement in performance.

Mozart - a guy who was great at something
 While I don't expect to become the world's greatest anything, some principles of deliberate practice struck me as familiar, and made me think about the way that I learn. I like learning new things, and having done so in a relatively wide variety of areas (painting, music, education, math, algorithms, poetry, design), I wanted to share what works for me when I come across new stuff to learn. To clarify, this is not deliberate practice. Deliberate practice is about becoming great at something over a long period of time. I want to focus on something much more common and useful for me: becoming reasonably proficient at something quickly.
Let's call it deliberate learning.


1. Choose a mentor and follow blindly, for a limited time

For a limited time, blindly follow you will
This seems to me particularly useful, while being quite challenging for anyone with a critical mind. It's also very obvious and most people do it to some extent, but it's a good idea to do so consciously as it allows us to switch it on when we need to learn something, whereas the natural tendency, especially with age, is to do it less and less, mostly due to ego.
 The idea is to find a teacher, who may be someone teaching you a course at the university, a colleague at work, or even your superior. This person should be very experienced and much better than you at what you're trying to learn, and should be available to you on a regular basis. You should buy into this person's point of view on the subject you're trying to learn: imitate this person's methods, ask for advice, ask questions, and just believe everything they tell you (professionally). For the limited time that you're adopting this person as your mentor, you should also suspend your disbelief about their professional shortcomings. It is a surprisingly powerful approach, as it let's you really run with ideas and concepts along a path someone else already paved, and with their help. After the limited time has passed (e.g., the university term), you can reevaluate all the stuff you used to embrace, and gradually throw away whatever you disagree with, or adapt it to your own style. One last point: this mentor doesn't have to be a person - it may even be a book.

2. Steer toward the boring and the difficult

Boring and difficult? Do go on!
Like the mentor thing, this is about suspending the self for a while. After you finish your learning period on the subject (I know the common wisdom is one should always be learning, but very rarely can one maintain a state of learning about everything) you'll naturally stay away from the stuff that bores you or that you don't understand well. This means that now is the only time you'll have to gain some ground in these areas. This being a learning period, you're also relatively energetic, and can muster the courage to face those areas, so it is very literally now or never. So be on the lookout for boring or difficult topics, and fearlessly dive into them. Your environment (at work or school) knows you're just learning, so it will be more forgiving to failure, which is inevitable when trying new and difficult things.

3. Get people whose judgement you trust to criticize your work frequently

My work sucks? I demand a trial by combat!
This is obviously necessary, and naturally hard on the ego, so encourage yourself with the following as you go:
- Paraphrasing Fight Club - you are not your work, so your work can and will suck sometimes, which doesn't mean you, in general, do.
- Going from terrible to okay is way easier than going from good to great, so if your work is shitty, this is good news as it means with little work you can improve a lot.
- Crummy work counts for experience too, so it wasn't a waste of time.