Ziv ScullyPhD student, Computer Science Department, Carnegie Mellon University
http://ziv.codes/
Fri, 09 Jun 2017 07:27:03 +0000Fri, 09 Jun 2017 07:27:03 +0000Jekyll v3.4.3How I Draw Slides<p>After my MAMA talk a few days ago,
many people were curious how I made my <a href="/pdf/multitask-talk.pdf">slides</a>.
The short version:
I drew them with a tablet in an SVG editor, and I recommend it!
Details below.</p>
<h2 id="hardware">Hardware</h2>
<p>I use an <a href="https://us-store.wacom.com/Product/Intuos-Art-Medium-S01">Intuos Art</a>, medium size,
which I got specifically for this purpose.
Its only fancy feature is pressure sensitivity,
but this is enough for my simple drawings.
Drawing on the tablet while looking at the computer screen
takes some getting used to.
The tablet has a grid of dots that make it feasible
to draw while looking at the tablet rather than the screen,
which I found easier for shapes with lots of right angles.
I’d consider getting a Surface or iPad Pro in the future
to be able to see more directly what I’m drawing.</p>
<h2 id="software">Software</h2>
<p>There are two steps to making the slides:
drawing pictures and assembling them with text as a presentation.
I did not find any tool that was good at both,
but I did find a pair that works well.</p>
<p>I use <a href="https://graphic.com">Autodesk Graphic</a> for drawing
and <a href="https://www.apple.com/keynote/">Keynote</a> for slides.
The key feature of this pair is that
you can copy and paste any selected part of a drawing
directly from Graphic into Keynote—no exporting or importing required!
This is invaluable for iterating quickly.
I would have needed maybe 50 files if I had to export and import
each animated component individually,
plus an extra 10 or so from slides that got cut.</p>
<p>Here are some other programs I tried.</p>
<ul>
<li>PowerPoint seems to work with Graphic well, too.</li>
<li>I actually found the most paper-like drawing experience
in raster (pixel-by-pixel) art programs like Corel Painter
(a version of which comes bundled with the Intuos Art),
but I’m willing to put up with a little clunkiness
to produce SVGs (scalable vector graphics).</li>
<li>Inkscape could maybe replace Graphic on Windows or Linux,
but on my Mac, I could only copy-paste raster images from it.
I think this is some sort of Pateboard-CLIPBOARD incompatibility.</li>
<li>Curiously, OneNote creates vector graphics
while feeling as natural as the raster programs.
However, I couldn’t copy-paste vector drawings from OneNote
into Keynote, PowerPoint, Preview, or anything else I tried.
For instance, when pasting into PowerPoint, the image gets rasterized.
The workaround is clunky and involves exporting and importing.</li>
<li>Xournal also creates vector graphics and has a good drawing feel,
but it doesn’t use smooth curves in its paths,
which makes it look worse than Graphic and OneNote.</li>
</ul>
<p>Here are some more details about how I use Graphic.</p>
<ul>
<li>I use the brush tool in Graphic with 10% smoothing.
This works pretty well, but I have to try drawing each shape a few times.
When a shape comes out well except for a small part,
I sometimes manually tweak it with the path tool.</li>
<li>I use lots of layers in Graphic to break up drawings into pieces.
If a complicated drawing has a complicated animation,
each stage of the animation gets its own layer at the very least.
<ul>
<li>Graphic makes it very easy to select all objects
that live in an arbitrary subset of layers,
so parts of the drawing that are common to multiple animation stages
each get their own layer, too.</li>
<li>For example, the animation on <a href="/pdf/multitask-talk.pdf">slide 2</a> has 7 layers.
In order: the queue, jobs, speech bubbles,
one for each of the red, green, and blue distributions,
and the coordinate axes.</li>
<li>Some of my layers collect many small utility drawings.
One of them has several arrows and curly braces, for instance.</li>
</ul>
</li>
<li>I made a <a href="/pdf/gittins-index-intro.pdf">previous presentation</a>
by drawing each slide individually in Graphic.
This worked okay, but each slide being a separate file
made it too cumbersome to make animations.</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>I find that tablet-drawn slides give a presentation a friendly vibe
while keeping a crisp look.
Perhaps more importantly,
it reduces the activation energy (for me, at least)
of including pictures and animations.
After finding the right setup,
I’m now faster faster drawing pictures in Graphic
than building similar diagrams directly in PowerPoint or Keynote
(or—<em>shudder</em>—TikZ).
If you give this approach a try, let me know how it goes!</p>
Thu, 08 Jun 2017 00:00:00 +0000
http://ziv.codes/2017/06/08/how-i-draw-slides.html
http://ziv.codes/2017/06/08/how-i-draw-slides.htmlAdjoint Functors and Computation<p>I’ve been sitting on this post not finishing it for almost a year. At this point it’s as finished as it will ever be, so I’m putting what I’ve got so far out there.</p>
<hr />
<p>This post is about category theory. If you don’t yet see the “fun” in “functor”, it will probably be difficult to follow. If you want to try to follow along anyway, look up what categories/product objects/exponential objects/functors/natural transformations/adjunctions are, try to read this post, fail to get very far, find three or four more introductions to category theory, read all of them to gain as much intuition as possible from their slightly different perspectives, try again, fail again, become a monk at one of those isolated-in-the-mountains math temples, vow not to speak until you truly understand the Yoneda Lemma, realize this all escalated rather quickly, decide you’ve had enough, move to a small island in the Carribean, conclude that maybe the beach is more fun than math, have an epiphany while brushing teeth revealing a tiny aspect of what adjoint functors might be all about, and write a blog post about it. That’s more or less how I got here.</p>
<p>Only slightly more seriously: you can probably get something out of this post only if you know what categories and functors are, and familiarity with their basic concepts and notation shall be mercilessly assumed. Before continuing, if you do not yet know what a natural transformation is, you should at least attempt to read a precise definition of the term elsewhere, because you won’t find one here. We’ll start with an attempt to grasp that formal definition intuitively, and we’ll continue that trend throughout.</p>
<h2 id="definitions">Definitions</h2>
<p>If <script type="math/tex">F</script> is a functor and <script type="math/tex">X</script> is an object, then <script type="math/tex">FX</script> is in some sense an object with “outer structure” of type <script type="math/tex">F</script> and “inner information” of type <script type="math/tex">X</script>. A functor “maps” a morphism by using the morphism only on the level of inner information while leaving the outer structure intact. For example, consider the list functor, <script type="math/tex">L : \mathbf{Set} \to \mathbf{Set}</script>. On objects, it brings each set <script type="math/tex">X</script> to the set of lists of elements of <script type="math/tex">X</script>. On morphisms, it maps each function <script type="math/tex">f</script> to “mapped <script type="math/tex">f</script>”, which, given a list as input, outputs the list of results of applying <script type="math/tex">f</script> to each member of the list. For instance, if <script type="math/tex">f(x) = x + 3</script>, then <script type="math/tex">Lf([1,2,3]) = [4,5,6]</script>. Mapped functions deal only with inner information, applying a function to individual elements of a list, but they don’t modify outer structure by adding, removing, or rearranging elements of the list.</p>
<p>Natural transformations are the opposite: they are morphisms that act on outer structure only, leaving the inner information intact. A natural transformation <script type="math/tex">\alpha : F \to G</script> maps outer structure <script type="math/tex">F</script> to outer structure <script type="math/tex">G</script>. Of course, <script type="math/tex">F</script> and <script type="math/tex">G</script> aren’t objects (of the categories we care about right now), so <script type="math/tex">\alpha</script> is represented as a collection of <em>components</em>, with an <script type="math/tex">\alpha_X : FX \to GX</script> for each object <script type="math/tex">X</script> in the domain of <script type="math/tex">F</script>, but in a certain sense all its components do the same thing. As an example, for any <script type="math/tex">X</script>, we can define a list reversal function <script type="math/tex">\rho_X : LX \to LX</script>. But this is kind of tedious: to reverse a list, we don’t care which set its members are from. We just change their order. We just change the outer structure. List reversal is “polymorphic” in that any choice can be made for what’s inside the list being reversed. That is, list reversal is a natural transformation <script type="math/tex">\rho : L \to L</script>.</p>
<p>The notion of operating on outer structure only is made precise by the naturality condition. Given a morphism <script type="math/tex">f : X \to Y</script>, which acts only on inner information, and a natural transformation <script type="math/tex">\alpha : F \to G</script>, which acts only on outer structure, there are ways we can imagine building a morphism that transforms both, <script type="math/tex">FX \to GY</script>: either use a map of <script type="math/tex">f</script> on inner information followed by <script type="math/tex">\alpha</script> on outer structure, or vice versa. If <script type="math/tex">\alpha</script> really does ignore inner information and maps of <script type="math/tex">f</script> really do ignore outer structure, these two choices should be the same. The naturality condition captures this in an equation: <script type="math/tex">Gf \circ \alpha_X = \alpha_Y \circ Ff</script>.</p>
<p>An <em>adjunction</em> of two functors <script type="math/tex">F : \mathcal{C} \to \mathcal{D}</script> and <script type="math/tex">G : \mathcal{D} \to \mathcal{C}</script> is a pair of natural transformations:</p>
<ul>
<li>the <em>unit</em>, <script type="math/tex">\eta : 1_{\mathcal{D}} \to GF</script>, and</li>
<li>the <em>counit</em>, <script type="math/tex">\varepsilon : FG \to 1_{\mathcal{C}}</script>;</li>
</ul>
<p>satisfying a pair of natural transformation composition laws called the <em>triangle identities</em> (because when drawn as commuting diagrams, each equation is a triangle): for all objects <script type="math/tex">X</script> in <script type="math/tex">\mathcal{C}</script> and <script type="math/tex">Y</script> in <script type="math/tex">\mathcal{D}</script>,</p>
<ul>
<li><script type="math/tex">F\eta_X \circ \varepsilon_{FX} = 1_{FX}</script>, and</li>
<li><script type="math/tex">\eta_{GY} \circ G\varepsilon_Y = 1_{GY}</script>.</li>
</ul>
<p>If you didn’t know what an adjunction was already, well, now you… probably still don’t. But don’t panic! If you followed most of the discussion of natural transformations, you’re all set to keep reading. The internet is full of many detailed explanations of the definition of adjunctions written by people who know it better than I do. My personal favorite is a series of videos by <a href="https://youtu.be/loOJxIOmShE">The Catsters</a>, but, as mentioned in the introduction, seeing many explanations and intuitive perspectives helped me a lot. Instead of giving additional definitional detail, this post introduces another such intuitive perspective: some adjunctions can be thought of as describing <em>evaluation of computation</em>.</p>
<h2 id="free-and-forgetful-functors">Free and Forgetful Functors</h2>
<p>Some classic adjoint functor pair examples are “free” and “forgetful” functors for various algebraic structures over sets, such as groups, rings, and monoids. For concreteness, we consider monoids, which are quickly defined and explained below.</p>
<p>A <em>monoid</em> is a set with an associative binary operation and an element that’s the left and right identity of that operation. Monoids are like groups in which inverses might not exist. Indeed, all groups are also monoids. One monoid that isn’t a group is the set of all <script type="math/tex">n \times n</script> matrices: multiplication is an associative operation with an identity, but not all matrices are invertible. A <em>monoid homomorphism</em> is, analogous a group or ring homomorphism, a function that preserves the operation and its identity. In equations, using <script type="math/tex">\bullet_A</script> and <script type="math/tex">1_A</script> to denote the operation and identity element of a monoid <script type="math/tex">A</script>, we say <script type="math/tex">f : A \to B</script> is a monoid homomorphism if <script type="math/tex">f(x \, \bullet_A \, y) = f(x) \, \bullet_B \, f(y)</script> and <script type="math/tex">f(1_A) = 1_B</script>. There is a category of all monoids, which we creatively call <script type="math/tex">\mathbf{Mon}</script>, with all monoids as objects and all monoid homomorphisms as morphisms. As with groups, by default we call the operation “multiplication”, and we write it as juxtaposition, often without parentheses, which associativity makes unnecessary.</p>
<p>We have two functors to define: the <em>free</em> functor, <script type="math/tex">F : \mathbf{Set} \to \mathbf{Mon}</script>, and the <em>forgetful</em> functor, <script type="math/tex">G : \mathbf{Mon} \to \mathbf{Set}</script>. The forgetful functor is easy to describe. On objects, it maps each monoid to its underlying set of elements, “forgetting” what the operation does and which element is the identity. On morphisms, it maps each monoid homomorphism to its underlying function between two sets, “forgetting” that the function happened to satisfy any equations.</p>
<p>The free functor is slightly trickier to describe. The <em>free monoid</em> on a set <script type="math/tex">X</script>, written <script type="math/tex">FX</script> (hint, hint), is a monoid “freely generated” by the elements of <script type="math/tex">X</script>. This means two things.</p>
<ul>
<li>By “generated”, we mean that the underlying set of <script type="math/tex">FX</script> has all the elements of <script type="math/tex">X</script> plus anything else needed to be a monoid. For example, if we were to generate a monoid from <script type="math/tex">\{17,42\}</script> using <script type="math/tex">+</script> as the operation, our generated monoid would need <script type="math/tex">0</script>, because it’s the identity, <script type="math/tex">59</script>, because it’s <script type="math/tex">17+42</script>, and many more numbers.</li>
<li>By “freely”, we mean that the operation of <script type="math/tex">FX</script> never assumes two things are equal if they don’t have to be. For example, if <script type="math/tex">X = \{x,y,z\}</script>, then <script type="math/tex">x(yz) = (xy)z</script> is required by associativity, but <script type="math/tex">xy \neq yx</script> because no monoid axiom says they have to be equal.</li>
</ul>
<p>It turns out that <script type="math/tex">FX</script> has a concise interpretation: the free monoid on <script type="math/tex">X</script> is <em>lists of elements of <script type="math/tex">X</script></em>, with concatenation of lists as the operation. For instance, concatenating the lists <script type="math/tex">[x,y,y]</script> and <script type="math/tex">[x,z,x]</script> gives</p>
<script type="math/tex; mode=display">[x,y,y][x,z,x] = [x,y,y,x,z,x].</script>
<p>The identity of <script type="math/tex">FX</script> is the empty list, <script type="math/tex">[]</script>. We sometimes call the free monoid the “list monoid”.</p>
<p>As suggested by our notation, the free functor <script type="math/tex">F : \mathbf{Set} \to \mathbf{Mon}</script> maps each set to the free monoid on it. To finish the definition, we need to define how to turn a function <script type="math/tex">f : X \to Y</script> into a monoid homomorphism <script type="math/tex">Ff : FX \to FY</script>. Ignoring the monoid homomorphism conditions, our task is this: given a function <script type="math/tex">f : X \to Y</script> and a list of elements of <script type="math/tex">X</script>, generate a list of elements of <script type="math/tex">Y</script>. Recalling our discussion of the list functor <script type="math/tex">L</script>, we take <script type="math/tex">Ff</script> to be list-mapped <script type="math/tex">f</script>. It’s not hard to check that this satisfies the axioms for both functors and monoid homomorphisms.</p>
<p>A subtle distinction bears mentioning: though we call both “list-mapped <script type="math/tex">f</script>”, <script type="math/tex">Ff</script> and <script type="math/tex">Lf</script> are not the same thing. They’re not even the same type of thing! The former is a monoid homomorphism, and the latter is a function. That said, they are related: <script type="math/tex">Lf</script> is the underlying function of <script type="math/tex">Ff</script>. (In fact, <script type="math/tex">GF = L</script>. More on this in a bit.)</p>
<p>Similar notions exist for groups and rings. We’ll focus on monoids in the next section but will mention rings as well, for which we’ll need the following result (with proof left as an exercise, of course): the free (commutative) ring on a set <script type="math/tex">X</script> is the polynomial ring with integer coefficients where each element of <script type="math/tex">X</script> is a variable.</p>
<h2 id="the-free-forgetful-counit-is-expression-evaluation">The Free-Forgetful Counit is Expression Evaluation</h2>
<p>Let’s summarize the story so far.</p>
<ul>
<li>The free functor, <script type="math/tex">F : \mathbf{Set} \to \mathbf{Mon}</script>, maps each set to its list monoid and each function to its list-mapped version.</li>
<li>The forgetful functor, <script type="math/tex">G : \mathbf{Mon} \to \mathbf{Set}</script>, maps each monoid to its underlying set and each monoid homomorphism to its underlying function.</li>
</ul>
<p>As mentioned at the beginning of the previous section, these are adjoint functors, which means there are natural transformations <script type="math/tex">\eta : 1_{\mathbf{Set}} \to GF</script> and <script type="math/tex">\varepsilon : FG \to 1_{\mathbf{Mon}}</script> satisfying the triangle identities. Before trying to figure out what <script type="math/tex">\eta</script> and <script type="math/tex">\varepsilon</script> are, let’s first understand what the relevant functor compositions are.</p>
<ul>
<li><script type="math/tex">GF : \mathbf{Set} \to \mathbf{Set}</script> brings a set <script type="math/tex">X</script> to the underlying set of the list monoid on <script type="math/tex">X</script>, which is the set of lists of elements of <script type="math/tex">X</script>. We’ve actually seen <script type="math/tex">GF</script> before: it’s the list functor <script type="math/tex">L</script> from the discussion of natural transformations.</li>
<li><script type="math/tex">FG : \mathbf{Mon} \to \mathbf{Mon}</script> brings a monoid <script type="math/tex">Y</script> to the list monoid on the underlying set of <script type="math/tex">Y</script>. I like to think of this as the monoid of “unevaluated expressions” in <script type="math/tex">Y</script> by thinking of a list of elements of <script type="math/tex">Y</script> as a list of terms to be multiplied. Multiplying unevaluated expressions corresponds to list concatenation. For example, we can multiply <script type="math/tex">17 \times 42</script> and <script type="math/tex">38 \times 99</script> without simplifying to get <script type="math/tex">17 \times 42 \times 38 \times 99</script>.</li>
</ul>
<p>This, along with the intuition of natural transformations as “polymorphic” morphisms, is enough to guess what the unit and counit are.</p>
<p>Let’s start with the unit, <script type="math/tex">\eta : 1_{\mathbf{Set}} \to GF</script>. Given a set <script type="math/tex">X</script>, a component <script type="math/tex">\eta_X : X \to GFX</script> is a function from <script type="math/tex">X</script> to lists of elements of <script type="math/tex">X</script>, which we called <script type="math/tex">LX</script> earlier on and call <script type="math/tex">GFX</script> now. That is, <script type="math/tex">\eta_X</script> gets a single element of <script type="math/tex">X</script> as input and has to produce a list of elements as output. A simple way to do this is to produce a singleton list, so we define</p>
<script type="math/tex; mode=display">\eta_X(x) = [x].</script>
<p>It’s straightforward to mechanically verify that <script type="math/tex">\eta</script> is a natural transformation. It certainly fits our polymorphism intuition. Each component <script type="math/tex">\eta_X</script> wraps a list “outer structure” around its argument in the exact same way, without regard for the “inner information” about what the argument is or what set it’s from.</p>
<p>We turn to the counit, <script type="math/tex">\eta : FG \to 1_{\mathbf{Mon}}</script>. Given a monoid <script type="math/tex">Y</script>, a component <script type="math/tex">\varepsilon_Y : FGY \to Y</script> is a monoid homomorphism from the list monoid on the underlying set of <script type="math/tex">Y</script> to <script type="math/tex">Y</script> itself. That is, <script type="math/tex">\varepsilon_Y</script> gets a list of elements of <script type="math/tex">Y</script> as input and has to produce a single element as output. A simple way to do this is to multiply everything in the list together to produce a single result (with the empty list mapping to the identity of <script type="math/tex">Y</script>), so we define</p>
<script type="math/tex; mode=display">\varepsilon_Y([y_1, y_2, \ldots, y_n]) = y_1 y_2 \cdots y_n.</script>
<p>That is, if we think of a list of elements of <script type="math/tex">Y</script> as an unevaluated monoid expression, then <script type="math/tex">\varepsilon_Y</script> <em>evaluates</em> the expression.</p>
<p>It’s straightforward to mechanically verify that <script type="math/tex">\varepsilon</script> is a natural transformation. That said, it doesn’t clearly fit our polymorphism intuition because we use multiplication, which feels like using “inner information”. However, as we’re about to see, this feeling is wrong!</p>
<p>In <script type="math/tex">\mathbf{Set}</script>, given multiple arbitrary elements of an arbitrary set, there’s no way for the multiple elements to interact. Natural transformations can move elements around, as we saw with our earlier example of list reversal, but there’s no way to use the given elements to get a new element of the set. If this seems restrictive, it’s because it is. The morphisms in <script type="math/tex">\mathbf{Set}</script> are arbitrary functions, so the inner information that morphisms can modify is basically as unrestricted as possible. This flexibility when modifying inner information is what puts such strong restrictions on modifying outer structure. Given functors <script type="math/tex">F, G : \mathbf{Set} \to \mathbf{Set}</script>, if a family of functions <script type="math/tex">\alpha_X : FX \to GX</script> does anything too fancy, we can find some function <script type="math/tex">f : X \to Y</script> such that <script type="math/tex">Gf \circ \alpha_X \neq \alpha_Y \circ Ff</script> because there are so many things arbitrary functions can do.</p>
<p>The story is different for <script type="math/tex">\mathbf{Mon}</script> because its morphisms are more restricted than those in <script type="math/tex">\mathbf{Set}</script>: they preserve multiplication and identity elements. Furthermore, given multiple arbitrary elements of an arbitrary monoid, there are two ways we can get new elements that weren’t initially given: multiplication and getting the identity. Together, these facts mean both multiplication and using the identity are fair game when modifying outer structure. This is intuitively why <script type="math/tex">\varepsilon</script> is a natural transformation: it uses only identity (when given the empty list) and multiplication (when given a list with more than one element), both of which are outer structure for the purposes of monoid homomorphisms.</p>
<p>Showing that <script type="math/tex">\eta</script> and <script type="math/tex">\varepsilon</script> actually satisfy the triangle identities is an unsurprising exercise that can be left for another day.</p>
<h2 id="more-than-monoids">More than Monoids</h2>
<p>Hmmm, this post is pretty long already. Here are the two further examples I was going to talk about before my arms fell off.</p>
<ul>
<li>The free-forgetful counit for rings is polynomial evaluation. The reasoning is pretty similar to what we’ve seen for monoids, except we use <script type="math/tex">\mathbf{Ring}</script> instead of <script type="math/tex">\mathbf{Mon}</script>. This means the outer structure that natural transformations can use includes addition, subtraction, and additive identity as well as the multiplication and multiplicative identity we had for monoids.</li>
<li>As an example that does not involve a free-forgetful adjunction: the product-exponential counit is function evaluation.</li>
</ul>
<p>Hopefully this helped in understanding what natural transformations, or maybe even adjunctions, are at a level that’s intuitive but not just hand-waving. If you’re curious for more material on relating adjunctions to computation, I’m pretty sure “something something free algebra” is a relevant next step.</p>
Tue, 02 May 2017 00:00:00 +0000
http://ziv.codes/2017/05/02/adjoint-functors-and-computation.html
http://ziv.codes/2017/05/02/adjoint-functors-and-computation.htmlKnowing that Everyone Knows<p>We consider a classic “paradox” where a simple inductive proof seems to clash with intuition. Though the proof makes clear that the naive intuition is wrong, it’s hard to pinpoint exactly where the intuition’s logical error is. After discussing the paradox at some length with my family, we came up with an angle of attack that gives an intuitive framework that both matches the math and makes the problem with the naive intuition clearer.</p>
<p>The situation is as follows. Dragons, as you probably already know, are a perfectly honest and rational species with color vision and either red or blue eyes. One hundred red-eyed dragons are on an island, sworn to a two-part pact:</p>
<ul>
<li>they will not communicate with each other, look at reflections, or otherwise directly find out what color eyes they have, and</li>
<li>if any dragon can logically deduce some day that they have red eyes, then that dragon will leave the island the following night.</li>
</ul>
<p>The dragons live for years on the island, each of them seeing ninety-nine red-eyed dragons but none of them able to logically deduce that they too have red eyes. One day, a perfectly honest visitor comes to the island, announces that at least one of the dragons has red eyes, and leaves.</p>
<p>If you haven’t heard this before, try to figure out before continuing: what happens?</p>
<hr />
<p><em>On the one hundredth night after being told that at least one of them has red eyes, all the dragons leave the island!</em></p>
<p>Here’s the argument.</p>
<ul>
<li>If there were exactly one dragon <script type="math/tex">X</script> with red eyes, they would have seen only blue eyes and deduced that they must be the one with red eyes, so <script type="math/tex">X</script> would leave on the first night following the announcement.</li>
<li>If there were exactly two dragons <script type="math/tex">X</script> and <script type="math/tex">Y</script> with red eyes, they would both stay the first night. The following day, each would see that the other hadn’t already left. <script type="math/tex">X</script> knows by the previous bullet point that if <script type="math/tex">Y</script> were the only dragon with red eyes, then <script type="math/tex">Y</script> would have left on the first night. This didn’t happen, so <script type="math/tex">X</script> deduces that they must also have red eyes. Symmetrically, so does <script type="math/tex">Y</script>, and both leave on the second night.</li>
<li>More generally, if exactly <script type="math/tex">k</script> dragons have red eyes, then after <script type="math/tex">k-1</script> nights of no dragons leaving, each of them realizes that, if the other <script type="math/tex">k-1</script> red-eyed dragons were the only dragons with red eyes, they would have left on night <script type="math/tex">k-1</script>. This didn’t happen, so they deduce that they must also have red eyes, and all <script type="math/tex">k</script> red-eyed dragons leave on night <script type="math/tex">k</script>.</li>
</ul>
<p>This is a pretty simple inductive argument, but there’s an apparent paradox: the announcement made by the visitor was something all of the dragons already knew! What difference does it make? The typical (and entirely correct) answer is that without the announcement, the first bullet point doesn’t hold. That bullet point is the crucial base case of the inductive argument each dragon uses to deduce they have red eyes. But even though I know how induction works, I find it very counterintuitive that this should matter, because every dragon sees at least two other red-eyed dragons and therefore knows they aren’t in the base case!</p>
<p>The rough reason that the base case matters, even though all the dragons know they aren’t in it, is that we have to not just consider what each dragon knows, but also what each dragon knows about what each other dragon knows… and what each dragon knows about what each other dragon knows about what each other dragon knows, and so on. I was able to figure things out for up to three red-eyed dragons, but after that there were too many cases to keep track of.</p>
<p>Following a common mathematical theme, to give ourselves better intuition about a complicated situation, we’re going to define a new concept and build intuition about that new concept instead of about the situation directly. Let us call a dragon <em><script type="math/tex">k</script>-aware</em> for positive integer <script type="math/tex">k</script> under the following conditions.</p>
<ul>
<li>A dragon is <script type="math/tex">1</script>-aware when they know at least one dragon has red eyes.</li>
<li>For <script type="math/tex">k \geq 2</script>, a dragon is <script type="math/tex">k</script>-aware when they know every dragon is <script type="math/tex">(k-1)</script>-aware.</li>
</ul>
<p>For example, if only one dragon <script type="math/tex">X</script> has red eyes, then every other dragon is <script type="math/tex">1</script>-aware. If only two dragons <script type="math/tex">X</script> and <script type="math/tex">Y</script> have red eyes, then they are both <script type="math/tex">1</script>-aware and every other dragon is <script type="math/tex">2</script>-aware: not only do the other dragons know that <script type="math/tex">X</script> and <script type="math/tex">Y</script> have red eyes, but they know that each of <script type="math/tex">X</script> and <script type="math/tex">Y</script> can see the other, so they know that every dragon can see a red-eyed dragon. We can generalize this.</p>
<h4 id="theorem"><strong>Theorem.</strong></h4>
<p>Before the visitor’s announcement, if a dragon can see at least <script type="math/tex">k</script> red-eyed dragons, they are <script type="math/tex">k</script>-aware, and if they can see at most <script type="math/tex">k</script> red-eyed dragons, they are not <script type="math/tex">(k+1)</script>-aware.</p>
<h4 id="proof"><em>Proof.</em></h4>
<p>We prove each statement separately by induction.</p>
<ul>
<li>If a dragon can see another red-eyed dragon, then they are <script type="math/tex">1</script>-aware.</li>
<li>If a dragon can see at least <script type="math/tex">k \geq 2</script> red-eyed dragons, then they know that every other dragon can see at least <script type="math/tex">k-1</script> red-eyed dragons. By the inductive hypothesis, they know every other dragon is <script type="math/tex">(k-1)</script>-aware, so they are <script type="math/tex">k</script>-aware.</li>
<li>If a dragon sees no red-eyed dragons, then they are not <script type="math/tex">1</script>-aware.</li>
<li>If a dragon <script type="math/tex">X</script> can see at most <script type="math/tex">k \geq 1</script> red-eyed dragons, then because <script type="math/tex">X</script> must consider that they might have blue eyes, it is possible that each of those red-eyed dragons can see just <script type="math/tex">k-1</script> other red-eyed dragons. By the inductive hypothesis, <script type="math/tex">X</script> cannot know for sure that those red-eyed dragons are <script type="math/tex">k</script>-aware, so <script type="math/tex">X</script> is not <script type="math/tex">(k+1)</script>-aware. <script type="math/tex">\square</script></li>
</ul>
<p>This theorem means that before the visitor’s announcement, the dragons are all <script type="math/tex">99</script>-aware. After the visitor’s announcement, <em>the dragons become <script type="math/tex">k</script>-aware for every <script type="math/tex">k \geq 1</script></em> because of the public nature of the announcement: not only does everyone know that at least one dragon has red eyes, but everyone knows that everyone knows this, and everyone knows that everyone knows that everyone knows this, and so on. This makes all the difference.</p>
<h4 id="theorem-1"><strong>Theorem.</strong></h4>
<p>If there are exactly <script type="math/tex">k</script> red-eyed dragons and they simultaneously become <script type="math/tex">k</script>-aware, they will leave <script type="math/tex">k</script> nights later.</p>
<h4 id="proof-1"><em>Proof.</em></h4>
<p>We prove only that the dragons that are supposed to leave do so at the right time, given that the other dragons stay. It’s not too hard to add the details to rigorously show that all the other dragons do indeed stay.</p>
<ul>
<li>Suppose a dragon sees no red-eyed dragons but becomes <script type="math/tex">1</script>-aware. They immediately deduce they have red eyes because nobody else does, so they leave on the first possible night.</li>
<li>Suppose for some <script type="math/tex">k \geq 2</script> that a dragon <script type="math/tex">X</script> is <script type="math/tex">k</script>-aware and sees exactly <script type="math/tex">k-1</script> red-eyed dragons. By <script type="math/tex">k</script>-awareness, <script type="math/tex">X</script> knows that those red-eyed dragons are all <script type="math/tex">(k-1)</script>-aware. <script type="math/tex">X</script> reasons that if they had blue eyes, then those red-eyed dragons would have each seen exactly <script type="math/tex">k-2</script> red-eyed dragons and, by the inductive hypothesis, would have left on night <script type="math/tex">k-1</script>. Therefore, if this doesn’t happen, <script type="math/tex">X</script> can deduce that they must have red eyes and will leave on the next night, which is night <script type="math/tex">k</script>. <script type="math/tex">\square</script></li>
</ul>
<p>The above proof is essentially the same as the initial argument, but the explicit definition and usage of <script type="math/tex">k</script>-awareness helped me (and, hopefully, you!) build better intuition for it.</p>
Sat, 09 Jan 2016 00:00:00 +0000
http://ziv.codes/2016/01/09/knowing-that-everyone-knows.html
http://ziv.codes/2016/01/09/knowing-that-everyone-knows.html