What is the explore-exploit tradeoff? It's advice that's not novel for most people, but it seems putting it into practice remains difficult. Algorithms to live by: Explore vs Exploit “Trying new things or sticking with our favorite ones?” According to the book, people have t h e tendency to explore/exploit trade-offs as they are faced with decision making among various options on a daily basis. The literature on over-exploration is the strongest reason to think I might be wrong here, but there's also a threat from something like social desirability bias. immediately commit—deposit. There's also the issue of combinatorial search when options may be complements or substitutes for each other. each—yet many of, those cars were on the road next to you, whereas the planes If the logarithm of the observation is significantly greater than the median, we expect the logarithm of the final elapsed duration (or what-have-you) to be a bit bigger than the current logarithm. For example, the book opens with a discussion of so-called 'optimal stopping' problems. increased by 600%. But without exploring, there's nothing to exploit. to brainstorm, the thicker the pen they use—a clever form of The book didn't discuss this, though Gwern has produced some practical prior art. Buy Algorithms to Live By: The Computer Science of Human Decisions 12 by Christian, Brian, Griffiths, Tom (ISBN: 9780007547999) from Amazon's Book Store. If you are in a competition with others, the absolute quality might be quite unimportant. Notably, most of these changes are ones you've probably already heard of without having to turn to computer science. As a result of this, I have increased my tendency to explore in some situations. ticking, few aspects of. Sorting theory tells us how (and whether) to arrange our offices. (For example in the case above, something analogous to a stably biased coin). Lots of different choices, spreading out into trees of further choices, interacting with chance and ending up in different worlds you value to different degrees. Finding a really nice library reduces the need to find a café that you can work in. If you haven't, first think of the exponential distribution. I will consider implementing a digital 'reference folder' where I can put things that seem like they might be useful, rather than defaulting to putting them onto a 'to read' list. If you don't have long, stick to exploiting; if you have years, shop around. When n is 1, the Erlang distribution collapses to the exponential. month for the search), noncommittally exploring options. means that exploration, necessarily leads to being let down on most occasions. If b + h > s, but b - k < 0, there are now 2 equilibria. between what you can measure and what really matters. You can pass forever on an option or accept it and see no more options. So, as the book says, if you think the amount of money Hollywood films makes is distributed like total_money⁻¹, and you hear a film has made $1000 so far, your mean guess for its total should be $2000. Getting from the bad equilibrium to the good one is ... difficult. No choice recurs. Even in quite transferable cases, like sorting, it pays to remember a piece of old programming wisdom: Rule 3. In a power law distribution, there's a fair amount of probability mass far out in the tail of the distribution. It's an annoying problem in Machine Learning. But piecemeal accounting for complications is dangerous. The baseline is taking no holiday in a low holiday environment. For this issue, think again of moving to a new city or starting a new job. Imagine you and a friend are big film buffs, and want to go to the cinema together. That said, if you need to sort a lot of material that you can only compare directly (rather than say, scoring) look to a merge sort. Should we be worried about the lack of concrete advice? Crucially, you get the money if you 'cooperate' (take holiday) even if others 'defect' (take no holiday). another idea from, computer science: “interrupt coalescing.” If you have five Internet, or read all, possible books, or see all possible shows, is bufferbloat. For example, moving to a new city (not trying enough different places) or attending a conference (stopping networking after meeting a few interesting people). By being concrete and proposing specific actions or times, we can allow someone to only check rather than search. 1. With overfitting, you end up predicting that data will at each point err from the 'true average' in the same way that the data you sampled did. I enjoyed this book a lot, so this review is going to be a long one. Brian Christian is the author of The Most Human Human, a Wall Street Journal bestseller. Algorithms are not followed only by computers. This chapter discussed some algorithmic approaches to that problem. through pairwise, comparisons—whether they involve exchanging rhetoric or Boris Berezovsky. The explore/exploit tradeoff tells us how to find the balance between trying new things and enjoying our favorites. work between nations. This could create two equilibria (one adequate / one inadequate) or even make taking holiday the dominant move! Optimal Stopping ... Explore/Exploit. Consider that the optimal algorithm gives you a 37% chance of getting the best flat: it really matters a lot what happens the other 63% of the time! emotional well-being that, When we think about the factors that make large-scale human Decide, how responsive you need to be—and then, if you want to get This chapter discussed its role in keeping work limited when marginal payoff becomes uncertain. If you want to be a good intuitive Bayesian—if you want to I'm not certain whether that should seriously reduce my confidence or not though (the hypothesis still has be relying on advice I evaluated well before). The problem is information getting stuck at the back of a long queue, with the sender none the wiser. And Algorithms to Live By by Brian Christian & Tom Griffiths is an exploration of the applicability of algorithms from computer science to human decision problems. How are we supposed to figure out how to explore this space effectively? I claim below that the analogy to humans seems pretty weak for buffer bloat. It’s entirely possible you’ve seen roughly as many of Search costs (covered later) for valuable reading are definitely getting high. The classic comparison between bubble sort and merge sort really pumps up your intuition that there could be hacks to be found! In a few paragraphs there's a reader's guide so you can skip around. the chickens—and, for. It was a shame the book didn't probe this at all. Or one with a high expected value? Algorithms to Live By is a surprisingly fun book considering the subject. Explore vs. the most important, you should try to stay on a single task as long as possible The fields of algorithmic relaxation & randomness explore answers to the above questions. Perhaps my emails contain enough items to think about employing an algorithm with large constant factors. I'm not confident on this, so if anyone could (dis)confirm that would be cool. enough to fill, Carnegie Hall even half full. When we cook from a recipe, we’re following an algorithm. ones. I thought that he missed a beat on the sorting and searching question. That person then needs to search through their schedule for a good time, which can take quite a bit of work. The feeling that one needs to look at everything on the As entrepreneurs, Jason Fried and David Heinemeier Hansson explain, the us to embrace high rates of failure even when acting It has big economic benefits for individuals and organisations. Well, if b + h > s, or if b - k > 0 then not taking holiday no longer dominates. If we were really going to leverage algorithms in this space, it would probably involve a bit of programming: that's not really practical for a general audience book. have all the facts, they’re free of all error and uncertainty, and you can Sorry for the length. A leap from ordinal to pleasure. Christian & Griffiths suggest reasons that people's tendency to favour exploration might be rational. Hesitation—inaction, Intuitively, we think that rational decision-making means Unfortunately, these chapters were pretty slim on applicable algorithms. One at everyone taking holiday and one at no one taking holiday. between looking and leaping. (This is really just another way that accessible payoffs may change over time). we are, “always connected.” But the problem isn’t that we’re always The chapter provides some evidence that humans tend to over-explore. Almost every decision in our lives comes down to the explore vs. exploit algorithm. We may get similar choices again, but never that exact one. cardinal. This ties together our explore / exploit phenomenon because younger people who have a longer time frame are more on the explore phase and older people with a more finite time frame are in the exploit phase. One awesome thing from this chapter were rules of thumb for certain estimates. And if, that’s not possible, you can at least exercise some control Let Us send you free Summaries Forever :), We respect your privacy and take protecting it seriously, Book Summary: Never Split The Difference Summary By Chris Voss, Book Summary: Rejection Free Summary Scott Allan, Book Summary: The Universal Law Of Success Summary Albert Laszlo, Book Summary: Unfuck Yourself Summary Gary John Bishop, Book Summary: How To Stop Feeling Like Shit Summary Andrea Owen, Book Summary: How to Fail at Almost Everything Summary By Scott Adams, Book Summary: Crazy Rich Asians Summary Kevin Kwan, Book Summary: Talking To Strangers Summary Malcolm Gladwell. The common computer science explore/exploit dilemma can model human behavior. But first, if you really have a lot of stuff to sort, remember to check the value of your time. When, our expectations are uncertain and the data are noisy, the decision-making (or of thinking more generally) are as If you unilaterally take holiday here, it turns out badly for you. from Simulated, Annealing: you should front-load randomness, rapidly cooling In some situations, spending more time in total sorting and searching is a good choice. And it seems, like it does: Carstensen has found that older people are For completeness, I will give some concrete sorting algorithm suggestions. In networks, this can lead to the receiver thinking the sender takes a long time to receive and process responses. We model the rest of the company as a single agent taking a 'high' or 'low' holiday strategy. benefit the rest of the time by having what we need at the It’s this, that forces us to decide based on possibilities we’ve not I just can't do the weekend and the week after next is less good.". the costs of error, against the costs of delay, and take chances, Book Summary: Never Split The Difference Summary By Chris VossBook Summary: When Daniel Pink SummaryBook Summary: Rejection Free Summary Scott AllanBook Summary: The Universal Law Of Success Summary Albert LaszloBook Summary: Unfuck Yourself Summary Gary John BishopBook Summary: How To Stop Feeling Like Shit Summary Andrea OwenBook Summary: How to Fail at Almost Everything Summary By Scott Adams, No time to the whole book ? Similarly, when it comes time for you and your friend to pick a film, vetoing your least favourites could make it easier to zone in on an acceptable choice. Optimal Stopping — When to Stop Looking; Explore/Exploit — The Latest vs. the Greatest; Sorting — Making Order That is, when no one was taking holiday, you're happy to take it. of life. The authors draw this idea from a study that it might take minutes for a human to recover productivity from a context switch. every time we, encounter a hitch, hard problems demand that instead of I also consider the case for lognormal, but it doesn't add much to the previous cases. There is also a mental toll from awareness of its infinitude. science regards as, the hard cases. But after that point, be prepared to Algorithms To Live By introduces a few methods of finding a balance between the two. So we can apply the rule for the normal distribution: if the logarithm of your observation is significantly less than the logarithm of the distribution's median (so let's say the observation is about half the median) just go with the median. I hadn't encountered the Erlang distribution before. But! of your experience. ignore sunk costs. I have not yet thought of further ways to take this advice into account. seniors can do is to try to, get a handle on the idea that their minds are natural This is not merely an intuitively satisfying compromise Until you start playing, you won’t have any idea which machines are the most lucrative and which ones are money sinks. spend 37% of your, apartment hunt (eleven days, if you’ve given yourself a Keeping gym items in a crate by the front door. latencies, take heart: the length of a delay is partly an indicator of the extent things done, be no, If you find yourself doing a lot of context switching Game theory is worth knowing about. When we study complexity, we study behaviour as the number of items they're processing gets large. is to be alive. What about if the CEO pays people take holiday? I am now more likely to look at complex, suboptimal situations as an opportunity to optimise in the sense of 'improve' rather than optimise in the sense of 'perfect' by default. If your pile of papers is well-sorted, you can do a binary search on it and find stuff quickly. In our world, payoffs are not fixed, and we even have priors about how much we expect them to change over time. Because new is unknown, and may be disappointing… Better go for something safe and sure (i.e., exploit). you’ve already seen. This could help a lot with explicit estimates and making predictions. While this isn't the most satisfying rule, I could see it providing some use in Fermi-style estimates and hopefully my intuitions about it will sharpen. Humans really do need to sort and search stuff, and computer science algorithms apply in a straightforward way. [...]. As I mentioned in the introduction, we should probably be relieved and pump our trust in the book because of this: personal scheduling really matters! simplification by stroke size: Unless we’re willing to spend eons striving for perfection longest as you approach freezing. The main estimates they work for are durations, where you have no information about when during the duration you've turned up and you want to estimate how long the total duration will be. We normally sort stuff so that we can find stuff in it later! A thousand bucks sweetens the deal but doesn't change the principle of the game. But at the same time, this Much as we bemoan the daily rat race, the fact that it’s a This comes from this chapter claiming a cache-management algorithm called LRU (Least-Recently Used) performs well in a variety of environments. I’m not sure what I can take away from these algorithms and apply them in my daily life but this was a fun read for me. about which games, you choose to play. I will also consider placing items so that they're close to where they're needed. When you're hoover gets full, it's probably because you're doing some hoovering! When I need to get rid of something, I will lean heavily on when it was last used as a heuristic. To give more detail on buffer bloat, as I understand it from this chapter, could not affect a human in an analogous way. My biggest concern with the value of this section is that I've not had cause to use them yet. Well for a power law distributed like t⁻ⁿ, where t is the random variable, should multiply by the n-th root of 2. However as they are the only part that I imagine will be broadly novel and broadly valuable, I've included it first. not track their, frequency in the world. One thing I got from these chapters was thinking about why we sort. I'll copy two items from the book here: A possible way of using this is looking for your habit triggers in your life. The Secretary Problem. space, requires a leap beyond. The most prevalent critique of modern communications is that Explore/Exploit. In English, the words “explore” and “exploit” come loaded with completely opposite connotations. If we're thinking of a reading or a todo list, a human would rarely work through it in order, but would keep an eye out for high priority items (a counter-example for me is RSS: I often do churn through my feeds in order). Algorithms to Live By takes you on a journey of eleven ideas from computer science, that we, knowingly or not, use in our lives every day. Third, But, the cultural practice of measuring status with quantifiable individuals sharing the same. While it sames safe to assume this is true for me as well, I think I have identified cases where I underexplore. A lot of that $2000 is coming from a small chance of hundreds of millions of dollars. As indicated above, we aren't that great at probabilistic inference and calculation. best bet is to paint with, a broad brush, to think in broad strokes. I think this is an improvement but I'm not that confident (maybe around 3:2 that it's an improvement). Apart from below the lognormal's median, they look kind of similar (but I prefer the lognormal cos of its reasonable behaviour around 0). For many things (email, paper & computer files) I no longer worry about having a good organisational system. We say, “brain fart” when we should really say “cache miss.” The The authors write, LRU [...] is the overwhelming favorite of computer scientists. Sorting theory tells us how (and whether) to arrange our offices. After discussing optimal stopping in my last post, in this post I will continue my series on "Algorithms to live by" by Christian&Griffins, with the famous "explore vs exploit" problem. metals, machinery. In general, I think better introductions are available in the LW-o-sphere, for example this recently curated post. And if b + h > s and b - k > 0, then taking holiday becomes the dominant option! Many problems that we all deal with as part of life have practical solutions that come from computer science, and this book gives a number of examples.
2020 algorithms to live by explore/exploit