The Unbundling of AWS

(I’m using AWS as an example, but this post applies just as much to other cloud providers)

Over the years AWS has grown to dozens of different services,  providing virtual machines,  databases, monitoring and  deployment tools on-demand. Today, it would be considered foolish to manage your own Postgres/MySQL server when you can set up an RDS instance with excellent scalability and availability characteristics in a matter of minutes.

But that’s changing.

Container infrastructure is starting to provide similar abstractions and benefits: One-click deployments, load balancing, auto-scaling, rolling deploys, recovery from failures, data migration, resource usage monitoring, and more. Increasingly, I see companies moving away from cloud provider services in favor of containers and container orchestration platforms. Core services like EC2 and S3 aren’t easily replaced, but others are, and there are good reasons to do so:

  • Costs. AWS prides itself on pay-per-use pricing, but many services aren’t fulfilling that promise. For example, an Elastic Load Balancer costs a fixed $25/month, even if it receives only a few requests. A database that runs only few queries per day (like this blog) also comes with a fixed price tag. Containers are almost free – instead of paying with dollars you pay with the CPU/network resources actually used by the service. Often, that turns out to be cheaper.
  • Features. Many AWS services are based either directly or indirectly on open source projects. But these open source projects are typically more feature-complete than their AWS counterparts. An Elastic Load Balancer has a limited set of features compared to a HAProxy instance; so does AWS Kinesis compared to Apache Kafka. Even services that are running open source software under the hood (such as EMR with Hadoop/Spark) don’t typically support the latest versions.
  • No cloud lock-in: Most container orchestration solutions work across clouds out of the box. This means you can host parts of your infrastructure on AWS, Google Cloud, Azure, DigitalOcean, you’re private cloud, or whatever else is the best fit.
  • Full control: When things don’t work as expected you’re relying on AWS support for help. That can be convenient, but usually it’s faster to debug a problem yourself. That’s only possible with access to the internals of a service, something you don’t have with hosted solutions. What if there’s a simple feature that’d take a small configuration change or a few lines of code to implement? AWS support can’t do that for you. With containers and open source software you can.

Just like Craigslist has been unbundled by purpose-built websites, it seems natural to (not only) me that cloud providers like AWS will be unbundled by purpose-built open source software. In a way that’s ironic because the value proposition of AWS is the exact opposite – bundling  open source software in a centralized place and giving them a consistent look and feel. Up until now we didn’t have the right technology to enable unbundling of PaaS solutions. It’s only recently that container infrastructure and orchestration are becoming mature enough to make this possible.


A Brief Guide to the Docker Ecosystem

Few other technologies have penetrated technology companies as rapidly as Docker (or more generally, containers). It seems like the majority of developers and companies are using containers in one way or another. Many use containers to simplify the setup of local development environments, but more and more companies are starting to completely re-architect their infrastructure and deployment processes around containers. In this post I’m hoping to provide a brief overview of the current state of the ecosystem.

Engines / Runtimes

Container Engines are the core piece of Container technology. The engine builds and runs containers, typically based on some declarative description, such as a Dockerfile.  When people talk about Docker, they typically refer to the Docker Engine, and not necessarily the rest of the ecosystem.

  • Docker Engine is the current industry standard and the by far most popular engine.
  • rkt is an open-source initiative to take on the Docker Engine, lead by the CoreOS team.

Cloud Services with built-in Docker support

Cloud providers have been quick in offering solutions to run containers on top of their platforms. Some built solutions in-house, and others rely on open source software. Of course, one could manually install Docker an run containers on any server, but most cloud providers go a step further and provide user interfaces that make managing containers easier.

  • Amazon EC2 Container Service allows running containerized applications on existing EC2 instances. ECS itself is free, you only pay for the EC2 usage.
  • Google Container Engine is built on top of Kubernetes, an open-source container orchestration project started by Google.
  • Azure has announced support for Docker containers on top of Mesos
  • Stackdock provides hosting for Docker containers.
  • Tutum provides hosting for Docker containers.
  • GiantSwarm is a cloud platform for defining and deploying microservice architectures running inside containers.
  • Joyent Triton provides hosting and monitoring for Docker containers.
  • Jelastic Docker provides cloud hosted orchestration for container deployments.

Container Orchestration

Container Orchestration is one of the most contended areas right now. Working with a few containers is easy, but scheduling, managing and monitoring containers at scale is extremely challenging. Container Orchestration software handles a variety of tasks, such as finding the best place/server to run a container, handling failures, sharing storage volumes, and creating load balancers and overlay networks to allow communication between containers.

  • Kubernetes is an open source effort started by Google. Kubernetes is based on Google’s internal container infrastructure, and in terms of features it is the most advanced orchestration platform currently available.
  • Docker Swarm allows scheduling containers on a cluster of Docker hosts. It is tightly integrated with the rest of the Docker ecosystem.
  • Rancher manages application stacks (linked containers) on a cluster of machines. Rancher features an intuitive user interface, excellent documentation, and runs inside a container itself.
  • Mesosphere  is a general purpose datacenter operating system. It was not specifically built for Docker, but it includes primitives that make it easy to run containers, or other orchestration systems like Kubernetes, next to traditional services like Hadoop .
  • CoreOS fleet  is part of the CoreOS operating system and manages the scheduling  of arbitrary commands (such as running Docker/rkt containers) within a CoreOS cluster.
  • Nomad is a general-purpose application scheduler with built-in support for Docker.
  • Centurion is a deployment tool internally used and developed by Newrelic.
  • Flocker assists with data/volume migration among containers running on different hosts.
  • Weave Run provides service discovery, routing, load balancing, and address management for microservice architectures.

Operating Systems

You can run containers on any operating system, but companies are increasingly moving towards containerizing their whole infrastructure. As such, it makes sense to to run a minimal operating systems optimized for Docker and related services.

  • CoreOS is designed for automatic updates and focuses on running containers across cluster of machines. It ships with fleet, a scheduler inspired by systemd, but also supports other orchestration systems.
  • Project Atomic is a lightweight operating system that runs Docker, Kubernetes, rpm and systemd.
  • Rancher OS  is a 20MB Linux distribution that runs the entire operating system within containers. It differentiates between “system containers” and “user containers”, each running in a separate Docker daemon.
  • Project Photon is an open source effort from VMWare.

Container Image Registries

Image Registries are the “Github for container images” and allow you to share container images with your team, or the world.

  • Docker Registry is the most popular open source registry. You can run it on your own infrastructure or use Dockerhub.
  • Dockerhub provides an intuitive UI, automated builds, private repositories, and a large number of official images maintained directly by the authors of the software.
  • is a container registry developed by the CoreOS team.
  • CoreOS Enterprise Registry focuses on providing fine-grained permission and audit trails.


Containers write log files that can be ingested into any existing log collection tool. Container monitoring software typically focus on resource usage (CPU, memory) broken down by container.

  • cAdvisor is an open source project by Google. It analyzes resource usage and performance characteristics of running containers and optionally uses InfluxDB as a storage backend for analytics.
  • Datadog Docker is an agent that collects statistics of running Docker containers and sends them to Datadog for further analysis.
  • NewRelic Docker send container statistics to NewRelic’s cloud service.
  • Sysdig can also monitor container resource usage.
  • Weave Scope automatically generates a map of your containers, helping you understand, monitor, and control your applications.
  • AppFormix provides real-time infrastructure monitoring that works with Docker containers.


A Few Tips To Make Distributed Teams Work Well

An increasing number of startups seem to be building distributed teams, particularly in engineering. But making a distributed team work well isn’t easy. From what I’ve seen most distributed teams are less productive than their centralized counterparts. Here are a few things from my own experience that I think are crucial to the effectiveness such teams. If I had to summarize all of the below I would do it as: Remove synchronous communication, and when in doubt, over-communicate.

Hire people who thrive in remote environments

An otherwise excellent hire may underperform in a distributed environment. This has nothing to do with skill, it’s mostly a result of past experience and personality. Some of us enjoy the social atmosphere and tighter supervision of an office environment. Others thrive with little or no supervision away from all distractions. I found that the latter type of people tend to have experience with starting or managing their own projects, e.g. an open source project, side project, or startup. Explain the company’s goals and they will figure out the little details themselves, without the need for a lot of meetings. Look for these people.

Be proactive, not reactive, in giving access to resources

Communication in a remote environment is asynchronous, so your goal should be to minimize the need for synchronous communication. For example, a developer may be blocked by not having access to a code repository or SaaS service the company uses. In an office environment this isn’t a big deal, she can just walk over to a manager or teammate and get access. In a remote environment she may have to wait several hours to get it. If this happens frequently it adds up to a lot of lost time. Be proactive in giving everyone the resources they need to make their own decisions.

Don’t be a hybrid

It’s not uncommon to simultaneously run a distributed team and a “core” team in a centralized location. This is appealing, but it’s also difficult to get right. It only works if everyone follows the same procedures. It’s tempting for members of the core team to make quick decisions through in-person meetings and neglect discussing them with people who aren’t there. If everyone talks in Slack then the core team should do the same. Not doing so will result in an imbalance of information, or worse, frustration of people who are not “in the loop”.

Focus on process, not outcome

As long as a centralized team is small you can run it without a lot of formal procedures. Not so with distributed teams. An example for engineering teams would be creating formal procedures for Pull Requests. What should be in the title/description and acceptance criteria? Who should review/merge, and when? Defining all of this formally seems like overkill, but in a remote environment it ensures that everyone is on the same page and knows what to expect. More generally, your goal shouldn’t be to ship the next version of your product as quickly as possible (though that’s a nice side effect). It should be to build a scalable process , get out of the way, and make your team productive without requiring a lot of supervision. Formal processes help do that.

Have the right technology stack in place

There’s lots of software that makes working in a distributed team easier. It would be foolish not to use it. Use Slack for team communication. Blossom or Pivotal Tracker for task tracking . Screenhero for screen sharing. Google Hangouts for meetings. And so on. Of course the above are just examples, use any software that meets your needs. Again, make sure that you have processes in place that define how to use your software stack (e.g. how to manage and reviews tasks).

Get everyone on the same page

I found that many inefficiencies in distributed teams stem from the fact that team members aren’t on the same page about company priorities, or that they don’t know what everyone else is working on. This is absolutely crucial. Processes that I’ve found effective include using company and team OKRs, regular standups (either in Slack or via video chat), and doing weekly retrospectives of what has been accomplished, what didn’t go so well, and what the goals for next week are.

Consider Transparency

Buffer is probably the best example of a company that’s incredibly transparent. Transparency works well with distributed teams because it removes the need for communication. If something is open to everyone, employees don’t need ask around for access. You don’t need to become as transparent as Buffer is, but it’s worth considering what you could be transparent about both publicly and internally.


Why Startups Really Succeed: Strings of Luck

Unicorn Cafe

Luck plays a huge role in everything we do, and where we’re born is the perhaps biggest lottery of our lives. But acknowledging luck makes us feel uncomfortable. Our brain seeks causal stories, and tries to create them from whatever information is currently available. This helps us maintain the illusion that the world is an orderly place we have control over.

Startups are no exception. The stories of those that succeeded, and the post-mortems of those that failed, are always causal stories. In the case of success they are typically stories about visionary founders in a fast-growing market pursuing an idea at just the right time. That’s exactly the kind of story that appeals to our brains (and the press). There’s no mention of luck. Surely, if we could turn back time, and those founders were to start the business again under the same circumstances, it would also succeed, right?

That’s an illusion.

We tend to overestimate the influence that founders, or any element we can control, have on the outcome. I am not discrediting the hard work of startup founders. The intelligence, resilience, resourcefulness, and optimism of the founders certainly play a big role in the success of a startup. But I believe that it’s a required and not a sufficient condition. Let’s take Airbnb as an example. Paul Graham writes:

Airbnb now seems like an unstoppable juggernaut, but early on it was so fragile that about 30 days of going out and engaging in person with users made the difference between success and failure.

There are an infinite number of events, from family problems to legal issues, that did not happen but would have resulted in Airbnb going out of business at some time during its inception. A chance encounter with someone offering an attractive job to the founders would probably have been enough (the founders started renting out mattresses because they couldn’t afford rent in SF). It was lucky that none of this happened.

The combined absence of all events that would’ve resulted in the founders shutting down Airbnb was very unlikely. Similarly, there were a few crucial (lucky) events that had a large impact on Airbnb. What if the initial 2 customers had never seen the website? What if nobody ever recommended that the founders take prettier pictures of the listed places? You can come up with similar examples for most other billion-dollar startups. Google almost sold their company for $750k in 1999 and just barely escaped death.  All companies are fickle in their early days, and it’s usually a stroke of random events that leads the founders to continue instead of shutting down or prematurely selling the business.

In Thinking Fast and Slow, nobel-winning Daniel Kahneman puts it well:

 Narrative fallacies arise inevitably from our continuous attempt to make sense of the world. The explanatory stories that people find compelling are simple; are concrete rather than abstract; assign a larger role to talent, stupidity, and intentions than to luck; and focus on a few striking events that happened rather than on the countless events that failed to happen.

This also gives us the top reason startups fail: Because it’s the default action. In the absence of continuous random events that keep a startup alive there are just too many things that can go wrong, and too many seemingly better opportunities the founders could choose to pursue. Statistically, it is more likely that something leads to the (voluntary or involuntary) shutdown of a startup than it is that everything goes just according to plan. That’s the reason VCs don’t focus on “Will this startup succeed?”, but on “If this startup succeeds, how big could it be?” Some have recognized that there are just too many variables to consider, and that it’s impossible to predict the future of a startup.

The reason so many successful startups come out of Silicon Valley is because it’s a numbers game. SV has the highest concentration of startups anywhere in the world (maybe even more than the rest of the world combined). People move to SV to start risky companies. Statistically it should come as no surprise that most successes start here. To avoid sounding like a hopeless pessimist I want to clarify that  I am not saying that all the other factors (culture, available of talent, etc) are irrelevant. It’s just that we tend to overvalue them because they make for good stories.

Optimism, or blissful ignorance, could be called the secret sauce of startup founders. Being relentlessly optimistic leads the founder to make the (irrational) decision of continuing with their startup when they could be pursuing an opportunity with a higher expected value. And given the large number of samples, this works out just fine in Silicon Valley.


Reimagining Language Learning with NLP and Reinforcement Learning

The way we learn natural languages hasn’t really changed for decades. We now have beautiful apps like Duolingo and Spaced Repetition software like Anki, but I’m talking about our fundamental approach. We still follow pre-defined curricula, and do essentially random exercises. Learning isn’t personalized, and learning isn’t driven by data. And I think there’s a big opportunity to change that. With the unlimited supply of natural language data online, and with the advances in Natural Language Processing (NLP) techniques, shouldn’t we be able to do something smarter? Here’s what I’m thinking.

The foundation: Modeling Knowledge

At the heart of making learning more efficient is the ability to model a learner’s knowledge. Once you understand what a learner knows you can present her with material that’s most beneficial. Modeling knowledge in general is a difficult problem.  How would you quantify your knowledge about ancient Rome, English literature or mechanics? Knowledge in most disciplines is based on connecting disparate facts and then and reasoning about them in one way or another. Language learning is different, and it’s unique in that it’s quite simple. Comprehending a sentence doesn’t require higher level reasoning, and we can actually measure a learner’s knowledge by presenting her with the right challenges, such as sentence comprehension or completion.

We also need to model language itself. In order to present a learner with a sentence she can comprehend we must know which knowledge (vocabulary, grammar, etc) that sentence depends on. In a way that’s what courses do “manually”.  They present you with a predefined sequence of material that builds on top of each other. I believe we can do this automatically. NLP techniques are sufficiently sophisticated that we should be able to figure out the knowledge dependencies of a text. And that would open up a whole new world of possibilities.

A mathematical formulation

To make things concrete, let’s actually try to define the above mathematically. What follows is invariably an oversimplification of language learning, but I think it’s a useful enough model to do something interesting with. Let’s assume a learner’s language knowledge can be quantified by how well she knows vocabulary and grammar items. I’m not saying this is the right, or the only definition, but it’s something has worked quite well in practice. It’s what most courses and textbooks do.

A learner’s knowledge is defined by a state s, which captures our belief about what the learner knows. For example, s could be a sparse vector of real numbers where each element (e.g. 0.73) is a score quantifying how well a learner knows a word or grammar rule. The score could be calculated based on the learner’s performance on reading/listening comprehension and writing/speaking production tasks. Note that s models our belief about the learner’s knowledge, not necessary the actual state of the world. Thus it would probably be a good idea to also include uncertainty (in the form of confidence bounds or distributions) in the representation above. But it’s easier to think about s as just a vector of scores.

We can perform actions a \in A to modify a learner’s knowledge. Actions could include vocabulary reviews, sentence comprehension tasks, or grammar exercises. Just think about what textbooks do. All of these actions have an effect on s.  They could increase or decrease the scores based on how well the learner did (or could change the uncertainty about our beliefs). In other words, if a learner is in state s_t at time t, then an action a \in A will transition her to a new state s_{t+1}. The number of possible states is obviously huge, or infinite.

This now starts to look a bit like a Markov Decision Process (MDP), except that we don’t have uncertainty in our state transition, and that we haven’t defined a reward function.

Learning towards a specific goal

Most approaches ignore the fact that students have different motivations for learning a language. That’s clearly a mistake. The knowledge required to understand your favorite TV drama is different from the knowledge required to comprehend scientific journals. Obviously there is a lot of overlap, but taking a class focused on daily conversation probably isn’t the fastest way towards reading academic literature. With the ability to model knowledge on a fine-grained level we can have truly personalized learning.

Let’s assume the learner’s goal is to understand a certain text, an online article or Youtube video for example. Because we know the knowledge dependencies of that text we know which target states s^t would allow the learner to comprehend it (with high probability at least). Our goal is to find a policy \pi(s_t) that tells us which actions to take at any given point in time in order to reach some target state as quickly as possible. The policy tells us the stochastically optimal path towards a learner’s goal. In an MDP, the policy is defined as maximizing the sum of rewards from some reward function R_a(s, s'), and by defining that function in the right way we can solve the problem of finding an optimal policy using Reinforcement Learning techniques.

This task is challenging due to several reasons. The state space is infinite and actions have stochastic results. We can never explicitly model the whole space. We may also need to trade immediate rewards for long term rewards. For example, instead of learning a complicated term that frequently appears in the target text it may be better to learn a common word that has a low frequency in the target text but makes more actions available to the learner in the future. Luckily, all these are well-known problems that have been solved in one way or another.

Picking the right actions

A key challenge in language learning is to present the learner with material that is neither too difficult nor too easy, or the learner will become frustrated or bored, respectively. In text comprehension there is research that shows that one unknown word for about 50 known words is a good ratio to encourage learning. Learning vocabulary from context is generally more effective than rode memorization because it forces the brain to make connections to things you already know. If we can accurately model the knowledge of a learner and the knowledge dependencies of text, then this task becomes trivial. We could find articles, social media posts or other content that are just right for the learner’s current level and create actions based on them. And of course, by presenting such material to the learner we would refine our model of what the learner actually knows. In order words, the set of actions available at a state s_t should be limited  to those actions that are appropriate for a learner at that stage.

Data Network Effects

The more actions a learner performs the more accurately we will be able model his knowledge, and the more confident we can be in presenting him with the right actions. But that’s not all. As more learners are performing actions we can become more certain about how actions affect a learner’s state, essentially answering the question: Which material is most effective for a learner with a certain background knowledge and goal? This not only allows for making optimal recommendations about what a learner should do next, but may even provide insights about how people learn in general.

These are just some examples of things we can do with a more analytical approach to language learning, but it’s already pretty exciting.

The human side of technical debt

When we talk about technical debt we typically talk about its business impact. It allows us to gain short-term efficiency at the expense of long-term productivity. Technical debt isn’t always bad. Sometimes it’s the right business decision. When running experiments that may or may not become part of a final product taking on technical debt is often a good idea. If the experiment doesn’t work out we can throw away the piece the incurred the debt and there’s no need to “pay it back”.

But technical debt has side effects that we often forget about. Developers hate working with technical debt that isn’t their own. Nobody likes cleaning up somebody else’s mess. Whenever a developer touches code or infrastructure plagued by technical debt she is likely to feel frustrated and demotivated, and that feeling will spill over to other aspects of her work. This effect on human motivation is hard to quantify, but extremely important to consider.

Then there’s the cascading effect. The presence technical debt increases the probability of developers adding more technical debt in adjacent components. It’s messed up anyway, so adding a bit more won’t hurt, right? Individual developers often make such decisions on the spot, and the result can get out of hand rather quickly.

The price of technical grows with number of people and the number of components that touch it.  We need to think of its impact in terms of people, not just code or infrastructure. As much as possible debt should stay confined to isolated components that are “owned” by individuals or teams who are responsible for managing and paying back the debt over time.


How to make decisions, stick with them, and be productive

The list of new technologies I want to learn about grows every day. My Amazon wish list contains hundreds of books on more than a dozen subjects, ranging from Machine Learning to Public Speaking. I have a spreadsheet with over 50 business ideas and associated research. Some I briefly started working on, only to abandon them a few weeks later and move on to the next shiny thing.

What the hell am I doing?

We live in a world of unlimited opportunity. But deciding how to spend our time is becoming increasingly difficult. There are so many choices. How do we pick the best one? Which metric do we use to define best?  And once we’ve made a decision, how can we follow through and avoid getting distracted? I often feel paralyzed. I am scared that whatever choice I make is suboptimal and may lead to misery down the road.

I’ve always admired people who can set their sight on a goal and follow through without getting distracted by other opportunities. I have a couple of friends who are like that. How are some people so incredibly productive? And why do some people feel empty even though they’ve achieved huge success? I figured that learning about human motivation may help me solve my problem. So that’s what I did and what this post is about.

I have’t found one answer that applies to all situations but I thought it would be good share what I’ve learned nonetheless. So this post is a collection of ideas and techniques that helped me become more motivated and productive. Not all of them may work for you. Everyone faces unique challenges after all.

Before diving into the practical stuff let’s try break down our problem. I think there are two subproblems:

  1. How do we decide what to spend our time on?
  2. Once we decided, how do we stick with our decision and become more productive?

Of course, these two problems are interrelated. Picking a project we’re passionate about may lead to increased productivity down the road. And knowing why we do something helps us sticking with it and ignore distractions. Still, I think it’s useful to consider these two question separately.

How do we decide what to spend our time on?

Don’t follow your passion

Successful people will tell you that they love what they do. There clearly is a correlation, but nobody has been able to prove a causation. Don’t confuse the two. In other words, these people didn’t go into deep introspection, figure out what they were passionate about, and then became successful because of that. They learned to love what they were doing. There’s a good chance that you are passionate about what you’re already good it. Because that’s the reason you became good at it in the first place.

Ben Horowitz recently gave a speech at Columbia University about this. Two points that stood out to me are that  passions hard to prioritize and that passions change. Are you more passionate about math or engineering? Are you more passionate about history or literature? What you’re passionate about now may not be the same thing you’re passionate about in 10 years.

In his book So Good They Can’t Ignore You Cal Newport, a computer science professor and the author of the popular Study Hacks blog, suggests looking are your strengths to figure out what you can offer to the world. What are you uniquely destined to do? Passion will be the side effect.

Follow Resistance

If you haven’t read Steven Pressfield’s The War of Art you should. Out of all the resources in this post it’s probably the one that has had the biggest impact on me. The basic idea is that a force called Resistance is trying to prevent us from doing meaningful things in our life. It’s not a scientifically rigorous theory, but a very helpful metaphor to think about human motivation.

Any act that rejects immediate gratification in favor of long-term growth, health, or integrity. Or, expressed another way, any act that derives from our higher nature instead of our lower. Any of these will elicit Resistance.

Goals that commonly evoke resistance include dieting, exercising, starting a business, and creative work like writing or drawing. Resistance is stronger the more meaningful the project is. That sucks, right? Nope, actually it’s great. It allows you to use resistance as a compass. The project you’re most scared about, the one that has been on your list forever, is probably the most fulfilling thing you can do.

I’ll talk a bit more about how Resistance manifests itself later in the post.

Be optimistic and have a bias towards action

When I read the biographies of successful people (Getting There: A Book of Mentors is a good resource) it stood out to me that many zig-zagged through their lives and made drastic career changes several times. It’s important to realize that the decisions you make are not set in stone. It’s usually impossible to determine the best path in advance. You’re better off making a well-informed decision quickly than to overanalyze the situation and do nothing. This is also known as Satisficing. Picking a strategy that achieves a minimum threshold of “goodness” often leads to better outcomes than spending a huge amount of time to finding optimal decision.

One thing to be aware of is that we tend to overestimate the potential downsides of our decisions because we’re naturally loss-averse. We’re  scared to lose our comfort and safety. In reality there are few career choices that will completely ruin your life. There’s almost always a way to recover. Look at upside opportunity instead of downside risk and be optimistic about uncertainty. If an opportunity has a lot of uncertainty (an entrepreneurial venture for example) you are certain to learn a lot from it, even if it doesn’t work out.

Inaction is often riskier than we think. A good example are supposedly stable jobs. They are stable until they aren’t. A market crash, legal dispute or shareholder issues can lead to layoffs that nobody expected. Relying on the stability of your job is riskier than continuously reinventing yourself (e.g. by developing skills currently in demand or switching jobs) and diversifying your activities.


Most of us ignore death. It’s not something we typically think about. We act as if we are in this world forever. That just isn’t true. All of us will die eventually. Once we let death enter our life everything changes.

The moment a person learns he’s got terminal cancer, a profound shift takes place in his psyche. At one stroke in the doctor’s office he becomes aware of what really matters to him. Things that sixty seconds earlier had seemed all-important suddenly appear meaningless, while people and concerns that he had till then dismissed at once take on supreme importance. What about that gift he had for music? What became of the passion he once felt to work with the sick and the homeless? Why do these unlived lives return now with such power and poignancy?

Steve jobs shares a similar experience in his graduation speech at Stanford. Remembering that he’ll be dead soon helped him forget about external expectation, pride, and fear of embarrassment, and figure out what’s important to him. One of the biggest regrets of the dying is “I wish I’d had the courage to live a life true to myself, not the life others expected of me”. Regularly thinking about death is a useful technique that may help us get closer to that.

How do we stick with our decisions and become more productive?

Label Your Enemy

Resistance (as described in the War of Art) manifests itself in many forms: Self-doubt, fear of public evaluation, rationalization, or fantasizing about the outcome of your decisions. My personal favorite is rationalization:

Rationalization is Resistance’s right-hand man. Its job is to keep us from feeling the shame we would feel if we truly faced what cowards we are for not doing our work.

Simply understanding what forms of resistance we experience is helpful. Next time you experience any of the above try to identify and label it. This is also known as affect labeling and has long been used in meditation to manage negative emotions. By noticing “I am rationalizing my behavior right now” you may just be able to get rid of your feeling. Actually noticing a specific feeling when it arises is difficult, often you’re just “swept away” by it, but you get better at it with practice.

Time Blocking and Pomodoro

Time blocking is a technique that has had a huge impact on my productivity. The basic idea is that you divide your day into fixed chunks of time and assign tasks to them. During each block you work on nothing else but the task at hand. If you finish your task early you can work on a secondary task. If you don’t finish within the allotted time you can reschedule your existing blocks or continue tomorrow. It’s important that you evaluate your day based on how well you adhere to your blocks, not based on how many tasks you finish. Cal Newport describes this technique in more detail on bis blog. A side effect of time blocking is that it helps you build habits by using the same block structure every day.

Directly related to time blocking is Parkinson’s Law:

Work expands so as to fill the time available for its completion.

By assigning work to discrete time slots you are forcing yourself to be more productive and cut down on unnecessary actions.

If you have trouble staying focused for a full-time block you can break it down into 25-minute intervals with little breaks in between. These are also known as Pomodoros.


Autonomy, Mastery and Purpose. According to Daniel Pink’s Drive these are the three pillars of intrinsic motivation. Most of us already have autonomy and I talked about purpose in the first part of this post. But what about Mastery?

Mastery is the feeling of progression, the feeling that your skills are improving. In video games Mastery often takes on the form of levels, ranks or better items. It’s one element that makes games so addictive. But how do we implement this in our own lives?

In order to gain a sense of mastery from your actions you need to set measurable goals. I like to use Google’s OKR system for this. (Okay it wasn’t invented by Google but at least they made it popular). OKR stands for Objectives and Key Results. An objective is something like “Lose 10 pounds” and its associated key results could be “Go to the gym 15 times for at least one hour” and “Run a total of 50 miles”. Key Results are always fully quantifiable.

An important characteristic of OKRs is that they are ambitious. Achieving 60 to 70% of an Objective at the end of a period is what you’re aiming for.

If you get 100%, you’re not crushing it, you’re sandbagging.

I use a spreadsheet to manage my personal OKRs. I review and update their progress once a week and create new OKRs at the end of each month. The OKRs feed directly into my time blocking schedule. Every time block is related to a Key Result.

Building Habits

According to Power of Habit up to 40% of our waking life may be dictated by habits. During these times we’re on autopilot. Habits are hard to change. This is both good and bad. Once you’ve created a habit of going to the gym every day it’s difficult to break it. But when you’re in the middle of doing your work and somehow end up on Facebook without realizing it, that’s a habit too.

Habits are made up of a cue/trigger, behavior and reward. The reward satisfies some kind of craving you have. It could be hunger, the need for socialization, or boredom. To get rid of a bad habit you can use the following procedure:

  1. Identify the behavior, e.g. visiting Facebook. This is the easiest part because it’s usually obvious.
  2. Experiment with various rewards to figure out your craving. What if you went for a walk instead? What if you listened to music for a few minutes? Does that satisfy you?
  3. Try to identify the cue. It could be location, time, emotional state, other people, preceding action or something else.
  4. Make a new plan based on your cue and try to replace your habit. E.g. “At 3pm listen to music for 5 minutes.” Write it down and follow your plan when you experience the cue next time.


Regular meditation practice can help with various aspects of productivity. It can help you stay focused for longer periods of time. It will also help you to identify feelings and thoughts (e.g. rationalization) you may not notice otherwise. If you’re looking for an easy way to get started with meditation I recommend the Headspace app.

If you made it this far, thanks for reading. What are your favorite techniques for staying motivated and productive?

Why I’ll never go to an office again

Over the past few years I’ve had the chance to work with many distributed teams. Most of them were startups, but I’ve also done remote projects for larger corporations. I honestly have never been happier with my work-life balance and I firmly believe that distributed teams are the future of engineering organizations. I can no longer imagine living and working any other way. In this post I want to discuss what about remote work had the biggest impact on my personal happiness as well as some of the things I’ve learned along the way.

Increased productivity

Working remotely has helped me to better understand my body. Previously I didn’t have much choice in setting my hours. By tracking my productivity over the course of the day I found that I am most productive in the morning (6-11am) and early evening (4-6pm). I essentially can’t get any intellectually demanding work done during noon or late at night. Being in the office from 11-3 is a waste of time for me and for the company I work for. I don’t produce any output and I feel horrible. Setting my own schedule means that I can take a break at noon, get lunch, work out or go for a run, take a nap, and run some errands. Afterwards I feel happy, re-energized and ready to get stuff done again. I know plenty of people who love to work late at night. So why not let them?

While some centralized teams offer flexible hours with they usually come with limitations. If you have a long commute (next point) what can you really do within 4 hours of free time? What if there’s no gym or track around your office? What if you want to buy groceries for your home? What about the peer pressure of eating and hanging out with co-workers? If you’re the only person in the office at midnight what’s the reason to being there at all?

No commute

Studies have founded that a long commute has negative effects on life dissatisfaction. The longer the commute the more pronounced these effects are. Some people succeed in setting up a home office but for many (including me) it’s rather difficult to get serious work done at home. I typically work from a co-working space pretty close to my house. So it isn’t quite true that there is no commute, but usually the commute is much shorter and doesn’t feel like a commute. For example, I walk to my favorite coffee shop in the morning and enjoy getting some fresh air. That’s pretty different from a 1-hour drive and being stuck in traffic.

Reducing the commute has been huge for me. You may have gotten used to it already, but don’t underestimate the stress that comes from commuting. I no longer worry about when exactly to leave home, whether or not there’s traffic, or when I will arrive at my office. As result I feel less stressed and more productive throughout the day.

Flexibility and ability to travel

Centralized teams have setup a structure of synchronous communication where not being in the office at usual times breaks processes like scheduled meetings. Distributed teams typically designed their communication structure to be asynchronous and less reliant and people being available at the same time. This means that occasionally changing the times you work isn’t such a big deal. When I have an appointment, errands to run, or want to show a friend around town I simply change my hours to make room for it. I can also satisfy my urge to travel to other places or countries. My favorite places to live in are currently Japan and Thailand, and I’ve been traveling back and forth between them.

Office politics and gossip

This point isn’t really a result of working remotely but rather of how distributed teams or typically setup and run. Humans are social animals and as the team grows it’s inevitable that office politics and social hierarchies develop. I am not talking about professional relations, these are usually pretty clear, but interpersonal ones. Things that are seemingly unrelated to work, such as forgetting someone’s birthday, deciding what to get for lunch, or not going out for beers with the others inevitably spill over to professional interactions (sometimes just subconsciously). While some people benefit from these things, they can become a huge factor of stress and work dissatisfaction for others.

Distributed teams don’t avoid office politics completely, but almost. Restricting in-person social interaction and focusing on trackable results and metrics leads to significantly less politics and stress. Separating work and personal life is much easier when working remotely.

What about the employer’s perspective?

To make distributed teams the norm we need to make it lucrative for both sides, the employee and the employer. This post is written from the employee’s perspective and that’s arguably the easier side to convince. The benefits for the employee are pretty well understood and there are enough people who want to work remotely.

However, making distributed teams more productive than centralized ones is hard. Asynchronous communication is probably the main reason for this. Most distributed teams I’ve worked with were probably less effective than if everyone had been in the same space. But I don’t think it has to be this way. With the right processes in place I believe that we can make distributed teams work for both employees and employers. I’ll discuss some of the things I’ve learned from the employer’s side in a future post.

I’d love to hear about your experiences with distributed teams.


Product Idea: Smart & Beautiful Spaced Repetition

This post is part of the product idea series. I’ll take an idea for a product I have, do some research, and then present it. Usually these ideas are something I want for myself but don’t have the time or motivation to work on. I’ll structure each idea roughly according to a YC application because I think its questions are a pretty good idea filter.

What is your company going to make?

We build a flashcard-based Spaced Repetition  application and API to help people being maximally effective when memorizing vocabulary, for example when learning a new language. Our application has a beautiful mobile UI and our platform makes it easy for developers to create integrations.  We also use collective review data to make smart suggestions to our users.

Why did you pick this idea to work on? Do you have domain expertise in this area? How do you know people need what you’re making?

I have been a long-time user of Anki, a popular Spaced Repetition System (SRS) that has been around for about 10 years. I used Anki extensively when I was studying Japanese, and I recently started using it again to memorize Korean Vocabulary. Anki’s functionality has been immensely helpful in memorizing vocabulary and grammar.

Anki hasn’t changed much in the past decade. By today’s standards data input feels clunky, the user experience out of date, and it’s not optimized for mobile. Typically manage my flash cards on the desktop but do all my reviews on a mobile device. The syncing functionality is slow and requires manual work. The plugin system is hard to use for developers. There are no good alternative to Anki that offer the features I need.

The need for Spaced Repetition is well established and there are a couple of successful commercial products out there. Besides language learning subjects like GMAT, SAT, LSAT and medical term memorization are good use cases for flashcard-based SRS.

What’s new about what you’re making? What substitutes do people resort to because it doesn’t exist yet (or they don’t know about it)?

People who need to memorize vocabulary (or other things) can be categorized by increasing level of sophistication:

  1. Those who do not use any system to memorize facts. These people go to classes, read textbooks, and repeat words to themselves, but do not use a formal system to make memorization more efficient.
  2. Those who make paper/electronic flashcards. They may use software such as Quizlet that is not customizable and doesn’t use Spaced Repetition.
  3. Those who download various specialized apps from the app store. For example a user may download a Learn Japanese vocabulary app on the iPhone. Such apps are easy to use but leave the user with very little control over what to learn or how to structure their learning.
  4. Those who use sophisticated and customizable SRS software such as Anki. These people want control over what they learn and how they learn it. Because software like Anki is feature-packed it is typically difficult to use for non-technical people and not as accessible to the general population. A lot of Anki users seem to be developers, science and medical students, or people somehow involved in the tech industry.

SRS software like Anki is relatively unknown to the general population so it could be that people in groups 1-3 simply don’t know about it. For example, college students taking language classes are generally tech-savvy but may have never heard of SRS.

The key to our product is to combine the advantages of 3 and 4. There are three key points I want to touch on here (there are other but I want to keep the post relatively short): Integrations, Data and mobile UX.

Despite many attempts to commercialize flashcard software Anki has stayed popular and is the the SRS of choice (see here or here or here) for sophisticated users. This is mainly due to the flexibility and extensibility that Anki offers. Let me give you an example. When I enter a new vocabulary word in Anki I can activate a plugin that automatically downloads the Google Translation Audio for the word, and plays it when I review the card. There also exist a tool to create flashcards from movies and their subtitle files. That’s something that’s impossible to achieve with any other flashcard software. However, while it’s possible to do these things with Anki, it’s not easy by any means. Plugins are not easy to create for the developer and not easy to use for the end-user. That should change. If you compare Anki’s plugin system to modern APIs we are used to (think Slack) then it becomes clear that it’s outdated it. Having a modern developer API that allows people to create plugins and integrations is key.

The second point is data. Anki (and other flashcard apps) collect a huge amount of data on what people are learning and reviewing. But they don’t use it. Here are some simple examples of how one could make use of collective data. 1. Doing entity disambiguation to figure out which users are are studying the same facts and which flash cards refer to the same entities. We could use that knowledge when someone inputs new data. For example, if one user has added a picture for the word bird then the same picture could be used by another user who is studying that word (even in another language). 2. One could make recommendations based on which terms are typically studied together by many users 3. One could analyze review data to find an efficient “study order”. For example, it may be easier to learn the meaning of the word imagine before unimaginative. There are many more interesting things you can do with collective review data and I have lots of ideas. This has been totally unexplored in the solutions I’m aware of.

The third point is a mobile-first user experience. This doesn’t need much explaining. The app must sync seamlessly across devices and have a beautiful mobile and desktop user experience.

Who are your competitors, and who might become competitors? Who do you fear most?

As explained above, there are solutions with varying degrees of sophistication.

  • Customizable open source software like Anki
  • General-Purpose closed-source apps like MemorangQuizlet or Brainscape. These solutions usually don’t offer much flexibility and their main value lies in the pre-made flashcard decks they provide.
  • Subject-specific flashcard apps found in the app store

What do you understand about your business that other companies in it just don’t get?

There are two camps of companies, both of which have their own deficiencies. We also touched upon this above.

Anki and its users are very tech-savvy. Anki allows people to track and customize anything, but does not care much about the end-user or integration developer experience.

Companies like Quizlet have a fairly good user experience but do not appeal to power users due to the lack of customization. They are not as effective as Anki due to not making use of spaced repetition algorithms, community plugins, or data.

What we really need is a combination of the above, where unsophisticated users can easily benefit from the sophistication of others, and where sophisticated users can benefit from a good UX and API.

Another point that competitors don’t understand is that review data has network effects. With more usage of the product we can use the data to provide additional value to existing and new users in the form of recommendations, better user experience (due to smart suggestions), pre-made decks, and social features. Other companies in the space don’t seem to be executing on this idea.

How will you get users? If your idea is the type that faces a chicken-and-egg problem in the sense that it won’t be attractive to users till it has a lot of users (e.g. a marketplace, a dating site, an ad network), how will you overcome that?

I believe we can make people from Anki switch over by offering a compatible data format (they can import/export to and from Anki) but providing a superior user experience and API. People using Anki fit the early adopter profile of being a tech-savvy crowd that likes to try out new things. We need ore-made card decks for a specific languages to attract such users.

Other acquisition channels could include promoting the app for in-classroom use at schools or growth-hacking language exchange websites and social networks.

How will you make money?

I believe it’s too early to think about monetization for a consumer product and all the focus should be on getting traction. As mentioned above, network effects would allow the right product to lock in users and monetize them later on. I can see several obvious paths to monetization, such as paid integrations/plugins (e.g. a developer marketplace), paid hosting of review data, paid personal analytics, or upselling of other services. Another possibility may be to have a paid product for teachers/classrooms with team features while the product for individuals stays free.

Being in Silicon Valley as a Founder – Pros and Cons

Many founders working on early-stage startups seem to want to move to Silicon Valley. The biggest hurdles are usually visa and money. But is moving to SV actually worth it? I’m not a SV veteran, I’ve only been here since 2008, but over the past years I’ve spent a fair amount of time living outside of SV. I believe this helped me gain some perspective on what Silicon Valley can and cannot provide to founders.

Common Misconceptions

Let’s first look at things that people believe to be advantages of being in SV, but which in my experience also have a flip side. I see these as not inherently positive or negative, but depending on your situation they may be worth considering.

Raising (seed) money is easy. It’s true that the amount of capital available in SV is huge, but raising money is not as easy as you may think (or as the press makes it out to be). One reason for this is that lots of startups are competing for the money. Expectations of what companies are required to show are constantly shifting. Manu Kumar has an excellent post on how seed rounds are looking more like series the A rounds a couple years ago. To have a shot at raising a seed round in Silicon Valley you better have a post-MVP product with traction and promising metrics. Having a pitch deck or an MVP without customers doesn’t usually cut it. So, SV is a great place to raise money only If you’re at the right stage.

Lots of events and communities. You can attend startup mixers, meetups, hackathons and pitch competitions all day long. You’ll have beers with founders who are usually looking for co-founders or trying to sell their product. In my experience going to these events is mostly a waste of time. The time is better spent building your product and talking to users. Also, don’t expect to meet any influential people at such events. Most of them are too busy to attend and have realized that there is little value to be gained. 

The best talent is in SV. Silicon Valley companies and famous bay area universities attract people from all over the world. True, but that doesn’t immediately benefit you or your startup. Sought-after people have dozens of excellent opportunities and you would need to offer something pretty special to attract someone to your company over the other gazillion startups out there. Expect to pay salaries or hourly rates that are 3-5x above the rates in other parts of the country (or world). You may actually be able to attract better talent in areas where people don’t have as many options as they do in SV.


Karma really exists in SV. I’m constantly amazed at how helpful people are, particularly those that I expected would be too busy to reply to me. I’ve asked favors of influential people who had absolutely zero reason to talk to me, but they still did. This is something that newcomers, in my experience especially those from a sales background, seem to have a hard time getting used to. They try to strike a deal whenever someone asks them for a favor. I think this culture is pretty unique to SV and I haven’t seen anything comparable in other cities or countries.

Selling to startups is easy. I often hear people discouraging startups from selling to other startups. I disagree. It’s bad if your total available market consists of startups only, but there is nothing wrong with having startups as early adopters and then moving upmarket. In accelerators like YC it’s extremely common that startups start out selling to the other cohort and alumni companies. There’s no better place to sell to startups than SV. Getting meetings is relatively easy, and companies are open to trying out and implementing new solutions, helping you to iterate quickly on your product.

It’s easy to relate to people. Living in other parts of the world I found it very difficult to explain what exactly I am doing to others. In some places I got tired of explaining and settled with “I’m an engineer and work for a software company” or something similar. Most people just aren’t familiar with the concept of a startup. In Silicon Valley you could probably strike up a conversation about churn rates with your barista (not that I’ve tried…) and she’ll tell you that she’s working on her own mobile app during break time. Knowing that people understand what you are doing is a very comfortable feeling. Not being able to talk to anyone about your problems can be quite depressing.


High burn rates. Medium rent for a 1BR apartment in San Francisco is about $2,500 to $3,500 (source) and increasing. There are cheaper options, such as living in the east bay, but since almost all meetings happen in SF you would be commuting all the time. That time/money may be better spent on your product and commuting is among the most significant factors of work and life dissatisfaction. If you’re in the early stages and haven’t raised a big chunk of money then these are serious expenses. Startups are a marathon, and lowering your burn rate may just make the difference between making or not. The Airbnb story is a good example of this.

An influx of people who want to make a quick buck. Even in the relatively short time I’ve been in SV I’ve seen the crowd in SV change. Many of those who used to go into Finance are now going into startups hoping to make a quick buck. I’m all for more people getting into the startup ecosystem and bringing positive change, but it also seems like startups in SV have become a more hostile environment as a result. Legal disputes between founders and people trying to trick each other with strange non-standard equity clauses seem to become more common. As startups are entering mainstream media the situation will likely worsen, and SV is the place that will feel it the most.

High Pressure. At Stanford we have something called the “duck syndrome”. On the surface it looks like you are gliding along effortlessly, but in reality you’re vehemently paddling underneath. Talk to any startup founder in Silicon Valley and they’ll tell you how they’re “crushing it”.  In reality most founders work crazy hours and things aren’t going nearly as well as it may seem. But you’ll never find out about that. As humans we tend to compare ourselves to those around us. In SV this can lead to extreme stress and pressure. Founder depression is common and not to be underestimated.

For anyone who has made the move, I’d love to hear your experience.