Unaligned Incentives

Jul 24 2025

There are many goals in this world; goals of companies, job roles, your own individual goals. However, these are all driven by separate incentives which often do not perfectly align with the goals themselves, and can form a good framework for understanding the true nature of a system and its effectiveness.

Are companies evil?

A common case of this is in business. The "mission" of a company and its incentives (almost always making money), are not the same, and in some cases are unaligned with each other.

Take the case of dating apps, their goal ideally should be to find everyone the perfect match in a short time span. However they profit from premium users giving them money, hence the longer users are on their app, the more money they make. These are intrinsically unaligned, thus you cannot trust this system to be effective. Sure if you get no dates, you'll also stop paying, but this only produces a semi-effective system. Now imagine a dating app where you pay so much for a year, but as each month goes by without you finding a perfect match, money gets refunded to you. You would probably be off the app and in a relationship within the first week (assuming that's why you as a user are on the app in the first place).

There are companies that are more trustworthy when viewed through this framework, where their profits and goals are aligned. Take TV's, they make a good TV and sell it for more than its made, and you buy it because its good, if it's not good then it doesn't get bought. The TV companies are incentivized to make good TVs. Similarly, streaming platforms are more aligned. If they do not have a good selection of shows that entertain you, then you stop paying for the platform.

Obviously, things like planned obsolescence and monopolies muddy this, but the closer these two metrics become, the better.

Search

Over recent years the top results of many search engines have often been regarded as being fairly low quality. With many sites being bloated, SEO optimised, filled with various calls to action and pop-ups. SEO by definition is augmentation of content and web pages to boost its rank on search engines, do people really want to develop websites with all this unnecessary noise? Or are they incentivised to by search engines?

I feel the web as a whole could be vastly improved by search rankings that prioritise fast web pages, with minimal pop-ups and screen noise, and content that gets to the core of the information you want, without all the additional bloat. The simplicity and signal-to-noise ratio of LLMs and LLM powered search show how effective this is, and if nothing is done then systems like these are likely to take over traditional search.

The self-alignment test

Think about this for yourself, what is your chosen life's purpose, does it align with what you do? how you make money? are you unaligned with yourself?

Most people have goals in life, maybe its to be more intelligent, or healthier, or code more. In order to track if your achieving this goal you may set yourself proxy metrics. Say you want to be smarter, and you determine that reading books is a valuable way to gain more knowledge, so you set yourself the proxy of reading 50 books this year. This goal and incentive alignment can diminish pretty quickly when you start upping your book count be reading children's books.

You also see this with programming. GitHub has a nice feature where you can see your activity on the platform overtime. This can be easily mistaken as a measure of how much you program, or your improvement as a programmer. However, its pretty easy to create small commits, or even use bots that create daily activity, giving yourself the illusion of progress.

I love metrics as a way to track progress and make sure you motivate yourself to follow a goal. But its important to understand that these proxies will eventually become incentives, and you may forget about the original goal. So its important to make sure that metric is aligned with your original goal, and that you cannot reward hack your own terminal goals.

The current state of AI

My biggest gripe with the current state of AI is the shift in incentives over the last few years. Now that neural networks provide enough value to generate revenue, many have pivoted their efforts away from solving intelligence to optimizing the current models to squeeze value and maximize profits, which is a sad direction to see great minds go in.

Even for more pure research efforts, Goodhart's law and unaligned incentives have still poisoned the well. The large focus on benchmarks have led researchers to be less experimental with their research, how many papers do you see about improving RNNs? We have trodden down a path with transformers which has shown to be temporarily fruitful. However, I believe the true path to AGI will be radically different, but am afraid that to get there we will need to take many steps back, thus making models that are not beating the state of the art, for now.

A good example of this is with Geoffrey Hinton's forward-forward algorithm, a new algorithm to replace backpropagation, something that doesn't really reflect the learning process in biological brains. To me this was super exciting to see, however since the paper tested this algorithm on the MNIST dataset, many people were left unimpressed. It's sad to me that model's that cannot land on the big LLM leaderboards are dismissed so quickly, GPT2 would have performed poorly compared to today's models, but it paved the way to what we currently have, and new methods will do the same.

If the goal is the understand and replicate intelligence, then we need metrics and incentives to progress us towards this goal. Making money from these models are not aligned incentives, neither are benchmarks about small pattern recognition tasks. We should instead either have more general benchmarks that are actually hard (such as a robot that can do 100 varied tasks), or develop algorithms that reflect the human brain, if its not done like a biological brain, then drop it.

Make sure you are aligned