The future of values 2: explicit vs. implicit

Relatively minor technological change can move the balance of power between values that already fight within each human. Beeminder empowers a person’s explicit, considered values over their visceral urges, in much the same way that the development of better sling shots empowers one tribe over another.

In the arms race of slingshots, the other tribe may soon develop their own weaponry. In the spontaneous urges vs. explicit values conflict though, I think technology should generally tend to push in one direction. I’m not completely sure which direction that is however.

At first glance, it seems to me that explicit values will tend to have a much better weapons research program. This is because they have the ear of explicit reasoning, which is fairly central to conscious research efforts. It seems hard to intentionally optimize something without admitting at some point in the process that you want it.

When I want to better achieve my explicit goal of eating healthy and cheap food for instance, I can sit down and come up with novel ways to achieve this. Sometimes such schemes even involving trickery of the parts of myself that don’t agree with this goal, so divorced they are from this process. When I want to fulfill my urge to eat cookie dough on the other hand, I less commonly deal with this by strategizing to make cookie dough easier to eat in the future, or to trick other parts of myself into thinking eating cookie dough is a prudent plan.

However this is probably at least partly due to the cookie dough eating values being shortsighted. I’m having trouble thinking of longer term values I have that aren’t explicit on which to test this theory, or at least having trouble admitting to them. This is not very surprising; if they are not explicit, presumably I’m either unaware of them or don’t endorse them.

This model in which explicit values win out could be doubted for other reasons. Perhaps it’s pretty easy to determine unconsciously that you want to live in another suburb because someone you like lives there, and then after you have justified it by saying it will be good for your commute, then all the logistics that you need to be conscious for can still be carried out. In this case it’s easy to almost-optimize something consciously without admitting that you want it. Maybe most cases are like this.

Also note that this model seems to be in conflict with the model of human reasoning as basically involving implicit urges followed up by rationalization. And sometimes at least, my explicit reasoning does seem to find innovative ways to fulfill my spontaneous urges. For instance, it suggests that if I do some more work, then I should be able to eat some cookie dough. One might frame this as conscious reasoning merely manipulating laziness and gluttony to get a better deal for my explicit values. But then rationalization would say that. I think this is ambiguous in practice.

Robin Hanson responds to my question by saying there are not even two sets of values here to conflict, but rather one which sometimes pretends to be another. I think it’s not obvious how that is different, if pretending involves a lot of carrying out what an agent with those values would do.

An important consideration is that a lot of innovation is done by people other than those using it. Even if explicit reasoning helps a lot with innovation, other people’s explicit reasoning may side with your inchoate hankerings. So a big question is whether it’s easier to sell weaponry to implicit or explicit values. On this I’m not sure. Self-improvement products seem relatively popular, and to be sold directly to people more often than any kind of products explicitly designed to e.g. weaken willpower. However products that weaken willpower without an explicit mandate are perhaps more common. Also much R&D for helping people reduce their self-control is sponsored by other organizations, e.g. sellers of sugar in various guises, and never actually sold directly to the customer (they just get the sugar).

I’d weakly guess that explicit values will win the war. I expect future people to have better self-control, and do more what they say they want to do. However this is partly because of other distinctions that implicit and explicit values tend to go along with; e.g. farsighted vs. not. It doesn’t seem that implausible that implicit urges really wear the pants in directing innovation.

The future of values

Humans of today control everything. They can decide who gets born and what gets built. So you might think that they would basically get to decide the future. Nevertheless, there are some reasons to doubt this. In one way or another, resources threaten to escape our hands and land in the laps of others, fueling projects we don’t condone, in aid of values we don’t care for.

A big source of such concern is robots. The problem of getting unsupervised strangers to to carry out one’s will, rather than carrying out something almost but quite like one’s will, has eternally plagued everyone with a cent to tempt such a stranger with. There are reasons to suppose the advent of increasingly autonomous robots with potentially arbitrary goals and psychological tendencies will not improve this problem.

If we avoid being immediately trodden on by a suddenly super-superhuman AI with accidentally alien values, you might still expect a vast new labor class of diligent geniuses with exotic priorities would snatch a bit of influence here and there, and eventually do something you didn’t want with the future we employed them to help out with.

The best scenario for human values surviving far into an era artificial intelligences may be the brain emulation scenario. Here the robot minds start out as close replicas of human minds, naturally with the same values. But this seems bound to be short-lived. It would likely be a competitive world, with strong selection pressures. There would be the motivation and technology to muck around with the minds of existing emulations to produce more useful minds. Many changes that would make a person more useful for another person might involve altering that person’s values.

Regardless of robots, it seems humans will have more scope to change humans’ values in the future. Genetic technologies, drugs, and even simple behavioral hacks could alter values. In general, we understand ourselves better over time, and better understanding yields better control. At first it may seem that more control over the values of humans should cause values to stay more fixed. Designer babies could fall much closer to the tree than children traditionally have, so we might hope to pass our wealth and influence along to a more agreeable next generation.

However even if parents could choose their children to perfectly match their own values, selection effects would determine who had how many children – somewhat more strongly than they can now – and humanity’s values would drift over the years. If parents also choose based on other criteria – if they decide that their children could do without their own soft spot for fudge, and would benefit from a stronger work ethic – then values could change very fast. Or genetic engineering may just produce shifts in values as a byproduct. In the past we have had a safety net because every generation is basically the same genetically, and so we can’t erode what is fundamentally human about ourselves. But this could be unravelled.

Even if individual humans maintain the same values, you might expect innovations in institution design to shift the balance of power between them. For instance, what was once an even fight between selfishness and altruism within you could easily be tipped by the rest of the world making things easier for the side of altruism (as they might like to do, if they were either selfish or altruistic).

Even if you have very conservative expectations about the future, you probably face qualitatively similar changes. If things continue exactly as they have for the last thousands of years, your distant descendants’ values will be as strange to you as yours are to your own distant ancestors.

In sum, there is a general problem with the future: we seem likely lose control of a lot of it. And while in principle some technology seems like it should help with this problem, and it could also create an even tougher challenge.

These concerns have often been voiced, and seem plausible to me. But I summarize them mainly because I wanted to ask another question: what kinds of values are likely to lose influence in the future, and what kinds are likely to gain it? (Selfish values? Far mode values? Long term values? Biologically determined values?)

I expect there are many general predictions you could make about this. And as as a critical input into what the future looks like, future values seem like an excellent thing to make predictions about. I have predictions of my own; but before I tell you mine, what are yours?

Which stage of effectiveness matters most?

Many altruistic endeavors seem overwhelmingly likely to be ineffective compared to what is possible. For instance building schoolsfunding expensive AIDS treatment, and raising awareness about breast cancer and low status.

For many other endeavors, it is possible to tell a story under which they are massively important, and hard to conclusively show that we don’t live in that story. Yet it is also hard to make a very strong case that they are better than a huge number of other activities. For instance, changing policy discourse in China, averting rainforest deforestation or pushing for US immigration reform.

There are also (at least in theory) endeavors that can be reasonably expected to be much better than anything else available. Given current disagreement over what fits in this category, it seems to either be empty at the moment, or highly dependent on values.

An important question for those interested in effective altruism is whether most of the gains from effectiveness are to come from people who support the obviously ineffective endeavors moving to plausibly effective ones, or from people who support the plausibly effective endeavors moving to the very probably effective ones.

One reason this matters is that the first jump requires hardly any new research about actual endeavors, while the second seems to require a lot of it. Another is that the first plan involves engaging quite a different demographic to the second, and probably in a different way. Finally, the second plan requires intellectual standards that can actually filter out the plausible endeavors from the very good ones. Such standards seem hard to develop and maintain. Upholding norms that filter terrible interventions from plausible ones is plenty of work, and probably easier.

My own intuition has been that most of the value will come from the second possibility. However I suspect others have the opposite feeling, or at least aim to exploit the first possibility more at the moment. What do you think? Is the distinction even just?

Why would evolution favor more bad?

Sometimes people argue that pain and suffering should be expected to overwhelm the world because bad experiences are ‘stronger’ in some sense than good ones. People generally wouldn’t take five minutes of the worst suffering they have ever had for five minutes of the best pleasure (or so I’m told). An evolutionary explanation sometimes given is that the things that happen to animals tend to be mildly beneficial for them most of the time, then occasionally very bad. For instance, eating food is a bit good, but one meal won’t guarantee you evolutionary success. If on the other hand someone else eats you, you have lost pretty badly.

This seems intuitively plausible. Many processes have the characteristic that you can add more bricks and gradually reach your goal, but taking away a brick causes the whole thing to crumble. However good and bad outcomes are relative. If you see a snake and are deciding whether to go near it or not, there is a worse outcome of it biting you, and a better outcome of it not biting you. The good outcome here is super valuable, even if it doesn’t buy you immediate evolutionary success. It is just as important for you to get the good outcome as for you to not get the bad outcome. So what exactly do we mean by bad outcomes being worse than good outcomes are good? It seems we are judging outcomes relative to some default. So we need an explanation for why the default is where it is.

I think the most obvious guess is that the default is something like expectations, or ‘business as usual’. If you generally expect to go through your morning not being killed, then the avoiding being bitten by the snake option is close to neutral, whereas the being bitten option is very bad. But if the default is expectations, then the expected badness and the expected goodness should roughly cancel out – if suffering just tends to be stronger, then it should also tend to be rare enough to cancel. So on this model you shouldn’t expect life to be net bad especially.

At least on this model the badness and goodness should have cancelled out in the evolutionary environment. Our responses to good and bad situations don’t seem to change with our own expectations that much – even if you have been planning to go to the dentist for months, and it isn’t as bad as you thought, it can still be pretty traumatic. So you might think the default is fairly stable, and after we have been pushed far from our evolutionary environment, joy and suffering could be out of balance. Since we have been the ones pushing ourselves from the evolutionary environment, you might think we have been pushed basically in the direction of things we like (living longer, avoiding illness and harsh physical conditions, minimizing hard labor). So you might expect it is out of balance in the direction of more joy.

This story has some gaps. Why would we experience positive and negative emotions relative to rough expectations? Is the issue really expectations, or just something that looks a bit like that? To answer these questions one would seem to need a much better understanding of the functions of emotional reactions than I have. For now though, a picture where positive and negative emotions were roughly equal in some sense at some point seems plausible, and on that picture, I expect they are now net positive for humans, and roughly neutral for animals (by that same measure). This contribute to my lack of concern for both wild animal suffering, and the possibility that human lives are broadly not worth living.

There are many further issues unresolved. The notion that pleasure and pain should be roughly balanced for some reason is given much of its intuitive support by the observation that they are close enough that which is greater seems somewhat controversial. But perhaps net pleasure and pain only seem to be broadly comparable because humans are bad at comparing things, especially nebulous things. It is not uncommon to be both unclear on whether to go to school A or school B, and also unclear on whether you should go to school A with $10,000 or school B. Another issue is whether the measure by which there were similar amounts of pleasure and suffering actually align with your values. Perhaps positive and negative emotions use similar amounts of total mental energy, but mental energy translates to experiences you like more efficiently than to ones you don’t. Another concern is whether animals in general should be in such an equilibrium, or whether perhaps only animals that survive should, and all the offspring produced that die immediately don’t come into the calculus and can just suffer wantonly.

I think it is hard to give a conclusive account of this issue at the moment, but as it stands I don’t see how evolutionary considerations suggest we should expect bad feelings to dominate.

Writers as scientists

Matthew Yglesias a while back on Quine’s Word and Object:

It’s tempting (and conventional) to imagine language working neatly through such correspondences. Each word refers to some object in the world; each sentence describes a fact. Quine’s somewhat fanciful speculations on radical translation serve to undermine this account of meaning. Language is a social phenomenon, and languages are social practices with no guarantee of such direct correspondences. Quine observes that if we hear of a place where the local inhabitants describe pelicans as their half-brothers, it would be foolish to interpret this as a sign of profound genetic misunderstanding on their part. Instead, we see that their words don’t quite line up with ours, and a concept exists that somehow refers to half-brothers and pelicans alike.

[...] Those of us who try to describe the world for a living aren’t just poor handmaidens of those who try to uncover the truth about it… Rather, the process of description is the process of discovery. Language and science are, together, a joint process of discovery. Quine uses the phrase “ontic decision” to bypass the traditional question of what kinds of things are “real” as opposed to merely nominal. As he puts it, “The quest of a simplest, clearest overall pattern of canonical notation is not to be distinguished from a quest of ultimate categories, a limning of the most general traits of reality.” To paraphrase loosely — no doubt a bit too loosely for the tastes of one of the most precise writers I’ve ever read — a writer’s search for better, clearer, more concise descriptions of what we know is fundamentally of a piece with the searches for new knowledge.

I agree that a writers’ choosing concepts to describe things is in some ways basically similar to the part of science which involves ‘explaining’, or finding simple theories to describe the complexity we see. Both activities can be about spotting regularities in the messy world and re-describing what’s left of the mess with those patterns factored out. However it seems to me that writers’ efforts along these lines are badly constrained by conflicting rules. A scientist might notice that everything falls at the same rate, so they might name this ‘gravity’ and in future just say that all the stuff they describe falls with gravity, instead of describing each trajectory individually. A social commentator might notice many situations have in common an “irresistible force that draws you back to bed, or toward any mattress, couch, or other soft horizontal surface”, after which they need only mention ‘bed gravity‘. Or more poetically, a writer may notice a similar pattern in the behavior of a certain man and that of a disease, and say “He is sooner caught than the pestilence, and the taker runs presently mad”.

Notice that ‘more poetic’ here goes with a much narrower categorization. One man’s behavior is like a disease, whereas ‘bed gravity’ links a large range of familiar situations. ‘Gravity’ is an even more widespread pattern. This points to what I think is a difference between writers and scientists: scientists are actually looking for patterns that apply most generally, whereas a writer is cliched or dull if they aim to use the same metaphor again and again. Or worse, to re-use metaphors many others have used before them. They often can’t even use the same word or phrase their recent self has used before them without it sounding weird.

If writers were in charge of science they would say ‘you can’t put this whirlpool down to the coriolis effect too! You just used it for the jet streams and boundary currents. And it’s so hackneyed already!’ Some writers do what scientists do, and look for ways to describe reality concisely. But usually I think this is a topic for their writing: to say it is part of writing is like saying that gardening is a part of writing by virtue of there being books about it.

This is all mostly for novel categorizations and phrases. Writers do get away with re-using English words and common phrases. And using words presumably involves a bit of changing the meanings of the concepts around the edges as one goes. This is similar to the part of science where a scientist sees a particular flower and says, ‘ah, that’s a marigold’, perhaps ever so gradually thereby shifting exactly what it means to be a marigold.

Obvious points

As mentioned previously, pointing out obvious things seems embarrassing to me. However, it also often seems very valuable. That might seem obvious to you. Even so, this post will elaborate on this obvious point.

The set of things an intellectual would like to claim are obvious will tend to be much larger than the set that is reliably casually inferable by a random person with three minutes to devote to the issue. It is probably even much larger than the set of things reliably inferable by that intellectual earlier in their life. Many questions have obvious answers, while the questions themselves are not obvious. Many questions are obviously important once you notice them, but were not salient beforehand. Many points are obvious intellectually, yet not automatically integrated into one’s worldview and actions. And arguably, the more important and true and valuable a point, the more likely it is to look obvious once you know it.

I sometimes think of considerations that are so obvious to me now that I can barely articulate the converse, yet which it seems I must have been unaware of when younger. In general, if something is too incoherent to articulate, this seems like a strong mark against its appropriateness as a focus of discussion. It’s falsity is probably obvious. So I’m not very inclined to write blog posts about such topics. Yet it would usually have been very valuable for my younger self to read such a post – I’d guess more than hundreds of times as valuable as it is costly for me to write such posts, which is much worse again than the cost to more knowledgeable readers of seeing a discussion of something they already knew. And unless I am unusually dense (in which case my blogging strategy seems unimportant), others probably make similar errors to the ones I seem to have made. So it seems probably socially beneficial to write posts about points as obvious as those.

If writing obvious things is costly to the author, does it matter much that it is socially beneficial? It makes more difference than you might suppose: if the author endorses writing socially beneficial obvious things, then when others see the author writing obvious things, they should less infer that the author thought the point was non-obvious (as long as endorsing this coincides at all with writing things that seem obvious, which appears plausible). On that note then, I just wanted to say how important I think writing obvious things is.

The landscape of altruistic interventions

Suppose you want to figure out what the best things to do are. One approach is to start by prioritizing high level causes: is it better broadly to work on developing world health, or on technological development? Then you can work your way downwards: is it better to work on treating infectious diseases or on preventative measures? Malaria or HIV? Direct bed-net distribution or political interventions? Which politician? Which tactic? Which day?

This should work well if the landscape of interventions is kind of smooth – if the best interventions are found with the pretty excellent interventions, which are in larger categories with the great interventions, etc. This approach might work well for finding a person who really likes hockey for instance. The extreme hockey lovers will be found with the fairly enthusiastic hockey lovers, who will probably ultimately be in countries of hockey lovers. It should not on the other hand work very well for finding the reddest objects in your house – the most red thing is not likely to be in the room which has the most overall red. Which of these is more similar to finding good altruistic interventions?

This method would work well for finding the reddest things in your house if the redness of things was influenced a lot by color of the lights, and you had very different colored lights throughout your house. Similarly, if most of the variation in value between different altruistic interventions comes from general characteristics of high level causes, we should expect this method to work better there. You might also expect it to work well if the important levels could be mixed and matched – if the best high level cause could be combined with the best generic method of pursuing a cause, and done with the best people. These things seem plausible to me in the case of altruistic interventions, but I’m not really sure. What do you think?

High level climate intervention considerations

I’ve lately helped Giving What We Can extend their charity evaluation to climate change mitigation charities. This is a less abridged draft of a more polished post up on their blog.

Suppose you wanted to prevent climate change. What methods would get you the most emissions reduction for your money?

GWWC research has recently tried to answer this question, with a preliminary investigation of a number of climate change mitigation charities. Another time, I’ll discuss our investigation and its results in more detail. This time I’m going to tell you about some of the high level arguments and considerations we encountered for focusing on some kinds of mitigation methods over others.

The binding budget consideration

The world’s nations have been trying to negotiate agreements, limiting their future emissions in concert. The emissions targets chosen in such agreements are intended to sum up to meet a level deemed ‘safe’. Suppose some day such agreements are achieved. It seems then that any emissions you have reduced in advance will just be extra that someone will be allowed to emit after that agreement.

This argument implies political strategies are better than more direct means of reducing emissions. In particular, political strategies directed at causing such an agreement to come about.

This argument may sound plausible, but note that it relies on the following assumptions:

  1. the probability of such an agreement being formed is not substantially altered by prior emissions reductions

  2. the emissions targets set in such an agreement are not sensitive to the cost of achieving them

  3. such targets will be met, or we will fail to meet them by a similar margin regardless of how far we begin from them.

None of these is very plausible. Agreement seems more likely if it will be cheaper for the parties to uphold, or if it is more expensive to have no agreement. These are both altered by prior emissions reductions. There is no threshold of danger at which targets will automatically be set; more expensive targets are presumably less likely to be chosen. Two degrees is especially likely due to past discussions, however as it becomes harder to meet it becomes less likely to be retained as the goal. The further we begin from the targets we set, the less likely we are to attain them. Overall, it seems unclear whether reducing emissions by a tonne yourself will encourage more or less abatement through future large scale agreements. Either way, it is probably not a large effect. Consequently no adjustment is made for this consideration in our analysis.

Correcting feedback adjustments

Suppose you protect a hectare of rainforest from being felled. The people who would have bought the wood still want wood though, so the price of wood increases a little. This encourages others to fell their forests a little more, canceling some of your gains.

This is how prices work in general: when you buy something, the world makes a bit more of that thing, but not as much as you bought. If you buy a barrel of oil and bury it, you reduce the total oil to be burned, but by less than one barrel. Others respond to the higher price of oil after you buy some by drilling for more.

These considerations are real, and well known by economists. The big question is, how much do these feedbacks reduce the effect of your efforts?

This depends on what are known as the ‘price elasticity of supply’ and the ‘price elasticity of demand’. These measure how much more wood is harvested if the price of wood goes up by one percent, and how much more wood is wanted if the price goes down by one percent. Let’s call these ES and ED. If you ‘buy’ one unit of forest and keep it from being logged, the reduction in logged forest is ED/(ED + ES). Supply and demand elasticities are known for many items. If we can’t find these figures however, we may estimate ES and ED to be roughly equal, so estimate the real effect of reducing logging to be half of what it first seems.

Many other kinds of correcting feedbacks work in a similar way. If you reduce carbon emissions by a tonne, everyone else will be a tiny bit less concerned about climate change in expectation, and make a tiny bit less effort to prevent it. If you put an extra tonne of carbon dioxide in the atmosphere, plants and the oceans will absorb carbon dioxide a tiny bit faster, so the total added to the atmosphere will be less than a tonne.

The selfish tech concern

New technologies could greatly aid climate change mitigation. Unlike many other approaches however, private businesses have large economic incentives to pursue innovation projects. This is often seen as reason to avoid paying for technological progress: if you didn’t donate, businesses would do it anyway. Plus they have probably already taken the good opportunities.

The truth appears to be quite the opposite. Suppose we break projects up into two categories: those that have attracted some private investment, and those that have not. A random project from the first category is actually likely to be better than a project from the second category.

Self-interested companies will invest in clean energy research until the costs exceed the private benefits (the gains that return to them, instead of everyone else). This means at the point that they stop, you know that the costs and the private gains are about equal. If you buy more at this point, to get public gains, on the margin this is close to free for you because private gains almost cancel the costs.

For a random project without private investment, you just know that the private gains are somewhere below the costs. Probably they are far below, so it is substantially more expensive. This could be made up for if it had larger public benefits, but there seems little reason to expect this. In particular, if private and public gains are correlated, you would not expect this. In general, funding extra work on self-interested projects will be more effective than funding projects that only altruists ever cared for.

The worthless tonne concern

What if you can only reduce carbon emissions by a single puny tonne? Or if you have a project to reduce emissions, but it can’t get to the ‘heart of the problem’, merely make a small dent cheaply then run out of steam?

Many people feel that with since climate change is a very big problem, contributing a small amount to its solution is not worth much, compared to completely solving a proportionally smaller problem, such as one person’s illness. If you contribute a tiny bit, other people may not contribute the rest of what is needed to solve the problem. Or China might increase its emissions so much as to dwarf reduction efforts in your country. A common sense is that your efforts have then been wasted.

This would be true if the amount of carbon in the atmosphere didn’t make much difference except at a threshold. That is, if ‘solving climate change’ was worth a lot, while ‘almost solving climate change’ was worth little.

This is not the situation we are in. Firstly, as far as we know the costs from climate change don’t come at big thresholds like that – each extra bit of carbon dioxide in the atmosphere makes climate change a bit worse. ‘Safety’ targets such as two degrees do not signify steep changes in harm. They are lines chosen to represent costs ‘too large’ by some agreed standards, to focus mitigation efforts.

Secondly, even if there were steep thresholds, we don’t know where they are. Which makes reducing emissions on the margin as good in expectation as if there weren’t thresholds, though more chancy. Often your effort will do nothing, while sometimes it does everything. This is similar to running for a bus which leaves at an unknown time – at many times your running won’t help, but sometimes it will make all the difference. Overall, if you run a bit more you’re a bit more likely to catch the bus.

So a tonne of reduced emissions is worth about as much whether it is the only tonne you contribute, or one of millions.

Hidden help complications

Suppose a charity tries to shut down coal plants, and coal plants are indeed shut down. This is not strong evidence that the charity has achieved anything. Other charities may also have been trying to shut down coal plants, and coal plants close for many reasons. On the other hand, the charity may have made many other power plants more likely to close, which you don’t see because they in fact stayed open. How can you say how much good this charity has done?

There is not a simple answer. You will want to find a way to estimate what would have happened otherwise. You will need to decide whether to credit a charity with the difference in probability of outcomes they seem to have caused, or with what actually happened. The former avoids extra randomness and better counts the effort that you want, while the latter is much easier to measure, and harder to manipulate. Another question is whether to credit charities with the marginal or average value of contributing to a project alongside other charities, or something else. For instance, if the first charity working on something makes a large difference, but each added charity helps less, do you divide the gains between them, credit each with almost nothing, or credit each successive one with less?

The unruly future consideration

Suppose you reduce emissions by stopping some forest from being logged. Even if you do a good job of this, it might be hard to protect it from being logged in fifty years. You have bought the people in the future the option of continuing to lock up the carbon, but circumstances and economic incentives will be different, and it’s not clear whether they will take it. If the forest is logged in fifty years, you will have basically delayed some climate change for fifty years, ignoring e.g. short term emissions exacerbating feedbacks and producing more emissions.

Thus protecting the forest reduces most of the harm it appears to in the short term, but an increasingly small fraction of harms moving into the future, as the cumulative probability that it will be logged rises. How much this is worth overall depends on where the harms are concentrated. Increasing costs to the climate moving further from what we are used to suggest costs will be concentrated in the further future. But wealth, technology progress and adaptation push hard in the other direction. Also, people are more likely to continue your mitigation in cases where climate change turns out to be worse in the future. I am not sure the overall effect. This consideration could erode a large fraction of the value of a mitigation project.


These have been some of the issues considered in our quest to find the best organizations for turning dollars into reduced greenhouse emissions. If our analyses of them are adequate, next time we will bring you the finest climate change charities a brief investigation can find.


An illicit theory of costly signaling

I’m sympathetic to the view that many human behaviors are for signaling. However so far it doesn’t seem like a very tight theory. We have a motley pile of actions labeled as ‘maybe signaling’, connected to a diverse range of characteristics one might want to signal. We have a story for why each would make sense, and also why lots of behaviors that don’t exist would make sense. However I don’t know why we would use the signals we do in particular, or why we would particularly signal the characteristics that we do. When I predict whether a middle class Tahitian man would want to appear to his work colleagues as if he was widely traveled, and whether he would do this by showing them photographs, my answers are entirely based on my intuitive picture of humans and holidays and so on; I don’t see how to derive them from my theory of signaling. Here are two more niggling puzzles:

Why would we use message-specific costly signals for some messages, when we use explicit language + social retribution for so many others?

Much of the time when you speak to others, your values diverge from theirs at least a little. Often they would forward their own interests best by deceiving you, ignoring social costs and conscience. But even in situations where risks from their dishonesty are large, your usual mode of communication is probably spoken or written language.

This is still a kind of costly signaling, as long as if the person faces the right threats of social retribution. Which they usually do I think. If a person says to you that they have a swimming pool, or that they write for the Economist, or that your boyfriend said you should give his car keys to them, you will usually trust them. You are usually safe trusting such claims, because if someone made them dishonestly they could expect to be found out with some probability, and punished. In cases where this isn’t so – for instance if it is a stranger trying to borrow your boyfriend’s car – you will be much less trusting accordingly.

This mode of costly signaling seems very flexible – spoken language can represent any message you might want to send, and the same machinery of social sanctions can be used to guard many messages at once. And we do use this for a lot of our communication. So why do we use different one-off codes for some small class of messages? What sets that class apart?

The main obvious limitation of language + social sanctions is that it requires a threat of social retribution large enough to discourage lying. This might be hard to arrange, if for instance there are very large gains from lying, if lies are hard to find, or if the person who might lie doesn’t rely on good relationships with the people who might be offended by the lies. So maybe we use non-costly signaling in those cases?

In many of those cases we do use a kind of costly signal, yet a different variant again to the kind hypothesized to covertly pervade human interactions. This type of signal is the explicit credential. When a taxi-driver-to-be takes a driving test or has a background check, then displays his qualifications, this is a signaling display. Acquiring these documents is much cheaper for a person who can drive and has a clean background, and you (or the taxi company) know this and treat him differently if he makes these signals. I say this seems different from the social signaling we usually think of because it is explicitly intended as a signal, and everyone readily accepts that that is the goal, and is fine with it. Which almost brings me to the next puzzle. In conclusion, it’s not clear whether the signaling that we usually think of as such mostly occurs in situations where language and social sanctions are hard to use, but it is at least not the only thing used in such cases.

Why is signaling seen as bad? Why don’t we know about our own signaling?

It is often taken as given that signaling is bad. If a person comes to believe that a behavior they once partook in is for signaling, it is not unusual for them to give it up on those grounds alone, without even noticing the step of inference required between ‘is for signaling’ and ‘is bad’. A signaling theory is apparently a cynical theory.

This seems odd, as badness is not implied at all by the theoretical costly signaling model. There, signaling can be bad or good socially, depending on the costs of carrying it out. There are gains from assorting people well – it is better if the good people do the important jobs for instance – but no guarantee that the costs of the fight won’t overwhelm the gains.

Another related oddity is that people are supposed to be mostly unaware that they are signaling. Nobody bats an eyelid when a person claims to realize that they were doing a thing for signaling in the past. Talk of signaling is full of ‘Maybe I’m just doing this for signaling, but …’. Yet in the naive model of human psychology, it is at least a bit odd to be unaware of your motives in taking an action until months later. It’s true that people quite often don’t appear to have a good grasp of their own drives, yet in signaling this seems to be the normal expectation. And again, the theoretical model of costly signaling says nothing of this. It’s not obvious why you should expect this at all, given that model.

Another reason this seems strange is that we do have a lot of other explicit forms of signaling that we are aware of and ok with, as mentioned above (qualifying tests, ID cards, licenses). It is not that we have a problem with spending effort on almost-zero-sum games, or paying costs to look good.

An explanation

I’d like to suggest an explanation: costly signaling (of the message-specific unconscious variety) is largely used to communicate illicit messages. For instance, many messages about one’s own wealth, accomplishments, status, or sexual situation, and other messages about social maneuvering and judgement, seem to be illicit. Such things are also common targets of signaling theories, though my reasons for suggesting this explanation are mostly theoretical.

Illicit messages can’t be honestly transmitted using language and social norms, for a few reasons. Illicit things often shouldn’t be said explicitly, for plausible deniability, to avoid common knowledge, etc. This means you generally can’t use language to communicate illicit things, because language is explicit. This is one reason language + social retribution doesn’t work well for illicit messages. But also if you successfully have plausible deniability or prevent the message spreading far, both of these make social retribution hard to arrange. So implicit messages are quite hard to make honest through language + social retribution. Or through explicit verification for that matter, which is similar. Yet if such messages are to be listened to at all, they need some other guarantee, which other kinds of non-explicit costly signaling can provide. So this would explain the first puzzle.

If we had a set of signals just for illicit messages, it would be very silly to claim that we were aware of sending such things, and perhaps upsetting to believe that we were and to lie about it. So for the usual reasons that people are thought to be unaware of their less desirable tendencies, it wouldn’t be surprising if people were unaware of the signals they were sending. And if such signals were largely used for illicit messages, it would be unsurprising if we universally thought of signaling as an illicit activity. So this would explain the second puzzle.

An unusual counterargument

Oftentimes, the correct response to an argument is ‘your argument appears after cursory investigation to make sense, however the fact that many smart people have never mentioned this to me suggests that there are good counterarguments, so I remain unconvinced’.

I basically never hear this response, which suggests that there are good counterarguments. Or alternatively that it is unappealing to respond accurately in such cases. The latter seems very plausible, because it suggests one cannot assess any argument at the drop of a hat.

If so, what do people actually say instead? My guess is the first argument they can think of that points in the direction that seems right. This seems unfortunate, as the ensuing discussion of that counterargument that nobody believes can’t possibly resolve the debate, nor is of much interest to anyone.