Tag Archives: values

More convergent instrumental goals

Nick Bostrom follows Steve Omohundro in exploring the types of instrumental goals that intelligences with arbitrary ultimate goals might converge on. This is important for two reasons. First, it means predicting the behavior of arbitrary intelligences might be a tiny bit easier than you’d think. Second, because it draws attention to the difficulty of creating a creature that doesn’t want to get mixed up in taking resources and seeking longevity and that sort of thing.

Between Nick and Steve we have these convergent instrumental goals:

  1. Self-preservation
  2. Preserve your values
  3. self-improvement
  4. rationality
  5. other cognitive enhancement
  6. technological perfection
  7. get resources
  8. Avoid counterfeit utility

I think acquiring information is included in cognitive enhancement here, though to me it seems big and different enough that I’d put it by itself.

I’d like to add three more, incidentally all to do with interacting with other intelligences. So not relevant if you are the one all powerful singleton, but  handy if there are a number of AIs which are of a similar enough level that they might be useful to one another (like sheep and humans are, or humans and humans are). Here’s my list:

Spread your values
This can be achieved by making other entities which have at least some of the same values, or more traditionally by buying advertising space, exercising social influence, or having children. This makes sense for the same reason that ensuring that you maintain your values in the future does. If you think of yourself as just your values (as Bostrom mentions might be appropriate for some kinds of intelligences), spreading values can come under the heading of self-preservation perhaps (or reproduction). But usually we don’t think like this. If I convince you that Bach is pretty good, we don’t think of this as me replacing part of you with part of me. So I think it’s worth having a different category.

Social influence
If you can’t get others to have your values, the next best thing is to get them to do what you want anyway. This can be done in many ways, e.g. deceit, deals, forming relationships with useful people, sending good signals, being likable. Perhaps manipulating others comes under perfection of technology: making friends is in a way like better programming your computer to do what you type in. Both can be seen as investing some effort in arranging a complex system such that in future you will get what you want from it with less input. But not that close, so I give these things a new category. The extent to which it is worth influencing other intelligences depends on how useful they are. If the world only had me and sheep and stupider things in it, influencing sheep would be handy, but not nearly as handy as social skills are to me in this world. The next can be thought of as a subcategory of this one.

Acquire commonly valued goods
Nick and Steve mention acquiring ‘resources’, but from their elaboration it seems like they mean goods that are direct inputs to the intelligence’s goals. An instrumental goal to acquiring directly useful resources is acquiring stuff that is valuable to someone else. This apparently roundabout route is common because of comparative advantage. So while an AI would presumably like to have energy and fast hardware, it would also like a yacht or intellectual property rights for a new kind of frozen dinner, assuming anyone exists who will trade. With few assumptions, acquiring something like the latter will be easier than acquiring the former directly. This is pretty much what humans spend their working lives on.

The appeal of fictional conflict

Robert Wiblin asks why stories celebrate conflict rather than compromise:

As I was watching the film Avatar and the cinemagoers around me were cheering on the Na’vi heroes in their fight against human invaders, I couldn’t help but wonder how many of us would actually want to live alongside such an uncompromising society…it is hardly an isolated case. In our stories we love idealistic heroes to fight for what they believe in against all odds…

We could tell stories of the countless political compromises reached through well-functioning democratic institutions. We could tell the stories of all the terrible wars that never happened because of careful diplomacy. We could tell the story of the merchant who buys low and sells high, leaving everyone they deal with a little better off. These are the everyday tales which make modern society so great to live in. But will any such movie gross a billion dollars in the near future? I suspect not.

Incidentally, the one line I still remember from Avatar:

They’re not going to give up their home — they’re not gonna make a deal. For what? Lite beer and shopping channel? There’s nothing we have that they want.

Nothing at all…oh, except control over the destruction of everything they care about. You’re right, you really have no bargaining power. As Rob elaborates further, the premise of the extreme conflict was so flimsy, one must infer that it was pretty important to have an extreme conflict.

Rob guesses the popularity of such stubborn warring in our stories is to do with what we subconsciously want our tastes to say about us. When we don’t pay the costs of fictional war, we may as well stand up for principles as strongly as possible.

I think he might be roughly right. But why wouldn’t finding good deals and balancing compromises well be ideals we would want to celebrate? When there are no costs to yourself, why aren’t you itching to go all out and celebrate the most extravagant tales of successful trading and extreme sagas of mutually beneficial political compromise?

I think because there is no point in demonstrating that you will compromise. As a default, everyone can be expected to compromise, because it’s the rational thing to do at the time. However it’s often good to look like you won’t easily compromise, so that other people will try to win you over with better deals. Celebrating ruthless adherence to idealistic principles is a way of advertising that you are insane, for the purpose of improving your bargaining position. If you somehow convince me that you’re the kind of person who would die fighting for their magic tree, I’ll probably try to come up with a pretty appealing deal for you before I even bring up my interest in checking out the deposits under any trees you have.

Of course the whole point of being a bloody-minded idealist is lost if you keep it a secret. So you probably won’t do that. Which means just not going out of your way to celebrate uncompromising fights to the death is a credible signal of willingness to compromise.

Motivation on the margin of saving the world

Most people feel that they have certain responsibilities in life. If they achieve those they feel good about themselves, and anything they do beyond that to make the world better is an increasingly imperceptible bonus.

Some people with unusual moral positions or preferences feel responsible for making everything in the world as good as they can make it, and feel bad about the gap between what they achieve and what they could.

In both cases people have a kind of baseline that they care especially about. In the first case they are usually so far above it that nothing they do makes much difference to their feelings. In the second case they are often so far below it that nothing they do makes much difference to their feelings.

Games are engaging when you have a decent chance at both winning and losing. Every move you make matters, so you long to make that one more move. 

I expect the same is true of motivating altruistic consequentialists. I’m not sure how to make achievements on the margin more emotionally salient, but perhaps you do?

What’s wrong with advertising?

These two views seem to go together often:

  1. People are consuming too much
  2. The advertising industry makes people want things they wouldn’t otherwise want, worsening the problem

The reasoning behind 1) is usually that consumption requires natural resources, and those resources will run out. It follows from this that less natural-resource intensive consumption is better* i.e. the environmentalist prefers you to spend your money attending a dance or a psychologist than buying new clothes or jet skis, assuming the psychologist and dance organisers don’t spend all their income on clothes and jet skis and such.

How does the advertising industry get people to buy things they wouldn’t otherwise buy? One practice they are commonly accused of is selling dreams, ideals, identities and attitudes along with products. They convince you (at some level) that if you had that champagne your whole life would be that much more classy. So you buy into the dream though you would have walked right past the yellow bubbly liquid.

But doesn’t this just mean they are selling you a less natural-resource-intensive product? The advertisers have packaged the natural-resource intensive drink with a very non-natural-resource intensive thing – classiness – and sold you the two together.

Yes, maybe you have bought a drink you wouldn’t otherwise have bought. But overall this deal seems likely to be a good thing from the environmentalist perspective: it’s hard to just sell pure classiness, but the classy champagne is much less resource intensive per dollar than a similar bottle of unclassy drink, and you were going to spend your dollars on something (effectively – you may have just not earned them, which is equivalent to spending them on leisure).

If the advertiser can manufacture enough classiness for thousands of people with a video camera and some actors, this is probably a more environmentally friendly choice for those after classiness than most of their alternatives, such as ordering stuff in from France. My guess is that in general, buying intangible ideas along with more resource intensive products is better for the environment than the average alternative purchase a given person would make.  There at least seems little reason to think it is worse.

Of course that isn’t the only way advertisers make people want things they wouldn’t otherwise want. Sometimes they manufacture fake intangible things, so that when you get the champagne it doesn’t really make you feel classy. That’s a problem with dishonest people in every industry though. Is there any reason to blame ‘advertisers’ rather than ‘cheats’?

Another thing advertisers do is tell you about things you wouldn’t have thought of wanting otherwise, or remind you of things you had forgotten about. When innovators and entrepreneurs do this we celebrate it. Is there any difference when advertisers do it? Perhaps the problem is that advertisers tend to remind you of resource intensive, material desires more often than they remind you to consume more time with your brother, or to meditate more. This is somewhat at odds with the complaint that they try to sell you dreams and attitudes etc, but perhaps they do a bit of both.

Or perhaps they try to sell you material goods to satisfy longings you would otherwise fulfil non-materially? For instance recommending new clothes where you might otherwise have sought self-confidence through posture or public speaking practice or doing something worthy of respect. Some such effect seems plausible, though I doubt a huge one.

Overall it seems advertisers probably have effects in both directions. It’s not clear to me which is stronger. But insofar as they manage to package up and sell feelings and identities and other intangibles,  those who care for the environment should praise them.

*This is not to suggest that I believe natural resource conservation is particularly important, compared to using human time well for instance.

I am anti-awareness and you should be too

People seem to like raising awareness a lot. One might suspect too much, assuming the purpose is to efficiently solve whatever problem the awareness is being raised about. It’s hard to tell whether it is too much by working out how much is the right amount then checking if it matches what people do. But a feasible heuristic approach is to consider factors that might bias people one way or the other, relative to what is optimal.

Christian Lander at Stuff White People Like suggests some reasons raising awareness should be an inefficiently popular solution to other people’s problems:

This belief [that raising awareness will solve everything] allows them to feel that sweet self-satisfaction without actually having to solve anything or face any difficult challenges…

What makes this even more appealing for white people is that you can raise “awareness” through expensive dinners, parties, marathons, selling t-shirts, fashion shows, concerts, eating at restaurants and bracelets.  In other words, white people just have to keep doing stuff they like, EXCEPT now they can feel better about making a difference…

So to summarize – you get all the benefits of helping (self satisfaction, telling other people) but no need for difficult decisions or the ensuing criticism (how do you criticize awareness?)…

He seems to suspect that people are not trying to solve problems, but I shan’t argue about that here. At least some people think that they are trying to effectively campaign; this post is concerned with biases they might face. Christian  may or may not demonstrate a bias for these people. All things equal, it is better to solve problems in easy, fun, safe ways. However if it is easier to overestimate the effectiveness of easy, fun, safe things,  we probably raise awareness too much. I suspect this is true. I will add three more reasons to expect awareness to be over-raised.

First, people tend to identify with their moral concerns. People identify with moral concerns much more than they do with their personal, practical concerns for instance. Those who think the environment is being removed too fast are proudly environmentalists while those who think the bushes on their property are withering too fast do not bother to advertise themselves with any particular term, even if they spend much more time trying to correct the problem. It’s not part of their identity.

People like others to know about their identities. And raising awareness is perfect for this. Continually incorporating one’s concern about foreign forestry practices into conversations can be awkward, effortful and embarrassing. Raising awareness displays your identity even more prominently, while making this an unintended side effect of costly altruism for the cause rather than purposeful self advertisement.

That raising awareness is driven in part by desire to identify is evidenced by the fact that while ‘preaching to the converted’ is the epitome of verbal uselessness, it is still a favorite activity for those raising awareness, for instance at rallies, dinners and lectures. Wanting to raise awareness to people who are already well aware suggests that the information you hope to transmit is not about the worthiness of the cause. What else new could you be showing them? An obvious answer is that they learn who else is with the cause. Which is some information about the worthiness of the cause, but has other reasons for being presented. Robin Hanson has pointed out that breast cancer awareness campaign strategy relies on everyone already knowing about not just breast cancer but about the campaign. He similarly concluded that the aim is probably to show a political affiliation.

These are some items given away to promote Bre...

Image via Wikipedia

In many cases of identifying with a group to oppose some foe, it is useful for the group if you often declare your identity proudly and commit yourself to the group. If we are too keen to raise awareness about our identites, perhaps we are just used to those cases, and treat breast cancer like any other enemy who might be scared off by assembling a large and loyal army who don’t like it. I don’t know. But for whatever reason, I think our enthusiasm for increased awareness of everything is given a strong push by our enthusiasm for visible identifying with moral causes.

Secondly and relatedly, moral issues arouse a  person’s drive to determine who is good and who is bad, and to blame the bad ones. This urge to judge and blame should  for instance increase the salience of everyone around you eating meat if you are a vegetarian. This is at the expense of giving attention to any of the larger scale features of the world which contribute to how much meat people eat and how good or bad this is for animals. Rather than finding a particularly good way to solve the problem of too many animals suffering, you could easily be sidetracked by fact that your friends are being evil. Raising awareness seems like a pretty good solution if the glaring problem is that everyone around you is committing horrible sins, perhaps inadvertently.

Lastly, raising awareness is specifically designed to be visible, so it is intrinsically especially likely to spread among creatures who copy one another. If I am concerned about climate change, possible actions that will come to mind will be those I have seen others do. I have seen in great detail how people march in the streets or have stalls or stickers or tell their friends. I have little idea how people develop more efficient technologies or orchestrate less publicly visible political influence, or even how they change the insulation in their houses. This doesn’t necessarily mean that there is too much awareness raising; it is less effort to do things you already know how to do, so it is better to do them, all things equal. However too much awareness raising will happen if we don’t account for there being a big selection effect other than effectiveness in which solutions we will know about, and expend a bit more effort finding much more effective solutions accordingly.

So there are my reasons to expect too much awareness is raised. It’s easy and fun, it lets you advertise your identity, it’s the obvious thing to do when you are struck by the badness of those around you, and it is the obvious thing to do full stop. Are there any opposing reasons people would tend to be biased against raising awareness? If not, perhaps I should reconsider stopping telling you about this problem and finding a more effective way to lower awareness instead.

Ignorance of non-existent preferences

I often hear it said that since you can’t know what non existent people or creatures want, you can’t count bringing them into existence as a benefit to them even if you guess they will probably like it. For instance Adam Ozimek makes this argument here.

Does this absolute agnosticism about non-existent preferences mean it is also a neutral act to bring someone into existence when you expect them to have a net nasty experience?

Does it look like it’s all about happiness?

Humanity’s obsession with status and money is often attributed to a misguided belief that these will bring the happiness we truly hunger. Would be reformers repeat the worldview-shattering news that we can be happier just by being grateful and spending more time with our families and on other admirable activities. Yet the crowds begging for happiness do not appear to heed them.

This popular theory doesn’t explain why people are so ignorant after billions of lifetimes of data about what brings happiness, or alternatively why they are helpless to direct their behavior toward it with the information. The usual counterargument to this story is simply that money and status and all that do in fact bring happiness, so people aren’t that silly after all.

Another explanation for the observed facts is that we don’t actually want happiness that badly; we like status and money too even at the expense of happiness. That requires the opposite explanation, of why we think we like happiness so much.

But first, what’s the evidence that we really want happiness or don’t? Here is some I can think of (please add):

For “We are mostly trying to get happiness and failing”:

  • We discuss plans in life, even in detail, as if the purpose were happiness
  • When we are wondering if something was a good choice we ask things like ‘are you happy with it?’
  • Some things don’t seem to lead to much benefit but enjoyment and are avidly sought, such as some entertainment.
  • We seem by all accounts both motivated in and fine at getting happiness in immediate term activities – we don’t accidentally watch a TV show or eat chocolate for long before noticing whether we enjoy it. The confusion seems about long term activities and investments.

“We often aren’t trying to get happiness”:

  • The recent happiness research appears to have fuelled lots of writing and not much hungry implementation of advice. eg I’ve noticed no fashion for writing down what you are grateful for at night starting up. Have I just missed it?
  • Few people get a few years into a prestigious job, realize status and money don’t bring happiness, declare it all a mistake, and take up joyful poor low status  activities
  • Most things take less than years to evaluate
  • I don’t seem to do the things that I think would make me most happy.
  • It seems we pursue romance and sex at the expense of happiness often, incapable of giving it up in the face of anticipated misery. Status and money have traditionally been closely involved with romance and sex, so it would be unsurprising if we were driven to have them too in spite of happiness implications.
  • Most of the things we seek that make us happy also make us more successful in other ways. People are generally happy when they receive more money than usual, or sex, or a better job, or compliments. So the fact that we often seek things that make us happy doesn’t tell us much.
  • Explicitly seeking status, money and sex looks bad, but seeking happiness does not. Thus if we were seeking sex or status we would be more likely to claim we were seeking happiness than those things.
  • Many people accept that lowering their standards would make them happier, but don’t try to.
  • We seem, and believe ourselves to be, willing to forgo our own happiness often for the sake of ‘higher’ principles such as ethics

It looks to me like we don’t care only about happiness, though we do a bit. I suspect we care more about happiness currently and more about other things in the long term, thus are confused when long term plans don’t seem to lead to happiness because introspection says we like it.

Perfect principles are for bargaining

When people commit to principles, they often consider one transgression ruinous to the whole agenda. Eating a sausage by drunken accident can end years of vegetarianism.

As a child I thought this crazy. Couldn’t vegetarians just eat meat when it was cheap under their rationale? Scrumptious leftovers at our restaurant, otherwise to be thrown away, couldn’t tempt vegetarian kids I knew. It would break their vegetarianism. Break it? Why did the integrity of the whole string of meals matter?  Any given sausage was such a tiny effect.

I eventually found two explanations. First, it’s easier to thwart temptation if you stake the whole deal on every choice. This is similar to betting a thousand dollars that you won’t eat chocolate this month. Second, commitment without gaps makes you seem a nicer, more reliable person to deal with. Viewers can’t necessarily judge the worthiness of each transgression, so they suspect the selectively committed of hypocrisy. Plus everyone can better rely on and trust a person who honors his commitments with less regard to consequence.

There’s another good reason though, which is related to the first. For almost any commitment there are constantly other people saying things like ‘What?! You want me to cook a separate meal because you have some fuzzy notion that there will be slightly less carbon emitted somewhere if you don’t eat this steak?’ Maintaining an ideal requires constantly negotiating with other parties who must suffer for it. Placing a lot of value on unmarred principles gives you a big advantage in these negotiations.

In negotiating generally, it is often useful to arrange visible costs to yourself for relinquishing too much ground. This is to persuade the other party that if they insist on the agreement being in that region, you will truly not be able to make a deal. So they are forced to agree to a position more favorable to you. This is the idea behind arranging for your parents to viciously punish you for smoking with your friends if you don’t want to smoke much. Similarly, attaching a visible large cost – the symbolic sacrifice of your principles – to relieving a friend of cooking tofu persuades your friend that you just can’t eat with them unless they concede. So that whole conversation is avoided, determined in your favor from the outset.

I used to be a vegetarian, and it was much less embarrassing to ask for vegetarian food then than was afterward when  I merely wanted to eat vegetarian most of the time. Not only does absolute commitment get you a better deal, but it allows you to commit to such a position without disrespectfully insisting on sacrificing the other’s interests for a small benefit.

Prompted by The Strategy of Conflict by Thomas Schelling.

Romantic idealism: true love conquers almost all

More romantic people tend to be vocally in favor of more romantic fidelity in my experience. If you think about it though, faith in romance is not a very romantic ideal. True love should overcome all things! The highest mountains, the furthest distances, social classes, families, inconveniences, ugliness, but NOT previous love apparently. There shouldn’t be any competition there. The love that got there first is automatically the better one, winning the support and protection of the sentimental against all other love on offer. Other impediments are allowed to test love, sweetened with ‘yes, you must move a thousand miles apart, but if it’s really true love, he’ll wait for you’. You can’t say, ‘yes, he has another girlfriend, but if you really are better for him he’ll come back – may the truest love win!’.

Perhaps more commitment in general allows better and more romance? There are costs as well as benefits to being tied to anything though. Just as it’s not clear that more commitment in society to stay with your current job would be pro-productivity, it’s hard to see that more commitment to stay with your current partner would be especially pro-romance. Of course this is all silly – being romantic and vocally supporting faithfulness are about signaling that you will stick around, not about having consistent values or any real preference about the rest of the world. Is there some other explanation?


Why will we be extra wrong about AI values?

I recently discussed the unlikelihood of an AI taking off and leaving the rest of society behind. The other part I mentioned of Singularitarian concern is that powerful AIs will be programmed with the wrong values. This would be bad even if the AIs did not take over the world entirely, but just became a powerful influence. Is that likely to happen?

Don’t get confused by talk of ‘values’. When people hear this they often think an AI could fail to have values at all, or that we would need to work out how to give an AI values. ‘Values’ just means what the AI does. In the same sense your refrigerator might value making things inside it cold (or for that matter making things behind it warm). Every program you write has values in this sense. It might value outputting ‘#t’ if and only if it’s given a prime number for instance.

The fear then is that a super-AI will do something other than what we want. We are unfortunately picky, and most things other than what we want, we really don’t want. Situations such as being enslaved by an army of giant killer robots, or having your job taken by a simulated mind are really incredibly close to what you do want compared to situations such as your universe being efficiently remodeled into stationery. If you have a machine with random values and the ability to manipulate everything in the universe, the chance of it’s final product having humans and tea and crumpets in it is unfathomably unlikely. Some SIAI members seem to believe that almost anyone who manages to make a powerful general AI will be so incapable of giving it suitable values as to approximate a random selection from mind design space.

The fear is not that whoever picks the AI’s goals will do so at random, but rather that they won’t forsee the extent of the AI’s influence, and will pick narrow goals that may as well be random when they act on the world outside the realm they were intended. For instance an AI programmed to like finding really big prime numbers might find methods that are outside the box, such as hacking computers to covertly divert others’ computing power to the task. If it improves its own intelligence immensely and copies itself we might quickly find ourselves amongst a race of superintelligent creatures whose only value is to find prime numbers. The first thing they would presumably do is stop this needless waste of resources worldwide on everything other than doing that.

Having an impact outside the intended realm is a problem that could exist for any technology. For a certain time our devices do what we want, but at some point they diverge if left long enough, depending on how well we have designed them to do what we want. In the past a car driving itself would diverge from what you wanted at the first corner, whereas after more work they diverge at the point another car gets in their way, and after more work they will diverge at the point that you unexpectedly need to pee.

Notice that at all stages we know over what realm the car’s values coincide with ours, and design it to run accordingly. The same goes with just about all the technology I can think of. Because your toaster’s values and yours diverge as soon as you cease to want bread heated, your toaster is programmed to turn off at that point and not to be very powerful.

Perhaps the concern about strong AI having the wrong goals is like saying ‘one day there will be cars that can drive themselves. It’s much easier to make a car that drives by itself than to make it steer well, so when this technology is developed, the cars will probably have the wrong goals and drive off the road.’ The error here is assuming that the technology will be used outside the realm it does what we want because the imagined amazing prototype can and programming what we do want it to do seems hard. In practice we hardly ever encounter this problem because we know approximately what our creations will do, and can control where they are set to do something. Is AI different?

One suggestion it might be different comes from looking at technologies that intervene in very messy systems. Medicines, public policies and attempts to intervene in ecosystems for instance are used without total knowledge of their effects, and often to broader and iller effects than anticipated. If it’s hard to design a single policy with known consequences, and hard to tell what the consequences are, safely designing a machine which will intervene in everything in ways you don’t anticipate is presumably harder. But it seems effects of medicine and policy aren’t usually orders of magnitude larger than anticipated. Nobody accidentally starts a holocaust by changing the road rules. Also in the societal cases, the unanticipated effects are often from society reacting to the intervention, rather than from the mechanism used having unpredictable reach. e.g. it is not often that a policy which intends to improve childhood literacy accidentally improves adult literacy as well, but it might change where people want to send their children to school and hence where they live and what children do in their spare time. This is not such a problem, as human reactions presumably reflect human goals. It seems incredibly unlikely that AI will not have huge social effects of this sort.

Another suggestion that human level AI might have the ‘wrong’ values is that the more flexible and complicated things are the harder it is to predict them in all of the circumstances they might be used. Software has bugs and failures sometimes because those making it could not think of every relevant difference in situations it will be used. But again, we have an idea of how fast these errors turn up and don’t move forward faster than enough are corrected.

The main reason that the space in which to trust technology to please us is predictable is that we accumulate technology incrementally and in pace with the corresponding science, so have knowledge and similar cases to go by. So another reason AI could be different is that there is a huge jump in AI ability suddenly. As far as I can tell this is the basis for SIAI concern. For instance if after years of playing with not very useful code, a researcher suddenly figures out a fundamental equation of intelligence and suddenly finds the reachable universe at his command. Because he hasn’t seen anything like it, when he runs it he has virtually no idea how much it will influence or what it will do. So the danger of bad values is dependent on the danger of a big jump in progress. As I explained previously, a jump seems unlikely. If artificial intelligence is reached more incrementally, even if it ends up being a powerful influence in society, there is little reason to think it will have particularly bad values.