Gordon Irlam on the BEGuide

This post summarizes a conversation which was part of the Cause Prioritization Shallow, all parts of which are available here. Previously in this series, conversations with Owen Cotton-BarrattPaul Christiano, and Paul Penley.



Gordon Irlam: Philanthropist and Founder of Back of the Envelope Guide to Philanthropy

Katja Grace: Research Assistant, Machine Intelligence Research Institute


This is a summary of points made by Gordon Irlam during an interview with Katja on March 4th 2014. The summary was made by an anonymous author.

The Basics

The Back of the Envelope Guide to Philanthropy (BEGuide) is a one-man project that aims to quantify the impact of philanthropic causes and share information, helping philanthropists to prioritise spending in a way that delivers the highest value for money.

Gordon Irlam has been running the BEGuide for about 10 years and estimates that he’s spent between 200 and 600 hours on it in total, working on it in short sprees whenever promising new information catches his attention.

Gordon chooses which causes to investigate by gut instinct, looking at anything that he feels might turn out to be worthwhile. Almost every cause that he’s evaluated has turned out to promise a positive net impact.

Work normally begins with a seed in a newspaper article or radio program and grows within a few hours to an evaluation that can be published on the BEGuide website.

Gordon also runs the Gordon R Irlam Charitable Foundation and his work on the BEGuide is now the basis for around 99% of that foundation’s spending, around $200,000 annually in coming years.

What Gets Done

Gordon does no original research, but spends his time compiling available information and processing it to get results for the BEGuide.

Gordon considers this kind of work important, and laments that academic researchers get funded to go out and get information, but once they write a paper on it, seldom manage to direct it anywhere that it can be compared to other papers, so the knowledge doesn’t come into use and the papers are quickly forgotten.

Uploading the work to the BEGuide also takes time, but almost all of Gordon’s effort goes into the research. He muses that it would take a lot of time if he were ever to update the website, which currently has a simple design.

Gordon continues to update the BEGuide as new information emerges, both adding new causes and revising those he’s already worked on. He still revisits a lot of organisations that he’s been following for so many years, because a lot of those organisations that he looked into initially are still doing relevant work and expanding, as has the BEGuide, though Gordon’s own approach hasn’t changed since he began.

How It Works

Gordon’s method is, he says, quick and dirty, just as the name suggests. The back of the envelope calculation estimates the value of an activity as:

                           How big is the problem?                       
How much would it cost to do something about?

Once Gordon knows what the problem is, he searches the web, looks through journals and takes in information from other organisations until he has enough facts and figures to fill out the equation.

Gordon says that his background in physics has helped him to recognise that a lot of the problems in philanthropic evaluation can be addressed with Fermi estimates. He thinks this is a good approach because the field doesn’t have access to the detailed, accurate data that is necessary for more detailed calculations. Having studied pure mathematics and science, Gordon feels that it’s deep in his nature to appreciate and use these tools. He believes that quantifying everything is the right way to make philanthropic decisions.

Because of the complexity of many of the causes that he’s worked on, Gordon tries to hone in on the main thing that any funding decision would achieve, ignoring any flow on effects, which are, in his experience, harder to quantify and because of his selection process, inherently less important. If he becomes aware of good information on the long term effects of a funding decision, he would use the data, but believes it would be highly unusual for such information to be available.

Who Uses It?

The BEGuide website gets one or two visitors per day and Gordon doesn’t believe it’s used by any philanthropists outside his own organisation.

While smaller parties could find the BEGuide useful, Gordon posits that there is too much followup work for anyone but large philanthropic organisations to do in finding out which organisations best pursue what the BEGuide finds to be the best causes.

Gordon initially expected more interest in the website. He made it a wiki at one point, but it was quickly overrun with vandals, forcing him to retreat from that initiative.

Gordon says that he’s done very little to publicise the BEGuide. He initially wrote to Engaged Donors for Global Equity, who published a blurb for the website in their newsletter, but he has not maintained contact with any publication or actively advertised to other philanthropic organisations since then.

Gordon regrets that he has no real contact with anyone who might find the BEGuide useful and says he’d like to talk with other organisations about his methodology, however he hasn’t taken any steps to do so. The Laura and John Arnold Foundation, approached him and expressed an interest in using similar techniques to quantify the value of some 800 nonprofit CEOs.

The Results

Causes are organised on the BEGuide by what it gives as the upper limit of their Leverage Factor, which is, in simple terms, value for money.

A Leverage Factor of 15 is equivalent to $15 worth of societal value per dollar of input. One virtue of this unit is that it deemphasizes the necessary fact that in quantifying and comparing various philanthropic causes, the BEGuide puts dollar values on such things as human lives. Exactly what that dollar value is, for various results where it might be controversial, can be found on the website, but not on its main page.

If the BEGuide has found one trend in cause evaluations, it’s that high value causes tend to be high risk.

The biggest example so far is hostile artificial intelligence, which the BEGuide ranks as the most fundable cause, with a leverage factor between 100,000 and 11,000,000.

Gordon thinks philanthropic organisations tend to be unwilling to take risks, probably because they feel an obligation to have every cent spent on something that will have a result. He doesn’t see that this reasoning should be more true of philanthropists than of anyone else, comparing the opportunities in philanthropy to those of ordinary finance.

Another tendency that the BEGuide has found is that popular problems, such as solving global warming, are low value, because, the associated costs are so high.

This means that the highest value funding opportunities turn out to be high risk endeavours that aim to reduce problems that not many people know about, which is a problem, because this type of funding is understandably very unpopular with philanthropists.

Expanding the Audience

The most important barrier between the BEGuide and philanthropic organisations may be publicity. If Gordon had resources to add to any part of the processes that enable the BEGuide vision, he would use them to market to philanthropists.

Because philanthropists are regularly swamped by organisations looking for funding, they tend to develop a barrier that makes them very difficult to reach directly.

For this reason, Gordon suggests that the most efficient way to improve interest in projects like the BEGuide would be to deliver workshops at philanthropy conferences, such as the upcoming EDGE conference in Berkeley or those listed by the Chronicle of Philanthropy. These are places where philanthropists might have their guards down and be willing to find out that methodological cause valuation exists and to learn how it might benefit them.

What Else Is There?

Gordon has been in contact with GiveWell and is a member of Giving What We Can, but he feels that these organisations are limited to examining things like developing world issues.

Gordon believes that the BEGuide is unique in indiscriminately assessing the possible impacts of causes and feels that there is a need for a lot more aggregating and sorting of the existing evaluations that are produced by disparate organisations.

Other bodies that Gordon has used to source data for the BEGuide include the Copenhagen Consensus Centre and the Disease Control Priorities Project, both of which produce publications.

Gordon also stresses that it’s important to know the market. The Centre for Effective Global Action at UC Berkeley, for instance, provides data that’s useful for the US Agency for International Development and might at first glance like a good place for philanthropists or philanthropic data agglomerators to look for information. However it investigates problems like the comparative value of giving $10 in aid, or $5 and a chicken, while philanthropists often only want to know which organisation to support.

Without knowing much about other methodological approaches to evaluating causes, Gordon, expresses the sentiment that philanthropy as a whole is probably missing any methodology and that perhaps work on something like a marketing statistics approach would be of more value to the field than simply doing more research.

If other people are interested in contributing to research, Gordon feels that while there may be high value causes that have escaped his attention so far, newcomers’ time would probably best be spent looking in more detail at the highest leverage factor interventions on the BEGuide, as those causes could be much more attractive to philanthropists if their value were clearer.

On top of that, some of the causes on the BEGuide need regular updating and others haven’t been fully explored. Gordon worries that for his hostile AI evaluation, he has so far only been able to look at advocacy, but there are other possible solutions to that problem which might be more effective.

AI: is research like board games or seeing?

‘The computer scientist Donald Knuth was struck that “AI has by now succeeded in doing essentially everything that requires ‘thinking’ but has failed to do most of what people and animals do ‘without thinking’ – that, somehow, is so much harder!”‘
- Nick Bostrom, Superintelligence, p14

There are some activities we think of as involving substantial thinking that we haven’t tried to automate much, presumably because they require some of the ‘not thinking’ skills as precursors. For instance, theorizing about the world, making up grand schemes, winning political struggles, and starting successful companies. If we had successfully automated the ‘without thinking’ tasks like vision and common sense, do you think these remaining kinds of thinking tasks would come easily to AI – like chess in a new domain – or be hard like the ‘without thinking’ tasks?

Sebastian Hagen points out that we haven’t automated math, programming, or debugging, and these seem much like research and don’t require complicated interfacing with the world at least.

Crossposted from Superintelligence Reading Group.

Discontinuous paths

In my understanding, technological progress almost always proceeds relatively smoothly (see algorithmic progress, the performance curves database, and this brief investigation). Brain emulations seem to represent an unusual possibility for an abrupt jump in technological capability, because we would basically be ‘stealing’ the technology rather than designing it from scratch.

Similarly, if an advanced civilization kept their nanotechnology locked up nearby, then our incremental progress in lock-picking tools might suddenly give rise to a huge leap in nanotechnology from our perspective, whereas earlier lock picking progress wouldn’t have given us any noticeable nanotechnology progress.

If this is an unusual situation however, it seems strange that the other most salient route to superintelligence – artificial intelligence designed by humans – is also often expected to involve a discontinuous jump in capability, but for entirely different reasons. Is there some unifying reason to expect jumps in both routes to superintelligence, or is it just coincidence? Or do I overstate the ubiquity of incremental progress?

Crossposted from my own comment on the Superintelligence reading group. Commenters encouraged to do it over there.

High ulterior motives

People with ulterior motives are often treated with suspicion and contempt. In a world driven round substantially by ulterior motives, this can lead to dispair, both for the ulteriorly motivated and the suspicious and contemptuous. How terrible to not be able to trust your friends, your brothers, yourself!

At this point it is worth noting two things:

  1. This kind of extreme concern about everyone being corrupt only seems to concern more philosophically minded people. This suggests that few practical problems arise from everyone having poor motives.
  2. In general, if you have a scorecard on which you always score zero, it is likely that you are not using the most useful scoring system. You shouldn’t neccessarily change your overall goal of doing better on that metric, but for now it might be convenient to differentiate the space within ‘zero’.

It seems to me that an important way in which ulterior motives vary is the extent to which they align with the non-ulterior motives you would like the person in question to have.

Suppose you would like to leave your small child with a babysitter, Sam. Unfortunately, you have learned that Sam is not motivated purely by the desire to care for your child. He has an ulterior motive for agreeing to babysit. How much does this trouble you?

  • If Sam runs a babysitting company, and really he just wants his babysitting company to thrive, then you should basically not be concerned at all.
  • If Sam just wants to try out babysitting once, to see what it’s like, you should be more concerned.
  • If Sam really just wants a chance to use your big-screen TV this one time you should be even more concerned.
  • If Sam just wants a chance to steal your baby so that he can sell it on the black market, you should be truly very worried.

Your worry in these cases tracks the extent to which Sam’s ulterior motives will cause him to do exactly what he would do if he just fundamentally wanted to care for your baby. If he wants his business to go well, he will do what you want him to do, to the extent that you can tell and are willing to pay. If he wants to try out babysitting, he will probably at least hang out with your child and do the basic babysitting motions. If he wants to use your TV, there’s not much reason he will do anything besides spend part of the evening in the same building as your child. If he wants to steal your child, his motives diverge from yours from the moment he arrives at your house.

I claim that in general, ulterior motives are more troubling to us – and should be – if they are less well aligned with the purported high motives. I suspect they also feel more ‘ulterior’ when they are less well aligned, both to the person who has them and to the observer.

Ulterior motives like ‘make money’ and ‘get respect’ tend to be relatively well aligned I think. If you are aiming to do task X, but really you just want respect, and your actions or success at X will be visible to someone who might give you respect, then you will act like a person who wants to do task X, down to (at least some) minor details.

Ulterior motives that are troubling tend not to be well aligned with purported motives. Either in the sense that the person will not do the thing they are purporting to care about, or often in the sense that they will do it, but simultaneously do something you don’t want.

For instance, suppose you give me a compliment, with the hope that I will then help you move house. Your overt motive is implicitly something like honestly communicating to me, while your ulterior motive is to get moving help. A compliment motivated by your ulterior motive will probably not also be honest communication with me, so your behavior hardly aligns with your overt motive at all. On top of that, your ulterior motive means you will try to cause me to help you move house, a random other thing I don’t want to happen which has nothing to do with your overt motive.

This is not the only axis on which ulterior motives are better or worse. A different kind of reason ulterior motives might be particularly bad is if it is the motives that matter to you, rather than the behavior. For instance, if a person is merely friends with you to get your money, regardless of how friendly this makes them, you may be dissatisfied.

I think it would be better if we distinguished ‘low ulterior motives’ – which involve hardly caring about the overt goal – from ‘high ulterior motives’ – which are closely aligned with the avert goal consistently across many circumstances. Some people (perhaps read ‘all people’) pretend that they want to do what is best for the world, when in fact they also strongly want to be respected and praised and so on. Some people want to steal your baby. Conflating the two does not seem great, terminologically or psychologically.

I’m not saying that we shouldn’t criticize motives like desire for respect or money (or that we should). I merely suggest that if we want to do those things, we criticize these motives on their own merits, rather than lumping them in with much more hazardous low motives and cheaply criticizing ‘ulterior motives’ in general. 

No ulterior motives

AI surprise research: high value?

If artificial intelligence was about to become ‘human-level’, do you think we (society) would get advance notice? Would artificial intelligence researchers have been talking about it for years? Would tell-tale precursor technologies have triggered the alert? Would it be completely unsurprising, because AI’s had been able to do almost everything that humans could do for a decade, and catching up at a steady pace?

Whether we would be surprised then seems to make a big difference to what we should do now. Suppose that there are things someone should do before human-level AI appears (a premise to most current efforts to mitigate AI impacts). If there will be a period in which many people anticipate human-level AI soon, then probably someone will do the highest priority things. If you try to do them now, you might replace them or just fail because it is hard to see what needs doing so far ahead of time. So if you think AI will not be surprising, then the best things to do regarding AI now will tend to be the high value things which require a longer lead time. This might include building better institutions and capabilities; shifting AI research trajectories; doing technical work that is hard to parallelize; or looking for ways to get the clear warning earlier.

By Anders Sandberg (http://www.aleph.se/andart/archives/2006/10/warning_signs_for_tomorrow.html)

Anders Sandberg has put some thought into warning signs for AI.

On the other hand, if the advent of human-level AI was very surprising, then only a small group of people will ever respond to the anticipation of human-level AI (including those who are already concerned about it). This makes it more likely that a person who anticipates human-level AI now – as a member of that small group – should work on the highest priority things that will need to be done about it ever. This might include object-level tools for controlling moderately powerful intelligences, or design of features that would lower any risks of those intelligences.

I have just argued that the best approach for dealing with concerns about human-level AI should be depend on how surprising we expect it to be. I also think there are relatively cheap ways to shed light on this question that (as far as I know) haven’t received much attention. For instance one could investigate:

  1. How well can practitioners in related areas usually predict upcoming developments? (especially for large developments, and for closely related fields)
  2. To what extent is progress in AI driven by conceptual progress, and to what extent is it driven by improvements in hardware?
  3. Do these happen in parallel for a given application, or e.g. does some level of hardware development prompt software development?
  4. Looking at other areas of technological development, what warnings have ever been visible of large otherwise surprising changes? What properties go along with surprising changes?
  5. What kinds of evidence of upcoming change motivate people to action, historically?
  6. What is the historical prevalence of discontinuous progress in analogous areas (e.g. technology in general, software, algorithms (I’ve investigated this a bit); very preliminary results suggest discontinuous progress is rare)

Whether brain emulations, brain-inspired AI, or more artificial AI come first is also relevant to this question, as are our expectations about the time until human-level AI appears. So investigations which shed light on those issues should also shed light on this one.

Several of the questions above might be substantially clarified with less than a person-year of effort. With that degree of sophistication I think they have a good chance of changing our best guess about the degree of warning to expect, and perhaps about what people concerned about AI risk should do now. Such a shift seem valuable.

I have claimed that the surprisingness of human-level AI makes quite a difference to what those concerned about AI risk should do now, and that learning more about this surprisingness is cheap and neglected. So it seems pretty plausible to me that investigating the likely surprisingness of human-level AI is a better deal at this point than acting on our current understanding.

I haven’t made a watertight case for the superiority of research into AI surprise, and don’t necessarily believe it. I have gestured at a case however. Do you think it is wrong? Do you know of work on these topics that I should know about?

happy AI day