Suppose you want someone to calculate the current price of cheap computing hardware.
Alice knows roughly how much computing hardware cost a year ago, and know what the different kinds of computing hardware are, and how prices vary by time and market, and as she does the calculation she thinks things like ‘ooh, that is cheap’.
Bob doesn’t know anything about hardware and also isn’t trying to think about it. He mechanically follows an abstract research process he devised without reference to the subject matter, and makes no attempt to form an intuitive model. He reasons that in order to find the price of X, it will be necessary to locate the cheapest readily available version of X. He searches in various ways for cheap X. He writes down the best X prices he can find. He doesn’t notice if the prices are similar, or if there is a pattern behind the different X and prices associated with them. He determines the lowest price on his list, and returns it as the cheapest X. He writes words like ‘hardware’ instead of X, but it doesn’t really matter for his process.
I expect Alice to produce better research. Here are some advantages I expect her to have over Bob:
- She will know to look for cheap prices for common kinds of computers such as supercomputers, CPUs, and GPUs.
- If her searches only return CPUs, she will notice that maybe supercomputers aren’t coming up for some reason other than their not being cheap, and look them up directly.
- If Alice finds numbers that are very different from the number a year ago, she will be surprised and check more carefully.
- Alice will have an idea of whether the prices of GPUs she has found so far are relatively representative cheap GPUs, based on things like how much they vary, what brands they are from.
- If any numbers or patterns of numbers are surprising or interesting, Alice will notice and be able to do further research on a relevant sub-question, such as ‘is Moore’s law speeding up for computing cost?’
- She will remember much more of what she has found so far, because it is meaningful to her.
- Basically, Alice is getting feedback, while Bob may as well have set up the whole research procedure to run mechanically at the start, because he is not getting any information from what he finds. And often feedback is good.
You can do research without engaging with the topic, carrying out abstract calculations while ignoring the numbers you find, not updating your views, or feeling curiosity about the subject; feeling nothing. Or you can know what you are studying like you know your lover’s face: understand what each number means, and see what it implies for everything else. Usually you will do something in the middle of this spectrum, but it is good to consider what the ends are like. This is an axis of research I often think about, because sometimes I look more like Alice and sometimes I look more like Bob. I’ll tentatively call Alice’s research style ‘high immersement’, and Bob’s ‘low immersement’. I think of Bob’s as ‘Chinese room style research’.
I have assumed that high immersement is just overwhelmingly better, perhaps because I have some unreasonable bias in favor of knowing what one is doing. Or because of the list of reasons above, and ones like them. (Or because it is more impressive?)
But I expect one advantage of low immersement is that it should be less biased. Bob basically doesn’t know what his data means, so it is harder for him to push the results one way or another. He is like a blinded researcher, and it is often considered highly valuable to have everyone involved in research be blinded. And because Bob doesn’t notice errors, he avoids the kind of problem where he mostly corrects errors that disagree with his own intuitions, biasing the findings toward his own intuitions. I can’t think of obvious reasons he would be more biased in a particular direction. Though maybe in some topics, looking into a thing in more detail reliably leads to higher or lower numbers, and Bob might look into things in less detail. But overall Bob’s research seems likely to be more wrong and less biased.
Most research probably happens somewhere in the middle of the spectrum, and maybe there are non-linear effects to having some idea of what you are doing, on the potential for bias. Even so, I think I expect less bias from people who have less idea what they are doing.
My guess is still that high immersement research is generally much better, but to the extent you can control it, I wonder if it is worth being in low immersement mode to do some research tasks. In particular, tasks where you don’t really want your own interpretation of things to influence your behavior. For instance, in the survey I’ve been working on lately it might have been good to be in high immersement mode for looking into how to ask questions, and for interviewing people to inform question writing, and low immersement mode for turning the results into graphs and statistics, and then high immersement mode for looking at the graphs and statistics and speculating about what they mean.