Image by Curtis MacNewton via Unsplash https://bit.ly/2RNPE6s

The closing panel of our recent Behavioural Economics and Public Policy conference included leading politicians and policy makers sharing their views on behavioural insights (BI) and government. The full audio of the panel is available here. This two-part blog piece presents some of the most interesting issues, approaches and questions considered by the panel, extracted from the transcript. Find Part 2 here.

Participants

  • Chair: Miranda Stewart, University of Melbourne.
  • Jennie Granger PSM, former Director General, Her Majesty’s Revenue and Customs (HMRC).
  • David Gruen, Deputy Secretary, Prime Minister and Cabinet.
  • Andrew Leigh, Shadow Assistant Treasurer and Federal Member for Fenner, ACT.
  • Jane Mitchell, Behavioural Insights Team, Australian Taxation Office (ATO).

Our first panellist is Jane Mitchell, who is on the front line of BI research in government services, as Director of the BI team of the ATO.

Jane Mitchell

Behavioural Insights Team, Australian Taxation Office (ATO)

First, some context about the ATO. We are an organisation of 20,000 staff. We are broken into about 20 business lines, or divisions, that have specific focus areas, across almost 20 sites around Australia. We are an agency that administers the tax and superannuation legislation. That means that we have all the client-facing positions, and we are not a policy agency. Treasury is responsible for policy. The ATO may influence what the law changes might be, but ultimately, Treasury takes leadership of tax and super law. For example – how many people here do their own tax return through MyTax? [Big show of hands] That is an ATO system. We also have big phone contact-centres, and we do compliance activities.

The focus for the ATO is about encouraging willing participation in the tax and super systems and for those that are not complying, we focus on detecting this and treating those issues. The largest part of our efforts is all about encouraging willing participation. Behavioural insights dovetails very well with that goal. For us, it is about how can you use BI interventions to make it as likely as possible that people will do the thing that we want them to do, using freedom of choice. That is the whole purpose behind BI in the ATO.

In the ATO, we have been applying a BI approach since about 2011, but BI was not applied consistently nor was it embraced across the organisation. So, a decision was made in early 2016 to create a dedicated BI unit – we are very small – to drive the use of BI across the organisation. Today, in 2018, about a quarter of the organisation has done BI training and the ATO has about 200 BI activities under its belt. We are recognised as a leader in BI among domestic and international agencies and we proactively share our approaches with like-minded agencies.

Essentially, as shown in the one-page handout, I was given the role of taking work that was there already in the organisation, and to extend it across ATO and prove that it is useful. The important thing was that we were very clear at the outset about our plan, that is, what were we aiming to achieve? It was not just about having a set of experts that were great at economics, sitting in one area, it was about driving adoption and awareness in the organisation. We developed a framework that we call Hub and Spoke. We have got a small central unit, with about eight staff and we have developed BI champions across our 20 business lines.

In terms of raising awareness, we did a lot of work to communicate with staff, which went hand-in-glove with capability development including foundation, intermediate, and advanced training activities. One goal of training 25 per cent of staff was about growing support for BI approaches. Initially, it was very difficult to get support, and for staff to look for opportunities, if they do not understand what the value proposition is. Today, BI is enormously popular and widely embraced in the ATO. At a webinar last week, we had 1,500 staff registered.

The BI team has a combination of skills including project management, learning and development, communications experts, statisticians, people with strong business knowledge, particularly in a call centre environment and in business strategy. The team also has a focus on practical application, matching what we were delivering with what the internal ATO business wanted or needed.

I appreciate, and we all agree, that randomised controlled trials (RCTs) are best practice and where we can do them, we do. But we did not want to lose momentum by turning people away from BI approaches, where a randomised trial was not possible. There are plenty of other opportunities to improve the community’s experience with the tax system with BI approaches. So, our key is really ‘is it measurable?’ in terms of doing a project with BI impact.

It is a core part of our role as a BI team to look for opportunities to apply it in the internal environment of the ATO. Sometimes that may not align with the sort of skills that may normally appear in an economics-focused team. But, selling your successes to different areas of the ATO so they know the value that could be added, and identifying wherever BI might be useful, is critical.

Miranda Stewart: The ANU, and others, have partnered with the ATO in doing BI projects including randomised trials, so we know first-hand that there is a substantial enthusiasm at senior levels and through the organisation for learning and research, and for BI. Even so, it is hard to do randomised trials in that environment. Is that your experience?

Jane Mitchell: Yes, randomised trials can be difficult to do. It is not that the business (ATO) is trying to be obstructive. Probably the biggest reason for the difficulty is time. A division of the ATO might approach you when there are only two months before a new policy will take effect and you do not have the time to set up an RCT. The BI team will seek to make change over time if we can get early enough in on the actual measure that is being implemented.

Another big issue for a large organisation with complex information technology (IT) like the ATO, is that IT systems are not as flexible as they could be. It may cost a lot of money to change settings in a system to enable an RCT. In terms of resources, we have an IT priority list and changing a system to facilitate an RCT may not be at the top of the list of priorities.

Sometimes it comes down to being practical: if you cannot do a randomised trial, there are other things that you can do. For example, this could be staggering implementation of interventions, where you can identify and measure the effect of something progressively as you go. It is better to do that than walk away from the opportunity without having learned anything at all. If we were not practical, we would probably lose 90 per cent of the BI interventions that we have done, which have added value to the community.

Miranda Stewart: Our next panellist is Andrew Leigh, MP, author and empirical researcher. You may know Andrew’s most recent book, Randomistas: How radical researchers changed our world. It summarises about 1,000 randomised trials; they are really fascinating. Are you a missionary for randomised trials in government? We have already heard some of the efforts and constraints in a real organisation like the ATO.

Andrew Leigh

Shadow Assistant Treasurer and Federal Member for Fenner, ACT

Well, if I am going to get called a missionary I guess the least I should do is stand up.

I remember an overseas expert telling me once that he came to Australia about a decade ago, and said that he really thought we needed to do more randomised trials, and people shook their head and said ‘no, I do not think that will fly’. Then he came back a few years later and he said ‘I think we need to get more behavioural economics into policymaking’. And they said ‘oh, that is fascinating, how would we test those ideas?’ and he said ‘randomised trials’ and they said ‘great, when do we start?’. In much the same spirit, I am going to attempt to sneak a talk about Randomistas into the broad topic of behavioural economics and, let’s face it, a lot of what is being done in medicine around the world right now actually has a lot more of randomised trials than behavioural economics.

So let me start with a story.

In 2013, a group of Finnish doctors published the results of a randomised trial of knee surgery performed for a torn meniscus. It is an operation called a meniscectomy, performed millions of times around the world every year. And the results were based on sham surgery, in which the treatment group gets the true surgery and the control group is sliced open under an anaesthetic with easy-listening music playing, and they are sewn back up again, and neither group ever know whether they got the true surgery or the sham surgery. The Finnish study found that among middle age patients, sham surgery from meniscectomy was no less effective than the true surgery itself. Not everyone welcomed the results. The editors of the journal Arthroscopy thundered that results from a sham surgery trial were ‘ludicrous’. Their argument went as follows: they said because no right-minded patients would participate in sham surgery, the results from such trials were not generalisable to mentally healthy patients.

But sham surgery trials are growing in importance. One survey of the literature finds that about half the cases record a sham surgery result that is no worse than the true surgery itself, suggesting that there are millions of people every year undergoing surgeries which may well be unnecessary. This builds on the trend within medicine to move from eminence-based medicine to evidence-based medicine, going right back to James Lind’s scurvy trials, Ambroise Paré’s trials on battlefield burns, and indeed the study of randomised trials that eventually showed that bloodletting did not cure patients, conducted just a decade or two after the discipline of medicine had decided to call one of their top journals The Lancet. There was a trial in 1954 which randomly injected 600,000 American schoolchildren with either the polio vaccine or with salt water –and, finding that those who received the polio vaccine were less likely to contract the disease, the following year polio vaccines were rolled out across America. Remember that when people tell you that you cannot implement the results of randomised trials quickly.

Randomised trials have doubtless shaped the way in which you look after your health; they have certainly shaped mine. After I read the randomised trials on multivitamins, I decided not to take multivitamin tablets. I wear compression socks after marathons, after the randomised trials on Sydney, Melbourne and Gold Coast marathoners showed they aid recovery. And if I have to remove a band-aid from one of my three little boys, I will turn to the James Cook University study, which compared the fast removal method to the slow removal method, and concluded the fast removal method is less painful.

ANU has a great tradition of doing randomised trials. For example, restorative justice trials both here in Canberra and in Britain found that the probability that the victim would want to later take revenge on the offender was significantly lower when the case was allocated to restorative justice.

Miranda Stewart has asked me to say a few words about the insights that we have garnered within policymaking. For that I would turn to one of my favourite policy ‘randomistas’, Judith Gueron, who has worked at the MDRC. After conducting dozens of randomised trials involving hundreds of thousands of participants, she has a number of maxims for randomised trials in a policy context. She says ‘never say that something about the research is too complex to get into’. She says ‘if someone is unreservedly enthusiastic about the study from an outset, that is because they do not understand it’. And she says, ‘if your detractors claim that it is unfair to turn away worthy recipients, then just keep expanding your treatment group, until ultimately you have spent all the money on the treatment group, you have used your last dollar of funding, so you are not turning away any more people by doing the randomised trial’.

Randomised trials are happening at a large policy scale in developing countries. There is a randomised trial in Indonesia estimating the impact on student performance of doubling teacher pay. One in India, in the state of Andhra Pradesh, which had a sample size of 19 million, and looked at the impact on corruption of the rollout of biometric identification. There is a study in the Mexican city of Acayucan, where the mayor had only enough money to pave half the roads. She collaborated with a group of university researchers to choose which half, and creating a randomised trial that let her look at the subsequent impact of paving on property values.

The randomistas are running rampant across business. One commentator says that every pixel on the Amazon homepage has had to justify its existence through randomised trials. The Netflix algorithm, when it decides which movies to offer you next, is based on randomised trials. When Western Union decides what combination of mark-ups and fixed fees to charge, they are basing that decision on randomised trials. If you are wondering why about half of all public prices end in ‘9’, you can thank the randomised trials conducted in marketing.

If you are wondering why the Google toolbar has its particular shade of blue, you can thank Marissa Mayer, then a vice-president at Google, who tried 40 different shades of blue and chose the one which got the maximum number of users. The company estimates that is has added hundreds of millions of dollars to its bottom line as a result. There is an important point to note in understanding that Google does randomised trials: Google has access to around 15 exabytes of data. Every second it gets 40,000 searches. Google has no big data problem. If they think they need to do randomisation, then no one else on the planet can say ‘I have got plenty of data, I do not need to randomise’. Intuit, Humana, Lyft, Uber – a whole range of enterprises are doing randomised trials. Indeed, one firm says they have three cardinal rules: you do not harass women, you do not steal, and you must have a control group. And for breaking any of those rules you can lose your job.

Randomised trials are not without their critics. My book Randomistas got its title from Angus Deaton, who was using it not as a compliment, but as a critique of what he saw as the ‘overuse’ of randomised trials in development economics. But in most other areas of politics, I think there is much more of an underuse than an overuse problem. Certainly there are areas in which you would not conduct randomised trials. As one wag noted, just because parachutes have not been tested in a randomised trial, you would not necessarily jump out of a plane without one. Nonetheless, if you are a parachutist you do benefit from randomised trials on different chute types, or on the benefits of ankle braces, which have been shown in a parachuting randomised trial to minimise the risk of ankle injury.

Yes, we have to take into account generalisability: what works in 1960s Ypsilanti [Michigan] would not necessarily work in modern day Yemen. But that is true of any form of evaluation, not just randomised trials. Sure, they can cost millions of dollars and take decades – like Perry Preschool or the RAND health insurance experiment. But they do not have to. You can tweak processes very quickly. Indeed, the subtitle of my book came from a randomised trial on Google which took a week and cost about 50 dollars.

The White House under President Obama, working with the Arnold Foundation, ran a competition for low-cost randomised trials; those which cost $200,000 dollars or less. The winners tended to use administrative data and were programs such as a federal government department carrying out unexpected workplace health and safety inspections, a Boston non-profit providing intensive counselling to low-income youth. And the Arnold Foundation in the US has now taken up that approach.

We do need to make sure in academia that we have a better meld between theory and practice. When he reflects on what he has learned from his randomised trials in Liberia, development economics researcher Chris Blattman asserts that, instead of asking ‘does the program work?’, he should have been asking ‘how does the world work?’. And he argues that we need to work on testing fundamental assumptions within academia and leave the blue-envelope-versus-red-envelope studies to the marketing professionals.

One more story, to conclude. In the early 2000s, successful businessman Blake Mycoskie was visiting villages outside Buenos Aires. Blake Mycoskie was in his early thirties, had already set up three successful businesses and for the first time now was seeing face to face the true impacts of poverty. And he was particularly struck by the impact of shoelessness, what that did to the pain for the kids running around and, potentially, what it might do to deter them from attending school. So Mycoskie set up a firm called TOMS ‘Shoes for Better Tomorrows’ in which, for every pair of shoes purchased by someone in a developed country, there would be a pair given free to a child in a developing country.

Over the course of the next decade, they gave away some 60 million pairs of shoes. Then Mycoskie decided that he would collaborate with randomistas to see how the program worked. He worked with a team run by Bruce Wydick, who randomised across 18 communities in El Salvador. And when the results came in, they were an utter debunking of the TOMS model. They found that those who received the shoes were using them to replace earlier pairs of shoes, that they were having no impact on the health of the children, no impact on the school attendance of the children, and when they surveyed them about their attitudes, this free shoe distribution was making those children feel more dependent on outsiders. Now, many organisations and many governments would have immediately looked to discredit the researchers. But as Bruce Wydick wrote, TOMS’ response was anything but. They were nimble. They shifted from giving out loafers to giving out sneakers. They shifted from giving out free shoes to everyone to working to make shoes an incentive for attending school or getting better grades. They set up TOMS based on the best available knowledge, and they adapted TOMS as the new knowledge came in. Test, learn, adapt.

Frankly, that reflects the fact that failure is surprisingly common. If we look at the stats, from the drugs that look promising coming out of the lab, only about one in ten make it through stage one, two, and three trials and then onto market. If we look at education, only about a tenth of the studies commissioned by the What Works Clearinghouse produce positive results. When Google does randomised experiments, only about a fifth help them improve the product. Rigorous social policy experiments find that only about a quarter have a strong positive effect.

Randomised trials flourish where modesty meets numeracy. The philosophy in the experimenting society is not that we know what works, is that we are willing to bring good science in the service of solving big social problems.

Miranda Stewart: One question before we move on: you said, researchers should leave the red and blue letters to the marketing specialists. What about randomised trials in government? Should governments be doing this, or are there better places to do it, or can we leave a lot of it to the market, and learn from what the marketers tell us?

Andrew Leigh: We should certainly be learning from the market. I probably should have said ‘we should leave it to the non-academics’. Academics should not be doing red and blue trials.

That is also true of the BI interventions being done by the ATO. But one of the really valid critiques of some of the randomised trials that were published in that early wave a decade or so ago, was that too few of them were testing deep economic questions. And that is one way in which the economics profession is now recalibrating. You are getting these fascinating studies which are melding structural models and randomised trials and embedding the randomised trials deep into the theory.

 

Further reading

Behavioural Insights and Public Policy: A Discussion – Part 2, by Maria Sandoval Guzman

Nudging Businesses to Pay Their Taxes: Does Timing Matter?, by Christian Gillitzer and Mathias Sinning

This article has 1 comment

  1. I am a great believer in randomized trials. It may not seem relevant to us tax policy makers and tax lawyers, but I found the free MIT courses https://www.edx.org/course/designing-and-running-randomized-evaluations-1 and the https://www.edx.org/course/foundations-of-development-policy-1 brilliant.