It is time to plan to retire the A/B testing.

20210430 Power Of Data.md Copyright© Miklos Szegedi, 2021

Advancing the data age

Lot has happened during the past year. The pandemic showed that there is a lot that we can do to enhance science and the application into practice.

The traditional way of doing scientific business – the A/B testing – had a strong challenge. The world needed lifesaving Covid-19 reliable vaccines fast. The traditional approach to test the real vaccine and a fake one called placebo on a random set of people have proven to have issues.

Science has a new ally, and that is data. Data is available and it has exponential growth. We collect more as devices get more affordable. Historic information is kept longer. This will have a tremendous impact on research. You can just cheaply buy access. You can share raw information and make collection and storage affordable, even in the long run. You can even compare historic results and point out the dynamics.

You can also do clean and comparative analysis. Your data is there, you can correlate with another research. There is no interference. You can slice, dice, and clean the data set so that you can verify your results.

You can apply statistical regression, correlation analysis, and machine learning. All these provide strong evidence only with computational tools. As it becomes cheaper, research becomes more affordable as well, and it can scale up to find granular results. It will allow us to research rare illnesses.

It is very easy to reuse data. There are only a few limitations. One is privacy but there are numerous methods to limit the harm that data can do with careful filtering. There is a lot that can be done in this space. The most important is that as data becomes abundant the focus will shift to protect the individuals by limiting the actions that can be done using it.

One important fact is that data comes from the real world instead of an experimental scenario. While laboratory results are important as final proof of concepts, rich information can be collected and verified to test a hypothesis and the impact on real people and society.

So, what is wrong with A/B?

Money. First, it takes time and money to carry out tests. You need to collect participants, do the legal work and ask consents, and collect the results. All this is money. It is also hard to find volunteers and it is even more difficult to pro-rate the results to the general population.

Freedom. Second, A/B testing was invented when society was less democratic. The general population has more access to information in the data age. Hierarchies that were built on the superiority of one’s knowledge become less important. Aristocracy, bureaucracy, meritocracy, bourgeois ethics were all based on information that is not available to the public. A test that is based on withholding truth will slowly lose its credibility.

Information. A/B testing also withdraws a part of the freedom of an individual. This may not have an impact in the case of a single test but if science uses more A/B testing, eventually people will anticipate being lied to all the time. Do I have the freedom to know what I eat? Am I being tested? Do I get real stuff or just a placebo?

Psychology. A lot of psychological research can be done on this topic. Does A/B testing have a long-term psychological impact on an individual that is never tested? Does it change behavior or personality traits? Does it affect future results? I believe A/B testing has a long-term psychological cost embedded into it.

Interference. A/B testing usually relies on a statistically significant data set. This has a limitation as it does not scale well. Eventually, it requires a general population limiting the usability of it on a town for example. Moreover, multiple tests on the subset of the same population can have negative effects on each other. Let us take the example of the Covid vaccine. Is the accidental death of six vaccinated people because of the vaccine alone or is it due to the interaction with the A/B test of another experimental medicine or a psychological test carried out through one’s browser? What is the loss of statistical evidence in this case? Not many serious A/B tests can be done on the same population at the same time without anticipating these effects. Repeating helps but it incurs costs.

Learning. The EU has recently pushed into AI territory. There are proposals to limit the learning on the real general population for example to prevent accidents. Even more important is the impact on human learning. Will humans and AI learn and give different reactions when realizing they may be under testing? Humans may learn the effects of medicine and the immune system may simulate responses. How much do tests and artificial learning affect individual freedom and happiness?

Temporary. Also, society changes. There may be issues that affect the current results. The only repeated expensive test can prove that these results were temporary. All this is time and money.

Ethics. Conscience also matters. Using data does not require changing the real world. It still provides the evidence necessary. A/B testing needs intrusion and intervention. Will you exclude expectant mothers? Will you exclude people with mental illness to prevent distress or suicide? What is the impact of these on the usefulness of the results?

Psychosomatic. Statistical medicine also has issues that it embraces the general population. Side effects happen and they usually affect only the subset of the population. Is this because of a domino effect of experiences in the past? We all have life experiences with learned responses and emotional attachment. Is someone’s cultural background or race important to be excluded or included? Should extensive quantitative analysis be done before accepting participation?

Theatre. There was a case of an alleged terrorist, who pressed a button, the FBI created in Portland, Oregon. The case was very interesting. Somebody commits a criminal act but in a simulated environment. No harm could be done, the environment is fake. Still, the individual would carry out the act otherwise, not knowing about the situation. Is this crime or should be treated for delusions and mental illness? A lot of research will be done in the future regarding such issues, I think.

Equality. A/B testing divides people, even if it is random and not arbitrary. Still, if the impact is strong enough this carries a sad message. Those people will do similar acts in their private and professional lives. Are dividing people ethical? Is telling fake news to test the reactions worth a positive relationship?

Visibility. Also, participating in impactful research may improve visibility and the life experience makes people stand out. The resulting personality changes may affect them in the long term. A/B participants may favor more visibility, or they may be open to more observation. Some on the other hand may become withdrawn, or hostile.

Betting. A/B testing has the economic structure of betting. This may collide with the ethics of people, who avoid agent-principal relationships. A bet is made on the health or happiness of others and they face many consequences as stated above.

Use existing data over A/B.

The author has two important suggestions regarding A/B testing. If existing data can be used, it is better to avoid A/B testing to prevent the side effects above. If A/B testing must be carried out anyways due to scientific and regulatory reasons, it is important that full information is provided, and consent is requested. This removes the placebo effect, but the placebo effect is removed anyways by the participation. It is also important to accept that A/B testing can long term health and psychological impact so that participants need to be checked periodically for the acceptance of the full result of the research.