Covid-19 is throwing our need for and use of information into stark relief. Data is no longer a nice-to-have, but a necessity: something we have to invest in, build and maintain as diligently as we do our health services, medical supply chains and personal hygiene.
The drive to quickly develop treatments and a vaccine is leading to rapid, collaborative sharing and use of data amongst scientists. The barriers that prevent this from happening in normal circumstances are being dismantled. The Wellcome Trust has rightly called on researchers, journals and funders to ensure that research findings and data relevant to this outbreak are shared rapidly and openly. Geneticists are sharing data through existing data bodies such as GenBank and GISAID which help new visualisations to be created, donations of data to be cited and scientists recognised for their work. Publication of pre-print papers on hubs such as BioRXiv, the Lancet and Elsevier are growing rapidly. As many articles have pointed out, the coronavirus outbreak has galvanised open access to research, and collaboration around the world.
Minimising the impact of coronavirus outbreaks requires authorities and communities to work together. Data performs a key role here as well. In times of uncertainty, we hunger for information. When official sources are missing, we latch onto unofficial ones: rumours, misinformation, and hunches that can spread both panic and complacency.
We as individuals should be practicing good information hygiene, just as we should good physical hygiene, especially if we are super-spreaders, such as journalists or influencers. If you’re passing something on, cite its source. If you’re reading something, think critically: check the source, and ask for the source if there isn’t one. Just like washing your hands, these are always good practices to follow, but particularly important in an emergency.
Equally, just as authorities are thinking about the production and distribution of equipment like testing kits and face masks, they need to be ensuring a good supply of data that helps everyone make informed decisions. Again, there are known good practices and sometimes it will require some hard tradeoffs.
Data about the outbreak at the international level helps individuals and businesses plan overseas travel and authorities manage risks from people returning from different locations. The World Health Organisation is publishing daily bulletins – as PDFs. These are suited to people, who want to read text, but are less useful for automated processing that can generate automated alerts or provide alternative presentations. The European Centre for Disease Prevention and Control publishes both human-readable situation reports and a machine-readable spreadsheet. Researchers at the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University have described how they creates their dashboard through a combination of automated and manual processes, as well as making that data available through GitHub.
In our lives and in our work, we need information to make other decisions too. Should I work from home or travel in to work? Should we cancel events? As further social distancing measures are taken, we may be looking for answers to questions like which schools or public buildings are closed, which public transport is running. If we or our loved ones feel ill, we will want to know how fast and well they will be tested and treated. If we have to self-isolate we will want to know where to get supplies with as little human contact as possible.
Governments should be planning for how to collect and distribute this kind of information, because a lack of authoritative sources will mean people turn to unofficial (and inaccurate) ones. But some of this requires businesses, such as retailers, transport providers and venue suppliers, to share data as well. And some could be crowd-sourced in well-managed and trustworthy ways, by communities.
Sometimes the data will tell a hard story. It might reveal a lack of testing kits, hospital beds or equipment. Authorities often feel an urge to hide this information to avoid a negative backlash. But if the reaction to the US Center for Disease Control removing figures from its dashboard, or the UK saying they would only publish figures on a weekly basis tells us anything, it’s that a lack of transparency in itself is a red flag. People will understand if health services are stretched; they will not be so forgiving if that is covered up.
Having said that, we should also be alert to and minimise the negative side-effects of making data available. Singapore has published a lot of details about every infected person, including their age, gender, workplace, where they have visited and how they had contact with other infected people. South Korea is sending text alerts containing similar information. This may be helping others be alert to their own symptoms, if they have visited the same places. But it is also reportedly leading to fewer customers for small businesses just because they were once visited by someone with coronavirus, and the public shaming of people thought to have been caught having affairs or engaging in illicit activities.
Setting aside the ethical problems of infringing people’s right to privacy, the distribution and use of individual case information has negative side effects. When disclosing information might damage ourselves or things we care about, we are likely to lie. If people who contract Covid-19 dissemble about their movements and contacts, officials will be less likely to be able to trace and contain further cases. This in turn will increase the spread of the disease.
The data infrastructure we build to support us through the Covid-19 outbreak could last a long time. We should make sure it is open, trustworthy and built responsibly.
Jeni Tennison is CEO of the Open Data Institute