Analysis of research data in health: opportunities within reach

9 Feb
Xavier Serra-Picamal
Xavier Serra-Picamal

The generation and storage of data is omnipresent nowadays. The costs have fallen drastically and the health sector is not alien to this. To illustrate this, it is worth having a look at the following graph created by the National Institutes of Health about the human genome, which shows the evolution of the cost of sequencing a genome:Cost per genome

As can be seen, since 2007, the cost of sequencing a genome has fallen dramatically. Having one’s own genome sequenced is now possible and in the future it may become commonplace. Bearing in mind that a copy of the human genome is made up of aproximately 3 million base pairs (3 million adeninines, thymines, citosines or guanines arranged sequentially in 23 chromosomes ) it is easy to infer that, also within this field, the quantity of data generated in the coming years will be massive.

This tendency is repeated in other areas of health care: among other, clinical history data in electronic format, medical imaging, primary care data or that of drug consumption are obtained and stored in registries, in general, structured and interlinked. The potential of this data for carrying out research in order to provide better health care is notable, in the way of faster and more accurate diagnoses, improved therapeutic approaches and a better management of the system.

To analyse the challenges and opportunities at a European level, a work session organised by the Directorate-General for Research and Innovation of the European Commission was held in Luxembourg with representatives from AQuAS. The points discussed have been gathered in the article Making sense of big data in health research: Towards an EU action plan, published in the Genome Medicine magazine and of open access. As explained in the article, using this information to provide better healthcare is a challenge but a great opportunity at the same time.

Making sense of big data in health research

Nevertheless, a big effort is required to transform this data into knowledge and specific actions. However much the costs of generating and storing data may drop, the management of information, its interpretation, and the generation of knowledge needs considerable investment and resources. This means having adequate information systems as well as the economic and human resources so that the data can be treated efficiently and the protection of individual rights guaranteed. In addition, the participation, commitment and effective communication of all the agents of the system is needed (including the scientific community, patients, citizens, the administration, and so on) to guarantee that this data is used efficiently, responsibly and that it promotes research which is efficient and of quality.

Catalonia, because of the size of its population, the fact that it has an integrated health system and the work done over many years, is well positioned to be able promote the reuse of health data for research. At an international level, some comparable projects exist and new projects exist with the goal of integrating and consolidating data from different sources, with some very ambitious and attractive programmes. The  PADRIS Programme, presented last 12 January, aims to centralise and make the data generated in health available to researchers in research centres in Catalonia and universities so as to provide better healthcare with a maximum guarantee in security and privacy. The work to be done is considerable. The resources needed too. The opportunities to provide better research and better healthcare are within reach.

Post written by Xavier Serra-Picamal, researcher at the Karolinska Institutet (Sweden).

* TERMCAT (the centre for terminology in the Catalan language) has recently dealt with the question of how to say data scientist in Catalan. The subject is very much a current issue!