Intelligent ecosystem to improve
the governance, the sharing,

and the re-use of health data for rare cancers

Hot takes from the third consortium meeting in Madrid

The IDEA4RC third consortium meeting was held at the Universidad Politécnica de Madrid on 22-23 November 2023, hosted by the Life Supporting Technologies lab where many members of the consortium carry on their research activities.

“The second year of the project is for health professionals and providers”, said project coordinator Annalisa Trama in her opening remarks. “We really need your contribution to achieve the milestones planned for the end of this second year.”

The three milestones for the second year are:

  • Infrastructure deployment. The 11 clinical centers of the EURACAN reference network involved in the consortium will test the capsule environment with synthetic data.
  • Data ingestion. Each clinical center will need to locate the variables included in the common data model (see deliverable “Metadata Taxonomy” here) inside their data warehouses and identify which are already structured and which instead need Natural Language Processing (NLP) of free texts, such as medical notes and pathology reports.
  • Governance and legal framework. Who can access the data? How data will be accessed depending on the nature of the user and the purpose of its research project? In addition, the legal framework will be set up, taking into account the approaches of the clinical centers but also the legal landscape of the European Union regarding the secondary use of health data. The data governance layer, which will regulate the access to the data stored in the capsules, will be designed on the basis of the governance approach and the legal framework on which the consortium will agree upon. The data governance layer should transform a paper-based approach into a dynamic and fluid process to request and grant data permits.

To achieve infrastructure deployment and data ingestion, three working groups have been established: one for architecture, one for activities related to NLP training and testing, and the third for pilot deployment, which functions as a coordination group involving both healthcare professionals and hospital data managers, in addition to the consortium’s technical partners. The working group on pilot deployment meets once a month and aims at ensuring that all the clinical centers are updated and in line with the proposed timeline.

The second year will also witness the development of the first specification for the virtual assistant, which will serve as the interface between users and the platform. To achieve this result, partners for UPM have conducted focus groups and interviews to validate functionalities and users of the virtual assistant. Their results were shared and discussed with clinicians and researchers during the consortium meeting. The first mockup of the assistant is expected to be ready by the end of January 2024.

Scenario validation workshop

The second co-creation workshop was held in Madrid at the end of the consortium meeting, on the 23 November. Claudia Egher from Utrecht University led the workshop, which involved some 50 participants from all consortium partners.

The aim of the workshop was to discuss the three implementation scenarios identified for the IDEA4RC data ecosystem, and outilned in deliverable “Data ecosystem value analysis and scenarios” (you can read a summary of the deliverable here).

For each scenario, participants were asked to discuss in pairs about the desirable end situation and the societal values which the scenario would sustain. The insights emerging from this workshop will inform the debate on the governance approach that should be adopted for the IDEA4RC data ecosystem.

In the first scenario IDEA4RC will facilitate research on rare cancers. Important values to participants are researchers’ autonomy, potential impacts in designing clinical trials, studies reproducibility.

In the second scenario, IDEA4RC would also offer support to clinical decision making. On this point, participants highlighted the implications of such a system in terms of equitable access to care and better quality of care, which are especially important for rare cancers patients. Participants also highlighted the importance of building reliable quality indicators for data generated by NLP models within IDEA4RC ecosystem in order to generate trust among researchers and clinicians.

In the third scenario, IDEA4RC users could include also patients, public health agencies and regulatory bodies. Participants points out that the interface for patients’ access to the ecosystem should be different from that of clinicians and researchers and very carefully designed