Intelligent ecosystem to improve
the governance, the sharing,

and the re-use of health data for rare cancers


June 10, 2024


In this newsletter we update you about the fourth plenary meeting of the IDEA4RC consortium, which took place at the National Cancer Institute in Milan on 29 and 30 April. We also share with you a new animated video introducing IDEA4RC, its vision and its ultimate goals.

If you missed the previous issue, where we interviewed Vasiliki Tsiompanidou, legal researcher at ECCP and partner of IDEA4RC, about the European legal landscape on the re-use of health data, you can find it here.

By subscribing to the newsletter, you will be receiving bi-monthly updates on the project’s advancements. If you want to invite your friends to subscribe, send them this link.


Hot takes from the fourth
consortium meeting in Milan


IDEA4RC members gathered in Milan on 29 and 30 April for the fourth consortium meeting, hosted by the National Cancer Institute. It was the occasion for the various working groups to share updates on their most recent activities, and to validate the first version of the virtual assistant, the software component which will allow IDEA4RC users to explore the data, request a permit to access them and finally run their analyses.

The meeting was kicked-off by Christina Kyriakopoulou, scientific officer at DG Research and Innovation, who stressed the importance of collaboration among the projects of the cluster “Innovative Tools for Electronic Health Records and patient registries”, AIDAVA, DataTools4Heart, eCREAM, IDEA4RC and RES-Q+. She also highlighted the possibility of sharing expertise with the QUANTUM project, started in January 2024, that aims at developing health data quality standards, an effort that also IDEA4RC researchers are also undertaking.

Vasiliki Tsiompanidou, legal researcher at ECCP, summarised the work done so far on the data agreements necessary to run pilot projects. An advanced draft is under revision by the legal experts of the centres of expertise, the 11 clinical centres of the EURACAN network involved in IDEA4RC. The draft has been developed based on a survey run among the legal experts about three different scenarios in which the data processing could happen within the ecosystem. A summary of the survey results and a description of the scenarios have been included in deliverable “Pilot data governance”, a summary of which is available here. The centres of expertise are expected to send their final feedback by the end of June.

Claudia Egher, sociologist at Utrecht University, summarised the results of a survey run among IDEA4RC members about the governance model of the data ecosystem to be adopted after the end of the project. These results are also discussed at length in deliverable “Pilot data governance”. As a next step, a survey on data governance will be run among the legal representatives of the centres of expertise- This survey will gather insights into the rules and procedures of the various centres in order to understand how the high-level picture of the governance model emerged from the first survey, could be implemented.

Eugenio Gaeta and Franco Mercalli updated about the deployment of the IDEA4RC architecture and of the pilot projects, that will later this year.

Ioanna Drympeta, researcher at CERTH, updated participants about the development of the Data Governance layer, the software component of the IDEA4RC ecosystem which will manage the data permit application phase. This layer will allow researchers interested in running analyses over IDEA4RC data to easily submit multiple applications towards the centres whose data they wish to include in their study. Drympeta gave details about the privileges and responsibilities of the various profiles involved in this phase, both on the data holders side (the centres of expertise) and on the data users side (the research team). A chatbot trained on scientific publications concerning ethical and legal aspects of the use of data and AI algorithms will guide the researchers during this phase.

Unai Zulaika, engineer at Deusto University, revised all the steps that led to the formulation of the two IDEA4RC data models, one for sarcomas and the other for head and neck cancers. The process started from clinicians identifying the relevant variables for each rare cancer family. These variables were then converted into a set of entities related to each other through a diagram. These two diagrams represent the two IDEA4RC data models, one for each of the two rare cancer families considered by the project.

Laura Lopez and Itziar Alonso, engineers at UPM, shared with the participants the first mock-ups of the virtual assistant which will allow researchers to interact with the IDEA4RC data ecosystem, from data discovery to data access application and data use. A lively discussion with the oncology researchers followed. It gave useful insights to tailor the virtual assistant to their needs and practices.

The meeting ended with a few demonstrations from technical partners. Frank Martin, software engineer at IKNL, showed how to use the suite of federated learning tools of Vantage6 which will be integrated into the IDEA4RC ecosystem. Soumitra Ghosh and Alberto Lavelli, researchers at FBK, showed the latest results of the large language model developed to process clinical texts, such as physicians notes and pathology reports, with the aim of extracting data from them.


Meetings, results
and updates


FBK researchers are among the authors of a recent paper about a new open-source multilingual large language model for the medical domain, which stemmed also from their work within the IDEA4RC project. It was published in the Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, that was held in Turin from 20 to 25 May 2024. You can find the abstract and the full text here.We are glad to announce the first publications acknowledging the support of IDEA4RC. You can find them here.


Claudia Egher took part in the annual conference of the European forum for Studies on Policies for Research and Innovation, whose focus was “Governing Technology, Research, and Innovation for Better Worlds”. She talked about the work done within the IDEA4RC project, and you can find her slides here.


The next IDEA4RC consortium meeting will be held in Rome on 21-22 November 2024, hosted by Engineering Ingegneria Informatica.


What’s up in health
data sharing and reuse
in the EU


Earlier this month, an important change to the Italian Personal Data Protection code was introduced, facilitating the re-use of health-data for scientific research purposes. IDEA4RC contributed to this change through an open and ongoing dialogue with the Italian Data Protection Authority, started at the 2023 Privacy Symposium in Venice. Read more about this change and its implication here (in Italian).


In its manifesto ahead of the European Parliament elections, ESMO called for rare cancers to be prioritised with specific work streams within the research programme. Read the ESMO manifesto “An evidence-based approach to optimise Europe’s Beating Cancer Plan” here.


To mark the European Week Against Cancer (27-31 May), Euractiv magazine has taken the opportunity to highlight the issue of RareCancers in its weekly Health Brief. Catherine Feore, health editor at Euractiv, interviewed Jean-Yves Blay, general director of the Centre Léon Bérard and IDEA4RC partner. “When it comes to the particular challenges of diagnosis and treatment of rare cancers, there are clear benefits from cooperation at the EU level, and Blay stressed as a great example of how we are stronger when we act together.” Read the full article and the linked interview here.