Intelligent ecosystem to improve
the governance, the sharing,

and the re-use of health data for rare cancers


IDEA4RC has been presented in the poster session of the 5th International Conference on Rare Diseases, which will be held in Budapest on 30 November and 1 December 2023.

The conference gathers experts from across the world presenting ground-breaking and cutting-edge news on diagnosis, treatments and networking in Rare Diseases.

Among the new topics of RARE2023 are autoimmune disorders, palliative care in rare disease, surgical procedures in ultra-rare disorders, and newly approved innovative therapies for genetic and non-genetic rare diseases.

IDEA4RC aims at building a platform to foster the sharing and re-use of health data on two families of rare cancers, head and neck tumours and sarcomas, with the ultimate objective of advancing research and improve care. Its framework can be expanded to other rare cancer families and other rare diseases.

You can download the poster PDF here.

Rare cancers need large and diverse datasets to advance research and care

Every year in Europe 650’000 people receive a rare cancer diagnosis. Taken together they represent nearly 25% of all cancer diagnoses in the continent. However, there are nearly 200 different rare cancers, each found in less than 6 persons per100’000 each year. Each rare cancer faces the challenge of insufficient large and diverse datasets, hindering the comprehensive analysis necessary to unveil insights into disease progression, discover novel diagnostic and prognostic factors, and evaluate the effectiveness of treatments.

Sharing and re-use of such data is currently hampered by the lack of interoperability across centers and sometimes even between different data sources in the same center, and by privacy and security requirements imposed by EU regulations on personal data. IDEA4RC aims at overcoming these obstacles developing a data ecosystem which implements data protection and privacy by design and by default and harmonizes data standards to ensure interoperability with the ultimate objective of favouring and speeding up the re-use of health data on rare cancers.

Federated Learning

IDEA4RC will adopt a federated learning approach, that allows to tively analyze multiple datasets without moving them from their original location. Each dataset will be processed locally inside a so-called secure processing environment, enabling the implementation of data protection principles through design and default settings. Clinicians and researchers will access the ecosystem through a virtual assistant, which will allow them to explore and assess the quality of the data, build cohorts and run analyses.

Natural Language Processing

IDEA4RC will leverage the power of machine learning to extract information from texts written in natural language, such as clinician notes and medical reports, which are currently under exploited. IDEA4RC researchers will train large language models to understand texts produced in the different languages spoken acrossthe consortium. It will do so by annotating the existing texts according to the IDEA4RC data model, in order to merge these unstructured sources with the structured ones.

Testing the ecosystem

The ecosystem will be tested through pilot projects deployed in the 11 clinical centers of the European Reference Network for Rare Adult Solid Cancers (EURACAN) belonging to the consortium and based in 8 EU member countries. The projects will focus on two of the rare cancer families, head and neck tumors and sarcomas, but its framework could be extended to other rare cancers as well as other rare diseases.

Co-creating the governance model

The data governance layer will implement the sharing preferences of each clinical center. These preferences are shaped by the values and positions of the different stakeholders that can use and benefit from the ecosystem. Understanding these evolving preferences is a challenge, especially in a dynamic legal landscape. To address this, the consortium will engage stakeholders in co-creation workshops, fostering discussions on implementation scenarios and adapting the governance model accordingly.