Quality Data Drives Discovery: Open and Free Datasets Support AlphaGenome

The newly launched AlphaGenome model by Google DeepMind highlights how open, free, and high-quality datasets enable cutting-edge scientific advancements. Built using publicly available data, AlphaGenome demonstrates what becomes possible when researchers have access to reliable and reusable resources.

It is great to see the eQTL Catalogue’s data featured in AlphaGenome’s variant interpretation benchmarks, including gene expression, splicing, and polyadenylation quantitative trait loci (QTLs). This kind of data reuse underscores a key principle of FAIR data: the more a dataset is reused, the more valuable it becomes - not only for the science it enables but also as recognition of the effort researchers invest in creating, curating, and maintaining these resources.

In bioinformatics, real progress relies not just on having data, but on ensuring it is open, accessible, high-quality, and supported by robust, free-to-use tools. Services like the eQTL Catalogue, developed in partnership between ELIXIR Estonia, EMBL-EBI, and Open Targets, ensure that well-structured data remains usable and impactful across domains and over time. Every reuse is a reminder that thoughtful data stewardship benefits the entire scientific community.

🔗 Learn more about AlphaGenome

📄 Read the preprint

🔍 Explore the eQTL Catalogue

🔍 Explore the eQTL Catalogue Browser