University of Nova Gorica Co-develops a New Machine-Learning Method to Analyse Complex Data
A group of researchers lead by Uros Zivanovic (University of Trieste) and including Gabriella Contardo (SMASH fellow at the Center for Astrophysics and Cosmology, University of Nova Gorica) has developed a versatile new AI method to process complex data, such as the ones typically encountered in astronomy. They will present their work at NeurIPS, one of the most important machine learning conferences, which is taking place in December 2-7 in San Diego, USA, and is expected to draw more than 10,000 machine learning experts. This new method offers exciting opportunities for analysing the upcoming data from the Vera Rubin Observatory, of which the University of Nova Gorica is an international partner.
Astronomical observatories like the Vera Rubin Observatory in Chile, might observe the sky each night, but don't necessarily observe the same regions of the sky at regular intervals. Bad weather can also prevent observations! From these observations, astronomers can extract light-curves, which show how an astronomical object's brightness changes over time, indicating, for instance, a star flaring, exploding, or getting consumed by a black hole. However, because of irregularity in time intervals between observations, these light-curves will often contain gaps (missing data).
Researches across many scientific fields working on complex sequential or image data have so far often turned to modern AI methods for analyses, such as Transformers, which were originally developed for language tasks, (leading to significant improvements in chatbot quality). These methods work by using attention mechanisms, which let the model learn to identify which parts of the data are the most relevant. However, they struggle with data like astronomical light-curves, as they expect data to arrive at regular intervals like words in a sentence. Previous solutions for this problem proposed specialized architectural modifications or different ways to provide the model with information about when each measurement was taken.
In their recent work, the group of researchers lead by Uros Zivanovic proposes a new approach that combines two existing techniques: one which encodes time information flexibly (Rotary Position Embeddings), and another which trains by hiding parts of the data and learning to reconstruct them (Masked AutoEncoder). This new method named Rotary Masked Autoencoder and dubbed RoMAE, leads to a model that is robust to irregularity and gaps in the data.
In their work, the researchers demonstrate RoMAE's versatility across multiple data types (time-series, audio, images) and various tasks. Notably, RoMAE outperforms specialized methods on a challenge that mimics the data the Vera Rubin Observatory will start producing in the beginning of 2026. The observatory will track and collect millions of brightness changing events in the sky each night, far too many to analyse by hand, making novel approaches like RoMAE crucial.
“The new method opens really exciting perspectives for future astronomical research”, says Gabriella Contardo. “With researchers at the University of Nova Gorica's Center for Astrophysics and Cosmology and our collaborators, we are exploring the possibilities of using RoMAE for distinguishing different types of astronomical events and detecting new, previously unknown types of events that might be hiding in Vera Rubin Observatory data.
Gabriella Contardo, co-author of RoMAE, is a postdoctoral fellow in the European project SMASH (Machine Learning for Science and Humanities), coordinated by the University of Nova Gorica. In the project, 50 postdoctoral researchers, together with their supervisors at five Slovenian research institutions, are using AI methods in their research in the fields of precision medicine, climate science, communication, particle physics, cosmology, and astrophysics.
List of other authors with their institutions:
Uros Zivanovic (1,2), Serafina Di Gioia (2,3) , Andre Scaffidi (2) , Martín de los Rios (2) , and Roberto Trotta (2,4,5,6)
1: University of Trieste, Italy
2: Scuola Internazionale Superiore di Studi Avanzati (SISSA), Italy
3: International Centre for Theoretical Physics (ICTP), Italy
4: INFN National Institute for Nuclear Physics, Italy
5: ICSC, Centro Nazionale di Ricerca in High Performance Computing, Italy
6: Imperial College London, United Kingdom