Center for Decoding the Universe (C4DU) Journal Club
Event Details:
Location
This event is open to:
Title: Decoding the Chemical Fossil Record: Machine Learning and Foundation Models for Near-Field Cosmology
Abstract: Flagship spectroscopic surveys such as DESI, LAMOST, SDSS, and PFS are accumulating an unprecedented dataset of stars spanning the Milky Way and the broader Local Group. In near-field cosmology, these ancient stars act as time capsules: their chemical abundances encode the star formation history and hierarchical assembly of our Galaxy. Maximizing the scientific yield of these heterogeneous datasets is a major challenge, as traditional stellar parameter pipelines often struggle at low-to-medium spectral resolution and suffer from domain mismatch across survey instruments. This creates a cross-domain transfer and label-scarce learning problem at astronomical scale.
In this talk, I will present data-driven machine learning frameworks designed to bridge the cross-survey divide. First, I will demonstrate how simple, parameter-efficient neural networks (MLPs) pre-trained on low-resolution data (LAMOST) can rapidly adapt to medium-resolution surveys (DESI) using few-shot transfer learning. This approach successfully recovers the Galactic thin and thick disk chemical bi-modality from DESI spectra, a separation that is smeared out in the classical DESI stellar parameter pipeline, and improves agreement with high-resolution reference labels.
Next, I will introduce SpecCLIP, a spectral foundation model that uses contrastive learning with auxiliary reconstruction and cross-modal prediction decoders to align different spectral modalities (LAMOST low-resolution and Gaia XP spectra) into a unified embedding space while preserving modality-specific information. SpecCLIP delivers competitive performance across a broad range of stellar parameters and enables cross-survey spectral retrieval and prediction. I will also discuss how embeddings from SpecCLIP generalize to DESI fine-tuning, examining the regimes where foundation model representations offer advantages over direct spectral inputs and where simple pre-trained MLPs remain competitive. Together, these approaches offer a practical path toward extracting the chemical fossil record at scale, providing observational constraints on early galaxy assembly and chemical enrichment history, with broader implications for multi-modal transfer learning in any domain where instruments are mismatched and labels are scarce.
Related Topics
Explore More Events
-
KIPAC Seminar
KIPAC Seminar: One, Two, Three... Infinity: Unveiling the Physics of Early Galaxies and Their Collective Impact at Cosmic Dawn
Guochao (Jason) Sun (Northwestern Univ.)-Campus, Varian 206 -
KIPAC Tea Talk
KIPAC Tea: Commissioning the LSST Camera Focal Plane Array at Rubin Observatory / TBD
Sean MacBride (Univ. of Zurich) / Kate Storey-Fisher (KIPAC)-Campus, PAB 102/103 -
Astrophysics Colloquium
Astrophysics Colloquium: Galaxies: From Island Universes to Complex Ecosystems
Sanch Borthakur (Arizona State University)-SLAC, Kavli 3rd Floor Conf. Room