Week 45, 2025

2511.01344v1

BALNet: Deep Learning-Based Detection and Measurement of Broad Absorption Lines in Quasar Spectra

Theme match 2/5

Yangyang Li, Zhijian Luo, Shaohua Zhang, Du Wang, Jianzhen Chen, Zhu Chen, Hubing Xiao, Chenggang Shu

First listed 2025-11-03 | Last updated 2025-11-03

Abstract

Broad absorption line (BAL) quasars serve as critical probes for understanding active galactic nucleus (AGN) outflows, black hole accretion, and cosmic evolution. To address the limitations of manual classification in large-scale spectroscopic surveys - where the number of quasar spectra is growing exponentially - we propose BALNet, a deep learning approach consisting of a one-dimensional convolutional neural network (1D-CNN) and bidirectional long short-term memory (Bi-LSTM) networks to automatically detect BAL troughs in quasar spectra. BALNet enables both the identification of BAL quasars and the measurement of their BAL troughs. We construct a simulated dataset for training and testing by combining non-BAL quasar spectra and BAL troughs, both derived from SDSS DR16 observations. Experimental results in the testing set show that: (1) BAL trough detection achieves 83.0% completeness, 90.7% purity, and an F1-score of 86.7%; (2) BAL quasar classification achieves 90.8% completeness and 94.4% purity; (3) the predicted BAL velocities agree closely with simulated ground truth labels, confirming BALNet's robustness and accuracy. When applied to the SDSS DR16 data within the redshift range 1.5<z<5.7, at least one BAL trough is detected in 20.4% of spectra. Notably, more than a quarter of these are newly identified sources with significant absorption, 8.8% correspond to redshifted systems, and some narrow/weak absorption features were missed. BALNet greatly improves the efficiency of large-scale BAL trough detection and enables more effective scientific analysis of quasar spectra.

Short digest

Presents BALNet, a 1D-CNN + Bi-LSTM that detects and measures C IV BAL troughs directly along quasar spectra, trained on SDSS DR16-based mocks spanning 1.5<z<5.7. On tests it recovers troughs with 83.0% completeness and 90.7% purity (F1=86.7%) and classifies BAL quasars at 90.8% completeness/94.4% purity, with predicted velocities closely matching labels. Applied to DR16, BALNet finds ≥1 BAL trough in 20.4% of spectra, with >25% newly identified and 8.8% redshifted systems—boosting inflow candidates and a less biased BAL census. Caveat: some narrow/weak absorption is still missed.

Key figures to inspect

  • Figure 1: Compare observed vs simulated distributions of the number of C IV BAL troughs per spectrum to judge whether the mock set reproduces multi‑trough incidence and avoids overproducing simple single‑trough cases.
  • Figure 2: Inspect the mock‑construction pipeline and label vector; verify how real unabsorbed spectra are combined with randomly placed BAL regions and whether the setup can encode redshifted (inflow) as well as blueshifted features.
  • Figure 4: Examine how the 1165‑pixel input is downsampled to a 387‑element probability vector (kernel_size=7, stride=3) and whether this resolution sets a practical floor on detectable trough width—relevant to the stated misses of narrow/weak absorption.
  • Figure 3: Review the LSTM gating schematic to understand how sequential context along the spectrum is modeled, which underpins the reported agreement between predicted velocities and labels.

Discussion

Log in to view the paper discussion, see votes, and leave your own feedback.