IEEE International Workshop on
Machine Learning for Signal Processing (MLSP) 2025
August 31-September 3, Istanbul/Turkey
Signal Processing in the age of
Large Language Models
IEEE

POSTER SESSION PROGRAM

September 1

Signal Decomposition and Estimation Methods

  • (R223-oral) Continuous-Time Signal Decomposition: An Implicit Neural Generalization of PCA and ICA
  • (R207-oral) Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis
  • (R101-oral) LANM: Learned Atomic Norm Minimization for Superfast Gridless Spectral Compressed Sensing
  • (R18-oral) prNet: Efficient and Robust Phase Retrieval via Stochastic Refinement
  • (R200-oral) Coordinate Ascent Neural Kalman-MLE for State Estimation
  • (R33) Efficient algorithms for the Hadamard decomposition
  • (ISPS14) Outlier-Resilient Model Fitting via Percentile Losses: Methods for General and Convex Residuals
  • (R189) Efficient Algorithms for Estimating the Parameters of Mixed Linear Regression Models
  • (R6) Hankel Surrogate for Model-Bridge Simulation Calibration
  • (R75) Nonlinear Matrix Decomposition with the Sigmoid function
  • (R197) TSNPC: Learning from partially observed data using tensor network structured probabilistic circuits
  • (R180) Online Frequency Estimation with Adaptive Locally Differentially Private Mechanisms

NOCASA Data Challenge

  • Non-native Children's Automatic Speech Assessment Challenge (NOCASA)
  • Automated Pronunciation Scoring of Child L2 Learners with Score Calibration for Imbalanced Distributions
  • (oral) Comparison of End-to-end Speech Assessment Models for the NOCASA 2025 Challenge

Audio and Speech Processing

  • (R139-oral) DDL: A Dataset for Drone Detection and Localization from Multi-Channel Audio and a Deep Uncertainty-Aware Framework
  • (R203-oral) Audio Prototypical Network for Controllable Music Recommendation
  • (R206-oral) Re-Bottleneck: Latent Re-Structuring for Neural Audio Autoencoders
  • (R95-oral) Input Conditioned Layer Dropping in Speech Foundation Models
  • (R13) AudioMAE++: learning better masked audio representations with SwiGLU FFNs
  • (R214) HuBERT-Derived SSL Features and ECAPA-TDNN Matching for Robust Audio Deep‑Fake Detection
  • (R213) Efficient representation learning for music via likelihood factorisation of a variational autoencoder
  • (R198) Semi-Supervised Audio-Visual Action Recognition with Audio Source Localization Guided Mixup
  • (R128) SAGE: Spliced-Audio Generated Data for Enhancing Foundational Models in Low-Resource Arabic-English Code-Switched Speech Recognition
  • (R120) From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems
  • (R2) Whisper speaker identification: leveraging pre-trained multilingual transformers for robust speaker embeddings
  • (R38) Prototypical contrastive learning for improved few shot audio classification
  • (R71) Tiny Noise-Robust Voice Activity Detector for Voice Assistants
  • (R90) Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss
  • (R168) An analysis of associations between maternal vocalizations and infant stress recovery using speech emotion recognition models
  • (S33) Child speech assessment through large language model speech synthesis: Preliminary results

The Sampling-Assisted Pathloss Radio Map Prediction Challenge

  • Efficient Indoor Radio Map Prediction with Improved Transformers and Active Sampling Strategies
  • IRM-NET: An Enhanced Attention Networks for Indoor Radio Map Estimation
  • Sparse-Guided RadioUNet with Adaptive Sampling for the MLSP 2025 Sampling-Assisted Pathloss Radio Map Prediction Data Competition
  • Radio Map Prediction via Neural Networks with Ground Truth Shortcuts and Selective Sampling
  • U-Net Based Indoor Radio Map Prediction under Sparse Sampling
  • SAIPP-Net: A Sampling-Assisted Indoor Pathloss Prediction Method for Wireless Communication Systems
  • U-Net for Indoor Pathloss Prediction from Sparse Measurements with Physics-Informed Features
  • The Sampling-Assisted Pathloss Radio Map Prediction Competition

LLMs and Deep Learning Architectures

  • (R74) Convolutional spiking-based GRU cell for spatio-temporal data
  • (R68) Data Aware Differentiable Neural Architecture Search for Tiny Keyword Spotting Applications
  • (R46) An Alternating Algorithm for Neural Collapse in Deep Classifier Neural Network with Arbitrary Number of Classes
  • (R39) Neural-ANOVA: Analytical Model Decomposition using Automatic Integration
  • (R70) Asymptotic Study of In-context Learning with Random Transformers through Equivalent Models
  • (R85) Tackling Distribution Shift in LLM via KILO: Knowledge-Instructed Learning for Continual Adaptation
  • (R122) Post-inference guided transformer for anomaly interval localization in multivariate time series
  • (R177) Semantic Chunking and Chain-of-Thought Reasoning for RAG-based Document Processing

Computer Vision Applications

  • (R56) Efficient license plate recognition via pseudo-labeled supervision with Grounding DINO and YOLOv8
  • (R81) Pose-guided focal loss for enhancing vision transformers in continuous sign language recognition
  • (R8) Estimation of NDVI from UAV RGB imagery using deep learning models
  • (R138) Deep learning based polyculture structural phenotyping
  • (R146) 3D face morph generation using geometry-aware template inversion
  • (R147) Generating synthetic face recognition datasets using Brownian identity diffusion and a foundation model
  • (R155) Post-training quantization for Vision Mamba with k-scaled quantization and reparameterization
  • (R164) Attention and edge-aware band selection for efficient hyperspectral classification of burned vegetation
  • (R211) Bridging discrete and continuous: A multimodal strategy for complex emotion detection
  • (R210) MCFF-Det: Multispectral coarse-to-fine fusion for object detection
  • (R208) Efficient mouth alignment for visual speech recognition
  • (R171) Long range constraints for neural texture synthesis using sliced Wasserstein loss

September 2

ML for Neuroscience

  • (ISPS2) Latent representation learning for multimodal brain activity translation
  • (R42) UniPhyNet: A unified network for multimodal physiological raw signal classification
  • (R49) MicroWaveNet: Lightweight CBAM-augmented wavelet-attentive networks for robust EEG denoising
  • (R87) Towards generalizable learning models for EEG-based identification of pain perception
  • (R203) DECIFRA: Deep extraction of causally informed features via restricted architecture
  • (R178) Uncovering k-way balanced consensus communities in signed multilayer brain networks
  • (R170) Hypergraph overlapping community detection for brain networks

Learning Theory, Optimization and Algorithms

  • (R104-oral) Fast and Robust Training of Deep Learning Models with Multiplicative Adagrad
  • (R73-oral) Information Entropy-Based Scheduling for Communication-Efficient Decentralized Learning
  • (R80-oral) Model Recycling Framework for Multi-Source Data-Free Supervised Transfer Learning
  • (R162-oral) Meta-Tree: Bayesian Approach to Avoid Overfitting in Decision Trees and Analysis on the Application to Boosting
  • (R114) Model-Agnostic Uncertainty Calibration for Noisy Constraint Modeling in Bainitic Steel Optimization
  • (R173) Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals
  • (ISPS5) Attention Augmented Structure-centric Bias Mitigation with Feature Disentanglement
  • (R136-oral) APA: Domain Generalization Using Frequency Based Augmentation
  • (R150-oral) Perturbation-based Multiview Graph Learning with Consensus Graph
  • (R3) Shapley-Based Data Valuation with Mutual Information: A Key to Modified K-Nearest Neighbors
  • (R72) Toward Sustainable Continual Learning: Detection and Knowledge Repurposing for Reoccurring Tasks
  • (R105) Online Topology Identification of Higher-Order Cell Structures
  • (R215) Benefits of Online Tilted Empirical Risk Minimization: A Case Study of Outlier Detection and Robust Regression
  • (R184) EMORF-II: Adaptive EM-Based Outlier-Robust Filtering with Correlated Measurement Noise 

Special Session, Applications of AI in the Analysis of Cultural and Artistic Heritage

  • (S26-oral) Solving Jigsaw Puzzles in the Wild: Human-Guided Reconstruction of Cultural Heritage Fragments
  • (S20) The Cow of Rembrandt - Analyzing Artistic Prompt Interpretation in Text-to-Image Models
  • (S11) Speaking Images: A Novel Framework for the Automated Self-Description of Artworks
  • (S6) Experimenting Active and Sequential Learning in a Medieval Music Manuscript
  • (S25) Deep Learning for Fine-Grained Classification of Montelupo Majolica: Benchmarking and Explainability
  • (S30) DFA-CON: A Contrastive Learning Approach for Detecting Copyright Infringement in DeepFake Art
  • (S31) Restoration and Enhancement of Historical Manuscript Images Using Diffusion Model
  • (S10) Multimodal Artwork Topic Modeling via Fine-Tuned CLIP and Knowledge-Driven Prompts 

Special Session, LEAP: Low-Energy AI For Edge Learning and Processing

  • (S32-oral) Improving Communication-Efficiency for Decentralized Federated Clustering
  • (S1) Unseen Speaker and Language Adaptation for Lightweight Text-To-Speech with Adapters
  • (S2) Utilizing Dynamic Sparsity on Pretrained DETR
  • (S7) Neural Successive Cancellation Decoder for Polar Codes Using Analog In-Memory Computing with Memristors
  • (S19) Self-Supervised Learning at the Edge: The Cost of Labeling
  • (S21) Energy-Information Trade-Off in Self-Directed Channel Memristors

Special Session, Large Vision Language Models (LVLMs) and their Application to Document Understanding

  • (S22-oral) Trust the Model: Compact VLMs as In-Context Judges for Image-Text Data Quality
  • (S4) Multi-Agent Interactive Question Generation Framework for Long Document Understanding
  • (S14) Enhanced Arabic Text Retrieval with Attentive Relevance Scoring
  • (S28) Learning or Cheating? Assessing Data Contamination in Large Vision-Language Models
  • (S29) ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models

Applications in Wireless, Radio and Radar Communications

  • (R11) Few-Shot Radar Signal Recognition through Self-Supervised Learning and Radio Frequency Domain Adaptation
  • (ISPS9) Signal Processing Challenges in Automotive Radar
  • (ISPS8) Advancing Single-Snapshot DOA Estimation with Siamese Neural Networks for Sparse Linear Arrays
  • (ISPS7) Advancing High-Resolution and Efficient Automotive Radar Imaging Through Domain-Informed 1D Deep Learning
  • (R50-oral) RadioTrace: Bridging Diffusion Priors and RSS Measurements for Accurate Radio Map Estimation
  • (R62) Robust Imputation SwinLSTM for Spectrum Map Prediction of Incomplete Data
  • (R127) Online Gaussian Process for Dynamic Radio Map Updating
  • (R10) Signal Prediction for Loss Mitigation in Tactile Internet: A Leader-Follower Game-Theoretic Approach
  • (R154) Benchmarking Transfer Learning in Passive Sonar: An Evaluation Study
  • (R219) Robust and Efficient Kernel-Based Digital Self-Interference Cancellation Using A Priori Knowledge in Full-Duplex Transceivers

Diffusion, Generative and Representation Learning

  • (R37) MPRDiff: Mixed Precision Restorative Diffusion Model with Incoherence Processing
  • (R14) Backdoor Inversion in Neural-Activation Space
  • (R22) Contrastive Disentanglement Learning for Empathetic Generation
  • (R51) Compression Beyond Pixels: Semantic Compression with Multimodal Foundation Models
  • (R21) Diffusion-Based Connectionist Temporal Classification
  • (R5) Foundation Model-Aided Video Semantic Communication: Framework Design and Prototype Validation
  • (R4) FreNBRDF: A Frequency-Rectified Neural Material Representation
  • (R95) Input Conditioned Layer Dropping in Speech Foundation Models
  • (R204-oral) Ptychographic Image Reconstruction from Limited Data via Score-Based Diffusion Models with Physics-Guidance
  • (R58) Stability and Performance Analysis of Diffusion Learning for Two-Network Competing Problems

September 3

ML for Medical Applications

  • (R137-oral) Cycle-Consistent Diffusion Model with Vessel-Aware Attention for Endoscopic Image Translation
  • (ISPS3) DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images
  • (ISPS1) ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images
  • (R24) SVD Based Least Squares For X-Ray Pneumonia Classification Using Deep Features
  • (R31-oral) RSR-NF: Neural Field Regularization by Static Restoration Priors for Computed Dynamic Imaging
  • (R182-oral) Closing the Gap in Multimodal Medical Representation Alignment
  • (R7) Adaptable Non-parametric Approach for Speech-based Symptom Assessment: Isolating Private Medical Data in a Retrieval Datastore
  • (R15) "Digital Washing" of Semen Time-lapse Images
  • (R148) StrokeVision-Bench: A Multimodal Video and 2D Pose Benchmark for Tracking Stroke Recovery
  • (R163) Does Language Matter for Early Detection of Parkinson's Disease from Speech?

Human Activity Recognition, Physiological Signals & Wearables

  • (ISPS13) OSR: Toward Developing Efficient Federated Learning-based Human Activity Recognition using Optimal Server Representations
  • (R41) Compressed and Lightweight CNN for Real-Time Parkinson's Tremor Detection from Wearable IMU Data
  • (R142) Subject Invariant Contrastive Learning for Human Activity Recognition
  • (R98) Graph Structure Learning with Local Connectivity Refinement for Improved Physiological Emotion Recognition
  • (R188) GEMS: Group Emotion Profiling through Multimodal Situational Understanding
  • (ISPS11) Interbeat Interval Filtering
  • (R20) Classification Filtering
  • (R27) Embedded Inter-Subject Variability in Adversarial Learning for Inertial Sensor-Based Human Activity Recognition

Decision Making, Bandits, and Recommendation Systems

  • (ISPS4) A Distillation-based Future-aware Graph Neural Network for Stock Trend Prediction
  • (R156) Learning from Multiple Noisily Optimal Demonstrators in Stochastic Multi-Armed Bandits
  • (R194) Networked Contextual Bandits with Anomaly-Aware Learning
  • (R193) State Prediction for Offline Reinforcement Learning via Sequence-to-Sequence Modeling
  • (R190) Poisson-based Modeling and Curvature-aware Optimization for Neural Collaborative Filtering in Recommendation Systems
  • (R157) Integrating Adaptive Prediction with an Optimization-based Methodology for Data-Driven Efficiency Evaluation in Education
  • (ISPS6) What Does an Audio Deepfake Detector Focus on? A Study in the Time Domain
  • (R141) CollabPersona: A Framework for Collaborative Decision Analysis in Persona Driven LLM-based Multi-Agent Systems

Federated & Decentralized Learning

  • (ISPS10) Federated Domain Generalization with Label Smoothing and Balanced Decentralized Training
  • (ISPS12) Enhancing Federated Learning Convergence With Dynamic Data Queue and Data-Entropy-Driven Participant Selection
  • (R222) Provable Reduction in Communication Rounds for Non-Smooth Convex Federated Learning
  • (R111) Joint Graph Estimation and Signal Restoration for Robust Federated Learning
  • (R107) Memory-Efficient Correlated Noise for Locally Differentially Private Momentum in Distributed Learning
  • (R65) DeMem: Privacy-Enhanced Robust Adversarial Learning via De-Memorization

Special Session, Decoding the Brain Time Series

  • (S12-oral) Toward a Gaze-Independent Brain-Computer Interface Using the Code-Modulated Visual Evoked Potentials
  • (S3S) Assessing the Capabilities of Large Brainwave Foundation Models
  • (S5) Uncertainty Quantification for Motor Imagery BCI - Machine Learning vs. Deep Learning
  • (S8) On the Role of Low-Level Visual Features in EEG-Based Image Reconstruction
  • (S9) Deep Learning of Mesoscale Cortical Dynamics for Real-Time Classification of Forelimb Movement in Mice
  • (S16) Interpretability of Riemannian Tools Used in Brain-Computer Interfaces
  • (S18) AbsoluteNet: A Deep Learning Neural Network to Classify Cerebral Hemodynamic Responses of Auditory Processing
  • (S23) Riemannian Fusions of EEG-Based Features for Motor Imagery Detection under Propofol Sedation
  • (S27) Evaluating Manifold Alignment of Motor Imagery for Transfer Learning in EEG-Based BCIs
  • (S9) Deep Learning of Mesoscale Cortical Dynamics for Real-Time Classification of Forelimb Movement in Mice

Data Competition, VEELA - Vessel Extraction and Extrication for Liver Analysis

  • Hepatic Vessel Segmentation and Classification in CTA Images Using NNU-Net with Centerline Regression
  • Hybrid Boundary Sensitive Tversky 3D U-Net for Liver Vessel Segmentation
  • Veela Challenge - Vessel Extraction and Extrication for Liver Analysis