POSTER SESSION PROGRAM
September 1
Signal Decomposition and Estimation Methods
- (R223-oral) Continuous-Time Signal Decomposition: An Implicit Neural Generalization of PCA and ICA
- (R207-oral) Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis
- (R101-oral) LANM: Learned Atomic Norm Minimization for Superfast Gridless Spectral Compressed Sensing
- (R18-oral) prNet: Efficient and Robust Phase Retrieval via Stochastic Refinement
- (R200-oral) Coordinate Ascent Neural Kalman-MLE for State Estimation
- (R33) Efficient algorithms for the Hadamard decomposition
- (ISPS14) Outlier-Resilient Model Fitting via Percentile Losses: Methods for General and Convex Residuals
- (R189) Efficient Algorithms for Estimating the Parameters of Mixed Linear Regression Models
- (R6) Hankel Surrogate for Model-Bridge Simulation Calibration
- (R75) Nonlinear Matrix Decomposition with the Sigmoid function
- (R197) TSNPC: Learning from partially observed data using tensor network structured probabilistic circuits
- (R180) Online Frequency Estimation with Adaptive Locally Differentially Private Mechanisms
NOCASA Data Challenge
- Non-native Children's Automatic Speech Assessment Challenge (NOCASA)
- Automated Pronunciation Scoring of Child L2 Learners with Score Calibration for Imbalanced Distributions
- (oral) Comparison of End-to-end Speech Assessment Models for the NOCASA 2025 Challenge
Audio and Speech Processing
- (R139-oral) DDL: A Dataset for Drone Detection and Localization from Multi-Channel Audio and a Deep Uncertainty-Aware Framework
- (R203-oral) Audio Prototypical Network for Controllable Music Recommendation
- (R206-oral) Re-Bottleneck: Latent Re-Structuring for Neural Audio Autoencoders
- (R95-oral) Input Conditioned Layer Dropping in Speech Foundation Models
- (R13) AudioMAE++: learning better masked audio representations with SwiGLU FFNs
- (R214) HuBERT-Derived SSL Features and ECAPA-TDNN Matching for Robust Audio Deep‑Fake Detection
- (R213) Efficient representation learning for music via likelihood factorisation of a variational autoencoder
- (R198) Semi-Supervised Audio-Visual Action Recognition with Audio Source Localization Guided Mixup
- (R128) SAGE: Spliced-Audio Generated Data for Enhancing Foundational Models in Low-Resource Arabic-English Code-Switched Speech Recognition
- (R120) From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems
- (R2) Whisper speaker identification: leveraging pre-trained multilingual transformers for robust speaker embeddings
- (R38) Prototypical contrastive learning for improved few shot audio classification
- (R71) Tiny Noise-Robust Voice Activity Detector for Voice Assistants
- (R90) Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss
- (R168) An analysis of associations between maternal vocalizations and infant stress recovery using speech emotion recognition models
- (S33) Child speech assessment through large language model speech synthesis: Preliminary results
The Sampling-Assisted Pathloss Radio Map Prediction Challenge
- Efficient Indoor Radio Map Prediction with Improved Transformers and Active Sampling Strategies
- IRM-NET: An Enhanced Attention Networks for Indoor Radio Map Estimation
- Sparse-Guided RadioUNet with Adaptive Sampling for the MLSP 2025 Sampling-Assisted Pathloss Radio Map Prediction Data Competition
- Radio Map Prediction via Neural Networks with Ground Truth Shortcuts and Selective Sampling
- U-Net Based Indoor Radio Map Prediction under Sparse Sampling
- SAIPP-Net: A Sampling-Assisted Indoor Pathloss Prediction Method for Wireless Communication Systems
- U-Net for Indoor Pathloss Prediction from Sparse Measurements with Physics-Informed Features
- The Sampling-Assisted Pathloss Radio Map Prediction Competition
LLMs and Deep Learning Architectures
- (R74) Convolutional spiking-based GRU cell for spatio-temporal data
- (R68) Data Aware Differentiable Neural Architecture Search for Tiny Keyword Spotting Applications
- (R46) An Alternating Algorithm for Neural Collapse in Deep Classifier Neural Network with Arbitrary Number of Classes
- (R39) Neural-ANOVA: Analytical Model Decomposition using Automatic Integration
- (R70) Asymptotic Study of In-context Learning with Random Transformers through Equivalent Models
- (R85) Tackling Distribution Shift in LLM via KILO: Knowledge-Instructed Learning for Continual Adaptation
- (R122) Post-inference guided transformer for anomaly interval localization in multivariate time series
- (R177) Semantic Chunking and Chain-of-Thought Reasoning for RAG-based Document Processing
Computer Vision Applications
- (R56) Efficient license plate recognition via pseudo-labeled supervision with Grounding DINO and YOLOv8
- (R81) Pose-guided focal loss for enhancing vision transformers in continuous sign language recognition
- (R8) Estimation of NDVI from UAV RGB imagery using deep learning models
- (R138) Deep learning based polyculture structural phenotyping
- (R146) 3D face morph generation using geometry-aware template inversion
- (R147) Generating synthetic face recognition datasets using Brownian identity diffusion and a foundation model
- (R155) Post-training quantization for Vision Mamba with k-scaled quantization and reparameterization
- (R164) Attention and edge-aware band selection for efficient hyperspectral classification of burned vegetation
- (R211) Bridging discrete and continuous: A multimodal strategy for complex emotion detection
- (R210) MCFF-Det: Multispectral coarse-to-fine fusion for object detection
- (R208) Efficient mouth alignment for visual speech recognition
- (R171) Long range constraints for neural texture synthesis using sliced Wasserstein loss
September 2
ML for Neuroscience
- (ISPS2) Latent representation learning for multimodal brain activity translation
- (R42) UniPhyNet: A unified network for multimodal physiological raw signal classification
- (R49) MicroWaveNet: Lightweight CBAM-augmented wavelet-attentive networks for robust EEG denoising
- (R87) Towards generalizable learning models for EEG-based identification of pain perception
- (R203) DECIFRA: Deep extraction of causally informed features via restricted architecture
- (R178) Uncovering k-way balanced consensus communities in signed multilayer brain networks
- (R170) Hypergraph overlapping community detection for brain networks
Learning Theory, Optimization and Algorithms
- (R104-oral) Fast and Robust Training of Deep Learning Models with Multiplicative Adagrad
- (R73-oral) Information Entropy-Based Scheduling for Communication-Efficient Decentralized Learning
- (R80-oral) Model Recycling Framework for Multi-Source Data-Free Supervised Transfer Learning
- (R162-oral) Meta-Tree: Bayesian Approach to Avoid Overfitting in Decision Trees and Analysis on the Application to Boosting
- (R114) Model-Agnostic Uncertainty Calibration for Noisy Constraint Modeling in Bainitic Steel Optimization
- (R173) Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals
- (ISPS5) Attention Augmented Structure-centric Bias Mitigation with Feature Disentanglement
- (R136-oral) APA: Domain Generalization Using Frequency Based Augmentation
- (R150-oral) Perturbation-based Multiview Graph Learning with Consensus Graph
- (R3) Shapley-Based Data Valuation with Mutual Information: A Key to Modified K-Nearest Neighbors
- (R72) Toward Sustainable Continual Learning: Detection and Knowledge Repurposing for Reoccurring Tasks
- (R105) Online Topology Identification of Higher-Order Cell Structures
- (R215) Benefits of Online Tilted Empirical Risk Minimization: A Case Study of Outlier Detection and Robust Regression
- (R184) EMORF-II: Adaptive EM-Based Outlier-Robust Filtering with Correlated Measurement Noise
Special Session, Applications of AI in the Analysis of Cultural and Artistic Heritage
- (S26-oral) Solving Jigsaw Puzzles in the Wild: Human-Guided Reconstruction of Cultural Heritage Fragments
- (S20) The Cow of Rembrandt - Analyzing Artistic Prompt Interpretation in Text-to-Image Models
- (S11) Speaking Images: A Novel Framework for the Automated Self-Description of Artworks
- (S6) Experimenting Active and Sequential Learning in a Medieval Music Manuscript
- (S25) Deep Learning for Fine-Grained Classification of Montelupo Majolica: Benchmarking and Explainability
- (S30) DFA-CON: A Contrastive Learning Approach for Detecting Copyright Infringement in DeepFake Art
- (S31) Restoration and Enhancement of Historical Manuscript Images Using Diffusion Model
- (S10) Multimodal Artwork Topic Modeling via Fine-Tuned CLIP and Knowledge-Driven Prompts
Special Session, LEAP: Low-Energy AI For Edge Learning and Processing
- (S32-oral) Improving Communication-Efficiency for Decentralized Federated Clustering
- (S1) Unseen Speaker and Language Adaptation for Lightweight Text-To-Speech with Adapters
- (S2) Utilizing Dynamic Sparsity on Pretrained DETR
- (S7) Neural Successive Cancellation Decoder for Polar Codes Using Analog In-Memory Computing with Memristors
- (S19) Self-Supervised Learning at the Edge: The Cost of Labeling
- (S21) Energy-Information Trade-Off in Self-Directed Channel Memristors
Special Session, Large Vision Language Models (LVLMs) and their Application to Document Understanding
- (S22-oral) Trust the Model: Compact VLMs as In-Context Judges for Image-Text Data Quality
- (S4) Multi-Agent Interactive Question Generation Framework for Long Document Understanding
- (S14) Enhanced Arabic Text Retrieval with Attentive Relevance Scoring
- (S28) Learning or Cheating? Assessing Data Contamination in Large Vision-Language Models
- (S29) ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models
Applications in Wireless, Radio and Radar Communications
- (R11) Few-Shot Radar Signal Recognition through Self-Supervised Learning and Radio Frequency Domain Adaptation
- (ISPS9) Signal Processing Challenges in Automotive Radar
- (ISPS8) Advancing Single-Snapshot DOA Estimation with Siamese Neural Networks for Sparse Linear Arrays
- (ISPS7) Advancing High-Resolution and Efficient Automotive Radar Imaging Through Domain-Informed 1D Deep Learning
- (R50-oral) RadioTrace: Bridging Diffusion Priors and RSS Measurements for Accurate Radio Map Estimation
- (R62) Robust Imputation SwinLSTM for Spectrum Map Prediction of Incomplete Data
- (R127) Online Gaussian Process for Dynamic Radio Map Updating
- (R10) Signal Prediction for Loss Mitigation in Tactile Internet: A Leader-Follower Game-Theoretic Approach
- (R154) Benchmarking Transfer Learning in Passive Sonar: An Evaluation Study
- (R219) Robust and Efficient Kernel-Based Digital Self-Interference Cancellation Using A Priori Knowledge in Full-Duplex Transceivers
Diffusion, Generative and Representation Learning
- (R37) MPRDiff: Mixed Precision Restorative Diffusion Model with Incoherence Processing
- (R14) Backdoor Inversion in Neural-Activation Space
- (R22) Contrastive Disentanglement Learning for Empathetic Generation
- (R51) Compression Beyond Pixels: Semantic Compression with Multimodal Foundation Models
- (R21) Diffusion-Based Connectionist Temporal Classification
- (R5) Foundation Model-Aided Video Semantic Communication: Framework Design and Prototype Validation
- (R4) FreNBRDF: A Frequency-Rectified Neural Material Representation
- (R95) Input Conditioned Layer Dropping in Speech Foundation Models
- (R204-oral) Ptychographic Image Reconstruction from Limited Data via Score-Based Diffusion Models with Physics-Guidance
- (R58) Stability and Performance Analysis of Diffusion Learning for Two-Network Competing Problems
September 3
ML for Medical Applications
- (R137-oral) Cycle-Consistent Diffusion Model with Vessel-Aware Attention for Endoscopic Image Translation
- (ISPS3) DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images
- (ISPS1) ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical Images
- (R24) SVD Based Least Squares For X-Ray Pneumonia Classification Using Deep Features
- (R31-oral) RSR-NF: Neural Field Regularization by Static Restoration Priors for Computed Dynamic Imaging
- (R182-oral) Closing the Gap in Multimodal Medical Representation Alignment
- (R7) Adaptable Non-parametric Approach for Speech-based Symptom Assessment: Isolating Private Medical Data in a Retrieval Datastore
- (R15) "Digital Washing" of Semen Time-lapse Images
- (R148) StrokeVision-Bench: A Multimodal Video and 2D Pose Benchmark for Tracking Stroke Recovery
- (R163) Does Language Matter for Early Detection of Parkinson's Disease from Speech?
Human Activity Recognition, Physiological Signals & Wearables
- (ISPS13) OSR: Toward Developing Efficient Federated Learning-based Human Activity Recognition using Optimal Server Representations
- (R41) Compressed and Lightweight CNN for Real-Time Parkinson's Tremor Detection from Wearable IMU Data
- (R142) Subject Invariant Contrastive Learning for Human Activity Recognition
- (R98) Graph Structure Learning with Local Connectivity Refinement for Improved Physiological Emotion Recognition
- (R188) GEMS: Group Emotion Profiling through Multimodal Situational Understanding
- (ISPS11) Interbeat Interval Filtering
- (R20) Classification Filtering
- (R27) Embedded Inter-Subject Variability in Adversarial Learning for Inertial Sensor-Based Human Activity Recognition
Decision Making, Bandits, and Recommendation Systems
- (ISPS4) A Distillation-based Future-aware Graph Neural Network for Stock Trend Prediction
- (R156) Learning from Multiple Noisily Optimal Demonstrators in Stochastic Multi-Armed Bandits
- (R194) Networked Contextual Bandits with Anomaly-Aware Learning
- (R193) State Prediction for Offline Reinforcement Learning via Sequence-to-Sequence Modeling
- (R190) Poisson-based Modeling and Curvature-aware Optimization for Neural Collaborative Filtering in Recommendation Systems
- (R157) Integrating Adaptive Prediction with an Optimization-based Methodology for Data-Driven Efficiency Evaluation in Education
- (ISPS6) What Does an Audio Deepfake Detector Focus on? A Study in the Time Domain
- (R141) CollabPersona: A Framework for Collaborative Decision Analysis in Persona Driven LLM-based Multi-Agent Systems
Federated & Decentralized Learning
- (ISPS10) Federated Domain Generalization with Label Smoothing and Balanced Decentralized Training
- (ISPS12) Enhancing Federated Learning Convergence With Dynamic Data Queue and Data-Entropy-Driven Participant Selection
- (R222) Provable Reduction in Communication Rounds for Non-Smooth Convex Federated Learning
- (R111) Joint Graph Estimation and Signal Restoration for Robust Federated Learning
- (R107) Memory-Efficient Correlated Noise for Locally Differentially Private Momentum in Distributed Learning
- (R65) DeMem: Privacy-Enhanced Robust Adversarial Learning via De-Memorization
Special Session, Decoding the Brain Time Series
- (S12-oral) Toward a Gaze-Independent Brain-Computer Interface Using the Code-Modulated Visual Evoked Potentials
- (S3S) Assessing the Capabilities of Large Brainwave Foundation Models
- (S5) Uncertainty Quantification for Motor Imagery BCI - Machine Learning vs. Deep Learning
- (S8) On the Role of Low-Level Visual Features in EEG-Based Image Reconstruction
- (S9) Deep Learning of Mesoscale Cortical Dynamics for Real-Time Classification of Forelimb Movement in Mice
- (S16) Interpretability of Riemannian Tools Used in Brain-Computer Interfaces
- (S18) AbsoluteNet: A Deep Learning Neural Network to Classify Cerebral Hemodynamic Responses of Auditory Processing
- (S23) Riemannian Fusions of EEG-Based Features for Motor Imagery Detection under Propofol Sedation
- (S27) Evaluating Manifold Alignment of Motor Imagery for Transfer Learning in EEG-Based BCIs
- (S9) Deep Learning of Mesoscale Cortical Dynamics for Real-Time Classification of Forelimb Movement in Mice
Data Competition, VEELA - Vessel Extraction and Extrication for Liver Analysis
- Hepatic Vessel Segmentation and Classification in CTA Images Using NNU-Net with Centerline Regression
- Hybrid Boundary Sensitive Tversky 3D U-Net for Liver Vessel Segmentation
- Veela Challenge - Vessel Extraction and Extrication for Liver Analysis