Meta Industry Workshop – 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing

Meta Industry Workshop at IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Session Chair: Sriram Srinivasan
Date/Time: 2-3:30 PM IST, April 9, Wednesday
Venue: Hall 6

Topic: GenAI, Signal Processing and Media Innovations Across Meta’s Family of Apps and Devices

Abstract: Join us for an exciting industry workshop at the 2025 IEEE ICASSP in Hyderabad, India, where Meta’s researchers will share their expertise on cutting-edge topics such as Generative AI, Speech, Audio, Video, and Neural interfaces. This interactive session aims to foster collaboration and knowledge-sharing between Meta’s teams and the broader research community, exploring innovative solutions to challenging problems in these fields.

Time	Name	Title
Session Chair	Sriram Srinivasan	-
2.00 - 2.15	Shane Moon	Multimodal Assistant for Smart Glasses
2.15 - 2.30	Niko Moritz	Speech and Audio at Meta Powering Generative AI and Wearable Devices
2.30 - 2.45	Ioannis Katsavounidis	AV1 investments at Meta
2.45 - 3.00	Tim Harris	Next Generation of Open Audio Coding and Rendering Standards
3.00 - 3.15	Juan Azcarreta Ortiz	Augmented Hearing for Egocentric Cocktail Party scenarios
3.15 - 3.30	Dan Hill	Leveraging Noninvasive Neuromotor Signals for Text Input

Talks:

Introduction and Session Chair, Sriram Srinivasan, Director of Engineering, Meta

Dr. Sriram Srinivasan

Bio: Dr. Sriram Srinivasan leads the media teams working on technologies for next-gen real-time audio-video calling, full-duplex MetaAI Voice, media messaging and augmented reality effects across Meta’s family of apps such as Facebook, Instagram, Messenger and WhatsApp. Prior to Meta, he was at Microsoft leading the teams working on audio technologies for Microsoft Teams. He has 15+ years of real-time signal processing and ML experience in areas such as echo cancellation, noise suppression, low bitrate codecs (Satin, MLow), spatial audio and network resilience algorithms. He holds a PhD in audio signal processing, 25+ granted US patents and over 50 peer-reviewed publications.

Multimodal Assistant for Smart Glasses, Shane Moon, Research Lead, Wearables AI, Meta

Dr. Seungwhan Moon

Bio: Dr. Seungwhan Moon is an AI Research Scientist at Meta Reality Labs, leading research in multimodal learning for AR/Smart Glasses applications. His recent projects have focused on cutting-edge multimodal and knowledge-grounded conversational AI. He received his PhD in Language Technologies at School of Computer Science, Carnegie Mellon University under Prof. Jaime Carbonell. Before joining Meta, he has also worked at various research institutions including Snapchat Research, Disney Research, etc. He is a recipient of Samsung Fellowship, LTI Research Fellowship, Olin Merit Scholarship, and Korean Presidential Scholarship.

Speech and Audio at Meta Powering Wearable Devices, Content Understanding and Generative AI, Niko Moritz

Niko Mortiz

Bio: Niko Mortiz is a research scientist at Meta’s GenAI team. He holds a PhD from the University of Oldenburg, Germany, and has gathered over 15 years of experience working on speech and audio technologies. Niko joined Meta in 2021 as a research scientist, where he is working on various projects including speech translation technologies for smart-glasses and next-generation speech technologies in generative AI. Prior to joining Meta, Niko worked as a research scientist and applied scientist at Mitsubishi Electric Research Labs (MERL) in Boston, USA, and the Fraunhofer IDMT in Oldenburg, Germany.

AV1 investments at Meta, Ioannis Katsavounidis, Research Scientist, Video Infrastructure, Meta

Dr. Ioannis Katsavounidis

Bio: Dr. Ioannis Katsavounidis is part of the Video Infrastructure team, leading technical efforts in improving video quality and quality of experience across all video products at Meta. Before joining Meta, he spent 3.5 years at Netflix, contributing to the development and popularization of VMAF, Netflix’s open-source video quality metric, as well as inventing the Dynamic Optimizer, a shot-based perceptual video quality optimization framework that brought significant bitrate savings across the whole video streaming spectrum. He was a professor for 8 years at the University of Thessaly’s Electrical and Computer Engineering Department in Greece, teaching video compression, signal processing and information theory. He was one of the cofounders of Cidana, a mobile multimedia software company in Shanghai, China. He was the director of software for advanced video codecs at InterVideo, the makers of the popular SW DVD player, WinDVD, in the early 2000’s and he has also worked for 4 years in high-energy experimental Physics in Italy. He is one of the co-chairs for the statistical analysis methods (SAM) and no-reference metrics (NORM) groups at the Video Quality Experts Group (VQEG). He is actively involved within the Alliance for Open Media (AOMedia) as co-chair of the software implementation working group (SWIG). He has over 150 publications, including 50 patents. His research interests lie in video coding, quality of experience, adaptive streaming, and energy efficient HW/SW multimedia processing.

Next Generation of Open Audio Coding and Rendering Standards, Tim Harris, Engineering Manager, Video Infrastructure, Meta

Dr. Timothy Harris

Bio: Dr. Timothy Harris supports teams at Meta deploying and scaling video and audio transcoding and processing tools. He is the current Alliance for Open Media (AOMedia) Audio Codec Working Group chair. His Engineering Doctorate is in System Level Integration from iSLI with a thesis in scaling high-end video processing technologies on general purpose compute
hardware, supervised at Edinburgh University. Before Meta he led technology and engineering at a video advertising start-up Mirriad.

Leveraging Noninvasive Neuromotor Signals for Text Input, Dan Hill, AI Scientist, Meta

Dr. Daniel Hill

Bio: Dr. Daniel Hill is an AI Scientist in Meta’s EMG Engineering and Research Team, where he focuses on machine learning algorithms for handwriting recognition. There he has led research into personalization, closed-loop modeling, and multi-task learning. He received his PhD in computational neuroscience from UC San Diego, studying sensory-motor interaction in the rat whisker system. He then performed his post-doctoral research at the Technical University of Munich where he investigated synaptic integration in deep layers of primary motor cortex. Before Meta, he worked for Amazon Search, leading projects to personalize search results and to optimize the exploration-exploitation tradeoff via multi-armed bandits.

Augmented Hearing for egocentric Cocktail Party scenarios, Juan Azcarreta Ortiz

Juan Azcarreta Ortiz

Bio: Juan Azcarreta Ortiz is a Research Engineer at Meta Reality Labs Research Audio, tackling the egocentric cocktail party problem for augmented reality applications. Juan’s expertise lies in creating efficient machine learning algorithms to enable novel audio experiences. He joined Reality Labs in November 2022, where he is working on efficient multimodal speech technologies to shape the next generation of smart glasses. Previously, Juan worked at Cambridge start-up Audio Analytic, creating tiny machine learning models for acoustic event detection that have been shipped to millions of smart home devices worldwide. Juan also worked at the Signal Processing group at NTT in Japan, conducting research on statistical methods for source separation.