ISCA SPSC Symposium

2nd Symposium on Security and Privacy in Speech Communication

joined with

2nd VoicePrivacy Challenge Workshop

September 23 & 24 2022, as a satellite to Interspeech 2022


Speech and voice are media through which we express ourselves. Speech communication can be used to command virtual assistants, to transport emotion or to identify oneself. How can we strengthen security and privacy for speech representation types in user-centric human/machine interaction?

Interdisciplinary exchange is in high demand. The need to better understand and develop user-centric security solutions and privacy safeguard in speech communication is of growing importance for commercial, forensic, and government applications. The SPSC Symposium is a platform to seek better designed services and products as well as better informed policy papers for legislators and governance. The symposium is organized by the ISCA SPSC special interest group and the VoicePrivacy Challenge Team .


The second edition of the Symposium on Security & Privacy in Speech Communication, focuses on Speech and voice through which we express ourselves. As speech communication can be used to command virtual assistants to transport emotion or to identify oneself, the symposium encourages participants to give answers on how we can strengthen security and privacy for speech representation types in user-centric human/machine interaction? The symposium therefore sees that interdisciplinary exchange is in high demand and aims to bring together researchers and practitioners across multiple disciplines – more specifically: signal processing, cryptography, security, human-computer interaction, law and anthropology.

The second edition of the VoicePrivacy Challenge Workshop is spearheading the effort to develop privacy preservation solutions for speech technology. It aims to consolidate the newly formed community to develop the task and metrics and to benchmark progress in anonymization solutions using common datasets, protocols and metrics. VoicePrivacy takes the form of a competitive challenge. Participants are required to develop anonymization algorithms which conceal speaker identity within speech signals. At the same time, they should preserve linguistic content and naturalness. VoicePrivacy 2022 Challenge participants are encouraged to submit to the SPSC Symposium papers related to their Challenge entry, as well as other scientific papers related to voice privacy and anonymization.

To strengthen the efforts for both events, ease joined discussions, and extend the interdisciplinary exchange, we decided to combine our teams and organized a joined event. For the general symposium, we welcome contributions to related topics, as well as progress reports, project dissemination, or theoretical discussions and “work in progress”. In addition, guests from academia, industry and public institutions as well as interested students are welcome to attend the conference without having to make their own contribution.

Although, we aim for meeting all of you on-side, we also opt for virtual presentations during the workshop.


The Symposium is held at the Incheon National University: Small Theatre, 2F, Shops & Service Centre, see Campus-Map:

How to reach?


It is approx 40 minutes walk from the Interspeech Convention center to the NUI:


The bus-stop is called: Incheon National Univ. College of Engineering (인천대학교공과대학) The following busses service there:

  • 6 (every 13 Minutes)
  • 6-1 (every 13 Minutes)
  • 98 (every 25 Minutes)
  • 41 (every 24 Minutes)
  • 42 (every 28 Minutes)
  • 43 (every 28 Minutes)
From Bus stop to venue:

We have a large lecture hall with 286 seats. This room is near school diners for lunch.


Keynote Speakers

Opening Keynote

Deepfakes: regulatory challenges for the synthetic society by Bart van der Sloot (Tilburg University, Netherlands)


Abstract: With the rise of deepfakes and synthetic media, the question as to what is real and what is not will become increasingly important and politized. Deepfakes can be used to spread fake news, influence elections, introduce highly realistic fake evidence in courts and make fake porno movies. Each of these applications potentially has a big impact on society, social relationships, democracy and the rule of law. The question this talk shall assess is whether the current regulatory regime suffices to address these potential harms and if not, which additional rules and principles should be adopted. It will discuss several potential amendments to the privacy and data protection regime, limitations to the freedom of expression and ex ante rules on the distribution of use of deepfake-technologies.

Bio: Bart van der Sloot specializes in the area of Privacy and Big Data. He also publishes regularly on the liability of Internet Intermediaries, data protection and internet regulation. He has studied both philosophy (BA; MA) and law (BA; MA) in the Netherlands and Italy, also successfully completing the Honours Research Programme. He is an associate professor at the Tilburg Institute for Law, Technology, and Society of the University of Tilburg, Netherlands. Bart formerly worked for the Institute for Information Law, University of Amsterdam, where he wrote his Phd on privacy and virtue ethics, and for the Scientific Council for Government Policy (WRR) (part of the Prime Minister’s Office of the Netherlands) to co-author a report on the regulation of Big Data. Bart van der Sloot is the general editor of the international privacy journal European Data Protection Law Review. He also served as the director of the Privacy & Identity Lab between 2016-2021. Between 2010-2020, he was the founder and coordinator of the Amsterdam Platform for Privacy Research (APPR), the minor Privacy Studies and the Amsterdam Privacy Conferences 2012, 2015 and 2018. Bart was awarded two highly prestigious research stipends by the Dutch Scientific Organisation. The Top Talent Research Grant fully covered his Phd project and the Veni grant (2021-2025) covers a research project called: the right to be let alone ... by yourself.

Closing Keynote

The Five Issues Between Voice and Its Value by Jon Stine (Oregon, Portland, USA)


Abstract: How big should voice be? How big can voice become? For some, conversational AI is a toy-like technology of convenience – an alarm, a music player, a teller of tales and news headlines. For businesses, conversational AI is most often a technology of automation and efficiency – one that alleviates call center burdens. But let’s pause and consider, for a moment, the possibilities. Here is an inclusive interface that will soon be resident on every digital device – in a world where every device is digital. Here’s a capability that will exist within every website, every smart system, every AI. We’re at the edge – and let’s dare to say it – of a worldwide voice web. Join Jon Stine, Executive Director of the Open Voice Network, a community of the Linux Foundation, for an exploration of where voice is, where it can go (for optimal societal and economic benefit), and what stands in its way. (Spoiler alert: he’s going to talk about interoperability and data ownership.)

Bio: Jon Stine is the Executive Director of The Open Voice Network, an open-source community of the Linux Foundation dedicated to developing technical standards and usage guidelines for the emerging world of artificial intelligence-enabled voice assistance. https://www.linkedin.com/in/jonstine/.

He brings to the task more than 30 years of executive leadership in the retail and technology industries. He led sales of national apparel brand to better US department and specialty stores before joining the Intel Corporation in 2000 to head its first global outreach to the retail and consumer goods industries.

He joined Cisco Systems retail-consumer goods consulting team in late 2006, and later headed Cisco’s North America consulting practice for retail-CPG. In 2014, he returned to Intel as the Global Enterprise Sales General Manager for the retail, hospitality, and consumer goods industries. He left Intel in 2019 to build The Open Voice Network.

He resides in Portland, Oregon, USA.

Invited Speakers

Choose a pseudonym. Legal perspective on pseudonymisation of speech data by Pawel Kamocki (Leibniz-Institut für Deutsche Sprache, Mannheim, Germany)


Abstract: Often overlooked, pseudonymisation can be an interesting alternative to anonymisation, especially in the context of speech data. Recognised as a safeguard for the rights and freedoms of data subjects, pseudonymisation can make GDPR compliance considerably easier to achieve. This talk will discuss the advantages of pseudonymisation, its special role for speech data (e.g. according to the European Data Protection Board's guidelines on voice assistants), and its seemingly bright future under the EU Data Governance Act.

Bio: Dr. iur. Paweł Kamocki is a legal expert in Leibniz-Institut für Deutsche Sprache, Mannheim. He studied linguistics and law, and in 2017 obtained his doctorate in law from the universities of Paris and Münster for a thesis on legal aspects of data-intensive university research, with a focus on Knowledge Commons. He worked as a research and teaching assistant at the Paris Descartes university (now: Université de Paris), then also in the private sector. He is certified to work as an attorney in France. An active member of the CLARIN community since 2012, he currently chairs the CLARIN Legal and Ethical Issues Committee. He also worked with other projects and initiatives in the field of research data policy (RDA, EUDAT) and co-created several LegalTech tools for researchers. One of his main research interests are legal issues in Machine Translation.

Detecting manipulated and synthetic audio by Zhizheng Wu (Chinese University of Hong Kong, Shenzhen, China)


Abstract: In recent years, we have witnessed the astonishing advancement of speech generation technology, thanks to the rapid development of deep learning. The state-of-the-art speech synthesis technology can clone a speaker’s voice with a few training samples and generate natural-sounding audio samples that the speaker never said. The technology can be misused to create misinformation, which spreads farther, faster, and more broadly than the truth and erodes our trust in online information. It can also be misused to attack voice biometric systems. This talk will first present a high-level overview of approaches to manipulate and synthesize audio. Then, it will highlight recent technical developments to detect manipulated and synthetic audio. This talk will also discuss some current challenges and the needs from a user point of view.

Bio: Zhizheng Wu is an associate professor in the School of Data Science, the Chinese University of Hong Kong, Shenzhen. He received his Ph.D. from Nanyang Technological University, Singapore in 2015 and worked for Meta (formerly known as Facebook), JD.COM, Apple, University of Edinburgh, and Microsoft Research Asia. Zhizheng was the creator of Merlin, an open-source speech synthesis toolkit. He co-initiated and co-organized the first Automatic Speaker Verification Spoofing and Countermeasures (ASVspoof) challenge at Interspeech 2015, the Voice Conversion Challenge 2016, and organized the Blizzard Challenge 2019. He is a member of the IEEE Speech and Language Processing Technical Committee (2021-2023).


We are glad to announce the program, all times are KST:

Friday, September 23, 2022

09:00 - 09:10 Opening Ceremony
09:10 - 10:10 Opening Keynote Deepfakes: regulatory challenges for the synthetic society with Bart van der Sloot
10:10 - 11:10 Tutorial 1 Anonymization Part 1 by Emmanuel Vincent
11:10 - 11:40 Break
11:40 - 12:00 Tutorial 2 Anonymization Part 2 by Xin Wang
12:00 - 12:20 Introduction of the VoicePrivacy Challenge with Natalia Tomashenko
12:20 - 13:00 Discussion about the VoicePrivacy Challenge with VPC Organizers
13:00 - 14:30 Lunch Break (self-service)
14:30 - 15:30 Privacy and security using speech and speaker recognition techniques Part 1
  • New Challenges for Content Privacy in Speech and Audio
    Jennifer Williams, Karla Pizzi, Shuvayanti Das and Paul-Gauthier Noé
  • Towards Real-time Privacy-preserving Audio-Visual Speech Enhancement
    Mandar Gogate, Kia Dashtipour and Amir Hussain
  • Introducing Model Inversion Attacks on Automatic Speaker Recognition
    Karla Pizzi, Franziska Boenisch, Ugur Sahin and Konstantin Böttinger
15:30 - 16:00 Invited Talk Detecting manipulated and synthetic audio with Zhizheng Wu
16:00 - 16:30 Break
16:30 - 17:00 Invited Talk Choose a pseudonym. Legal perspective on pseudonymisation of speech data. with Pawel Kamocki
17:00 - 18:00 SIG's SPSC Townhall Meeting What has been done, what will be done in SPSC with Tom Backström
18:00 - open Social Event for on-site participants

Saturday, September 24, 2022

9:00 - 11:00 VoicePrivacy Challenge
  • Speaker Anonymization by Pitch Shifting Based on Time-Scale Modification
    Candy Olivia Mawalim, Shogo Okada and Masashi Unoki
  • Voice Privacy - Leveraging Multi-Scale Blocks with ECAPA-TDNN SE-Res2NeXt Extension for Speaker Anonymization
    Razieh Khamsehashari, Yamini Sinha, Jan Hintz, Suhita Ghosh, Tim Polzehl, Clarlos Franzreb, Sebastian Stober and Ingo Siegert
  • Cascade of Phonetic Speech Recognition, Speaker Embeddings GAN and Multispeaker Speech Synthesis for the VoicePrivacy 2022 Challenge
    Sarina Meyer, Pascal Tilli, Florian Lux, Pavel Denisov, Julia Koch, Ngoc Thang Vu
  • NWPU-ASLP System for the VoicePrivacy 2022 Challenge
    Jixun Yao, Qing Wang, Li Zhang, Pengcheng Guo, Yuhao Liang, Lei Xie
  • System Description for Voice Privacy Challenge 2022
    Xiaojiao Chen, Guangxing Li, Hao Huang, Wangjin Zhou, Sheng Li, Yang Cao, Yi Zhao
  • VoicePrivacy 2022 System Description: Speaker Anonymization with Feature-matched F0 Trajectories
    Unal Ege Gaznepoglu, Anna Leschanowsky, Nils Peters
11:00 - 11:30 Break
11:30 - 12:30 Privacy and security using speech and speaker recognition techniques Part 2
  • Why Eli Roth should not use TTS-Systems for anonymization
    Yamini Sinha, Jan Hintz, Matthias Busch, Tim Polzehl, Matthias Hasse, Andreas Wendemuth and Ingo Siegert
  • Zero-shot Cross-lingual Speech Emotion Recognition: A Study of Loss Functions and Feature Importance
    Sneha Das, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line. H. Clemmensen and Nicklas Leander Lund
  • Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification
    Yen-Lun Liao, Xuanjun Chen, Chung-Che Wang and Jyh-Shing Roger Jang
12:30 - 13:30

Closing Keynote The Five Issues Between Voice and Its Value with Jon Stine

13:30 - 13:40 Closing Ceremony

All time are given with respect to the KET zone. You can use a time zone converter to check the times in your time zone.


Registration fees for the event:

  • In person (Incheon National University, Korea) Full: 30€
  • In person (Incheon National University, Korea) Student: 20€
  • Virtual: free

The registration to the workshop can be performed using the Interspeech registration system For the event-only registration (without attending INTERSPEECH 2022), please use the following link: Event-only Registration

The event is open to everyone, regardless of their contribution to the VoicePrivacy challenge or SPSC symposium.

In addition, all the VoicePrivacy challenge participants, who will submit results and system descriptions by 31st July, are encouraged to present their work during the event (even if they did not submit papers to the SPSC symposium).

VoicePrivacy 2022 Challenge

The VoicePrivacy initiative aims to promote the development of privacy preservation tools for speech technology and foster progress in the development of anonymization and pseudonymization solutions which suppress personally identifiable information contained within recordings of speech while preserving linguistic content and speech naturalness. VoicePrivacy takes the form of a competitive benchmarking challenge, with common datasets, protocols and metrics.

Challenge Details


May 05, 2022

Paper submission opens

June 25th, 2022 (extended from Mai 18th)

Long paper submission deadline

June 25th, 2022 (extended from Mai 18th)

VoicePrivacy Challenge paper & Short papers submission deadline

July 25th, 2022

Acceptance Notification

July 31st, 2022

VoicePrivacy Challenge results and system description update deadline

End July/Beginning August

Registration deadline (on-site)

September 5th, 2022

Final Paper Submission

September, 23rd-24th


Organizing Committee

Ingo SIEGERT, Otto von Guericke University Magdeburg, Germany

Karla MARKERT, Fraunhofer AISEC, Germany

Tom BÄCKSTRÖM, Aalto University, Finland

Irina ILLINA, University of Lorraine, France

Hung-yi LEE, National Taiwan University, Taiwan

Jennifer WILLIAMS, University of Southampton, UK

Shri NARAYANAN, University of Southern California

Salima MEDHAFFAR, LIA - Avignon University, France

Gerald PENN, University of Toronto, Canada

Natalia TOMASHENKO, LIA - Avignon University, France