SMACK: Semantically Meaningful Adversarial Audio Attack

  • Zhiyuan Yu
  • , Yuanhaur Chang
  • , Ning Zhang
  • , Chaowei Xiao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

34 Scopus citations

Abstract

Voice controllable systems rely on speech recognition and speaker identification as the key enabling technologies. While they bring revolutionary changes to our daily lives, their security has become a growing concern. Existing work has demonstrated the feasibility of using maliciously crafted perturbations to manipulate speech or speaker recognition. Although these attacks vary in targets and techniques, they all require the addition of noise perturbations. While these perturbations are generally restricted to Lp-bounded neighborhood, the added noises inevitably leave unnatural traces recognizable by humans, and can be used for defense. To address this limitation, we introduce a new class of adversarial audio attack, named Semantically Meaningful Adversarial Audio AttaCK (SMACK), where the inherent speech attributes (such as prosody) are modified such that they still semantically represent the same speech and preserves the speech quality. The efficacy of SMACK was evaluated against five transcription systems and two speaker recognition systems in a black-box manner. By manipulating semantic attributes, our adversarial audio examples are capable of evading the state-of-the-art defenses, with better speech naturalness compared to traditional Lp-bounded attacks in the human perceptual study.

Original languageEnglish
Title of host publication32nd USENIX Security Symposium, USENIX Security 2023
PublisherUSENIX Association
Pages3799-3816
Number of pages18
ISBN (Electronic)9781713879497
StatePublished - 2023
Event32nd USENIX Security Symposium, USENIX Security 2023 - Anaheim, United States
Duration: Aug 9 2023Aug 11 2023

Publication series

Name32nd USENIX Security Symposium, USENIX Security 2023
Volume6

Conference

Conference32nd USENIX Security Symposium, USENIX Security 2023
Country/TerritoryUnited States
CityAnaheim
Period08/9/2308/11/23

Fingerprint

Dive into the research topics of 'SMACK: Semantically Meaningful Adversarial Audio Attack'. Together they form a unique fingerprint.

Cite this