TY - JOUR
T1 - Brain-to-text
T2 - Decoding spoken phrases from phone representations in the brain
AU - Herff, Christian
AU - Heger, Dominic
AU - de Pesters, Adriana
AU - Telaar, Dominic
AU - Brunner, Peter
AU - Schalk, Gerwin
AU - Schultz, Tanja
N1 - Publisher Copyright:
© 2015 Herff, Heger, De_pesters, Telaar, Brunner, Schalk and Schultz.
PY - 2015
Y1 - 2015
N2 - It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings. Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system achieved word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step towards human-machine communication based on imagined speech.
AB - It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings. Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system achieved word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step towards human-machine communication based on imagined speech.
KW - Automatic speech recognition
KW - Brain-computer interface
KW - Broadband gamma
KW - ECoG
KW - Electrocorticography
KW - Pattern recognition
KW - Speech decoding
KW - Speech production
UR - http://www.scopus.com/inward/record.url?scp=84931262369&partnerID=8YFLogxK
U2 - 10.3389/fnins.2015.00217
DO - 10.3389/fnins.2015.00217
M3 - Article
AN - SCOPUS:84931262369
SN - 1662-4548
VL - 9
JO - Frontiers in Neuroscience
JF - Frontiers in Neuroscience
IS - MAY
M1 - 6
ER -