A scientist in Japan has developed a technique that uses brain scans and artificial intelligence to turn a person’s mental images into accurate, descriptive sentences, reports BritPanorama.
This breakthrough follows progress in translating words from thought into text, though converting complex mental images remains challenging, according to Tomoyasu Horikawa, author of a study published on November 5 in the journal Science Advances.
Horikawa’s new method, termed “mind-captioning,” generates descriptive text reflecting information in the brain about visual details, including objects, places, actions, and events, as well as their relationships.
Horikawa, a researcher at NTT’s Communication Science Laboratories near Tokyo, analyzed the brain activity of four men and two women, native Japanese speakers aged 22 to 37, while they viewed 2,180 silent video clips that varied in content. These clips depicted a range of scenes and objects.
Large language models—AI systems trained on extensive datasets—transformed captions of the video clips into numerical sequences. Horikawa then trained simpler AI models, known as “decoders,” to correlate the scanned brain activity related to the video clips with the numerical sequences.
He subsequently used these decoders to interpret participants’ brain activity while they watched or recalled videos previously unseen by the AI. Another algorithm was created to generate word sequences that effectively matched the decoded brain activity.
As the AI learned from the data, its capacity to articulate descriptions based on the brain scans improved significantly. “It’s just one additional step forward in the direction of what, in my view, we can legitimately call brain-reading or mind-reading,” remarked Marcello Ienca, a professor of AI ethics and neuroscience at the Technical University of Munich, though he did not participate in the study.
Potential for ‘profound’ health interventions
Remarkably, the AI model was able to generate English text, despite the participants being non-native English speakers. Horikawa noted that the method can produce comprehensive descriptions of visual content without activating the brain’s language network. This suggests the technique could be utilized even if an individual has sustained damage to this area of the brain.
The technology has the potential to assist individuals with aphasia, a condition that impairs language expression due to damage in the language areas of the brain, or in cases of amyotrophic lateral sclerosis (ALS), a neurodegenerative disease affecting speech.
Psychologist Scott Barry Kaufman, who was not involved in the study, expressed that the research could lead to significant interventions for those who struggle to communicate, including non-verbal autistic people. However, he emphasized the need for careful applications, stating, “we have to use it carefully and make sure we aren’t being invasive and that everyone consents to it.”
‘The ultimate privacy challenge’
This successful method raises ethical issues concerning privacy, particularly the potential for revealing an individual’s private thoughts. Ienca cautioned that, if such technologies are developed for consumer use beyond medical applications, they could face significant privacy challenges. He pointed out that many companies, such as Neuralink, are advancing claims about developing neural implants for a broader audience.
Ienca insisted on strict regulations regarding access to mental and neural data, drawing attention to the sensitive nature of personal brain information, including early signatures of psychiatric disorders and conditions like depression.
The study also indicated that mechanisms could be designed to reduce the leakage of private thoughts when decoding is initiated by specific keywords, ensuring that invasions of privacy are minimized.
As neuroscience progresses rapidly, the advantages of assistive technologies are apparent, but the protection of mental privacy and the freedom of thought must also evolve. “We should treat neural data as sensitive by default and prioritize on-device processing with user-controlled unlocking mechanisms,” stated Łukasz Szoszkiewicz, a social scientist who was not involved in the research.
Horikawa acknowledged that while this technology holds promise, it requires substantial data collection with active participant involvement and is not yet resource-efficient for practical applications. Additionally, it remains uncertain whether the technique can effectively capture less predictable mental images. The current methodology, therefore, cannot easily invade a person’s private thoughts, he concluded.
The story of this technological advancement continues to unfold as the implications for both medical and ethical domains are explored.