Linguistically speaking, Louisiana French, often called Cajun French, is an amalgamation of French that has been shaped over generations by English, Spanish, African and Indigenous languages.
But time has not been kind to Louisiana French. State policies that discouraged, and in many cases prohibited, the use of French in public schools during the early 20th century accelerated its decline. Today, according to U.S. Census data, only about 20,000 fluent Louisiana French speakers remain in the United States, most of them older adults. Like hundreds of other endangered languages around the world, Louisiana French now faces an uncertain future.
AI is emerging as an unexpected tool in efforts to preserve languages that are at risk of disappearing. While AI cannot stop language loss on its own, researchers say it can document speech, transcribe recordings, translate rare languages and make historical collections more accessible, work that once required years of painstaking effort.
According to the United Nations, an Indigenous language disappears about every two weeks. UNESCO estimates that nearly half of the world’s roughly 7,000 languages could disappear by the end of this century. When a language dies, experts say, communities lose not only words but also oral histories, traditional knowledge, cultural practices and a unique way of understanding the world.
At the University of Louisiana at Lafayette, researchers hope AI can help preserve Louisiana French by teaching computers to recognize a language that has long been overlooked by mainstream technology.
The university’s Center for Louisiana Studies recently launched the LaFLEUR Project — Louisiana AI for French Language Exploration, Understanding and Research — its first effort to develop AI tools capable of recognizing the distinctive sounds and grammar of Louisiana French, according to the university.
The project grew from an experience. Dr. Joshua Caffery, director of the Center for Louisiana Studies, said he once asked Amazon Alexa to play music by legendary Cajun fiddler Dewey Balfa. Instead, the smart speaker played songs by pop star Dua Lipa.
“That just struck me as a frustrating thing,” Caffery said. “What happens when we are increasingly integrating these systems into our lives that is privileged on certain kinds of information?”
The problem reflects how most large language models are built. AI systems are trained primarily on enormous amounts of internet data dominated by widely spoken languages, leaving regional dialects and endangered languages with little digital representation.
Louisiana French presents an additional challenge because much of the language survives in oral histories, interviews and music rather than books or written records. Many recordings preserved by the Center for Louisiana Studies are decades old and contain background noise, music and overlapping voices that make them difficult for AI systems to understand.
When researchers tested a standard AI model on a recording of storyteller Evelia Boudreaux, the software repeatedly misinterpreted familiar Louisiana French expressions. According to the university, one common phrase was translated into standard French with an entirely different meaning.
“These errors occur because the machine struggles with the phonetic, syntactical or grammatical differences unique to Louisiana,” Dr. Rachel Doherty, assistant director of the Center for Louisiana Studies, told the university.
To improve the technology, researchers are creating what they call a “Ground Truth” dataset, carefully verified transcriptions paired with short audio clips that teach AI what authentic Louisiana French sounds like.
“We’re trying to create an automatic system that works seamlessly without a lot of human effort,” said Peyton Leathem-Boe, a research scientist at the Informatics Research Institute.
Researchers hope the technology will eventually allow families to upload recordings of relatives speaking Louisiana French and receive accurate transcriptions, helping preserve voices and stories that might otherwise be lost.
“We want to make sure that these machines have soul in the future,” Caffery said. “I think this is a good way to do that.”
The same technology is also helping preserve history in other ways.
At the same university, computer scientist Dr. Boisy Gene Pitre is using AI to organize more than 10,000 historical documents discovered in the attic of his great-grandfather’s home in Opelousas. By combining optical character recognition with large language models, Pitre is converting fragile records into searchable databases that allow historians to identify patterns and relationships that would be difficult to uncover manually.
“LLMs excel at helping us identify patterns, relationships and recurring themes across vast collections of documents,” Pitre said. “Once you understand the questions that can be asked, then you can start asking even more interesting questions.”
Pitre also warns that AI is not a substitute for careful scholarship.
“LLMs don’t have an independent sense of historical truth; they work with the information we give them,” he said. “That’s why human review, cross-checking and domain knowledge remain essential.”
Researchers at Dartmouth College are also involved in preserving an endangered language. They have developed NüshuRescue, an AI system designed to preserve Nüshu, a rare writing system created centuries ago by women in China’s Hunan province. According to Dartmouth, researchers showed that a large language model could begin translating the script using only a small number of expert-verified examples, creating new opportunities to digitize and preserve the language.
“Our work demonstrates that generative AI and large language models significantly lower barriers to revitalizing endangered languages,” Dartmouth researcher Soroush Vosoughi said. He added that native speakers and linguists remain essential to ensure the technology accurately reflects the language and culture it is intended to preserve.
Researchers around the world are pursuing similar goals. AI-powered speech recognition is being used to transcribe Cook Islands Māori and Indigenous languages in Costa Rica. Other researchers are developing tools that better recognize Navajo after existing language identification systems repeatedly failed to detect it.
Technology companies are also contributing. Google’s Woolaroo app uses image recognition and recordings from native speakers to teach endangered languages through everyday objects, while Meta’s No Language Left Behind initiative is expanding machine translation to hundreds of low-resource languages. Echoing Dr. Pitre at the University of Louisiana Lafayette, researchers caution that these systems require careful oversight because translation errors can distort languages that already have limited written records.
As more communication moves into digital spaces, researchers argue that languages must also exist there if they are to remain relevant for future generations. For Louisiana French, the goal is to ensure that a language rich in history, culture and identity continues to be understood long after today’s fluent speakers are gone.