Transcription accuracy is becoming a key issue as businesses expand their use of AI across customer engagement, workflow automation, meeting intelligence and compliance-driven environments.
Image: File
As AI becomes more deeply embedded in business systems, South African companies need to assess whether the underlying language tools are fit for the local market. This is the view of Warren Hawkins who is the Managing Director of Euphoria Telecom.
“Transcription accuracy is becoming a key issue as businesses expand their use of AI across customer engagement, workflow automation, meeting intelligence and compliance-driven environments,” he says. “It matters even more as transcription is increasingly used for meeting summaries, customer service platforms, legal documentation and healthcare administration.”
Most global transcription services are trained primarily on General American English or British Received Pronunciation. While they can process English, they are not always built to handle South African English, local accents or multilingual speech patterns.
South African English has its distinct pronunciation, vowel patterns and rhythm that global systems often misread as distortion, error or background noise, says Hawkins. “This affects more than the transcript itself. It reduces the quality of summaries, weakens search results and introduces errors into any AI system relying on spoken input as source data.”
The challenge is even greater when local languages are involved. isiZulu and isiXhosa, like other Nguni languages, are structurally very different from English. Meaning is carried through a complex system of prefixes and suffixes, and a single word can often convey what would take a full sentence in English. Generic global models regularly struggle with this, especially when they try to force local language structures into English-based patterns.
Afrikaans creates a different but equally important problem. International engines often treat it too closely to Dutch, which can lead to false similarities, awkward phrasing and incorrect meaning. In business, that creates a quality and reliability issue.
Code-switching and multilingual communication are common in local formal and informal business settings. Research on code-switched automatic speech recognition in five South African languages, published in ScienceDirect, found that these systems perform significantly worse on multilingual and code-switched South African speech than on single-language speech. This underscores the limits of generic global models in the local market.
Transcription errors do not stay confined to the transcript. They affect meeting records, summaries, search functions, compliance processes and any downstream AI system that depends on accurate language input.
“Businesses are moving quickly to adopt AI transcription, but that only delivers value if the system can understand how we actually speak,” continues Hawkins. “If a model cannot handle South African English, Afrikaans, isiXhosa or isiZulu properly, it introduces risk and reduces usability.”
The risk is particularly high in sectors where transcription accuracy has legal, financial or operational consequences. In healthcare and legal environments, inaccurate records can create liability. In regulated industries such as banking, insurance and customer service, poor transcription can also affect compliance and service quality.
“Proprietary transcription models trained on South African data are becoming essential," he says. “In our market, employees and customers often switch between languages within the same sentence.”
The same automatic speech recognition study found that this kind of intrasentential code-switching is especially difficult for speech recognition systems to process, and noted that a shortage of large, balanced local datasets remains a major challenge.
There’s also an inclusion issue. If AI systems cannot understand local languages, accents and code-switching, many employees and customers are excluded from using them properly. That creates an accessibility gap at the very point AI is meant to improve productivity and digital participation.
“Language capability is part of the core infrastructure that determines whether these systems are accurate and usable in practice,” says Hawkins. “For businesses investing in AI, generic transcription is inadequate. If a model cannot understand how South Africans actually speak, the technology will struggle to deliver where it counts.