Sped Up the Speech-to-text Conversion Process by 50%

Benevis is a Norwegian company founded in 2014 that provides an innovative service of transcribing audio speech into text. The product is based on machine learning technology that is trained to recognize speech. 


Benevis’ users upload the audio files and the algorithms grind through them and provide the ready-made text. Then the in-built mechanism splits the text into sentences and places punctuation marks. The platform supports Norwegian, Swedish, and Danish languages with their national dialects.


Product: Automatic speech-to-text conversion system

The scope of our work: Development of STT post-processing mechanism: semantic and punctuation analysis

Solution: Machine Learning, Text Processing


Location: Norway
Industry: Computer Software
Client Goal:

Our client’s goal was to create an automatic conversion of Norwegian speech using machine learning algorithms into a ready-made text.

Our Solutions
Automatic speech-to-text feature

The goal was to create an appropriate processing flow of Norwegian audio speech into the text.

The main task was to process the raw STT result to split it into sentences and insert punctuation marks. The final result should have been the well-structured sentences out of simple word sequences.

Also, when creating an audio-to-text function, it was crucial to use two variants of the Norwegian language (formal and informal).

Our team used BERT, a neural network from Google that allows creating natural language processing by automatically transforming and parsing text. 

We refined the BERT network rules, as well as upgraded the system of comparative analysis of the original text and the final variant.

We used the extensive text training kit to test the speech-to-text function. There was a long process of fine-tuning the rules and the processing model algorithm. Our team has been improving BERT until converting the flow of words showed a great result.

benevisMy Transcription: History


benevisAudio-To-Text Result Page

Optimization of the text processing speed

Our team assisted the client with the text processing speed optimization. The goal was to create a fast converting process for the audio speech to text, which wouldn’t strain users with a long wait.


Text processing speed optimization was possible using BERT with an improved rules system and exceptions. Ultimately, the whole speech-to-text process takes half the length of the audio clip.


As a result of our collaboration, Benevis received an automated speech-to-text feature that can:


  • convert Norwegian audio speech into text in two language variants (both formal and informal styles)
  • halve the speech-to-text process and the final results depending on the audio length.
Our process

November 2019 - April 2020

1 ML developer | BA
Technologies we used

Success Stories

Raised $4,9M The Skills
#E-Learning USA
The Skills

The Skills is an educational platform that produces and distributes online lessons from the world's top athletes like Michael Phelps, Maria Sharapova, and Shaun White collecting payments with end-users for a subscription.

tweet sentiment analysis
Twitter Sentiment Analysis

Recruitment software algorithm that helps to create a psychological portrait of a person basing on his/her social network posts.

Processes 5,000 calls/day Callme bot
#Transportation USA

CallMeBot is a phone bot created to automate the sales processes of a British car dealer.

Raised over $4M CityFalcon
#FinTech UK

CityFALCON is a financial news aggregator that analyzes and collects financial tweets, news, and authors by using Natural Language Processing.

Let's talk about your idea?

    Alex, Project Lead
    Alex, Project Lead