Natural Speech and Transcription Technologies


The Natural Speech and Transcription Technologies (NSTT) is a suite of complementary tools to the National Speech Corpus (NSC), to ease and expedite the development of local speech applications.

It comprises the following components:

  • Speech Activity Detection Engine (SADE)
  • Automated Speech Recognition Engine (ASR)
  • Speech Synthesis Mark-up Language (SSML) Module for Text-to-Speech Engine (TTS)
  • Dialogue Toolkit
  • Speaker Voice Recognition


Some benefits of NSTT includes:

  • Speech Activity Detection Engine (SADE)

SADE is an audio extraction tool that distinguishes speech from non-speech audio inputs, e.g. music and silence, among others. SADE was developed for speech application developers looking to stream speech segments as audio inputs into the Automated Speech Recognition engine (ASR). SADE reduces audio file size and improves speech recognition accuracy for speech application development.

  • Automated Speech Recognition Engine (ASR)

The ASR is a proof-of-concept tool, which transcribes audio speeches into text. The ASR supports human-machine interface by transcribing speeches into text. 

  • Speech Synthesis Mark-up Language (SSML) Module for Text-to-Speech Engine (TTS) 

The SSML converts text input into a manner for TTS engines to correctly pronounce local terms, and can be used by developers who are looking to provide an authentic locally accented audio response for text input. 

  • Dialogue Toolkit

The dialogue toolkit contains the methodology and taxonomy for creating replies to inputs and the use cases are for example, conversational AI and virtual assistants.

  • Speaker Voice Recognition 

The SVR aims to recognize the speaker of a speech audio. Example use cases, could be meeting minutes transcription or interview transcription where there is a need to differentiate which participant speaks which sentence.  

NSTT is a suite of technologies licensed under IMDA’s Technology Licensing terms. Industry partners are welcome to speak to DSL to access the technologies.

Who is it for?

Technology Providers and Developers, Institute of Higher Learnings (IHLs), Research Institutes (RIs) and Individuals are welcome to use our technologies.

Demo Tool

For NSTT demo enquiries, click here to get in contact with us.

What Our Partners Say

MediaCorp Logo

“An overall 8-hour exercise can drop to about 2 to 3 hours … 90-plus per cent (sub-titling accuracy) is quite good considering a big chunk of it gets done within minutes … This software has now been developed to a stage where it even does automatic generation of punctuation, capitalisation, full-stops. All of which are critical for the sub-titling process.”

Mr. Anil Nihalani, Head of Connected Media MediaCorp Pte Ltd on how the NSTT has improved efficiency of the sub-titling process.

Everybody loves inspiring Success Stories. Find out how the technologies have been implemented in the industry here.


1. Will there be future updates to the software?

The NSTT will be updated whenever there are new developments.

2. How do I use the demo?

Please contact us so that we can discuss your use cases and advise you appropriately.

3. What are the terms of use for this tool?

If you are keen to obtain the source codes for the NSTT, please reach us at our email. The transfer of the source code will require a Technology Licensing Agreement with IMDA. 


For further enquiries on the Natural Speech and Transcription Technologies, please contact

Last updated on: 30 Oct 2019