Cost effected Speech to Text without Compromise Functionality

Two consept must be well positioned for sustainable usage of a Speech to Text service in any type application. One of them it has to be able to transcript of any speech perfectly, and second is it has to be acceptable cost.

Suppose that below scenario is your Speech to Text requirement.

The Scenario

You have a 32 Channel Call Center Application running in a Bank, and thinking to implement below two function with Speech to Text.

Function 1:

Your application will say to caller as "Welcome to X Bank. Please Say What action you want to do." and then will get user transcript over "Speech to Text" and redirect the caller to appropirate department or jump to next ivr node, When a call received.

  • * 70.000 request processing approximatelly in One Day.

  • Function 2:
    Your application will listen customer complaints during 1 minute and send transcripted text to releated department as mail message
  • * 80 Request processing approximatelly in One Day.

  • What is the cost of the above scenario ?
    Google Cost

    If you use Google SR Service, your cost will be 12.657 $ at the end of month.

    Function-1: (70.000 * 30 * 0.006$) 12.600$

    Function-2: (80 * 30 * 0.024$) 57$

    Teknoses Cost

    Depending on the type of license you use:

  • If you use a "Teknoses High Volume Cloud License", your cost will be approximately %80 less than Google.
  • if you use an "Teknoses On-Premise License" you will not have to pay any additional fees except your License Price.


  • How we offer this exiting reduction without compromise functionalty ?

    Typically, all of exists Speech to Text Providers offers to you only the CPR ( Charge Per Request ) pricing model, except Teknoses.

    We offer two different licensing model same time to you can position 'Requirement - Cost' balance in the most perfect way. Two model can be used seperatelly or mixed mode. We offer LGL ( Limitted Grammar License ) in addition to CPRL ( Charge Per Request license ).

    • CPRL, best sutiable if you want to get all transcript of speech.
    • LGL, best sutiable if it's enough to get important words or phrases in speech. ( reduces Speech to Text cost )

    LGL is Channel Basis, Fixed Price, Limited Grammar Speech to Text License. LGL only works with words or phrases you specified in your grammar.
    • LGL returns the found words if it's defined in your grammar
    • LGL returns the "NIG" keyword if found words not defined in your grammar.


    But, You need all of them !".

    Allright, Please let us to explain.


    Are all words really required ?
  • Function-1: Exactly NO. All words not required.
    You can use LGL.
    ( Note that, %99.8 percent of cost comes from Function-1. )
  • Function-2: Yes, all words required. You have to use CPRL.

  • When you looked at function-1 carefully, you could find it has "static" structure in fact. When we say static we mean it depends on predefined certain rules. Function-1 gets user transcription and search Keywords or Phrases in transcription, and then executes best suitable action.

    For example: Function-1 redirects current call to lost-stolen department if it finds "i lost my card" in transcription or finds "lost" and "card" words simultaneously.


    The key point is here:

    Function-1 does not interest with all words in fact. It only interest with predefined words or phrases you defined. And this feature allows you to reduce your cost. Then, There is no barier to use LGL in function-1 without compromise functionalty. You can simply specify grammar for function-1 including keywords and phrases you need, and then start to use LGL licenced Speech to Text quickly.