Using a speech engine to highlight words: Difference between revisions

Content added Content deleted

Inline

Revision as of 13:52, 1 January 2013

Send a piece of text in a simple GUI through a text-to-speech engine (producing spoken output). At the same time as each word is being spoken, highlight the word in the GUI. (The GUI does not need to be interactive, but some extra kudos for allowing users of the code to provide their own text.) In languages where cursor control and highlighting are not possible, it is permissible to output each word as it is spoken.

AutoHotkey

We use the simple SAPI.SPVoice COM Object and a parsing loop. The highlighting is done with EM_SETSEL and Notepad. Rather crude, but it works. Due to the simplistic nature of the parsing loop, the text ends with a space. <lang AutoHotkey>SetTitleMatchMode 2 EM_SETSEL := 0x00B1

Run notepad,,,pid WinWaitActive ahk_pid %pid% ControlSetText, Edit1, % text := "AutoHotkey was the first to implement this task! ", ahk_pid %pid%

pVoice := ComObjCreate("Sapi.spvoice"), i := 1 ; the spvoice COM Object ships with the OS

parse the text

While lf := SubStr(text, i, 1) {

  If lf = %A_Space%
  {
     SendMessage, EM_SetSel, % i-StrLen(word)-1, % i-1, Edit1, ahk_pid %pid%
     pVoice.speak(word), word := "", i++
  }
  Else word .= lf, i++

}</lang>

Mathematica

<lang>DynamicModule[{text = "This is some text.", words, i = 0},

Panel@Column@{Dynamic[
    Row[Riffle[
      If[i != 0, MapAt[Style[#, Red] &, #, i], #] &@(words = 
         StringSplit@text), " "]]], InputField[Dynamic@text, String],
    Button["Speak", 
    While[i < Length@words, i++; FinishDynamic[]; Speak[wordsi]; 
     Pause[Max[0.7, 0.12 StringLength[wordsi]]]]; i = 0]}]</lang>

Ruby

Library: Shoes

I'm having difficulty figuring out how to get Shoes to update the GUI (like Tk's update command), so the user must click the button once for each word.

Uses the Ruby code from Speech synthesis <lang ruby>load 'speechsynthesis.rb'

if ARGV.length == 1

 $text = "This is default text for the highlight and speak program"

else

 $text = ARGV[1..-1].join(" ")

end $words = $text.split

Shoes.app do

 @idx = 0

 stack do
   @sentence = para(strong($words[0] + " "), $words[1..-1].map {|word| span(word + " ")})
   button "Say word" do
     say_and_highlight
   end
 end

 keypress do |key|
   case key
   when :control_q, "\x11" then exit
   end
 end

 def say_and_highlight
   speak $words[@idx]
   @idx = (@idx + 1) % $words.length
   @sentence.replace($words.each_with_index.map {|word, idx| idx == @idx ? strong(word + " ") : span(word + " ")})
 end

end</lang>

Tcl

This code uses the external /usr/bin/say program (known available on Mac OS X) as its interface to the speech engine; this produces rather stilted speech because it forces the text to be spoken one word at a time instead of as a whole sentence (in order to keep the highlighting synchronized).

Library: Tk

<lang tcl>package require Tcl 8.5 package require Tk 8.5 proc say {text button} {

   grab $button
   $button configure -state disabled -cursor watch
   update
   set starts [$text search -all -regexp -count lengths {\S+} 1.0]
   foreach start $starts length $lengths {

lappend strings [$text get $start "$start + $length char"] lappend ends [$text index "$start + $length char"]

   }
   $text tag remove sel 1.0 end
   foreach from $starts str $strings to $ends {

$text tag add sel $from $to update idletasks exec /usr/bin/say << $str $text tag remove sel 1.0 end

   }
   grab release $button
   $button configure -state normal -cursor {}

}

pack [text .t] pack [button .b -text "Speak, computer!" -command {say .t .b}] -fill x .t insert 1.0 "This is an example of speech synthesis with Tcl/Tk."</lang>