Class: Google::Cloud::Speech::Audio
- Inherits:
-
Object
- Object
- Google::Cloud::Speech::Audio
- Defined in:
- lib/google/cloud/speech/audio.rb
Overview
Audio
Represents a source of audio data, with related metadata such as the audio encoding, sample rate, and language.
See Project#audio.
Instance Attribute Summary collapse
-
#encoding ⇒ String, Symbol
Encoding of audio data to be recognized.
-
#language ⇒ String, Symbol
The language of the supplied audio as a BCP-47 language code.
-
#sample_rate ⇒ Integer
Sample rate in Hertz of the audio data to be recognized.
Instance Method Summary collapse
-
#process(max_alternatives: nil, profanity_filter: nil, phrases: nil) ⇒ Operation
(also: #long_running_recognize, #recognize_job)
Performs asynchronous speech recognition.
-
#recognize(max_alternatives: nil, profanity_filter: nil, phrases: nil) ⇒ Array<Result>
Performs synchronous speech recognition.
Instance Attribute Details
#encoding ⇒ String, Symbol
Encoding of audio data to be recognized.
Acceptable values are:
linear16
- Uncompressed 16-bit signed little-endian samples. (LINEAR16)flac
- The Free Lossless Audio Codec encoding. Only 16-bit samples are supported. Not all fields in STREAMINFO are supported. (FLAC)mulaw
- 8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law. (MULAW)amr
- Adaptive Multi-Rate Narrowband codec. (sample_rate
must be 8000 Hz.) (AMR)amr_wb
- Adaptive Multi-Rate Wideband codec. (sample_rate
must be 16000 Hz.) (AMR_WB)ogg_opus
- Ogg Mapping for Opus. (OGG_OPUS)Lossy codecs do not recommend, as they result in a lower-quality speech transcription.
speex
- Speex with header byte. (SPEEX_WITH_HEADER_BYTE)Lossy codecs do not recommend, as they result in a lower-quality speech transcription. If you must use a low-bitrate encoder, OGG_OPUS is preferred.
98 99 100 |
# File 'lib/google/cloud/speech/audio.rb', line 98 def encoding @encoding end |
#language ⇒ String, Symbol
The language of the supplied audio as a BCP-47 language code. e.g. "en-US" for English (United States), "en-GB" for English (United Kingdom), "fr-FR" for French (France). See Language Support for a list of the currently supported language codes.
122 123 124 |
# File 'lib/google/cloud/speech/audio.rb', line 122 def language @language end |
#sample_rate ⇒ Integer
Sample rate in Hertz of the audio data to be recognized. Valid values are: 8000-48000. 16000 is optimal. For best results, set the sampling rate of the audio source to 16000 Hz. If that's not possible, use the native sample rate of the audio source (instead of re-sampling).
144 145 146 |
# File 'lib/google/cloud/speech/audio.rb', line 144 def sample_rate @sample_rate end |
Instance Method Details
#process(max_alternatives: nil, profanity_filter: nil, phrases: nil) ⇒ Operation Also known as: long_running_recognize, recognize_job
Performs asynchronous speech recognition. Requests are processed asynchronously, meaning a Operation is returned once the audio data has been sent, and can be refreshed to retrieve recognition results once the audio data has been processed.
262 263 264 265 266 267 268 269 270 271 272 |
# File 'lib/google/cloud/speech/audio.rb', line 262 def process max_alternatives: nil, profanity_filter: nil, phrases: nil ensure_speech! speech.process self, encoding: encoding, sample_rate: sample_rate, language: language, max_alternatives: max_alternatives, profanity_filter: profanity_filter, phrases: phrases end |
#recognize(max_alternatives: nil, profanity_filter: nil, phrases: nil) ⇒ Array<Result>
Performs synchronous speech recognition. Sends audio data to the Speech API, which performs recognition on that data, and returns results only after all audio has been processed. Limited to audio data of 1 minute or less in duration.
The Speech API will take roughly the same amount of time to process audio data sent synchronously as the duration of the supplied audio data. That is, if you send audio data of 30 seconds in length, expect the synchronous request to take approximately 30 seconds to return results.
212 213 214 215 216 217 218 219 220 |
# File 'lib/google/cloud/speech/audio.rb', line 212 def recognize max_alternatives: nil, profanity_filter: nil, phrases: nil ensure_speech! speech.recognize self, encoding: encoding, sample_rate: sample_rate, language: language, max_alternatives: max_alternatives, profanity_filter: profanity_filter, phrases: phrases end |