Module: Google::Cloud::Speech
- Defined in:
- lib/google/cloud/speech.rb,
lib/google/cloud/speech/job.rb,
lib/google/cloud/speech/audio.rb,
lib/google/cloud/speech/result.rb,
lib/google/cloud/speech/stream.rb,
lib/google/cloud/speech/project.rb,
lib/google/cloud/speech/service.rb,
lib/google/cloud/speech/version.rb,
lib/google/cloud/speech/credentials.rb
Overview
Google Cloud Speech
Google Cloud Speech API enables developers to convert audio to text by applying powerful neural network models in an easy to use API. The API recognizes over 80 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application's microphone, enable command-and-control through voice, or transcribe audio files, among many other use cases. Recognize audio uploaded in the request, and integrate with your audio storage on Google Cloud Storage, by using the same technology Google uses to power its own products.
For more information about Google Cloud Speech API, read the Google Cloud Speech API Documentation.
The goal of google-cloud is to provide an API that is comfortable to Rubyists. Authentication is handled by #speech. You can provide the project and credential information to connect to the Cloud Speech service, or if you are running on Google Compute Engine this configuration is taken care of for you. You can read more about the options for connecting in the Authentication Guide.
Creating audio sources
You can create an audio object that holds a reference to any one of several types of audio data source, along with metadata such as the audio encoding type.
Use Project#audio to create audio sources for the Cloud Speech API. You can provide a file path:
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw",
encoding: :raw, sample_rate: 16000
Or, you can initialize the audio instance with a Google Cloud Storage URI:
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio "gs://bucket-name/path/to/audio.raw",
encoding: :raw, sample_rate: 16000
Or, with a Google Cloud Storage File object:
require "google/cloud/storage"
storage = Google::Cloud::Storage.new
bucket = storage.bucket "bucket-name"
file = bucket.file "path/to/audio.raw"
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio file, encoding: :raw, sample_rate: 16000
Recognizing speech
The instance methods on Audio can be used to invoke both synchronous and asynchronous versions of the Cloud Speech API speech recognition operation.
Use Audio#recognize for synchronous speech recognition that returns Result objects only after all audio has been processed. This method is limited to audio data of 1 minute or less in duration, and will take roughly the same amount of time to process as the duration of the supplied audio data.
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw",
encoding: :raw, sample_rate: 16000
results = audio.recognize
result = results.first
result.transcript #=> "how old is the Brooklyn Bridge"
result.confidence #=> 0.9826789498329163
Use Audio#recognize_job for asynchronous speech recognition, in which a Job is returned immediately after the audio data has been sent. The job can be refreshed to retrieve Result objects once the audio data has been processed.
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw",
encoding: :raw, sample_rate: 16000
job = audio.recognize_job
job.done? #=> false
job.reload!
job.done? #=> true
results = job.results
result = results.first
result.transcript #=> "how old is the Brooklyn Bridge"
result.confidence #=> 0.9826789498329163
Use Project#stream for streaming audio data for speech recognition, in which a Stream is returned. The stream object can receive results while sending audio by performing bidirectional streaming speech-recognition.
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw"
stream = audio.stream encoding: :raw, sample_rate: 16000
# register callback for when a result is returned
stream.on_result do |results|
result = results.first
result.transcript #=> "how old is the Brooklyn Bridge"
result.confidence #=> 0.9826789498329163
end
# Stream 5 seconds of audio from the microphone
# Actual implementation of microphone input varies by platform
5.times do
stream.send MicrophoneInput.read(32000)
end
stream.stop
Obtaining audio data from input sources such as a Microphone is outside the scope of this document.
Defined Under Namespace
Classes: Audio, InterimResult, Job, Project, Result, Stream
Constant Summary collapse
- VERSION =
"0.23.0"
Class Method Summary collapse
-
.new(project: nil, keyfile: nil, scope: nil, timeout: nil, client_config: nil) ⇒ Google::Cloud::Speech::Project
Creates a new object for connecting to the Speech service.
Class Method Details
.new(project: nil, keyfile: nil, scope: nil, timeout: nil, client_config: nil) ⇒ Google::Cloud::Speech::Project
Creates a new object for connecting to the Speech service. Each call creates a new connection.
For more information on connecting to Google Cloud see the Authentication Guide.
208 209 210 211 212 213 214 215 216 217 218 219 220 221 |
# File 'lib/google/cloud/speech.rb', line 208 def self.new project: nil, keyfile: nil, scope: nil, timeout: nil, client_config: nil project ||= Google::Cloud::Speech::Project.default_project if keyfile.nil? credentials = Google::Cloud::Speech::Credentials.default scope: scope else credentials = Google::Cloud::Speech::Credentials.new( keyfile, scope: scope) end Google::Cloud::Speech::Project.new( Google::Cloud::Speech::Service.new( project, credentials, timeout: timeout, client_config: client_config)) end |