Class: Google::Cloud::Language::Document

Inherits:
Object
  • Object
show all
Defined in:
lib/google/cloud/language/document.rb

Overview

Document

Represents a document for the Language service.

Be aware that only English, Spanish, and Japanese language content are supported.

See Project#document.

Examples:

require "google/cloud/language"

language = Google::Cloud::Language.new

content = "Star Wars is a great movie. The Death Star is fearsome."
document = language.document content
annotation = document.annotate

annotation.entities.count #=> 3
annotation.sentiment.score #=> 0.10000000149011612
annotation.sentiment.magnitude #=> 1.100000023841858
annotation.sentences.count #=> 2
annotation.tokens.count #=> 13

Instance Method Summary collapse

Instance Method Details

#annotate(sentiment: false, entities: false, syntax: false) ⇒ Annotation Also known as: mark, detect

Analyzes the document and returns sentiment, entity, and syntactic feature results, depending on the option flags. Calling annotate with no arguments will perform all analysis features. Each feature is priced separately. See Pricing for details.

Examples:

require "google/cloud/language"

language = Google::Cloud::Language.new

content = "Star Wars is a great movie. The Death Star is fearsome."
document = language.document content
annotation = document.annotate

annotation.sentiment.score #=> 0.10000000149011612
annotation.sentiment.magnitude #=> 1.100000023841858
annotation.entities.count #=> 3
annotation.sentences.count #=> 2
annotation.tokens.count #=> 13

With feature flags:

require "google/cloud/language"

language = Google::Cloud::Language.new

content = "Star Wars is a great movie. The Death Star is fearsome."
document = language.document content
annotation = document.annotate entities: true, syntax: true

annotation.entities.count #=> 3
annotation.sentences.count #=> 2
annotation.tokens.count #=> 13

Parameters:

  • sentiment (Boolean)

    Whether to perform sentiment analysis. Optional. The default is false. If every feature option is false, all features will be performed.

  • entities (Boolean)

    Whether to perform the entity analysis. Optional. The default is false. If every feature option is false, all features will be performed.

  • syntax (Boolean)

    Whether to perform syntactic analysis. Optional. The default is false. If every feature option is false, all features will be performed.

Returns:

  • (Annotation)

    The results of the content analysis.



216
217
218
219
220
221
222
# File 'lib/google/cloud/language/document.rb', line 216

def annotate sentiment: false, entities: false, syntax: false
  ensure_service!
  grpc = service.annotate to_grpc, sentiment: sentiment,
                                   entities: entities,
                                   syntax: syntax
  Annotation.from_grpc grpc
end

#entitiesAnnotation::Entities

Entity analysis inspects the given text for known entities (proper nouns such as public figures, landmarks, etc.) and returns information about those entities.

Examples:

require "google/cloud/language"

language = Google::Cloud::Language.new

content = "Star Wars is a great movie. The Death Star is fearsome."
document = language.document content
entities = document.entities # API call

entities.count #=> 3
entities.first.name #=> "Star Wars"
entities.first.type #=> :WORK_OF_ART
entities.first.mid #=> "/m/06mmr"

Returns:



284
285
286
287
288
# File 'lib/google/cloud/language/document.rb', line 284

def entities
  ensure_service!
  grpc = service.entities to_grpc
  Annotation::Entities.from_grpc grpc
end

#formatSymbol

The document's format.

Returns:

  • (Symbol)

    :text or :html



85
86
87
88
# File 'lib/google/cloud/language/document.rb', line 85

def format
  return :text if text?
  return :html if html?
end

#format=(new_format) ⇒ Object

Sets the document's format.

Examples:

require "google/cloud/language"

language = Google::Cloud::Language.new

document = language.document "<p>The Old Man and the Sea</p>"
document.format = :html

Parameters:

  • new_format (Symbol, String)

    Accepted values are :text or :html.



104
105
106
107
108
# File 'lib/google/cloud/language/document.rb', line 104

def format= new_format
  @grpc.type = :PLAIN_TEXT if new_format.to_s == "text"
  @grpc.type = :HTML       if new_format.to_s == "html"
  @grpc.type
end

#html!Object

Sets the document to the HTML format.



138
139
140
# File 'lib/google/cloud/language/document.rb', line 138

def html!
  @grpc.type = :HTML
end

#html?Boolean

Whether the document is the HTML format.

Returns:

  • (Boolean)


131
132
133
# File 'lib/google/cloud/language/document.rb', line 131

def html?
  @grpc.type == :HTML
end

#languageString

The document's language. ISO and BCP-47 language codes are supported.

Returns:

  • (String)


147
148
149
# File 'lib/google/cloud/language/document.rb', line 147

def language
  @grpc.language
end

#language=(new_language) ⇒ Object

Sets the document's language.

Examples:

require "google/cloud/language"

language = Google::Cloud::Language.new

document = language.document "<p>El viejo y el mar</p>"
document.language = "es"

Parameters:

  • new_language (String, Symbol)

    ISO and BCP-47 language codes are accepted.



165
166
167
# File 'lib/google/cloud/language/document.rb', line 165

def language= new_language
  @grpc.language = new_language.to_s
end

#sentimentAnnotation::Sentiment

Sentiment analysis inspects the given text and identifies the prevailing emotional opinion within the text, especially to determine a writer's attitude as positive, negative, or neutral. Currently, only English is supported for sentiment analysis.

Examples:

require "google/cloud/language"

language = Google::Cloud::Language.new

content = "Star Wars is a great movie. The Death Star is fearsome."
document = language.document content

sentiment = document.sentiment

sentiment.score #=> 0.10000000149011612
sentiment.magnitude #=> 1.100000023841858
sentiment.language #=> "en"

sentence = sentiment.sentences.first
sentence.sentiment.score #=> 0.699999988079071
sentence.sentiment.magnitude #=> 0.699999988079071

Returns:



317
318
319
320
321
# File 'lib/google/cloud/language/document.rb', line 317

def sentiment
  ensure_service!
  grpc = service.sentiment to_grpc
  Annotation::Sentiment.from_grpc grpc
end

#syntaxAnnotation::Syntax

Syntactic analysis extracts linguistic information, breaking up the given text into a series of sentences and tokens (generally, word boundaries), providing further analysis on those tokens.

Examples:

require "google/cloud/language"

language = Google::Cloud::Language.new

content = "Star Wars is a great movie. The Death Star is fearsome."
document = language.document content

syntax = document.syntax

sentence = syntax.sentences.last
sentence.text #=> "The Death Star is fearsome."
sentence.offset #=> 28

syntax.tokens.count #=> 13
token = syntax.tokens.first

token.text #=> "Star"
token.offset #=> 0
token.part_of_speech.tag #=> :NOUN
token.head_token_index #=> 1
token.label #=> :TITLE
token.lemma #=> "Star"

Returns:



257
258
259
260
261
# File 'lib/google/cloud/language/document.rb', line 257

def syntax
  ensure_service!
  grpc = service.syntax to_grpc
  Annotation::Syntax.from_grpc grpc
end

#text!Object

Sets the document to the TEXT format.



122
123
124
# File 'lib/google/cloud/language/document.rb', line 122

def text!
  @grpc.type = :PLAIN_TEXT
end

#text?Boolean

Whether the document is the TEXT format.

Returns:

  • (Boolean)


115
116
117
# File 'lib/google/cloud/language/document.rb', line 115

def text?
  @grpc.type == :PLAIN_TEXT
end