Class: Google::Cloud::Bigquery::Schema

Inherits:
Object
  • Object
show all
Defined in:
lib/google/cloud/bigquery/schema.rb,
lib/google/cloud/bigquery/schema/field.rb

Overview

Table Schema

A builder for BigQuery table schemas, passed to block arguments to Dataset#create_table and Table#schema. Supports nested and repeated fields via a nested block.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

table.schema do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |cities_lived|
    cities_lived.string "place", mode: :required
    cities_lived.integer "number_of_years", mode: :required
  end
end

See Also:

Defined Under Namespace

Classes: Field

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.dump(schema, destination) ⇒ Schema

Write a schema as JSON to a file.

The JSON schema file is the same as for the bq CLI.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
schema = Google::Cloud::Bigquery::Schema.dump(
  table.schema,
  "schema.json"
)

Parameters:

  • schema (IO, String)

    An Google::Cloud::Bigquery::Schema.

  • destination (IO, String)

    An IO to which to write the schema, or a String containing the filename to write to.

Returns:

  • (Schema)

    The schema so that commands are chainable.



105
106
107
# File 'lib/google/cloud/bigquery/schema.rb', line 105

def dump schema, destination
  schema.dump destination
end

.load(source) ⇒ Schema

Load a schema from a JSON file.

The JSON schema file is the same as for the bq CLI consisting of an array of JSON objects containing the following:

  • name: The column name
  • type: The column's data type
  • description: (Optional) The column's description
  • mode: (Optional) The column's mode (if unspecified, mode defaults to NULLABLE)
  • fields: If type is RECORD, an array of objects defining child fields with these properties

Examples:

require "google/cloud/bigquery"

schema = Google::Cloud::Bigquery::Schema.load(
  File.read("schema.json")
)

Parameters:

  • source (IO, String, Array<Hash>)

    An IO containing the JSON schema, a String containing the JSON schema, or an Array of Hashes containing the schema details.

Returns:



77
78
79
# File 'lib/google/cloud/bigquery/schema.rb', line 77

def load source
  new.load source
end

Instance Method Details

#boolean(name, description: nil, mode: :nullable) ⇒ Object

Adds a boolean field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



343
344
345
# File 'lib/google/cloud/bigquery/schema.rb', line 343

def boolean name, description: nil, mode: :nullable
  add_field name, :boolean, description: description, mode: mode
end

#bytes(name, description: nil, mode: :nullable) ⇒ Object

Adds a bytes field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



359
360
361
# File 'lib/google/cloud/bigquery/schema.rb', line 359

def bytes name, description: nil, mode: :nullable
  add_field name, :bytes, description: description, mode: mode
end

#date(name, description: nil, mode: :nullable) ⇒ Object

Adds a date field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



422
423
424
# File 'lib/google/cloud/bigquery/schema.rb', line 422

def date name, description: nil, mode: :nullable
  add_field name, :date, description: description, mode: mode
end

#datetime(name, description: nil, mode: :nullable) ⇒ Object

Adds a datetime field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



406
407
408
# File 'lib/google/cloud/bigquery/schema.rb', line 406

def datetime name, description: nil, mode: :nullable
  add_field name, :datetime, description: description, mode: mode
end

#dump(destination) ⇒ Schema

Write the schema as JSON to a file.

The JSON schema file is the same as for the bq CLI.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
table.schema.dump "schema.json"

Parameters:

  • destination (IO, String)

    An IO to which to write the schema, or a String containing the filename to write to.

Returns:

  • (Schema)

    The schema so that commands are chainable.



254
255
256
257
258
259
260
261
262
263
# File 'lib/google/cloud/bigquery/schema.rb', line 254

def dump destination
  if destination.respond_to?(:rewind) && destination.respond_to?(:write)
    destination.rewind
    destination.write JSON.dump(fields.map(&:to_hash))
  else
    File.write String(destination), JSON.dump(fields.map(&:to_hash))
  end

  self
end

#empty?Boolean

Whether the schema has no fields defined.

Returns:

  • (Boolean)

    true when there are no fields, false otherwise.



184
185
186
# File 'lib/google/cloud/bigquery/schema.rb', line 184

def empty?
  fields.empty?
end

#field(name) {|f| ... } ⇒ Field

Retrieve a field by name.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

field = table.schema.field "name"
field.required? #=> true

Yields:

  • (f)

Returns:

  • (Field)

    A field object.



172
173
174
175
176
177
# File 'lib/google/cloud/bigquery/schema.rb', line 172

def field name
  f = fields.find { |fld| fld.name == name.to_s }
  return nil if f.nil?
  yield f if block_given?
  f
end

#fieldsArray<Field>

The fields of the table schema.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

schema = table.schema

schema.fields.each do |field|
  puts field.name
end

Returns:

  • (Array<Field>)

    An array of field objects.



127
128
129
130
131
132
133
# File 'lib/google/cloud/bigquery/schema.rb', line 127

def fields
  if frozen?
    Array(@gapi.fields).map { |f| Field.from_gapi(f).freeze }.freeze
  else
    Array(@gapi.fields).map { |f| Field.from_gapi f }
  end
end

#float(name, description: nil, mode: :nullable) ⇒ Object

Adds a floating-point number field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



309
310
311
# File 'lib/google/cloud/bigquery/schema.rb', line 309

def float name, description: nil, mode: :nullable
  add_field name, :float, description: description, mode: mode
end

#headersArray<Symbol>

The names of the fields as symbols.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

schema = table.schema

schema.headers.each do |header|
  puts header
end

Returns:

  • (Array<Symbol>)

    An array of column names.



153
154
155
# File 'lib/google/cloud/bigquery/schema.rb', line 153

def headers
  fields.map(&:name).map(&:to_sym)
end

#integer(name, description: nil, mode: :nullable) ⇒ Object

Adds an integer field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



293
294
295
# File 'lib/google/cloud/bigquery/schema.rb', line 293

def integer name, description: nil, mode: :nullable
  add_field name, :integer, description: description, mode: mode
end

#load(source) ⇒ Schema

Load the schema from a JSON file.

The JSON schema file is the same as for the bq CLI consisting of an array of JSON objects containing the following:

  • name: The column name
  • type: The column's data type
  • description: (Optional) The column's description
  • mode: (Optional) The column's mode (if unspecified, mode defaults to NULLABLE)
  • fields: If type is RECORD, an array of objects defining child fields with these properties

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table" do |table|
  table.schema.load File.read("path/to/schema.json")
end

Parameters:

  • source (IO, String, Array<Hash>)

    An IO containing the JSON schema, a String containing the JSON schema, or an Array of Hashes containing the schema details.

Returns:

  • (Schema)

    The schema so that commands are chainable.



218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
# File 'lib/google/cloud/bigquery/schema.rb', line 218

def load source
  if source.respond_to?(:rewind) && source.respond_to?(:read)
    source.rewind
    schema_json = String source.read
  elsif source.is_a? Array
    schema_json = JSON.dump source
  else
    schema_json = String source
  end

  schema_json = %({"fields":#{schema_json}})

  @gapi = Google::Apis::BigqueryV2::TableSchema.from_json schema_json

  self
end

#numeric(name, description: nil, mode: :nullable) ⇒ Object

Adds a numeric number field to the schema. Numeric is a fixed-precision numeric type with 38 decimal digits, 9 that follow the decimal point.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



327
328
329
# File 'lib/google/cloud/bigquery/schema.rb', line 327

def numeric name, description: nil, mode: :nullable
  add_field name, :numeric, description: description, mode: mode
end

#record(name, description: nil, mode: nil) {|field| ... } ⇒ Object

Adds a record field to the schema. A block must be passed describing the nested fields of the record. For more information about nested and repeated records, see Preparing Data for BigQuery .

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

table.schema do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |cities_lived|
    cities_lived.string "place", mode: :required
    cities_lived.integer "number_of_years", mode: :required
  end
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

Yields:

  • (field)

    a block for setting the nested record's schema

Yield Parameters:

  • field (Field)

    the object accepting the nested schema

Raises:

  • (ArgumentError)


459
460
461
462
463
464
465
466
467
# File 'lib/google/cloud/bigquery/schema.rb', line 459

def record name, description: nil, mode: nil
  # TODO: do we need to raise if no block was given?
  raise ArgumentError, "a block is required" unless block_given?

  nested_field = add_field name, :record, description: description,
                                          mode: mode
  yield nested_field
  nested_field
end

#string(name, description: nil, mode: :nullable) ⇒ Object

Adds a string field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



277
278
279
# File 'lib/google/cloud/bigquery/schema.rb', line 277

def string name, description: nil, mode: :nullable
  add_field name, :string, description: description, mode: mode
end

#time(name, description: nil, mode: :nullable) ⇒ Object

Adds a time field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



390
391
392
# File 'lib/google/cloud/bigquery/schema.rb', line 390

def time name, description: nil, mode: :nullable
  add_field name, :time, description: description, mode: mode
end

#timestamp(name, description: nil, mode: :nullable) ⇒ Object

Adds a timestamp field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String)

    A description of the field.

  • mode (Symbol)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



374
375
376
# File 'lib/google/cloud/bigquery/schema.rb', line 374

def timestamp name, description: nil, mode: :nullable
  add_field name, :timestamp, description: description, mode: mode
end