Class: Google::Cloud::Bigquery::LoadJob

Inherits:
Job
  • Object
show all
Defined in:
lib/google/cloud/bigquery/load_job.rb

Overview

LoadJob

A Job subclass representing a load operation that may be performed on a Table. A LoadJob instance is created when you call Table#load_job.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |nested_schema|
    nested_schema.string "place", mode: :required
    nested_schema.integer "number_of_years", mode: :required
  end
end

load_job.wait_until_done!
load_job.done? #=> true

See Also:

Instance Method Summary collapse

Methods inherited from Job

#cancel, #configuration, #created_at, #done?, #ended_at, #error, #errors, #failed?, #job_id, #labels, #pending?, #project_id, #reload!, #rerun!, #running?, #started_at, #state, #statistics, #status, #user_email, #wait_until_done!

Instance Method Details

#allow_jagged_rows?Boolean

Checks if the load operation accepts rows that are missing trailing optional columns. The missing values are treated as nulls. If false, records with missing trailing columns are treated as bad records, and if there are too many bad records, an error is returned. The default value is false. Only applicable to CSV, ignored for other formats.

Returns:

  • (Boolean)

    true when jagged rows are allowed, false otherwise.



236
237
238
239
240
# File 'lib/google/cloud/bigquery/load_job.rb', line 236

def allow_jagged_rows?
  val = @gapi.configuration.load.allow_jagged_rows
  val = false if val.nil?
  val
end

#autodetect?Boolean

Checks if BigQuery should automatically infer the options and schema for CSV and JSON sources. The default is false.

Returns:

  • (Boolean)

    true when autodetect is enabled, false otherwise.



185
186
187
188
189
# File 'lib/google/cloud/bigquery/load_job.rb', line 185

def autodetect?
  val = @gapi.configuration.load.autodetect
  val = false if val.nil?
  val
end

#backup?Boolean

Checks if the source data is a Google Cloud Datastore backup.

Returns:

  • (Boolean)

    true when the source format is DATASTORE_BACKUP, false otherwise.



221
222
223
224
# File 'lib/google/cloud/bigquery/load_job.rb', line 221

def backup?
  val = @gapi.configuration.load.source_format
  val == "DATASTORE_BACKUP"
end

#csv?Boolean

Checks if the format of the source data is CSV. The default is true.

Returns:

  • (Boolean)

    true when the source format is CSV, false otherwise.



209
210
211
212
213
# File 'lib/google/cloud/bigquery/load_job.rb', line 209

def csv?
  val = @gapi.configuration.load.source_format
  return true if val.nil?
  val == "CSV"
end

#delimiterString

The delimiter used between fields in the source data. The default is a comma (,).

Returns:

  • (String)

    A string containing the character, such as ",".



80
81
82
# File 'lib/google/cloud/bigquery/load_job.rb', line 80

def delimiter
  @gapi.configuration.load.field_delimiter || ","
end

#destinationTable

The table into which the operation loads data. This is the table on which Table#load_job was invoked.

Returns:

  • (Table)

    A table instance.



66
67
68
69
70
71
72
# File 'lib/google/cloud/bigquery/load_job.rb', line 66

def destination
  table = @gapi.configuration.load.destination_table
  return nil unless table
  retrieve_table table.project_id,
                 table.dataset_id,
                 table.table_id
end

#ignore_unknown_values?Boolean

Checks if the load operation allows extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned. The default is false.

Returns:

  • (Boolean)

    true when unknown values are ignored, false otherwise.



252
253
254
255
256
# File 'lib/google/cloud/bigquery/load_job.rb', line 252

def ignore_unknown_values?
  val = @gapi.configuration.load.ignore_unknown_values
  val = false if val.nil?
  val
end

#input_file_bytesInteger

The number of bytes of source data in the load job.

Returns:

  • (Integer)

    The number of bytes.



288
289
290
291
292
# File 'lib/google/cloud/bigquery/load_job.rb', line 288

def input_file_bytes
  Integer @gapi.statistics.load.input_file_bytes
rescue
  nil
end

#input_filesInteger

The number of source data files in the load job.

Returns:

  • (Integer)

    The number of source files.



277
278
279
280
281
# File 'lib/google/cloud/bigquery/load_job.rb', line 277

def input_files
  Integer @gapi.statistics.load.input_files
rescue
  nil
end

#iso8859_1?Boolean

Checks if the character encoding of the data is ISO-8859-1.

Returns:

  • (Boolean)

    true when the character encoding is ISO-8859-1, false otherwise.



115
116
117
118
# File 'lib/google/cloud/bigquery/load_job.rb', line 115

def iso8859_1?
  val = @gapi.configuration.load.encoding
  val == "ISO-8859-1"
end

#json?Boolean

Checks if the format of the source data is newline-delimited JSON. The default is false.

Returns:

  • (Boolean)

    true when the source format is NEWLINE_DELIMITED_JSON, false otherwise.



198
199
200
201
# File 'lib/google/cloud/bigquery/load_job.rb', line 198

def json?
  val = @gapi.configuration.load.source_format
  val == "NEWLINE_DELIMITED_JSON"
end

#max_bad_recordsInteger

The maximum number of bad records that the load operation can ignore. If the number of bad records exceeds this value, an error is returned. The default value is 0, which requires that all records be valid.

Returns:

  • (Integer)

    The maximum number of bad records.



142
143
144
145
146
# File 'lib/google/cloud/bigquery/load_job.rb', line 142

def max_bad_records
  val = @gapi.configuration.load.max_bad_records
  val = 0 if val.nil?
  val
end

#null_markerString

Specifies a string that represents a null value in a CSV file. For example, if you specify \N, BigQuery interprets \N as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value.

Returns:

  • (String)

    A string representing null value in a CSV file.



159
160
161
162
163
# File 'lib/google/cloud/bigquery/load_job.rb', line 159

def null_marker
  val = @gapi.configuration.load.null_marker
  val = "" if val.nil?
  val
end

#output_bytesInteger

The number of bytes that have been loaded into the table. While an import job is in the running state, this value may change.

Returns:

  • (Integer)

    The number of bytes that have been loaded.



312
313
314
315
316
# File 'lib/google/cloud/bigquery/load_job.rb', line 312

def output_bytes
  Integer @gapi.statistics.load.output_bytes
rescue
  nil
end

#output_rowsInteger

The number of rows that have been loaded into the table. While an import job is in the running state, this value may change.

Returns:

  • (Integer)

    The number of rows that have been loaded.



300
301
302
303
304
# File 'lib/google/cloud/bigquery/load_job.rb', line 300

def output_rows
  Integer @gapi.statistics.load.output_rows
rescue
  nil
end

#quoteString

The value that is used to quote data sections in a CSV file. The default value is a double-quote ("). If your data does not contain quoted sections, the value should be an empty string. If your data contains quoted newline characters, #quoted_newlines? should return true.

Returns:

  • (String)

    A string containing the character, such as "\"".



129
130
131
132
133
# File 'lib/google/cloud/bigquery/load_job.rb', line 129

def quote
  val = @gapi.configuration.load.quote
  val = "\"" if val.nil?
  val
end

#quoted_newlines?Boolean

Checks if quoted data sections may contain newline characters in a CSV file. The default is false.

Returns:

  • (Boolean)

    true when quoted newlines are allowed, false otherwise.



172
173
174
175
176
# File 'lib/google/cloud/bigquery/load_job.rb', line 172

def quoted_newlines?
  val = @gapi.configuration.load.allow_quoted_newlines
  val = false if val.nil?
  val
end

#schemaSchema?

The schema for the destination table. The schema can be omitted if the destination table already exists, or if you're loading data from Google Cloud Datastore.

The returned object is frozen and changes are not allowed. Use Table#schema to update the schema.

Returns:

  • (Schema, nil)

    A schema object, or nil.



268
269
270
# File 'lib/google/cloud/bigquery/load_job.rb', line 268

def schema
  Schema.from_gapi(@gapi.configuration.load.schema).freeze
end

#skip_leading_rowsInteger

The number of rows at the top of a CSV file that BigQuery will skip when loading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped.

Returns:

  • (Integer)

    The number of header rows at the top of a CSV file to skip.



92
93
94
# File 'lib/google/cloud/bigquery/load_job.rb', line 92

def skip_leading_rows
  @gapi.configuration.load.skip_leading_rows || 0
end

#sourcesObject

The URI or URIs representing the Google Cloud Storage files from which the operation loads data.



56
57
58
# File 'lib/google/cloud/bigquery/load_job.rb', line 56

def sources
  Array @gapi.configuration.load.source_uris
end

#utf8?Boolean

Checks if the character encoding of the data is UTF-8. This is the default.

Returns:

  • (Boolean)

    true when the character encoding is UTF-8, false otherwise.



103
104
105
106
107
# File 'lib/google/cloud/bigquery/load_job.rb', line 103

def utf8?
  val = @gapi.configuration.load.encoding
  return true if val.nil?
  val == "UTF-8"
end