Creating A New Table And Setting The Expiration Date In Bigquery Using Python
Solution 1:
If you want to set an expiration time for your table, this might do the trick:
from datetime import datetime, timedelta
from google.cloud.bigquery.schema import SchemaField
def load_data_from_gcs(dataset,
table_name,
table_schema,
source,
source_format,
expiration_time):
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset(dataset)
table = dataset.table(table_name)
table.schema = table_schema
table.expires = expiration_time
if not table.created:
table.create()
job_name = str(uuid.uuid4())
job= bigquery_client.load_table_from_storage(
job_name, table, source)
job.source_format = source_format
job.begin()
wait_for_job(job)
dataset = 'FirebaseArchive'
table_name = 'test12'
gcs_source = 'gs://dataworks-356fa-backups/firetobq.json'
source_format = 'NEWLINE_DELIMITED_JSON'
table.schema = [SchemaField(field1), SchemaField(field2), (...)]
expiration_time = datetime.now() + timedelta(seconds=604800)
load_data_from_gcs(dataset,
table_name,
table_schema,
gcs_source,
source_format,
expiration_time)
Notice the only difference is the lines of code where it sets:
table.expires = expiration_time
Whose value must be of type datetime
(here defined as expiration_time = datetime.now() + timedelta(seconds=604800)
)
Not sure if it's possible to use schema auto-detection using the Python API but you still can send this information using the SchemaFields
. For instance, if your table have two fields, user_id
and job_id
, both being INTEGERS
, then the schema would be:
table_schema = [SchemaField('user_id', field_type='INT64'),
SchemaField('job_id', field_type='INT64')]
More information on how schema works in BigQuery you can find here.
[EDIT]:
Just saw your other question, if you want to truncate the table and then write data to it, you can just do:
job.create_disposition = 'WRITE_TRUNCATE'
job.begin()
In your load_data_from_gcs
function. This will automatically delete the table and create a new one with the data from your storage file. You won't have to define a schema for that as it's already previously defined (therefore might be a much easier solution for you).
Post a Comment for "Creating A New Table And Setting The Expiration Date In Bigquery Using Python"