HTTP to Google Cloud Storage Transfer Operator¶
Google has a service Google Cloud Storage. This service is used to store large data from various applications. HTTP (Hypertext Transfer Protocol) HTTP is an application layer protocol designed to transfer information between networked devices and runs on top of other layers of the network protocol stack.
Prerequisite Tasks¶
To use these operators, you must do a few things:
Select or create a Cloud Platform project using the Cloud Console.
Enable billing for your project, as described in the Google Cloud documentation.
Enable the API, as described in the Cloud Console documentation.
Install API libraries via pip.
pip install 'apache-airflow[google]'Detailed information is available for Installation.
Operator¶
Transfer files between HTTP and Google Storage is performed with the
HttpToGCSOperator
operator.
Use Jinja templating with
http_conn_id
, endpoint
, data
, headers
, gcp_conn_id
, bucket_name
, object_name
to define values dynamically.
Copying single files¶
The following Operator copies a single file.
http_to_gcs_task = HttpToGCSOperator(
task_id="http_to_gcs_task",
http_conn_id=conn_id_name,
endpoint="/test_file",
bucket_name=BUCKET_NAME,
object_name="test_file",
)
Reference¶
For more information, see