Skip to content

Getting data from a GCS bucket

To get data from a GCS bucket, you need to execute connect get gcs with following mandatory parameters:

  • --bucket then the GCS bucket name

Depending on the type of authentication you need to either use --auth-json for key based authentication or use the service account assigned to the instance (if the server is running on GCP).

To use JSON key for the authentication you need to provide:

  • --auth-json then the path to the file containing the auth JSON

To authenticate with Application Default Credentials (ADC) you need to set GOOGLE_APPLICATION_CREDENTIALS environment variable that points to the key file.

Terminal window
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/key-file.json"

Then you can omit providing the key file as it will be obtained from the environment variable.

For user credentials, ADC can be set by running Following

Terminal window
$ gcloud auth application-default login

Once authenticated you can connect without specifying the JSON key.

To get a single file from a GCS bucket you should use

  • --file followed by the file name to be downloaded

Following command will get file1.zip file from a GCS bucket my-gcs-bucket using JSON key for the authentication.

Terminal window
$ connect get gcs --file file1.zip --bucket my-gcs-bucket --auth-json /path/to/key-file.json
2025/06/05 11:45:28 INFO Getting files from gcs://my-gcs-bucket/
2025/06/05 11:45:28 INFO Files to get: count=1
[file1.zip] 10.00 MiB / 10.00 MiB done

If you’d like to get multiple files from a GCS bucket you need to specify either file mask or file regular expression for the files.

  • --file-mask a file mask
  • --file-regex regular expression

This will download all files that match a certain mask or pattern. You can also provide an optional source folder that contains the files to be downloaded:

  • -s, --src-folder folder containing the files to download

If you don’t provide the source folder files will be downloaded from “root folder” of the GCS bucket.

Following command will get all .zip files from data folder of GCS bucket my-gcs-bucket using JSON key for the authentication.

Terminal window
$ connect get gcs --file-mask "*.zip" -s data --bucket my-gcs-bucket --auth-json /path/to/key-file.json
025/06/05 12:12:14 INFO Getting files from gcs://my-gcs-bucket/data/
2025/06/05 12:12:15 INFO Files to get: count=3
[file1.zip] 10.00 MiB / 10.00 MiB done
[file3.zip] 10.00 MiB / 10.00 MiB done
[file2.zip] 10.00 MiB / 10.00 MiB done

You can also provide an optional destination folder where the files will be downloaded to:

  • -d, --dst-folder target folder where to download files

If you don’t provide the destination folder files will be downloaded to the current folder.

In below example all .zip files from data folder of GCS bucket my-gcs-bucket will be downloaded to local folder using JSON key for the authentication.

Terminal window
$ connect get gcs --file-mask "*.zip" -s data -d local --bucket my-gcs-bucket --auth-json /path/to/key-file.json
2025/06/05 12:13:12 INFO Getting files from gcs://my-gcs-bucket/data/
2025/06/05 12:13:12 INFO Files to get: count=3
[file1.zip] 10.00 MiB / 10.00 MiB done
[file3.zip] 10.00 MiB / 10.00 MiB done
[file2.zip] 10.00 MiB / 10.00 MiB done

Getting files from a GCS bucket in batch mode

Section titled “Getting files from a GCS bucket in batch mode”

When using the application in batch or non-interactive mode, such as in a script or scheduler, you should disable the progress bar by using

  • --batch to get files in batch mode

Following command will get all .zip files from the data folder of GCS bucket my-gcs-bucket and place them in the local folder, using batch mode and using JSON key for the authentication.

Terminal window
$ connect get gcs --file-mask "*.zip" -s data -d local --batch --bucket my-gcs-bucket --auth-json /path/to/key-file.json
2025/06/05 12:14:37 INFO Getting files from gcs://my-gcs-bucket/data/
2025/06/05 12:14:37 INFO Files to get: count=3
2025/06/05 12:14:37 INFO Transferring file=data/file1.zip targetFolder=local
2025/06/05 12:14:39 INFO File transferred successfully bytesTransferred=10485760 duration=1s821ms transferRate=5.49MB/s file=data/file1.zip targetFolder=local
2025/06/05 12:14:39 INFO Transferring file=data/file3.zip targetFolder=local
2025/06/05 12:14:41 INFO File transferred successfully bytesTransferred=10485760 duration=1s830ms transferRate=5.46MB/s file=data/file3.zip targetFolder=local
2025/06/05 12:14:41 INFO Transferring file=data/file2.zip targetFolder=local
2025/06/05 12:14:43 INFO File transferred successfully bytesTransferred=10485760 duration=1s839ms transferRate=5.44MB/s file=data/file2.zip targetFolder=local

Getting files from a GCS bucket using scheduler

Section titled “Getting files from a GCS bucket using scheduler”

When running the application from a scheduler, it’s recommended to ensure only one instance operates on a specific folder. It’s possible to configure the application to create a flag file at startup and delete it upon completion. If another instance tries to start while one is already running, it will detect the flag file and exit. This approach ensures that only one instance of the application runs at any given time. To do that use

  • --flag path to the flag file

Following command will get all .zip files from the data folder of GCS bucket my-gcs-bucket and place them in the local folder using /tmp/flag1 as a flag file and JSON key for the authentication. Mind that we are using batch mode as it’s required for the application to run correctly from script or scheduler.

Terminal window
$ connect get gcs --flag /tmp/flag1 --batch --file-mask "*.zip" -s data -d local --bucket my-gcs-bucket --auth-json /path/to/key-file.json
2025/06/05 12:15:21 INFO Using flag file: /tmp/flag1
2025/06/05 12:15:21 INFO Getting files from gcs://my-gcs-bucket/data/
2025/06/05 12:15:21 INFO Files to get: count=3
2025/06/05 12:15:22 INFO Transferring file=data/file1.zip targetFolder=local
2025/06/05 12:15:23 INFO File transferred successfully bytesTransferred=10485760 duration=1s824ms transferRate=5.48MB/s file=data/file1.zip targetFolder=local
2025/06/05 12:15:24 INFO Transferring file=data/file3.zip targetFolder=local
2025/06/05 12:15:25 INFO File transferred successfully bytesTransferred=10485760 duration=1s820ms transferRate=5.49MB/s file=data/file3.zip targetFolder=local
2025/06/05 12:15:26 INFO Transferring file=data/file2.zip targetFolder=local
2025/06/05 12:15:28 INFO File transferred successfully bytesTransferred=10485760 duration=1s846ms transferRate=5.42MB/s file=data/file2.zip targetFolder=local
2025/06/05 12:15:28 INFO Removing flag file: /tmp/flag1

Getting files from a GCS bucket in parallel

Section titled “Getting files from a GCS bucket in parallel”

To speed up file download you can fetch files simultaneously. Application will then get files using multiple sessions. To specify number of concurrent GCS sessions use:

  • --parallel number of GCS sessions

Following command will get all .zip files from data folder of GCS bucket my-gcs-bucket using JSON key for the authentication and 3 parallel sessions.

Terminal window
$ connect get gcs --file-mask "*.zip" -s data --batch --parallel 3 --bucket my-gcs-bucket --auth-json /path/to/key-file.json
2025/06/05 12:16:03 INFO Getting files from gcs://my-gcs-bucket/data/
2025/06/05 12:16:03 INFO Files to get: count=3
2025/06/05 12:16:04 INFO Transferring file=data/file3.zip targetFolder=
2025/06/05 12:16:04 INFO Transferring file=data/file1.zip targetFolder=
2025/06/05 12:16:04 INFO Transferring file=data/file2.zip targetFolder=
2025/06/05 12:16:05 INFO File transferred successfully bytesTransferred=10485760 duration=1s817ms transferRate=5.50MB/s file=data/file1.zip targetFolder=
2025/06/05 12:16:05 INFO File transferred successfully bytesTransferred=10485760 duration=1s821ms transferRate=5.49MB/s file=data/file3.zip targetFolder=
2025/06/05 12:16:05 INFO File transferred successfully bytesTransferred=10485760 duration=1s625ms transferRate=6.15MB/s file=data/file2.zip targetFolder=

Getting files from a GCS bucket in sequence

Section titled “Getting files from a GCS bucket in sequence”

If you don’t specify the --parallel option then files will be downloaded using single GCS session in the same order as they showed up on GCS bucket (oldest files first).

Action after file is downloaded from a GCS bucket

Section titled “Action after file is downloaded from a GCS bucket”

After file is successfully downloaded to local system it’s possible to remove it from source or move it to a different folder at the source system.

  • --delete delete a file after it’s downloaded
  • --move-folder target folder to move the file after it’s downloaded

Following command will get all .zip files from data folder of GCS bucket my-gcs-bucket using JSON key for the authentication, 3 parallel sessions and store them in local folder. Then files will be moved to archive folder at in the GCS bucket.

Terminal window
$ connect get gcs --file-mask "*.zip" -s data -d local --batch --parallel 3 --bucket my-gcs-bucket --move-folder archive --auth-json /path/to/key-file.json
2025/06/05 12:16:42 INFO Getting files from gcs://my-gcs-bucket/data/
2025/06/05 12:16:42 INFO Files to get: count=3
2025/06/05 12:16:42 INFO Transferring file=data/file1.zip targetFolder=local
2025/06/05 12:16:44 INFO File moved to folder folder=archive file=data/file1.zip
2025/06/05 12:16:44 INFO File transferred successfully bytesTransferred=10485760 duration=1s458ms transferRate=6.86MB/s file=data/file1.zip targetFolder=local
2025/06/05 12:16:44 INFO Transferring file=data/file3.zip targetFolder=local
2025/06/05 12:16:47 INFO File moved to folder folder=archive file=data/file3.zip
2025/06/05 12:16:47 INFO File transferred successfully bytesTransferred=10485760 duration=1s848ms transferRate=5.41MB/s file=data/file3.zip targetFolder=local
2025/06/05 12:16:47 INFO Transferring file=data/file2.zip targetFolder=local
2025/06/05 12:16:50 INFO File moved to folder folder=archive file=data/file2.zip
2025/06/05 12:16:50 INFO File transferred successfully bytesTransferred=10485760 duration=1s817ms transferRate=5.50MB/s file=data/file2.zip targetFolder=local

Following command will get all .zip files from data folder of GCS bucket my-gcs-bucket using JSON key for the authentication, 3 parallel sessions and then removes the files from the GCS bucket. Files will be downloaded to the local folder.

Terminal window
$ connect get gcs --file-mask "*.zip" -s data --batch --parallel 3 --bucket my-gcs-bucket --delete --auth-json /path/to/key-file.json
2025/06/05 12:18:22 INFO Getting files from gcs://my-gcs-bucket/data/
2025/06/05 12:18:22 INFO Files to get: count=3
2025/06/05 12:18:23 INFO Transferring file=data/file1.zip targetFolder=local
2025/06/05 12:18:25 INFO File deleted file=data/file1.zip
2025/06/05 12:18:25 INFO File transferred successfully bytesTransferred=10485760 duration=1s815ms transferRate=5.51MB/s file=data/file1.zip targetFolder=local
2025/06/05 12:18:25 INFO Transferring file=data/file3.zip targetFolder=local
2025/06/05 12:18:26 INFO File deleted file=data/file3.zip
2025/06/05 12:18:26 INFO File transferred successfully bytesTransferred=10485760 duration=1s452ms transferRate=6.89MB/s file=data/file3.zip targetFolder=local
2025/06/05 12:18:27 INFO Transferring file=data/file2.zip targetFolder=local
2025/06/05 12:18:29 INFO File deleted file=data/file2.zip
2025/06/05 12:18:29 INFO File transferred successfully bytesTransferred=10485760 duration=1s844ms transferRate=5.42MB/s file=data/file2.zip targetFolder=local
Terminal window
Usage:
connect get gcs [command] [flags]
Flags:
--auth-json string GCS authentication json file
--bucket string GCS bucket
Global Flags:
--batch No progress bars
--delete Try to delete files after successful get
-d, --dst-folder string Destination folder for retrieved files
--file string File name to be get
--file-mask string File mask to filter files
--file-regex string File regex to filter files
--flag string Flag file name
--from-mail string From mail used to send notifications (only in batch mode)
--godebug Turns on debug mode
--help Prints help for the command
--log-format string Log output format: text|json (default "text")
--mail-format string Mail format [text|html] (only in batch mode) (default "plain")
--move-folder string Folder to move files after successful get
--no-color Do not use colors in logs
--parallel uint Number of sessions used to get files (default 1)
--quiet Makes no output
-s, --src-folder string Folder to look for files
--to-mail-failure strings Email list to send failure notification (only in batch mode)
--to-mail-success strings Email list to send success notification (only in batch mode)

Please check here for information on the possible errors.