Getting data from a GCS bucket
Connecting to a GCS bucket
Section titled “Connecting to a GCS bucket”To get data from a GCS bucket, you need to execute connect get gcs
with following mandatory parameters:
--bucket
then the GCS bucket name
Depending on the type of authentication you need to either use --auth-json
for key based authentication
or use the service account assigned to the instance (if the server is running on GCP).
To use JSON key for the authentication you need to provide:
--auth-json
then the path to the file containing the auth JSON
To authenticate with Application Default Credentials (ADC) you need to
set GOOGLE_APPLICATION_CREDENTIALS
environment variable that points to the key file.
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/key-file.json"
Then you can omit providing the key file as it will be obtained from the environment variable.
For user credentials, ADC can be set by running Following
$ gcloud auth application-default login
Once authenticated you can connect without specifying the JSON key.
Getting single file from a GCS bucket
Section titled “Getting single file from a GCS bucket”To get a single file from a GCS bucket you should use
--file
followed by the file name to be downloaded
Following command will get file1.zip
file from a GCS bucket my-gcs-bucket
using JSON key for the authentication.
$ connect get gcs --file file1.zip --bucket my-gcs-bucket --auth-json /path/to/key-file.json2025/06/05 11:45:28 INFO Getting files from gcs://my-gcs-bucket/2025/06/05 11:45:28 INFO Files to get: count=1[file1.zip] 10.00 MiB / 10.00 MiB done
Getting multiple files from a GCS bucket
Section titled “Getting multiple files from a GCS bucket”If you’d like to get multiple files from a GCS bucket you need to specify either file mask or file regular expression for the files.
--file-mask
a file mask--file-regex
regular expression
This will download all files that match a certain mask or pattern. You can also provide an optional source folder that contains the files to be downloaded:
-s, --src-folder
folder containing the files to download
If you don’t provide the source folder files will be downloaded from “root folder” of the GCS bucket.
Following command will get all .zip
files from data
folder of GCS bucket my-gcs-bucket
using JSON key for the authentication.
$ connect get gcs --file-mask "*.zip" -s data --bucket my-gcs-bucket --auth-json /path/to/key-file.json025/06/05 12:12:14 INFO Getting files from gcs://my-gcs-bucket/data/2025/06/05 12:12:15 INFO Files to get: count=3[file1.zip] 10.00 MiB / 10.00 MiB done[file3.zip] 10.00 MiB / 10.00 MiB done[file2.zip] 10.00 MiB / 10.00 MiB done
You can also provide an optional destination folder where the files will be downloaded to:
-d, --dst-folder
target folder where to download files
If you don’t provide the destination folder files will be downloaded to the current folder.
In below example all .zip
files from data
folder of GCS bucket my-gcs-bucket
will be downloaded to local
folder using JSON key for the authentication.
$ connect get gcs --file-mask "*.zip" -s data -d local --bucket my-gcs-bucket --auth-json /path/to/key-file.json2025/06/05 12:13:12 INFO Getting files from gcs://my-gcs-bucket/data/2025/06/05 12:13:12 INFO Files to get: count=3[file1.zip] 10.00 MiB / 10.00 MiB done[file3.zip] 10.00 MiB / 10.00 MiB done[file2.zip] 10.00 MiB / 10.00 MiB done
Getting files from a GCS bucket in batch mode
Section titled “Getting files from a GCS bucket in batch mode”When using the application in batch or non-interactive mode, such as in a script or scheduler, you should disable the progress bar by using
--batch
to get files in batch mode
Following command will get all .zip
files from the data
folder of GCS bucket my-gcs-bucket
and place them in the local
folder, using batch mode and using JSON key for the authentication.
$ connect get gcs --file-mask "*.zip" -s data -d local --batch --bucket my-gcs-bucket --auth-json /path/to/key-file.json2025/06/05 12:14:37 INFO Getting files from gcs://my-gcs-bucket/data/2025/06/05 12:14:37 INFO Files to get: count=32025/06/05 12:14:37 INFO Transferring file=data/file1.zip targetFolder=local2025/06/05 12:14:39 INFO File transferred successfully bytesTransferred=10485760 duration=1s821ms transferRate=5.49MB/s file=data/file1.zip targetFolder=local2025/06/05 12:14:39 INFO Transferring file=data/file3.zip targetFolder=local2025/06/05 12:14:41 INFO File transferred successfully bytesTransferred=10485760 duration=1s830ms transferRate=5.46MB/s file=data/file3.zip targetFolder=local2025/06/05 12:14:41 INFO Transferring file=data/file2.zip targetFolder=local2025/06/05 12:14:43 INFO File transferred successfully bytesTransferred=10485760 duration=1s839ms transferRate=5.44MB/s file=data/file2.zip targetFolder=local
Getting files from a GCS bucket using scheduler
Section titled “Getting files from a GCS bucket using scheduler”When running the application from a scheduler, it’s recommended to ensure only one instance operates on a specific folder. It’s possible to configure the application to create a flag file at startup and delete it upon completion. If another instance tries to start while one is already running, it will detect the flag file and exit. This approach ensures that only one instance of the application runs at any given time. To do that use
--flag
path to the flag file
Following command will get all .zip
files from the data
folder of GCS bucket my-gcs-bucket
and place them in the local
folder using /tmp/flag1
as a flag file and JSON key for the authentication.
Mind that we are using batch mode as it’s required for the application to run correctly from script or scheduler.
$ connect get gcs --flag /tmp/flag1 --batch --file-mask "*.zip" -s data -d local --bucket my-gcs-bucket --auth-json /path/to/key-file.json2025/06/05 12:15:21 INFO Using flag file: /tmp/flag12025/06/05 12:15:21 INFO Getting files from gcs://my-gcs-bucket/data/2025/06/05 12:15:21 INFO Files to get: count=32025/06/05 12:15:22 INFO Transferring file=data/file1.zip targetFolder=local2025/06/05 12:15:23 INFO File transferred successfully bytesTransferred=10485760 duration=1s824ms transferRate=5.48MB/s file=data/file1.zip targetFolder=local2025/06/05 12:15:24 INFO Transferring file=data/file3.zip targetFolder=local2025/06/05 12:15:25 INFO File transferred successfully bytesTransferred=10485760 duration=1s820ms transferRate=5.49MB/s file=data/file3.zip targetFolder=local2025/06/05 12:15:26 INFO Transferring file=data/file2.zip targetFolder=local2025/06/05 12:15:28 INFO File transferred successfully bytesTransferred=10485760 duration=1s846ms transferRate=5.42MB/s file=data/file2.zip targetFolder=local2025/06/05 12:15:28 INFO Removing flag file: /tmp/flag1
Getting files from a GCS bucket in parallel
Section titled “Getting files from a GCS bucket in parallel”To speed up file download you can fetch files simultaneously. Application will then get files using multiple sessions. To specify number of concurrent GCS sessions use:
--parallel
number of GCS sessions
Following command will get all .zip
files from data
folder of GCS bucket my-gcs-bucket
using JSON key for the authentication and 3
parallel sessions.
$ connect get gcs --file-mask "*.zip" -s data --batch --parallel 3 --bucket my-gcs-bucket --auth-json /path/to/key-file.json2025/06/05 12:16:03 INFO Getting files from gcs://my-gcs-bucket/data/2025/06/05 12:16:03 INFO Files to get: count=32025/06/05 12:16:04 INFO Transferring file=data/file3.zip targetFolder=2025/06/05 12:16:04 INFO Transferring file=data/file1.zip targetFolder=2025/06/05 12:16:04 INFO Transferring file=data/file2.zip targetFolder=2025/06/05 12:16:05 INFO File transferred successfully bytesTransferred=10485760 duration=1s817ms transferRate=5.50MB/s file=data/file1.zip targetFolder=2025/06/05 12:16:05 INFO File transferred successfully bytesTransferred=10485760 duration=1s821ms transferRate=5.49MB/s file=data/file3.zip targetFolder=2025/06/05 12:16:05 INFO File transferred successfully bytesTransferred=10485760 duration=1s625ms transferRate=6.15MB/s file=data/file2.zip targetFolder=
Getting files from a GCS bucket in sequence
Section titled “Getting files from a GCS bucket in sequence”If you don’t specify the --parallel
option then files will be downloaded using single GCS session in the same order as they showed up on GCS bucket (oldest files first).
Action after file is downloaded from a GCS bucket
Section titled “Action after file is downloaded from a GCS bucket”After file is successfully downloaded to local system it’s possible to remove it from source or move it to a different folder at the source system.
--delete
delete a file after it’s downloaded--move-folder
target folder to move the file after it’s downloaded
Following command will get all .zip
files from data
folder of GCS bucket my-gcs-bucket
using JSON key for the authentication, 3
parallel sessions and store them in local
folder. Then files will be moved to archive
folder at in the GCS bucket.
$ connect get gcs --file-mask "*.zip" -s data -d local --batch --parallel 3 --bucket my-gcs-bucket --move-folder archive --auth-json /path/to/key-file.json2025/06/05 12:16:42 INFO Getting files from gcs://my-gcs-bucket/data/2025/06/05 12:16:42 INFO Files to get: count=32025/06/05 12:16:42 INFO Transferring file=data/file1.zip targetFolder=local2025/06/05 12:16:44 INFO File moved to folder folder=archive file=data/file1.zip2025/06/05 12:16:44 INFO File transferred successfully bytesTransferred=10485760 duration=1s458ms transferRate=6.86MB/s file=data/file1.zip targetFolder=local2025/06/05 12:16:44 INFO Transferring file=data/file3.zip targetFolder=local2025/06/05 12:16:47 INFO File moved to folder folder=archive file=data/file3.zip2025/06/05 12:16:47 INFO File transferred successfully bytesTransferred=10485760 duration=1s848ms transferRate=5.41MB/s file=data/file3.zip targetFolder=local2025/06/05 12:16:47 INFO Transferring file=data/file2.zip targetFolder=local2025/06/05 12:16:50 INFO File moved to folder folder=archive file=data/file2.zip2025/06/05 12:16:50 INFO File transferred successfully bytesTransferred=10485760 duration=1s817ms transferRate=5.50MB/s file=data/file2.zip targetFolder=local
Following command will get all .zip
files from data
folder of GCS bucket my-gcs-bucket
using JSON key for the authentication, 3
parallel sessions and then removes the files from the GCS bucket. Files will be downloaded to the local
folder.
$ connect get gcs --file-mask "*.zip" -s data --batch --parallel 3 --bucket my-gcs-bucket --delete --auth-json /path/to/key-file.json2025/06/05 12:18:22 INFO Getting files from gcs://my-gcs-bucket/data/2025/06/05 12:18:22 INFO Files to get: count=32025/06/05 12:18:23 INFO Transferring file=data/file1.zip targetFolder=local2025/06/05 12:18:25 INFO File deleted file=data/file1.zip2025/06/05 12:18:25 INFO File transferred successfully bytesTransferred=10485760 duration=1s815ms transferRate=5.51MB/s file=data/file1.zip targetFolder=local2025/06/05 12:18:25 INFO Transferring file=data/file3.zip targetFolder=local2025/06/05 12:18:26 INFO File deleted file=data/file3.zip2025/06/05 12:18:26 INFO File transferred successfully bytesTransferred=10485760 duration=1s452ms transferRate=6.89MB/s file=data/file3.zip targetFolder=local2025/06/05 12:18:27 INFO Transferring file=data/file2.zip targetFolder=local2025/06/05 12:18:29 INFO File deleted file=data/file2.zip2025/06/05 12:18:29 INFO File transferred successfully bytesTransferred=10485760 duration=1s844ms transferRate=5.42MB/s file=data/file2.zip targetFolder=local
All GCS Send Options
Section titled “All GCS Send Options”Usage: connect get gcs [command] [flags]
Flags: --auth-json string GCS authentication json file --bucket string GCS bucket
Global Flags: --batch No progress bars --delete Try to delete files after successful get -d, --dst-folder string Destination folder for retrieved files --file string File name to be get --file-mask string File mask to filter files --file-regex string File regex to filter files --flag string Flag file name --from-mail string From mail used to send notifications (only in batch mode) --godebug Turns on debug mode --help Prints help for the command --log-format string Log output format: text|json (default "text") --mail-format string Mail format [text|html] (only in batch mode) (default "plain") --move-folder string Folder to move files after successful get --no-color Do not use colors in logs --parallel uint Number of sessions used to get files (default 1) --quiet Makes no output -s, --src-folder string Folder to look for files --to-mail-failure strings Email list to send failure notification (only in batch mode) --to-mail-success strings Email list to send success notification (only in batch mode)
Errors
Section titled “Errors”Please check here for information on the possible errors.