I noticed that there is not an option to download an entire S3 bucket even an entire directory from the AWS Management Console to your local system.
When searching over the web, there are many options to do that out there, but the best one is using the AWS CLI, at least for me.
The following steps 100% works for me, I have downloaded all files from AWS S3 bucket with the respective paths.
- Download and install AWS CLI on your machine using this page. Make sure to follow the procedure for you OS (Linux, Windows or MacOS).
- Verify the installation by checking the AWS CLI version:
$ aws --version
Here is an example from my Ubuntu running as WSL instance.
alfredo@Mi-Pro:~$ aws --version
aws-cli/2.13.30 Python/3.11.6 Linux/5.10.16.3-microsoft-standard-WSL2 exe/x86_64.ubuntu.22 prompt/off
alfredo@Mi-Pro:~$
- Configure AWS CLI:
$ aws configure
$ aws configure
AWS Access Key ID [None]: {Your_AWS_Access_Key_ID}
AWS Secret Access Key [None]: {Your_Secret_Access_Key}
Default region name [None]: eu-west-3
Default output format [None]: json
Make sure you input valid access and secret keys, which you received when you created the account.
- Sync the S3 bucket using:
$ aws s3 sync s3://yourbucket_name /local/destination/path
In the above command, replace the following fields:
yourbucket_name
>> your S3 bucket that you want to download./local/destination/path
>> path in your local system where you want to download all the files. You can also simply use.
(dot) if you want the objects from the bucket to be save in current directory.
Checkout the official documentation for details and examples.
Whereas the above example is how to download an entire bucket, you can also download a specific folder recursively by performing the following command:
$ aws s3 cp s3://bucket_name/path/to/folder LocalFolderName --recursive
This aws s3 cp command will instruct the CLI to download all files and folder keys recursively within the PATH/TO/FOLDER
directory within the BUCKETNAME
bucket.
With this command you can also explore the –dryrun flag which displays the operations that would be performed using the specified command without actually running them.
Bonus
If you have a ton of data and don’t want to wait forever and/or facing some issues with the download speed/performance, you would want to read “AWS CLI S3 Configuration” for tweaking your configuration to meet your need.
For instance, the following commands will tell the AWS CLI to use 1,000 threads to execute jobs (each a small file or one part of a multipart copy) and look ahead 100,000 jobs:
$ aws configure set default.s3.max_concurrent_requests 1000
$ aws configure set default.s3.max_queue_size 100000
I hope this would help and will save you some times.