Sunday 14 February 2016

Bulk Upload/copy a Folder Structure and Files to Amazon S3 Bucket Using Windows Powershell

There are two ways to solve this problem:

Solution 1 - Write your own recursive function that iterate throughout all of your file and folder in your specified folder structure.
Solution 2 - Use a simple tool called "AWS Command Line Interface," provided by Amazon.

Lets now discuss both of these solutions in more  detail:

Solution 1.

In my previous post there is a perfect example of how you can write your own Recursive Function to List All Files in Folders and Sub-folders using Windows PowerShell. You can use the code as a base concept and create a functional code for uploading the folder structure to Amazon S3 bucket.

Quick tip here. According to AWS documentation "Amazon S3 has a flat structure with no hierarchy like you would see in a typical file system. However, for the sake of organizational simplicity, the Amazon S3 console supports the folder concept as a means of grouping objects. Amazon S3 does this by using key name prefixes for objects."

So you don't have to create folder in AWS S3 bucket before uploading the files. All you need to do is specify path to file e.g photos/abc.png and AWS will automatically create folder against the file abc.png

The downside of this approach is that it is a very slow process. For my folder (containing 2 155 Files, 87 Folders total size 32,8 MB), it took 41 minutes to upload everything on my AWS S3 bucket.

Solution 2.

Before we go through this approach we need to know, "What Is the AWS Command Line Interface?"

According to AWS CLI documentation "The AWS Command Line Interface is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts."

In this Tutorial we will use AWS CLI tools (recommend approach).

Before we go further you need to keep in mind few things

- We will be using Microsoft Windows 7 or above.
- Make sure you download and install  AWS CLI tools in your Windows environment.

Here is the documentation for the S3 API via CLI:
http://docs.aws.amazon.com/cli/latest/reference/s3/index.html
If you read through the documentation, here is how our statement should look
aws s3 cp /tmp/foo/ s3://bucket/ --recursive --exclude "*" --include "*.jpg"

If you try do everything according to the documentation you will end up with the problem of setting up Environment variables (that's what happened to me at least). You need environment variables so setup AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION

Tip: The way its describe to set Environment Variables in AWS documentation is for example
AWS_ACCESS_KEY_ID =AK0BJUTDXP2PPTT
and this approach didn't work . So i end up using the different approach to set Environment Variables
[Environment]::SetEnvironmentVariable("AWS_ACCESS_KEY_ID","AK0BJUTDXP2PPTT ")

Final code
$region = "AWS-BUCKET-REGION"
$folder = "c:\MyDirectory\"
$bucket = "AWS-BUCKET"
$accessKey = "AWS-ACCESS-KEY"
$secretKey = "AWS-SECRET-KEY"
$version = "1.3"

[Environment]::SetEnvironmentVariable("AWS_ACCESS_KEY_ID",$accessKey)
[Environment]::SetEnvironmentVariable("AWS_SECRET_ACCESS_KEY",$secretKey)
[Environment]::SetEnvironmentVariable("AWS_DEFAULT_REGION",$region)

aws s3 cp $folder s3://$bucket/$version/ --recursive

/Adnan