AWS Data Sync for Data Transmission

Mohammad Mahmud Hasan
3 min readAug 17, 2023

--

The process of moving massive volumes of data to and from AWS storage services via the Internet or AWS Direct Connect is made easier, automated, and quicker with the help of AWS DataSync, an online data transfer tool. Large-scale data migrations across clouds or from on-premises systems to the cloud can both greatly benefit from the use of DataSync.

You can transfer data using DataSync between Network File System (NFS) or Server Message Block (SMB) file servers, Amazon Simple Storage Service (S3), Amazon Elastic File System (EFS), and Amazon FSx for Windows File Servers.

I’ll outline the actions you must take in a hypothetical situation: what to do if you need to move data between AWS accounts, from one S3 bucket to another S3.

If you want to know more in-depth, check out this.

S3-to-S3 transfers between accounts and regions can be carried out without the need for a DataSync agent. The DataSync architecture of AWS handles this internally.

Make sure your destination account has an IAM role for accessing your source S3 location data when moving data between accounts. It is advised that you use the destination AWS account to execute the DataSync service in order to prevent any problems.

Step — 1: Setup in the destination account (Part 1)

  1. For S3 access from the source, configure an IAM Role CrossAccountS3FullAccess. I’ve used the AmazonS3FullAccess policy in the example for convenience.
  2. After creating the role, click on the Trust relationships tab, click Edit trust relationship, and add the following Trust Policy…
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "datasync.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}

Step — 2: Setup in the source account

  1. On the bucket’s Permissions tab, click Bucket Policy.
  2. Click Edit, add the following, and save:
{
"Version": "2012-10-17",
"Id": "Policy1616194240988",
"Statement": [
{
"Sid": "Stmt1616194236908",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::<destination_account_id>:user/<destination_iam_user_name>",
"arn:aws:iam::<destination_account_id>:role/CrossAccountS3FullAccess"
]
},
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::<your_bucket_name>",
"arn:aws:s3:::<your_bucket_name>/*"
]
}
]
}

Step — 3: Setup in the destination account (Part 2)

Create an S3 bucket with default properties in the destination account or region next. Using the AWS CLI listed below, create the cross-account s3 location (for source A) from destination B:

aws datasync create-location-s3 --s3-bucket-arn <your_source_bucket_arn> --s3-config '{"BucketAccessRoleArn": "arn:aws:iam::<your_destination_account_id>:role/CrossAccountAccess"}' --region <your_region>

If the command works, the below output will be displayed:

{
"LocationArn": "arn:aws:datasync:<your_region>:<account_id>:location/loc-<a_random_number>"
}

We may now configure both the source and destination locations for the DataSync service because the source S3 location ought to now appear in the destination view.

Step — 4: Create DataSync Task

  1. To start a data transfer with the supplied parameters, click Create task in the DataSync pane and follow the instructions.
  2. When you’ve completed the configuration, you’ll see details about your task in the dashboard.
  3. Select your task, go to Actions & click Start & Your data migration will happen.
  4. You can check your DataSync task execution status by clicking the Task History tab.

DataSync will only move your bucket contents, not the bucket settings. You have to manually set it up after the migration.

--

--

Mohammad Mahmud Hasan

DevOps Engineer | Bangladeshi | Guinness Record Holder | Athlete | Cat lover | Foodie | Tech Enthusiastic