DMS configuration for DocumentDB: don't forget about the indexes

We have a new chapter of our series “Technical pills about Platform Engineering: beyond what you find in the documentation”, where the Codurance Platform Engineering team shares pieces of knowledge regarding issues we face on our day to day operations. In the previous post we were talking about Migrating from Amazon Linux to Bottlerocket.

This time, a new need arose in one of our dev teams: migrate part of the application to a new AWS account.

There are many moving parts that we are looking at, but one of them was database migration: part of the architecture is a huge DocumentDB. The key point is the high writing rate that database has, so we took snapshots out of the table for the moment because that would require either losing some data or stopping the application writes on the database, which would impact the business. 

The solution? Amazon Data Migration Service (DMS). By following this official documentation we configured the DMS instance, the source and target endpoints (tested and running!) and finally created the Database Migration Task. As our goal was to have all the data in the new account we selected to “Migrate existing data and replicate ongoing changes”


Once the task ran, we connected to the new database and started reviewing: the databases were there, the collections also… but the indexes were missing!

Under deeper investigation AWS mentions:

“AWS DMS creates tables, primary keys, and in some cases unique indexes, but it doesn't create any other objects that aren't required to efficiently migrate the data from the source. For example, it doesn't create secondary indexes, non-primary key constraints, or data defaults.

To migrate secondary objects from your database, use the database's native tools if you are migrating to the same database engine as your source database.” - Official documentation talking about best practices

In that case, what’s the best approach? Fortunately there’s a tool provided for that! We can use the Amazon DocumentDB Index Tool to:

  1. Export the MongoDB source indexes
  2. Check indexes compatibility with the target
  3. Restore the indexes to the target Amazon DocumentDB, preferably before using AWS DMS to load data

 

In order to connect to both our databases, we will need to deploy an AWS Cloud9 instance. Taking advantage of the already configured peering, first we will install mongo shell in order to check the connection with both databases:

## Install mongo shell

echo -e "[mongodb-org-4.0] \nname=MongoDB Repository\nbaseurl=https://repo.mongodb.org/yum/amazon/2013.03/mongodb-org/4.0/x86_64/\ngpgcheck=1 \nenabled=1 \ngpgkey=https://www.mongodb.org/static/pgp/server-4.0.asc" | sudo tee /etc/yum.repos.d/mongodb-org-4.0.repo

sudo yum install -y mongodb-org-shell

## Download CA

wget https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem

## Test connection to source database

mongo --ssl --host [SOURCE_HOST]:[SOURCE_PORT] --sslCAFile global-bundle.pem --username administrator --password '[SOURCE_PASSWORD]'

## Test connection to target database

mongo --ssl --host [TARGET_HOST]:[TARGET_PORT] --sslCAFile global-bundle.pem --username administrator --password '[TARGET_PASSWORD]'

Once we verify we are able to connect to both databases, we will need to install the index migration tool and run the previously mentioned steps:

## Clone and install index migration too

git clone https://github.com/awslabs/amazon-documentdb-tools.git

cd amazon-documentdb-tools/index-tool

python3 -m pip install -r requirements.txt

## Create index export folder

mkdir -p docdb_index_export

## Copy the previously downloaded CA

cp -rp /home/ec2-user/environment/global-bundle.pem .

## Dump the index info from source db

python3 migrationtools/documentdb_index_tool.py --dump-indexes --dir docdb_index_export --uri 'mongodb://administrator:[SOURCE_PASSWORD]@[SOURCE_HOST]:[SOURCE_PORT]/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&retryWrites=false'

## Verify index compatibility with target db

python migrationtools/documentdb_index_tool.py --uri 'mongodb://administrator:[TARGET_PASSWORD]@[TARGET_HOST]:[TARGET_PORT]/?ssl=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false' --show-issues --dir docdb_index_export --support-2dsphere

## Restore index in target db

python migrationtools/documentdb_index_tool.py --uri 'mongodb://administrator:[TARGET_PASSWORD]@[TARGET_HOST]:[TARGET_PORT]/?ssl=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false' --restore-indexes --dir docdb_index_export --support-2dsphere

## If there are any special characters in the credentials, they will need to be % escaped

And all set! We now can connect to the target database and verify that the indexes are there. We have the database ready for our migration!

Have you ever faced this case? How did you solve it?

 

Clicking on the banner you will learn about building world-class software