Amazon DynamoDB, from development to deployment

Amazon DynamoDB, from development to deployment
COMMENTS ()
Tweet

Amazon DynamoDB is a fully managed proprietary NoSQL database service that supports key-value and document data structures and is offered by Amazon as part of the Amazon Web Services. DynamoDB exposes a similar data model to and derives its name from Dynamo, a distributed data storage system, but has a different underlying implementation. DynamoDB is known for its speed and scalability. DynamoDB synchronously replicates across multiple locations ensuring high availability and durability. Another prominent thing about DynamoDB is that its billing is based on throughput instead of storage used. We will discuss throughput in detail in section 3.

I have been working with DynamoDB and Java Spring Boot for several months. In this article, I intend to share some of the problems I came across on the way and their solution. So basically, we will go through DynamoDB starting from development all the way to deployment.

1) Setting up development environment

As development environment we will have:

◆     Ubuntu 16.04

◆     Java 1.8.0

◆     Gradle 4.10

◆     Spring Boot 2.2

You can learn more about Spring Boot here

1.1 Installing AWS CLI

To configure AWS credentials we are going to use AWS CLI. AWS credentials are required to access any Amazon service that you have opted for. To install Amazon CLI we need to have python package manager ‘pip’:

sudo apt install python-pip

Once you have pip installed you can install aws cli using

sudo pip install awscli

Although we are not going to use AWS cli for anything other than configuring our credentials, we can use the cli to interact with our local or remote database. For example, to list all tables in the database from the local DynamoDB server running at port 8000 we can do following:

aws dynamodb list-tables –endpoint-url http://localhost:8000

1.1.1 Configuring AWS cli

To configure aws cli, run:

aws configure

aws-cli will ask you for Access Key, Secret Key, Region and Output-form. Since this is the development environment we can use dummy values

  • AWS Access Key ID [****************]: key
  • AWS Secret Access Key [****************]: key
  • Default region name []: us-east-1
  • Default output format []: json

 

1.2 Setting up local DynamoDB

DynamoDB has a self-contained local version which can be installed on a system having java. Local version lets you write and test applications without accessing the DynamoDB Web Service thus helping you save on provisioned throughput, data storage, and data transfer fees. Switching from local to remote version requires minimal modifications in application properties. Despite the benefits, there are some limitations of the local version of DynamoDB. Firstly, it’s speed is dependent on your storage media. An SSD will give really fast reads/writes. If you have an EC2 instance in your production environment and your EC2 and DynamoDB are in the same region, they will work faster as compared to your local system if you use a HDD instead of an SSD. Secondly, with local DynamoDB you can’t play with read/write throughput as changing the throughput will not effect.

You can download the local version of DynamoDB from:

https://s3.ap-south-1.amazonaws.com/dynamodb-local-mumbai/dynamodb_local_latest.tar.gz

After extracting you will have the following files:

In this directory, run following command:

java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb

This will start dynamodb server on port 8000(by default, you can specify other port using “-p {port_number}”)

1.3 Installing DynamoDB admin

One thing that local dynamodb lacks is an admin panel, a place where you can visually interact with the database. As of now, there is no official admin application. However, there is a third party application that provides some of these features. To install the admin panel, you must have Node installed on your system.

To install the admin panel run:

npm install dynamodb-admin -g

Export the socket address where DynamoDB is exposed:

export DYNAMO_ENDPOINT=http://localhost:8000

Run:

dynamodb-admin

go to address http://localhost:8001 and you will see dynamodb admin panel:

2) Starting development

As mentioned earlier, we will use Spring Boot as our development framework.

2.1 Dependencies

To get started with DynamoDB on Spring Boot include following dependencies

compile "com.github.derjust:spring-data-dynamodb:5.0.4"
compile group: 'org.springframework.data', name: 'spring-data-commons', version: '2.0.2.RELEASE'

Spring Data DynamoDB is based on Spring Data and is aimed at providing familiar and consistent Spring-based programming model for data access while retaining the special traits of underlying data store. We could use AWS sdk for java but this implementation provides more flexibility when switching from dynamodb to another DBMS.

2.2 Configuring DynamoDB in Spring Boot

Although we are using the local version of dynamodb, the configuration needed to connect to dynamodb is almost the same as in the case with remote DynamoDB.

Here is our configuration class:

public class DynamoDBConfig {
 
	@Value("${amazon.aws.accesskey}")
	private String amazonAWSAccessKey;
 
    @Value("${amazon.aws.secretkey}")
	private String amazonAWSSecretKey;
 
    @Value("${amazon.dynamodb.endpoint}")
	private String amazonDybamoDBEndpoint;
 
	public AWSCredentialsProvider amazonAWSCredentialsProvider() {
    	return new AWSStaticCredentialsProvider(amazonAWSCredentials());
	}
 
	@Bean
	public AWSCredentials amazonAWSCredentials() {
    	return new BasicAWSCredentials(amazonAWSAccessKey, amazonAWSSecretKey);
	}
 
	@Bean
	public DynamoDBMapperConfig dynamoDBMapperConfig() {
    	return DynamoDBMapperConfig.DEFAULT;
	}
 
	@Bean
	public DynamoDBMapper dynamoDBMapper(AmazonDynamoDB amazonDynamoDB, DynamoDBMapperConfig config) {
    	return new DynamoDBMapper(amazonDynamoDB, config);
	}
 
	@Bean
	public AmazonDynamoDB amazonDynamoDB() {
    	AmazonDynamoDB amazonDynamoDB
            	= new AmazonDynamoDBClient(amazonAWSCredentials());
        amazonDynamoDB.setEndpoint(amazonDybamoDBEndpoint);
 
    	return amazonDynamoDB;
	}
}

Please note that we are getting the AWS Keys and DynamoDb Endpoint from application properties using @Value annotation. These properties are set in file application.properties under resources directory.

amazon.dynamodb.endpoint=http://localhost:8000/

amazon.aws.accesskey=key

amazon.aws.secretkey=key

switching from local to remote dynamodb is just a matter of changing the properties in this file.

This means you can configure different dynamodb for different environments by setting relevant properties in:

application-dev.properties

application-debug.properties

application-prod.properties

For production environment we would definitely want to move our AWS keys to the environment variables. If we do that we can fetch any environment variable by doing:

System.getEnv("AWS_KEY_NAME")

That’s all for the configuration part

2.3 Models

To kick things off, lets create a model that can cover a lot of use-cases for a model. Here we will take the example from official Amazon documentation

[https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html] with a few modifications to cover some more practical scenarios.

Use-cases for a model that I could think of are the following:

  1. Containing a composite key consisting of a hash-key and a range-key
  2. Containing a number of non-key attribute
  3. Containing a simple sub-document
  4. Containing a sub-document with its own key (representing a one to one relationship)
  5. Containing Index on a non-key attribute
  6. Containing an array of Objects (representing a one to many relationship)

So here is our model Music:

@DynamoDBTable(tableName = "Music")
@Data
@NoArgsConstructor
public class Music {
 
	// Spring-Data requires having a dedicated entity representing the key
	@Id
	// To ignore getter and setter for 'id' field : otherwise gives Mapping Exception
	// This is because we don't want id to be a separate column in the table. Rather we want artist and
	// songTitle to be columns in the table and dynamodb to automatically understand that the key of composed
	// if both the attributes
    @Getter(AccessLevel.NONE)
    @Setter(AccessLevel.NONE)
	// ---------------------
public MusicCompositeKey id;
 
	@DynamoDBAttribute
private String genre;
 
	@DynamoDBAttribute
private String albumTitle;
 
	@DynamoDBAttribute
	private List reviews;
 
	@DynamoDBAttribute
	private MiscellaneousInformation otherinfo;
 
	@DynamoDBAttribute
	// We need an index on year to able to query on its basis
    @DynamoDBIndexHashKey(attributeName = "year", globalSecondaryIndexName="year-index")
	private Integer year;
 
	public Music(String artist, String songTitle, String genre, String albumTitle, Integer year, List reviews, MiscellaneousInformation otherinfo) {
    	this.genre = genre;
    	this.albumTitle = albumTitle;
    	this.year = year;
    	this.reviews = reviews;
    	this.otherinfo = otherinfo;
    	this.id = new MusicCompositeKey();
        this.id.setArtist(artist);
        this.id.setSongTitle(songTitle);
	}
 
	@DynamoDBHashKey
    @DynamoDBAutoGeneratedKey
	// This allows creating of artist in the Music table instead of inside the attribute Id
	public String getArtist() {
    	return this.id==null ? null : this.id.getArtist();
	}
 
	public void setArtist(String artist) {
    	if(this.id == null) {
        	this.id = new MusicCompositeKey();
    	}
        this.id.setArtist(artist);
	}
 
	@DynamoDBRangeKey
	// This allows creating of sontTitle in the Music table instead of inside the attribute Id
	public String getSongTitle() {
    	return this.id==null ? null : this.id.getSongTitle();
	}
 
	public void setSongTitle(String songTitle) {
    	if(this.id == null) {
        	this.id = new MusicCompositeKey();
    	}
        this.id.setSongTitle(songTitle);
	}
 
    // Create an index on Miscellaneous.quality to be able to query on a sub-document
    @DynamoDBIndexRangeKey(localSecondaryIndexName = "quality-index")
	public String getQuality() {
    	if(otherinfo != null) {
        	return otherinfo.getQuality();
    	}
    	return null;
	}
 
	public void setQuality(String quality) {
    	if(otherinfo != null) {
            otherinfo.quality = quality;
        	return;
    	}
    	otherinfo =  new MiscellaneousInformation(quality);
	}
	// ---------------------------------------------------------------------------------
 
}

 

The composite key has to be defined separately

@Data

@AllArgsConstructor

@NoArgsConstructor

public class MusicCompositeKey implements Serializable {

    @DynamoDBHashKey

   @DynamoDBAutoGeneratedKey

    private String artist;

    @DynamoDBRangeKey

    private String songTitle;

}

 

and the sub-document class

@DynamoDBDocument
@Data
@NoArgsConstructor
@AllArgsConstructor
public class MiscellaneousInformation {
 
	String quality;
 
}

 

At this moment, there might be some confusions regarding the above classes as to why some of the functions are there but in the upcoming sections we will go through the reason for each of these later in this article.

With one Item in our Music table, it looks like:

It is interesting to note that there is no attribute named “id” in the table (Please see the above image). The composite key in the class Music exists as two columns in the table (artist and songtTitle). This is because we put @Getter (AccessLevel.NONE) and @Setter (AccessLevel.NONE) on attribute Id of Music table. We can’t have our keys inside an Id object in the table, they have to be exposed for query as individual properties. When we get Items from the database, we manually initialize the Id attribute in our Music Object to keep things straight.

2.4 Repository

In order to perform operations on a DynamoDB Table we need to create a repository for it. Although you can perform these operations by DynamoDbMapper class but using repository is much cleaner and generic way to achieve this.

Our repository class looks like this

@EnableScan
public interface MusicRespository extends CrudRepository {
 
	List findByArtist(String artist);
 
List findBySongTitle(String songTitle);
 
	@Query(fields = "artist, songTitle")
	// Note : If projections are used on Global Secondary Indexes, the index must contain the desired fields in the first place
List findByYear(Integer year);
 
List findByQuality(String quality);
 
List findByGenre(String genre);
 
// Note: Order by can be done on one of the attributes of the same index. For example, we wouldn't be able to order by 'year' when finding by artist
	// because our index only contains 'artist' and 'songTitle'
List findByArtistOrderBySongTitleDesc(String artist);
 
}

 

3) Testing

To test our database, we will write JUnit tests. These tests will confirm that everything is working fine and will demonstrate how to perform different actions on the database e.g. using index to search, use projection, sorting, using sub-document etc.

Here is the test class:

 

@RunWith(SpringRunner.class)
@SpringBootTest(classes = {PropertyPlaceholderAutoConfiguration.class, DemoApplication.class})
public class DemoApplicationTests {
 
	private static final Logger log = LoggerFactory.getLogger(DemoApplicationTests.class);
 
	@Autowired
	private AmazonDynamoDB amazonDynamoDB;
 
	@Autowired
	private DynamoDBMapper mapper;
 
	@Autowired
	private MusicRespository musicRespository;
 
	@Before
	public void init() throws Exception {
    	CreateTableRequest ctr = mapper.generateCreateTableRequest(Music.class);
    	final ProvisionedThroughput provisionedThroughput = new ProvisionedThroughput(5L, 5L);
    	ctr.setProvisionedThroughput(provisionedThroughput);
        ctr.getGlobalSecondaryIndexes().forEach(v -> v.setProvisionedThroughput(provisionedThroughput));
    	Boolean tableWasCreatedForTest = TableUtils.createTableIfNotExists(amazonDynamoDB, ctr);
    	if (tableWasCreatedForTest) {
            log.info("Created table {}", ctr.getTableName());
    	}
        TableUtils.waitUntilActive(amazonDynamoDB, ctr.getTableName());
    	log.info("Table {} is active", ctr.getTableName());
	}
 
	@After
	public void destroy() throws Exception {
        if(amazonDynamoDB.listTables(new ListTablesRequest()).getTableNames().indexOf("Music") <= 0) {
            DeleteTableRequest dtr = mapper.generateDeleteTableRequest(Music.class);
        	TableUtils.deleteTableIfExists(amazonDynamoDB, dtr);
            log.info("Deleted table {}", dtr.getTableName());
    	}
	}
 
	@Test
	public void test1_contextLoads() {
	}
 
	@Test
	public void test2_tableExists() {
        if(amazonDynamoDB.listTables(new ListTablesRequest()).getTableNames().indexOf("Music") < 0) {
        	fail();
    	}
	}
 
	@Test
	public void test3_CRUD() {
 
    	// Insertion test: Inserting two records and checking count in the database
    	List reviews = new ArrayList();
    	reviews.add(new Review("Really good song", 4.5f));
    	reviews.add(new Review("Excellent", 4.7f));
 
    	Music song1 = new Music("No one you know", "My Dog Spot", "Country", "Hey Now", 1984, reviews, new MiscellaneousInformation("Good"));
        musicRespository.save(song1);
    	Music song2 = new Music("No one you know", "Somewhere Down The Road", "Country", "Somewhat Famous", 1985, reviews, new MiscellaneousInformation("Excellent"));
    	musicRespository.save(song2);
        Assert.assertEquals(musicRespository.count(), 2);
        log.info("Insertion test successful");
 
    	// Get All
    	Iterable result = musicRespository.findAll();
        Assert.assertEquals(2, Lists.newArrayList(result).size());
        Assert.assertThat(result, hasItem(song1));
        Assert.assertThat(result, hasItem(song2));
    	log.info("Get All test successful");
 
    	//Query by hash key
    	List hashKeySearchResult = musicRespository.findByArtist("No one you know");
        Assert.assertEquals(2, hashKeySearchResult.size());
    	log.info("Query by by HashKey test successful");
 
    	//Query by hash key and Order by songTitle
    	List hashKeySearchResultOrdered = musicRespository.findByArtistOrderBySongTitleDesc("No one you know");
    	// When sorted in descending order song1 will on second position
        Assert.assertEquals(song1, hashKeySearchResultOrdered.get(1));
    	log.info("Query by by HashKey test successful");
 
    	// Query by range key
    	List rangeKeySearchResult = musicRespository.findBySongTitle("My Dog Spot");
        Assert.assertEquals(1, rangeKeySearchResult.size());
    	Assert.assertThat(rangeKeySearchResult, hasItem(song1));
    	log.info("Query by range key test successful");
 
    	// Query by Id
    	Optional queryByIdResult = musicRespository.findById(new MusicCompositeKey("No one you know", "Somewhere Down The Road"));
        Assert.assertNotEquals(false, queryByIdResult.isPresent());
        Assert.assertNotEquals(null, queryByIdResult.get());
        Assert.assertEquals(queryByIdResult.get(), song2);
    	log.info("Query by ID test successful");
 
    	// Query by a non key attribute by making an index on that attribute and apply projection
    	List queryUsingIndexResult = musicRespository.findByYear(1984);
        Assert.assertEquals(1, queryUsingIndexResult.size());
        Assert.assertEquals(queryUsingIndexResult.get(0).id, song1.id);
    	// since we have used projection for getting 'artist' and 'songTitle', we expect all other attributes to be null
        Assert.assertEquals(null,  queryUsingIndexResult.get(0).getYear());
    	log.info("Query using Index and Projection test successful");
 
    	// Query by a secondary index on a attribute in sub-document
    	List queryUsingSecondaryIndexResult = musicRespository.findByQuality("Good");
        Assert.assertEquals(1, queryUsingSecondaryIndexResult.size());
        Assert.assertThat(queryUsingSecondaryIndexResult, hasItem(song1));
    	log.info("Query using Secondary Index test successful");
 
    	// Scan : Searching using an attribute that is neither a partition key nor range key and also does not have any index on it
    	List scanResult = musicRespository.findByYear(1984);
        Assert.assertEquals(1, scanResult.size());
        Assert.assertEquals(scanResult.get(0).id, song1.id);
    	log.info("Scan test successful");
 
    	// Update: Get an existing user and update one of his attributes
    	Music songMyDogSpot = scanResult.get(0);
        songMyDogSpot.setAlbumTitle("Different Title");
        musicRespository.save(songMyDogSpot);
    	Assert.assertNotEquals(false, musicRespository.findById(songMyDogSpot.id).isPresent());
        Assert.assertEquals("Different Title", musicRespository.findById(songMyDogSpot.id).get().getAlbumTitle());
        log.info("Update test successful");
 
    	// Delete: delete an existing user
        musicRespository.delete(songMyDogSpot);
        Assert.assertEquals(musicRespository.count(), 1);
        log.info("Delete test successful");
 
    	log.info("All CRUD tests successful");
 
	}
 
}

 

You can see in the methods annotated with @Before and @After. These are part of Junit testing framework. They will be used for creating and destroying the DynamoDB table, respectively, after each test.

To create dynamoDb table we need to create a create-table request, providing it the model for our table(Music). Another very important thing required is the throughput.

When table is created, the associated indexes are also created, however, we have to specify provisioned throughput for each. An index is actually a table itself, with a number of columns and its own keys. Thus an index has its own cost. We can control the projected columns by specifying projection properties while creating index.

Index.withProjection(new Projection().withProjectionType("KEYS_ONLY"));
             	

Projection type can be “ALL”, “KEYS_ONLY” or “INCLUDE”. In case of “INCLUDE”, Projection.listOfNonKeyAttributes specifies the list of non-key attributes to include in the index.

3a) Throughput in DynamoDB

As discussed earlier, dynamoDB billing is based on the throughput you opt for instead of storage. It can either be fixed or scalable to you needs. DynamoDB uses read and write units to compute cost of your data transfers. More the read / write units consumed, more will be the cost. One read unit is 4KB requested from dynamodb and one write unit is 1KB of data written to dynamodb. If the record read sizes more than 4KB, additional read units will be used. Likewise, if you write an Item with size more than 1KB, more write units will be consumed. A transaction read/write takes one read/write unit alone. For your local development you can keep read/write throughput as you desire but you should be concerned about these when going to deployment.

For more info about throughput visit:

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html

Please note that MusicRepository(bean provided by Spring data),DynamoDbMapper(from DynamoDBConfig) and AmazonDynamoDb(from DynamoDBConfig) are being @Autowired for later use.

3.1 Test1

Test1 simply checks whether the Spring Boot context has loaded properly or not.

3.2 Test2

Test2 checks the existence of the Music table

3.3 Test3

Test3 is a sequence of sub-tests executed in order.

3.3.1 Get All

CrudRepository provides built-in method to get all records. So this one is simple.

3.3.2 Query by Hash Key

We want to query by Music table’s hash-key but if you see the model Music class you will find that the table’s hash key is inside the composite key, which itself is an attribute. To expose this nested attribute to the Spring JPA, we created function “getArtist” in the Music class and annotated it with @DynamoDBHashKey and added findByArtist to the repository.

3.3.3 Sorting Data using OrderBy Clause

Query result can be sorted on any key attribute or an attribute having an Index on it. Music.id.songTitle is already a Key and part of the same primary index. So we can sort our result Items on songTitle. But again we have to expose our songTitle sort-key to Spring JPA by creating getSongTitle function in the main Model class Music.

In our repository we need findByArtistOrderBySongTitleDesc to achieve a descending sorting of result on SongTitle.

3.3.4 Query by range-key (SongTitle)

Query by range-key requires the same setup as required for querying by the hash-key(Artist)

3.3.5 Query by Id

As discussed in section 2.3, Id is not a table column but dynamodb understands that we have a composite key with the help of @Id annotation in the Music class. Thus we are able to query by Id as a whole. We don’t need to write a function for this as CrudRepository already has a findById function.

3.3.6 Query using an Index and applying projection

To query by an attribute that is not a key (for example, Year), we need an Index on that attribute. You can see in Music class that we have an index on “Year”.

To apply projection, we need @Query annotation on the repository method with field parameter having a string with comma-separated names of required attributes. Attributes required in the projection must be part of the same index or keys of the table.

3.3.7 Query by a sub-document attribute

The subdocument MiscellaneousInformation has attribute “quality”. To be able to search on “quality” we create a getter getQuality in the main class Music and create a secondary index on it.

3.3.8 Scanning

DynamoDb query is really fast but requires having an index on the attribute or the attribute to be a key. If the case is not one of these, dynamodb must scan the whole database to find the items. To enable scan, we have to annotate the repository with @EnableScan.

3.3.9 Update and delete

Updating and deleting is simple. Id attribute is required in both cases. CrudRepository’s save() function will create a new record if the Id is not already existing, otherwise, it will update the existing record.

4) Deployment

As discussed before, to switch to deployment you just need to change the properties in application.properties file

When opting for DynamoDB service from Amazon, a number of factors gain significant importance, these factors are discussed below.

  1. Every index has a certain billing cost associated. So you need to design your table schema in such a way that contain minimum possible Indexes but still achieve your desire results.
  2. As discussed in Section 3a, dynamodb billing is based on throughput. It has two options.
  3. You can set provisioned throughput based on your needs and your billed amount will be constant. For example, you know your users size and you may only expect feature increases by time.
  4. You can opt on-demand usage. This makes the billing variable based on usage. Second option is best when you expect drastic increase in the database usage by time. For example, when you have exponential growing user base with time.
  5. While creating tables we can also set up Alarms through Amazon Simple Email Service, regarding events on your dynamodb, for example when your usage reached certain limit etc.

5) Conclusion

Now you know everything required for building your Spring Boot applications with DynamoDb. If you want to view the complete code or if you still have any queries you can simply create an issue in the GitHub repo.

https://github.com/sferhan/spring-boot-dynamodb-examples

 

 

Please feel free to reach out if you have any questions. In case you need any help with development, installation, integration, up-gradation and customization of your Business Solutions. We have expertise in Enterprise Application Development Solutions. Connect with us for more information. [email protected]

CALL

USA408 365 4638

VISIT

1301 Shoreway Road, Suite 160,

Belmont, CA 94002

Contact us

Whether you are a large enterprise looking to augment your teams with experts resources or an SME looking to scale your business or a startup looking to build something.
We are your digital growth partner.

Tel: +1 408 365 4638
Support: +1 (408) 512 1812