SQS To DynamoDB Tuning - Writing Data to DDB in Rust and Typescript

Last time we created our Rust and Typescript Lambdas with basic hello world implementations and did a quick performance comparison. We'll now expand our Rust and Typescript Lambdas from last time into ones that take data from SQS messages and push the data into DynamoDB. While we're at it, we'll compare the performance of Rust and Typescript versions and see which is more up to the task.

From last time, here is our template.yml with our two Lambdas:

AWSTemplateFormatVersion: "2010-09-09"

Transform:
- "AWS::Serverless-2016-10-31"

Resources:

  BlasterLambdaTS:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: index.handler
      Runtime: nodejs14.x
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
    Metadata:
      BuildMethod: makefile

  BlasterLambdaRust:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: none
      Runtime: provided.al2
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
    Metadata:
      BuildMethod: makefile

Let's add a DynamoDB table to template.yml:

...

  DynamoDBTable:
    Type: AWS::DynamoDB::Table
      Properties:
        AttributeDefinitions: 
          - AttributeName: id
            AttributeType: S
        KeySchema: 
          - AttributeName: id
            KeyType: HASH
        BillingMode: PAY_PER_REQUEST

This creates our table with 'id' as the hash key for the primary index, with a type of string. This tells DynamoDB that our items will have an 'id' field that will be used to reference them. Each item will contain several other fields, but we don't need to specify them in AttributeDefinitions because they will not be used as key fields. If we were to add secondary indexes, we would need to add the key fields for those secondary indices there.

We're also setting the BillingMode to PAY_PER_REQUEST because our workloads will be spikey and inconsistent and we don't want to have to pay for throughput we don't use. If we had steady, predictable workloads, we would want to use PROVISIONED.

Now we'll add SQS Queues for each Lambda to receive message from:

...
  TSQueue:
    Type: AWS::SQS::Queue
    Properties:
      VisibilityTimeout: 180

  RustQueue:
    Type: AWS::SQS::Queue
    Properties:  
      VisibilityTimeout: 180

We are setting the VisibilityTimeout here to 180 seconds because our Lambdas are set to 30 seconds timeout, given the guidance here that the visibility timeout should be at least six times the Lambda timeout, so we'll start there and likely tune it later.

We now need to tie everything together. We'll add SQS event source configurations to the Lambdas:

...
  BlasterLambdaTS:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: index.handler
      Runtime: nodejs14.x
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt TSQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile

  BlasterLambdaRust:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: none
      Runtime: provided.al2
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt RustQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile
...

This configures our Lambdas to automatically trigger when messages are received on their respective queues. Since we are configuring the BatchSize as 1, each time the Lambdas are invoked we will receive an event with 1 message inside of it. Now we need to give Lambda permission to access our DynamodDB. Following the security best practice of the principle of least priviledge (POLP), we just give our lambdas write access to our specific table. AWS SAM provides some nice policy templates, which allow for much less verbose permissions than creating full policies:

...
  BlasterLambdaTS:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: index.handler
      Runtime: nodejs14.x
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt TSQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile

  BlasterLambdaRust:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: none
      Runtime: provided.al2
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt RustQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile
...

Note that we have already AWSLambdaBasicExecutionRole configured. This gives our Lambdas permission to upload logs to CloudWatch so we can view our logs there.

We'll also need to provide our Lambdas a way to access the DDB table, a convenient way to do this is with environment variables:

...
  BlasterLambdaTS:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: index.handler
      Runtime: nodejs14.x
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Environment:
        Variables: 
          TABLE_NAME: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt TSQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile

  BlasterLambdaRust:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64
      Handler: none
      Runtime: provided.al2
      CodeUri: .
      Timeout: 30
      MemorySize: 512
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Environment:
        Variables: 
          TABLE_NAME: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt RustQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile
...

Now our Lambda configurations are starting to get large. There's a lot of duplication there, fortunately we can use the Globals section to refactor a lot of the common configurations to a single place:

...
  Function:
    Architectures:
      - arm64
    CodeUri: .
    Timeout: 30
    MemorySize: 512
    Environment:
      Variables: 
        TABLE_NAME: !Ref DynamoDBTable

Resources: 

  BlasterLambdaTS:
    Type: AWS::Serverless::Function
    Properties:
      Handler: index.handler
      Runtime: nodejs14.x
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt TSQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile

  BlasterLambdaRust:
    Type: AWS::Serverless::Function
    Properties:
      Handler: none
      Runtime: provided.al2
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt RustQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile
...

To make it easy to test with the AWS CLI without having to dig around in the AWS console, I'll add Outputs for the queue URLs and the table name:

...
Outputs:

  TSQueueUrl:
    Value: !GetAtt TSQueue.QueueUrl

  RustQueueUrl:
    Value: !GetAtt RustQueue.QueueUrl

  TableName:
    Value: !Ref DynamoDBTable

Our full template now looks like this:

AWSTemplateFormatVersion: "2010-09-09"

Transform:
- "AWS::Serverless-2016-10-31"

Globals:

  Function:
    Architectures:
      - arm64
    CodeUri: .
    Timeout: 30
    MemorySize: 512
    Environment:
      Variables: 
        TABLE_NAME: !Ref DynamoDBTable

Resources: 

  BlasterLambdaTS:
    Type: AWS::Serverless::Function
    Properties:
      Handler: index.blaster.handler
      Runtime: nodejs14.x
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt TSQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile

  BlasterLambdaRust:
    Type: AWS::Serverless::Function
    Properties:
      Handler: none
      Runtime: provided.al2
      Policies:
        - AWSLambdaBasicExecutionRole
        - DynamoDBWritePolicy:
            TableName: !Ref DynamoDBTable
      Events:
        SQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt RustQueue.Arn
            BatchSize: 1 
    Metadata:
      BuildMethod: makefile

  DynamoDBTable:
    Type: AWS::DynamoDB::Table
    Properties:
      AttributeDefinitions: 
        - AttributeName: id
          AttributeType: S
      KeySchema: 
        - AttributeName: id
          KeyType: HASH
      BillingMode: PAY_PER_REQUEST

  TSQueue:
    Type: AWS::SQS::Queue
    Properties:
      VisibilityTimeout: 180

  RustQueue:
    Type: AWS::SQS::Queue
    Properties:  
      VisibilityTimeout: 180

Outputs:

  TSQueueUrl:
    Value: !Ref TSQueue

  RustQueueUrl:
    Value: !Ref RustQueue

  TableName:
    Value: !Ref DynamoDBTable

Now we can test deployment:

>> sam build
>> sam deploy --stack-name sqs-to-ddb --s3-bucket larrys-cool-bucket --capabilities CAPABILITY_IAM

That succeeded so now we can try sending messages to each queue:

>> aws sqs send-message --queue-url <<rust-queue-url>> --message-body "Hello Q"
>> aws sqs send-message --queue-url <<ts-queue-url>> --message-body "Hello Q"

Logging into the AWS Lambda Console, finding the TS Lambda, clicking "View Logs in CloudWatch" in the "Monitor" tab, and viewing the latest log stream we see the event logged:

2022-01-15T17:11:52.493Z    79da49e5-701f-5348-900b-13b872940d7a    INFO    Hello Event: 
{
    "Records": [
        {
            "messageId": "c0110c84-0130-44c2-8199-78f8912896a1",
            "receiptHandle": ...,
            "body": "Hello Q",
            "attributes": {
                "ApproximateReceiveCount": "1",
                "SentTimestamp": "1642266711948",
                "SenderId": ...,
                "ApproximateFirstReceiveTimestamp": "1642266711949"
            },
            "messageAttributes": {},
            "md5OfBody": "50eba39d724e8bd654ade06019dbd7fc",
            "eventSource": "aws:sqs",
            "eventSourceARN": "...",
            "awsRegion": "us-east-1"
        }
    ]
}

We'll see a similar log for the Rust Lambda, so our SQS event source configurations appear to be working. Now we need to add code to push data into our DynamoDB table. We'll start out with just parsing some JSON from the message body of the SQS event and pushing it into our table.

We need to add the DynamoDB SDKs to our Lambda handlers and add code to push the data into our table.

Rust

Adding the SDK to our Cargo.toml:

[package]
name = "sqs_to_ddb"
version = "0.1.0"
edition = "2021"

[dependencies]
lambda_runtime = "0.4.1"
tokio = { version = "1.0", features = ["full"] }
serde_json = "^1"
aws-config = "0.4.0"
aws-sdk-dynamodb = "0.4.0"

Now we add a representation of our data and add code to read items from the SQS records and push them into our table (src/bin/blaster_handler.rs):

use std::env;
use lambda_runtime::{handler_fn, Context, Error as LambdaError};
use serde::{Deserialize};
use aws_lambda_events::event::sqs::SqsEvent;
use aws_sdk_dynamodb::{Client};
use aws_sdk_dynamodb::model::{AttributeValue};

#[derive(Deserialize)]
struct Data {
    id: String,
    a: f64,
    b: f64,
    c: f64,
    d: f64
}

#[tokio::main]
async fn main() -> Result<(), LambdaError> {
    let func = handler_fn(func);
    lambda_runtime::run(func).await?;
    Ok(())
}

async fn func(event: SqsEvent, _: Context) -> Result<(), LambdaError> {
    let items: Vec<Data> = event.records.iter()
        .map(|record| serde_json::from_str::<Data>(&record.body.as_ref().unwrap()).unwrap())
        .collect();

    let shared_config = aws_config::load_from_env().await;
    let client = Client::new(&shared_config);
    let table_name = &env::var("TABLE_NAME")?;
    for item in items {
        client.put_item()
            .table_name(table_name)
            .item("id", AttributeValue::S(item.id))
            .item("a", AttributeValue::N(item.a.to_string()))
            .item("b", AttributeValue::N(item.b.to_string()))
            .item("c", AttributeValue::N(item.c.to_string()))
            .item("d", AttributeValue::N(item.d.to_string()))
            .send().await.unwrap();
    }
    Ok(())
}

Typescript

Adding the DynamoDB SDK V3:

>> npm install @aws-sdk/client-dynamodb

And adding code corresponding to our Rust Lambda:

import { SQSEvent, SQSHandler } from "aws-lambda";
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";

type Data = {
    id: string,
    a: number,
    b: number,
    c: number,
    d: number
}

function parseData(json: string): Data {
    return JSON.parse(json);
}

export const handler: SQSHandler = async (event: SQSEvent) => {
    const client = new DynamoDBClient({});

    const items = event.Records.map(record => record.body).map(parseData);

    const tableName = process.env.TABLE_NAME;
    for (const item of items) {
        await client.send(new PutItemCommand({
            TableName: tableName,
            Item: {
                id: {S: item.id},
                a: {N: item.a.toString()},
                b: {N: item.b.toString()},
                c: {N: item.c.toString()},
                d: {N: item.d.toString()}
            }
        }));
    }
}

The Rust version seemed a lot more verbose than the Typescript version, it might be interesting to compare the file sizes:

>> wc -c src/bin/blaster_handler.rs 
    1305 src/bin/blaster_handler.rs
>> wc -c src/ts/index.ts 
     762 src/ts/index.ts

So the Rust version is about 70% larger. Granted, the difference in verbosity could be a symptom of the particular DDB SDKs used, rather than the expressiveness of the languages themselves, but my experience with each suggests the difference is fairly typical.

Let's build and compare the build times.

>> sam build --debug
...
2022-01-15 14:56:07,449 | executing Make: ['make', '--makefile', '/Users/larry/Documents/code/sqs_to_ddb/Makefile', 'build-BlasterLambdaTS']
2022-01-15 14:56:12,201 | CustomMakeBuilder:MakeBuild succeeded
...
2022-01-15 14:56:18,032 | executing Make: ['make', '--makefile', '/Users/larry/Documents/code/sqs_to_ddb/Makefile', 'build-BlasterLambdaRust']
2022-01-15 14:56:49,067 | CustomMakeBuilder:MakeBuild succeeded

So it looks like the Typescript build only took about 5 seconds with TS and Webpack compilation, while the Rust build took about 30 seconds, roughly 6X longer build time. Let's deploy and see if Rust's additional verbosity and much longer build time are worth it:

>> sam deploy --stack-name sqs-to-ddb --s3-bucket larrys-cool-bucket --capabilities CAPABILITY_IAM

Now clicking the "Test" in the Lambda console and testing the Typescript Lambda with the following event:

{
    "Records": [
        {
            "body": "{\"id\":\"ts-1\",\"a\":1.2,\"b\":2.3,\"c\":3.4,\"d\":4.5}"
        }
    ]
}

And the Rust Lambda with this event:

{
    "Records": [
        {
            "body": "{\"id\":\"rust-1\",\"a\":1.2,\"b\":2.3,\"c\":3.4,\"d\":4.5}"
        }
    ]
}

The durations look like this:

Rust:

REPORT RequestId: 409a4379-c316-4543-ba82-f01c79e81d8b    Duration: 122.97 ms    Billed Duration: 158 ms    Memory Size: 512 MB    Max Memory Used: 23 MB    Init Duration: 34.91 ms

Typescript:

REPORT RequestId: 57ee4cd3-129c-4ff1-b4ba-a8ee5967317b    Duration: 162.53 ms    Billed Duration: 163 ms    Memory Size: 512 MB    Max Memory Used: 62 MB    Init Duration: 240.25 ms

So the Typescript version took about 7X longer for initialization, after that it took about 33% longer, and used nearly 3X as much memory. Let's check our table and make sure the data made it in:

>> aws dynamodb scan --table-name <<table name>>
{
    "Items": [
        {
            "a": {
                "N": "1.2"
            },
            "b": {
                "N": "2.3"
            },
            "c": {
                "N": "3.4"
            },
            "d": {
                "N": "4.5"
            },
            "id": {
                "S": "rust-1"
            }
        },
        {
            "a": {
                "N": "1.2"
            },
            "b": {
                "N": "2.3"
            },
            "c": {
                "N": "3.4"
            },
            "d": {
                "N": "4.5"
            },
            "id": {
                "S": "ts-1"
            }
        }
    ],
    "Count": 2,
    "ScannedCount": 2,
    "ConsumedCapacity": null
}

Looks good. Now let's blast a bunch of messages into the queue and see what the average durations are. First, we'll move the model to src/bin/model/mod.rs:

use serde::{Deserialize, Serialize};

#[derive(Deserialize, Serialize)]
pub struct Data {
    pub id: String,
    pub a: f64,
    pub b: f64,
    pub c: f64,
    pub d: f64
}

Note that the reason we needed src/bin/model/mod.rs instead of src/bin/model.rs is that Cargo will try to compile any .rs file at the top level of bin to a binary, which we don't want. Now create src/bin/sqs_blaster.rs:

use aws_sdk_sqs::{Client};

mod model;

#[tokio::main]
async fn main() {
    let shared_config = aws_config::load_from_env().await;
    let client = Client::new(&shared_config);

    let queue_urls = ["<<ts-queue-url>>", "<<rust-queue-url>"];
    for i in 1..1000 {
        let data = model::Data { id: format!("id-{}", i), a: 1.2, b: 2.3, c: 3.4, d: 4.5 };

        for queue_url in queue_urls {
            let resp = client.send_message()
                .queue_url(queue_url)
                .message_body(serde_json::to_string(&data).unwrap())
                .send().await;

            match resp {
                Err(e) => println!("ERROR: {}", e.to_string()),
                Ok(v) => println!("RESULT: {}", v.message_id.unwrap())
            }
        }
    }
}

Running that, letting the queues drain, and running the following in Log Insights:

stats avg(@initDuration), avg(@duration), count(@initDuration), count(@duration)

Gives:

Version	avg(@initDuration)	avg(@duration)	count(@initDuration)	count(@duration)
Typescript	245.4025	45.1185	12	1010
Rust	34.8244	26.5928	9	1010

So it appears the Typescript init durations are still about 7X the Rust ones fairly consistently (although we were only able to force around 10 for each here), and the durations were about 70% longer.

Conclusion

As we can see, the init durations and overall durations for our Rust Lambda were much lower than our Typescript Lambda. On the other hand, the Typescript version built much faster and required significantly less code. If those results generalize to other workloads, I'd probably prefer Rust for spikey workloads where minimizing processing time is important, e.g. for spikey workloads that are expected to be near-real-time. In addition, I would choose Rust for high volume flows where 70% faster can make a huge difference over millions or billions of invocations, in terms of monetary cost and Lambda reserved concurrency contention. I would choose Typescript for low volume workloads where consistently low latencies are not important, such as a simple API Gateway endpoint that doesn't receive a lot of volume and where latency spikes (due to cold starts) are acceptable.

Next time we'll convert the SQS messages to contain a reference to a large S3 Object containing JSON and convert the Lambdas to read the data line-by-line and shuttle it into DynamoDB.