By Sandeep Kumar P, Principal Solutions Architect – AntStack
By Chris Zheng, Sr. Partner Solutions Architect, App Modernization – AWS


Delivering swift performance is critical for a good user experience. It can be difficult to achieve a swift experience in web applications handling large datasets and media content, however, and this can impact user experience. Leveraging AWS Lambda’s response streaming functionality enables progressive data delivery and enhances application performance.

In this post, we will discuss Lambda Response Streaming for efficient data retrieval and compare the architectural and code changes with standard APIs, and discuss the cost factor. For demonstration, we’ll build a simple application that leverages AWS Lambda, Amazon DynamoDB, and Amazon API Gateway.

AntStack is an AWS Specialization Partner and AWS Marketplace Seller with serviced delivery designations for Lambda, DynamoDB, and Amazon API Gateway. With a team of 70+ serverless engineers, AntStack offers a comprehensive range of services including application development, data engineering and modernization, user interface (UI) engineering, and user experience (UX) design.

Customer Challenges

In this walkthrough, we are considering a data-intensive web application that’s rendering a large volume dataset from a database or other data source. The data source will be Amazon DynamoDB, and due to the dataset’s large volume, it will take a significant amount of time to be transferred to the web application.

The time taken from API request to the first byte of data received by the requestor (web application) is called Time to First Byte (TTFB). The data intensive application is having a high TTFB and leading to poor user experience.

TTFB can become a make-or-break metric when we are working with data-intensive applications. Waiting for the complete set of data to arrive before rendering it is a potential deal breaker for the user to choose between continuing to use the application or looking for alternatives.

Using Lambda Response Streaming to Reduce TTFB

Instead of waiting for the API request to complete and the entire dataset to arrive, we can use Lambda Response Streaming which sends the data progressively in chunks. This reduces the TTFB and significantly improves the user experience by reducing the overall time it takes for the API request response to complete.

The improvement in the overall time also improves the performance of the backend systems, which in this case is the AWS Lambda function.

About Lambda Response Streaming

In April 2023, AWS Lambda announced support for response payload streaming. Response streaming is a new invocation pattern that lets functions progressively stream response payloads back to clients.

Here are some of the key points about this functionality:

  • Reduced TTFB latency: Dramatically decreases Time to First Byte, ensuring a swift user experience.
  • Expanded payload limit: Raises the response payload limit to 20 MB, compared to the standard 6 MB, allowing for more data-rich applications.
  • Runtime compatibility: Specifically designed to support Node.js 14.x, its subsequent managed runtimes, and custom runtimes.
  • Direct streaming via Lambda URLs: Leverages Lambda function URLs to stream payloads directly, providing a more seamless integration.
  • Amazon CloudFront integration: Offers the capability to use Amazon CloudFront in conjunction with Lambda function URLs, enhancing content delivery.
  • StreamifyResponse Decorator: The Lambda handler should be wrapped with a streamifyResponse() decorator, which adds an additional responseStream parameter.
  • Writing to the stream: Users can consume the responseStream object directly using write() and pipeline() methods, offering deeper control over data flow.
  • Cost changes: There is a charge of $0.008 per GB for the data written to the stream. The first 6 MB of response payload is streamed at no additional cost.

Solution Overview

In this section, we will conduct a comparative analysis between two distinct approaches for the problem stated earlier—the traditional API implementation and the Lambda Response Streaming API. At the end of the comparison, we’ll draw the conclusion on the solutions that work best to enhance user experience.

Architecture – Traditional API Method


Figure 1 – Architecture of traditional API method.

In the traditional API model, the React client has to wait until the Lambda function fetches all the data from DynamoDB and routes it through the Amazon API Gateway. Only then is the entire dataset rendered on the client side, which can result in noticeable delays, especially with large datasets.

Architecture – Lambda Response Streaming API Method


Figure 2 – Architecture of Lambda Response Streaming API method.

Conversely, in the Lambda Response Streaming model the React client begins receiving data as soon as each query to DynamoDB is completed. This immediate and incremental streaming of data eliminates the lag usually associated with waiting for an entire dataset to load, thereby enhancing the user’s experience through faster TTFB and more dynamic rendering.

Code Changes

The Lambda handler needs to be wrapped in streamifyResponse() decorator providing an additional parameter called responseStream which is used to stream the data. The response can be streamed from Lambda in two ways, write() and pipeline().

First, write() is used when data is partially available (multiple database queries, calling external), while pipeline() is used when the data is fully available (image, media content).

Sample Lambda Code to Retrieve Data from DynamoDB

In the code, we start by initializing the AWS SDK and Amazon DynamoDB. We are retrieving the table name from the env variable SCAN_DB_NAME.

As you have noticed, the Lambda handler is wrapped in streamifyResponse() decorator. The scanDynamoDBTable function will scan the table with a limit of 200 items per scan operation and send the items to the stream. The function is called recursively until all items are retrieved from the table. Once the scan is completed, the stream is ended.

const AWS = require("aws-sdk");
const dynamodb = new AWS.DynamoDB();
const tableName = process.env.SCAN_DB_NAME;

exports.handler = awslambda.streamifyResponse(
 async (event, responseStream, context) => {
   const httpResponseMetadata = {
     statusCode: 200,
     headers: {
       "Content-Type": "text/html",
       "Access-Control-Allow-Origin": "*",

   responseStream = awslambda.HttpResponseStream.from(

   await scanDynamoDBTable(tableName);

   async function scanDynamoDBTable(tableName, startKey = null, items = []) {
     // Create the required scan params
     const params = {
       TableName: tableName,
       ExclusiveStartKey: startKey,
       Limit: 200,

     // Use the DynamoDB object to scan the table with the specified parameters
     const data = await dynamodb.scan(params).promise();

     // Convert the items from DynamoDB JSON to regular JSON
     data.Items = => {
       return AWS.DynamoDB.Converter.unmarshall(item);

     // Send the scan result to the stream

     // If there are more items to scan, recursively call the scanDynamoDBTable function with the last evaluated key
     if (data.LastEvaluatedKey) {
       return scanDynamoDBTable(tableName, data.LastEvaluatedKey, items);
     // End stream at the end of the iteration

Sample Template File to Deploy Lambda and DynamoDB

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
   Type: AWS::DynamoDB::Table
       - AttributeName: id
         AttributeType: S
       - AttributeName: id
         KeyType: HASH
     BillingMode: PAY_PER_REQUEST
         SSEEnabled: True
         SSEType: KMS
         PointInTimeRecoveryEnabled: True
   Type: AWS::Serverless::Function
     CodeUri: ./index.js
     Handler: index.handler
     Runtime: nodejs16.x
     Timeout: 60
     MemorySize: 2048
     AutoPublishAlias: live
     ReservedConcurrentExecutions: 20
           Ref: BikesDDB
       AuthType: NONE
       InvokeMode: RESPONSE_STREAM
           Id: BikesDDB
           - Read

To configure Lambda to stream responses, we need to enable the Lambda URL and set the InvokeMode to RESPONSE_STREAM. To control the access to invoke the Lambda function, you need to add a new resource with the type AWS::Lambda::Permission.


  1. Create an index.js file and template.yaml file with the sample Lambda code and sample template provided above.
  2. Add the AWS::Lambda::Permission resource to the template for accessing the Lambda function.
  3. Configure your AWS Command Line Interface (CLI) with the appropriate AWS credentials.
  4. Replace s3_bucket_name with a bucket of your choice to store the deployment package and run the following command to build the package:
aws cloudformation package   
  --template-file template.yaml                                     
  --output-template-file packaged-template.yaml

  1. Replace stack_name with a name of your choice and region with the region of the Amazon Simple Storage Service (Amazon S3) bucket holding the deployment package and run the command to deploy the AWS CloudFormation stack.
aws cloudformation deploy 
  --template-file packaged-template.yaml 
  --capabilities CAPABILITY_IAM

  1. Once the deployment is complete, add a large number of items to the newly-created DynamoDB table. Check the output section of the CloudFormation for the DynamoDB table name and Lambda URL.


Figure 3 – Performance comparison: Lambda Response Streaming vs. regular API.

In the figure above, the left half represents a response streaming API and the right represents a traditional API. Both the APIs are tasked to do the same job, retrieving 2,000 records from Amazon DynamoDB. From the image, we can see the TTFB for a response streaming API is 250 ms, whereas the traditional API takes 1716 ms.

A detailed explanation of the same is available in this YouTube video.


In today’s fast-paced digital environment, how quickly and efficiently an application retrieves and displays data can make or break the user experience.

In this post, we discussed the challenges posed by applications handling large datasets, particularly with respect to latency and user engagement. We introduced Lambda Response Streaming as an innovative alternative that offers incremental data delivery and reduced Time to First Byte (TTFB) latency.

You can learn more about AntStack in AWS Marketplace.


AntStack – AWS Partner Spotlight

AntStack is an AWS Specialization Partner that offers a comprehensive range of services including application development, data engineering and modernization, UI engineering, and user experience design.

Contact AntStack | Partner Overview | AWS Marketplace