Graphical person interfaces have been round for many years, however at the moment’s customers more and more count on to work together with functions. Amazon Nova Sonic is Amazon Bedrock’s cutting-edge foundational mannequin that permits this transition by offering pure, low-latency, two-way voice conversations by means of a easy streaming API. Fairly than simply interacting with functions, customers can collaborate with them by means of voice and built-in intelligence.
On this publish, we’ll present you learn how to add a real voice-first expertise to our reference utility, the Sensible Todo app, turning on a regular basis process administration into clean hands-free conversations.
Reimagine person interactions by means of collaborative AI voice brokers
Necessary usability enhancements are sometimes deprioritized, not as a result of they are not useful, however as a result of they’re troublesome to implement inside a standard mouse and keyboard interface. Options similar to clever batch actions, customized workflows, and voice-guided help are steadily mentioned however postponed attributable to UI complexity. That is about voice as an extra general-purpose interplay mode, not a substitute for device-specific controls or accessibility-only options. Voice allows new interplay patterns and likewise advantages customers of assistive applied sciences similar to display screen readers by offering an extra complete method to work together with functions.
Amazon Nova Sonic goes far past one-shot voice instructions. This mannequin permits functions to plan multi-step workflows, name backend instruments, and preserve context throughout turns. cooperate along with customers.
The next desk exhibits voice interactions from varied utility domains similar to process administration, CRM, and assist desk.
| Voice interplay (instance sentences) | intention/purpose | System actions/behaviors | Affirmation/UX |
|---|---|---|---|
| Mark all duties as full. | Full duties in bulk | Discover person’s unresolved duties → mark accomplished → Archive if set |
All 12 excellent duties are marked as accomplished. |
| Create a 3rd quarter readiness plan Funds: Break it into levels, assign house owners, and set deadlines. |
Create a multi-step workflow | Create Plan → Create Job → Assign Proprietor → Set deadline → Floor evaluate choices |
A plan created with 6 duties. notify Who owns it? |
| Discover enterprise leads in APAC with ARR Over $1 million in draft customized outreach. |
Create focused prospect lists and drafts outreach |
CRM queries → Assembling filtered lists → Draft customized messages for evaluate |
Created 24 particular person outreach proposals message. Would you prefer to evaluate and submit? |
| Precedence will likely be given to all P1 tickets opened in . Retailer information from the previous 24 hours and assign it to on-call. |
Triage and allocation | Filter tickets → Set precedence → Assign On-call → Log modifications |
12 P1 tickets will likely be prioritized and allotted to the on-call crew. |
Amazon Nova Sonic understands your intent, calls the required APIs, and checks the outcomes. No type required. This doubles productiveness and helps create an setting the place context is the interface. It isn’t about changing conventional UI, it is about unlocking new performance by means of voice.
Pattern utility overview
The Sensible Todo reference utility permits customers to create to-do lists and handle notes inside these lists. This utility supplies a centralized and versatile interface for monitoring duties and organizing notes. Including voice makes your utility a hands-free expertise, permitting for extra pure and productive interactions. The Sensible Todo app permits customers to say:
- “Add a be aware to observe up on the mission constitution.”
- “Archive all accomplished duties.”
Behind every command is a targeted motion, similar to creating a brand new be aware, organizing content material, or updating process standing, carried out naturally and effectively by means of voice.
How Amazon Nova Sonic Bidirectional API Works
Amazon Nova Sonic implements a real-time, bidirectional streaming structure. After the session has began, InvokeModelWithBidirectionalStreamthe audio enter and mannequin response movement concurrently on an open stream.
- Begin session – Consumer sends
sessionStartOccasions containing mannequin settings, similar to temperature and topP. - Beginning prompts and content material – The consumer sends a structured occasion indicating whether or not the upcoming information is voice, textual content, or software enter.
- audio streaming – Microphone audio is streamed as a base64 encoded audio enter occasion.
- mannequin response – Because the mannequin processes the enter, it asynchronously streams the subsequent response.
- Automated speech recognition (ASR) outcomes
- Name to make use of software
- textual content response
- Audio output for playback
- Ending the session – Dialog is explicitly ended by sending
contentEnd,promptEndandsessionEndoccasion.
Nova Sonic structure diagram
This event-driven method means that you can barge in to your assistant, allow multi-turn conversations, and help real-time adaptability.
answer structure
This answer makes use of the serverless utility structure sample. The UI is react to a single page utility. React single web page functions are built-in with a backend internet API working in a server-side container. The Sensible Todo app is deployed utilizing a scalable, security-aware AWS structure designed to help real-time voice interactions. The next diagram supplies an outline of the structure of AWS companies that work collectively to help the bidirectional streaming wants of voice-enabled functions.

Key AWS companies embody:
- Amazon Bedrock – Powers real-time, two-way voice interplay by means of the Amazon Nova Sonic basis mannequin.
- Amazon CloudFront – A content material supply community (CDN) that delivers functions globally with low latency. /(root) Routes visitors to a React utility hosted in an Amazon S3 bucket.
/apiand/novasonicSite visitors to your Software Load Balancer. - AWS Fargate for Amazon Amazon Elastic Container Service (Amazon ECS) – Runs a backend containerization service for WebSocket processing and a REST API that may help long-term bidirectional streams.
- Software Load Balancer (ALB) – Forwards internet visitors.
/api(HTTPS REST API calls) to backend ECS companies, processing the Sensible Todo App API, and/novasonic(WebSocket connection) Connect with an ECS service that makes use of Amazon Nova Sonic to handle real-time audio streaming. - Amazon Digital Non-public Cloud (Amazon VPC) – Offers community isolation and safety for backend companies. The general public subnet hosts the Software Load Balancer (ALB), and the non-public subnet hosts the ECS Fargate duties that run WebSocket and REST APIs.
- NAT gateways permit Amazon ECS duties in non-public subnets to extra securely connect with the web for operations similar to Cognito JWT token validation endpoints.
- Amazon Easy Storage Service (Amazon S3) – hosts a React entrance finish for person interplay
- AWS WAF – Shield your Software Load Balancer (ALB) from malicious visitors and implement safety guidelines on the utility layer.
- Amazon Cognito – Manages authentication and points tokens.
- Amazon DynamoDB – Shops utility information similar to to-do lists and notes.
The next diagram exhibits how person requests are dealt with with help for low-latency bidirectional streaming.
Request workflow
Deploying the answer
To guage this answer, we now have offered pattern code for the Sensible Todo app obtainable at: GitHub repository.
The Sensible Todo app consists of a number of impartial Node.js tasks, together with a CDK infrastructure mission, a React front-end utility, and a back-end API service. Deployment workflows make sure that your parts are constructed accurately and built-in with AWS companies similar to Amazon Cognito, Amazon DynamoDB, and Amazon Bedrock.
Conditions
Set up steps
- Clone the next repository:
- In case you are deploying for the primary time, use the next automated script.
This script does the next:
- Set up dependencies utilizing npm (Node Bundle Supervisor).
- Construct parts and container pictures utilizing regionally put in Docker Engine
- Deploy your infrastructure utilizing CDK (CDK BootStrap ==> CDK Synth ==> CDK Deploy)
- Replace setting variables in Amazon Cognito settings
- Rebuild the UI with up to date setting variables
- Ultimate infrastructure deployment (CDK deployment)
Verification of implementation
After profitable deployment, carry out the next steps:
- Entry the Amazon CloudFront URL offered within the CDK output.
Observe: The URL proven within the picture is for reference solely and each deployment is assigned a singular URL.
Screenshot of profitable deployment
- Create a brand new person by signing up utilizing . Create an account part.
Create a person and log in
- Check the voice performance and confirm integration with Amazon Nova Sonic. The next diagram exhibits a dialog between a signed-in person and the Amazon Bedrock agent. AI brokers can name current APIs, and the UI updates in real-time to mirror the agent’s actions.
Permit your microphone to entry your utility
Voice interactions within the Sensible Todo app
cleansing
You may delete the stack utilizing the next command:
subsequent step
Voice is turning into extra than simply an accessibility add-on, it is turning into the first interface for complicated workflows.
I discover it quicker to talk than to pick, particularly when the app responds.
Attempt these sources to get began.
- sample code repository – Working Amazon Nova Sonic integration
Will be run regionally. See how real-time voice interactions, intent processing, and multi-step flows work.
Applied end-to-end. - Amazon Novasonic experience workshop – Guided lab
Deploy Amazon Nova Sonic in your AWS account and check voice-native performance. - Amazon Nova Sonic Documentation – Offers API reference, streaming examples, and the most effective of the most effective
Offers practices that will help you design and implement voice-driven workflows. - To be taught extra about how AI-driven options can rework your operations, contact your AWS account crew.
In regards to the creator
Manu Mishra He’s a senior options architect at AWS, specializing in synthetic intelligence, information and analytics, and safety. His experience spans strategic oversight and hands-on technical management, reviewing and guiding the work of each inner and exterior shoppers. Manu works with AWS prospects to develop know-how methods and align know-how and organizational objectives that drive impactful enterprise outcomes.
AK Soni As a Senior Technical Account Supervisor for AWS Enterprise Help, he helps enterprise prospects obtain their enterprise objectives by offering proactive steerage on implementing revolutionary cloud and AI/ML-based options aligned with trade greatest practices. With over 19 years of expertise in enterprise utility structure and improvement, he leverages his experience in generative AI applied sciences to reinforce enterprise operations and overcome current technological limitations.
Raj Bagwe He’s a Senior Options Architect at Amazon Internet Providers based mostly in San Francisco, California. He has labored at AWS for over six years serving to prospects handle complicated technical challenges, specializing in cloud structure, safety, and migration. In my spare time, I coach the robotics crew and play volleyball. He might be reached on the X deal with @rajesh_bagwe.

