Searching for the most effective worth efficiency

With a willingness to remain aggressive, immediately’s companies look to AI, serving to them reduce prices and maximize effectivity. It’s obligatory for them to seek out essentially the most applicable AI mannequin. For a lot of corporations, the transfer from Openai’s mannequin household to Amazon Nova represents not solely a mannequin change, however a strategic transfer in the direction of scalability, effectivity and broader multimodal capabilities.

On this weblog, we’ll clarify the way to optimize prompts on Amazon Nova and the most effective worth efficiency.

Why go from Openai to Amazon Nova?

Openai’s mannequin stays robust, however increasing it might make working prices prohibitive. Take into account these numbers Artificial analysis:

Mannequin	Enter token value (per million tokens)	Output token value (per million tokens)	Context window	Output pace (tokens per second)	Latency (seconds per first token)
GPT-4O	~$2.50	~$10.00	Max 128K tokens	~63	~0.49
GPT-4O MINI	~$0.15	~$0.60	Max 128K tokens	~90	~0.43
Nova Micro	~$0.035	~$0.14	Max 128K tokens	~195	~0.29
Novalite	~$0.06	~$0.24	As much as 300k tokens	~146	~0.29
Nova Professional	~$0.80	~$3.20	As much as 300k tokens	~90	~0.34

For big quantities of purposes (corresponding to international buyer assist or large-scale doc evaluation), these variations in value are disruptive. Not solely does Amazon Nova Professional provide greater than 3 times less expensive, however it additionally means that you can deal with extra intensive and sophisticated inputs with an extended context window.

Disassemble the Amazon Nova suite

Amazon Nova is just not a single mannequin. This suite is designed for a wide range of wants.

Amazon Nova Professional – Strong multimodal mannequin that may deal with textual content, pictures and movies. Glorious for duties corresponding to doc evaluation and deep information visualization. Benchmark comparisons present that Amazon Nova Professional matches or outweighs GPT-4o on advanced inference duties. According to NOVA Technical Report and Model Card Section 2.1.1.
Amazon Nova Lite – Supplies a balanced mixture of multimodal processing and pace. Amazon Nova Lite is right for purposes corresponding to doc abstract, translation, and primary visible search, offering low latency and cost-effective output in comparison with the GPT-4o Mini. You’ll find these benchmark outcomes NOVA Technical Report and Model Card Section 2.1.2.
Amazon Nova Micro – Textual content-only mannequin designed for ultra-low latency. With output speeds of as much as 195 tokens per second, Amazon Nova Micro is right for real-time purposes corresponding to chat-based assistants and automated FAQs. That token value is dramatically decrease than the GPT-4o Mini value, 4.3 instances cheaper per token.

Amazon Nova’s decrease value per second and better output offers you the flexibleness to simplify prompts for real-time purposes, permitting you to stability the standard, pace and value of your use circumstances.

Understanding the fundamentals

Understanding the variations between quick engineering finest practices on each Openai and Amazon Nova to make the most effective choices about which mannequin household fits your wants. Every mannequin household has its personal strengths, however there are a number of that apply to each households. High quality accuracy is achieved by means of readability of directions, structured prompts, and iterative enhancements in each mannequin households. Whether or not you are utilizing robust output instructions or utilizing well-defined use circumstances, the objective is to scale back ambiguity and enhance the standard of responses.

Openai method

Openai makes use of a layered messaging system for fast engineering. That is the place the system, builders, and person prompts work in concord, controlling tone, security, and output format. Their method highlights:

The position of hierarchical messages – Make it possible for complete security and elegance pointers (set at system immediate) are saved when configuring mannequin roles and habits utilizing system messages
Educating preparations and delimiters – Directives are positioned first, clearly separating contexts, examples, and queries
Selective Chain – Detailed step-by-step inference is used when benefiting advanced duties
Format and construction – Use robust directives corresponding to DO, want, and offering constant output (e.g. JSON)

Amazon Nova method

Outline fast use circumstances
- job – What precisely ought to the mannequin do?
- position – The position that the mannequin assumes
- Response Model – Output construction or tone
- Directions – You have to observe the rules mannequin
mind-set
- Specific state response – Present clear and highly effective directions to restrict mannequin responses
- Structural considering – Sequential considering will encourage structural considering
Format and construction
- Use a delimiter to part the immediate, corresponding to ## job ##, ## context ##, or ##instance.
- Specify the output format, corresponding to JSON, YAML, or MarkDown
- Use robust directions or caps corresponding to Do, Mot, or Mustide
- For instance, the reply to derive a mannequin begins with “{” or “json…”

Evolution of fast methods: Migrating from Openai to Amazon Nova

Migrating to Amazon Nova is not only a change in API endpoints, however requires realigning immediate engineering to match Amazon Nova’s strengths. You have to reconfigure the definition of the use case. Begin by splitting your present GPT-4O or GPT-4O miniprompt into core parts of duties, roles, response types, directions and success standards. Be sure to construct these parts clearly to supply a blueprint in your mannequin.

To grasp the way to migrate current OpenaI prompts to work finest with Amazon Nova Professional, contemplate the next instance utilizing Assembly Notes Summarizer: That is the GPT-4O system immediate.

You may be supplied with assembly notes, and your job is to summarize the assembly 
as follows: 
-General abstract of debate 
-Motion objects (what must be finished and who's doing it) 
-If relevant, a listing of matters that must be mentioned extra totally within the subsequent assembly.

A person immediate is a gathering be aware that must be summarized.

Assembly Date: March fifth, 2050
Assembly Time: 2:00 PM
Location: Convention Room 3B, Intergalactic Headquarters...

GPT generates this useful response:

General Abstract of Dialogue: The workforce mentioned the current mission to Planet Zog, 
the house pirate problem in Sector 7, the annual Intergalactic Bake-Off, planning for 
the upcoming charity fundraiser, and a proposed team-building retreat. The mission to 
Planet Zog was profitable, however communication with the Zogians was difficult. The workforce 
additionally mentioned the necessity for a greater technique to take care of house pirates. The workforce positioned 
second within the Intergalactic Bake-Off and is planning for the upcoming charity fundraiser. 
A team-building retreat to the Moon Resort and Spa was proposed.

Motion Objects:
1. Dr. Quasar to start out engaged on a Zogian-English dictionary.
2. Captain Stardust to talk with Admiral Starbeam about rising patrols in Sector 7.
3. Dr. Quasar to finalize the prototype of the brand new cloaking expertise.
4. All workforce members to start out gathering provides and making ready video games for the Intergalactic 
Charity Bazaar.
5. Captain Stardust to verify the funds for the proposed team-building retreat to the Moon 
Resort and Spa.

Matters for Subsequent Assembly:
1. Replace on the Zogian-English dictionary.
2. Progress report on the cloaking expertise.
3. Outcomes of elevated patrols in Sector 7.
4. Last preparations for the Intergalactic Charity Bazaar.

This is what the Amazon Nova Professional immediate seems to be like to fulfill or exceed the standard of the response from the GPT-4o: This immediate begins with a system immediate and makes use of the identical finest practices mentioned on this publish. I used A Temperature of .2 and TOP of 0.9 right here:

You might be an skilled government assistant expert in assembly be aware evaluation and 
summarization. Your major duties embody distilling advanced discussions
into clear, actionable summaries.
Observe these directions:

##INSTRUCTIONS##
1. Learn perceive the assembly notes present in ##NOTES##
2. Put your whole outputs in a bit referred to as ##OUTPUTS## in markdown formatting
3. Summarize the assembly notes in 5 sentences or much less. Put this in a bit referred to as 
"General Abstract".
4. Numerically checklist any motion objects for particular individuals and what must be accomplished. 
Put this checklist in a bit referred to as "Motion Objects".
5. If relevant, checklist the matters that must be mentioned extra totally within the subsequent assembly. 
Put this in a bit referred to as "Matters for Subsequent Assembly".

That is the person’s immediate.

##NOTES##
Assembly Date: March fifth, 2050
Assembly Time: 2:00 PM
Location: Convention Room 3B, Intergalactic Headquarters
Attendees:
- Captain Stardust
- Dr. Quasar
- Girl Nebula
- Sir Supernova
- Ms. Comet
Assembly referred to as to order by Captain Stardust at 2:05 PM
1. Introductions and welcome to our latest workforce member, Ms. Comet
2. Dialogue of our current mission to Planet Zog
- Captain Stardust: "General, successful, however communication with the Zogians was tough. 
We have to enhance our language expertise."
- Dr. Quasar: "Agreed. I am going to begin engaged on a Zogian-English dictionary instantly."
- Girl Nebula: "The Zogian meals was out of this world, actually! We should always contemplate having 
a Zogian meals evening on the ship."
3. Addressing the house pirate problem in Sector 7
- Sir Supernova: "We'd like a greater technique for coping with these pirates. They've already 
plundered three cargo ships this month."
- Captain Stardust: "I am going to converse with Admiral Starbeam about rising patrols in that space.
- Dr. Quasar: "I have been engaged on a brand new cloaking expertise that might assist our ships keep away from 
detection by the pirates. I am going to want a couple of extra weeks to finalize the prototype."
4. Assessment of the annual Intergalactic Bake-Off
- Girl Nebula: "I am glad to report that our workforce positioned second within the competitors! Our Martian Mud 
Pie was an enormous hit!"
- Ms. Comet: "Let's purpose for first place subsequent 12 months. I've a secret recipe for Jupiter Jello that I 
suppose might be a winner."
5. Planning for the upcoming charity fundraiser
- Captain Stardust: "We'd like some artistic concepts for our sales space on the Intergalactic Charity Bazaar."
- Sir Supernova: "How a few 'Dunk the Alien' sport? We are able to have individuals throw water balloons at a 
volunteer dressed as an alien."
- Dr. Quasar: "I can arrange a 'Identify That Star' trivia sport with prizes for the winners."
- Girl Nebula: "Nice concepts, everybody. Let's begin gathering the provides and making ready the video games."
6. Upcoming team-building retreat
- Ms. Comet: "I want to suggest a team-building retreat to the Moon Resort and Spa. It is a fantastic 
alternative to bond and chill out after our current missions."
- Captain Stardust: "Feels like a unbelievable concept. I am going to verify the funds and see if we are able to make it occur."
7. Subsequent assembly agenda objects
- Replace on the Zogian-English dictionary (Dr. Quasar)
- Progress report on the cloaking expertise (Dr. Quasar)
- Outcomes of elevated patrols in Sector 7 (Captain Stardust)
- Last preparations for the Intergalactic Charity Bazaar (All)
Assembly adjourned at 3:15 PM. Subsequent assembly scheduled for March nineteenth, 2050 at 2:00 PM in Convention Room 3B, 
Intergalactic Headquarters.

##OUTPUTS##

The next instance exhibits that the Amazon Nova response meets and exceeds the accuracy of the Openai instance.

### General Abstract
The assembly lined the profitable mission to Planet Zog, the continued problem with 
house pirates in Sector 7, the outcomes of the annual Intergalactic Bake-Off, plans 
for the upcoming charity fundraiser, and a proposed team-building retreat. Motion objects 
had been assigned, and the following assembly's agenda was set.

### Motion Objects
1. **Dr. Quasar**: Start engaged on a Zogian-English dictionary.
2. **Captain Stardust**: Converse with Admiral Starbeam about rising patrols in Sector 7.
3. **Dr. Quasar**: Finalize the prototype for the brand new cloaking expertise inside a couple of weeks.
4. **Group**: Begin gathering provides and making ready video games for the Intergalactic Charity Bazaar.
5. **Captain Stardust**: Verify the funds for the proposed team-building retreat to the Moon Resort and Spa.

### Matters for Subsequent Assembly
1. Replace on the Zogian-English dictionary (Dr. Quasar)
2. Progress report on the cloaking expertise (Dr. Quasar)
3. Outcomes of elevated patrols in Sector 7 (Captain Stardust)
4. Last preparations for the Intergalactic Charity Bazaar (All)

Some updates to the immediate can obtain comparable or higher outcomes for Amazon Nova Professional, whereas having fun with less expensive inference prices.

Use Amazon Nova extension context

Amazon Nova Lite and Amazon Nova Professional can assist as much as 300,000 enter tokens. This implies you could embody extra context on the immediate if essential. Broaden the background information and detailed steps accordingly. If the unique OpenAI immediate was optimized for 128,000 tokens, alter it to make use of the Amazon Nova extension window.

Adjusts output constraints

In case your GPT immediate requires strict formatting (for instance, “JSON solely” or “JSON solely”), make certain your Amazon Nova immediate contains these directives. Moreover, if the duty comprises multimodal enter, specify when to incorporate picture or video references.

Operate name

The rise within the generator AI brokers have resulted in purposeful or software calls, one of the essential capabilities of a selected main language mannequin (LLM). The flexibility of the mannequin to accurately choose the proper instruments in your job is the distinction between success and failure of an agent system in a low latency method.

Each Openai and Amazon Nova fashions share similarity of operate calls, notably assist for structured API calls. Each mannequin households assist software choice by means of outlined software schemas. I am going to clarify this later on this publish. Each present a mechanism to find out when to invoke these instruments.

OpenAI operate calls use a versatile JSON schema to outline and configure API interactions. This mannequin helps a variety of schema configurations, permitting builders to shortly implement exterior operate calls by way of easy JSON definitions tied to API endpoints.

Right here is an instance operate.

instruments = [{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current temperature for a given location.",
            "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City and country e.g. Montevideo, Uruguay"
                        }            
                },            
                "required": [                
                    "location"
        ],
        "additionalProperties": False
    },
    "strict": True
    }
}]

completion = shopper.chat.completions.create(
    mannequin="gpt-4o",
    messages=[{"role": "user", "content": "What is the weather like in Punta del Este today?"}],
    instruments=instruments

Just like Openai’s method, Amazon Nova can invoke the software when the configuration schema passes, as proven within the following code instance: Amazon Nova makes use of a whole lot of grasping decoding when calling instruments, and we suggest setting temperature, TOPP, and TOPK to 1. This ensures that the mannequin has the best accuracy in software choice. These grasping decoding parameters and different nice examples of software use are lined intimately in Instrument Use (Operate Name) in Amazon Nova.

Under is an instance of a operate name with out utilizing it Extra ModelRequestFields:

tool_config = {
    "instruments": [{
        "toolSpec": {
            "name": "get_recipe",
            "description": "Structured recipe generation system",
            "inputSchema": {
                "json": {
                    "type": "object",
                    "properties": {
                        "recipe": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "ingredients": {
                                    "type": "array",
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "item": {"type": "string"},
                                            "amount": {"type": "number"},
                                            "unit": {"type": "string"}
                                        }
                                    }
                                },
                                "instructions": {
                                    "type": "array",
                                    "items": {"type": "string"}
                                }
                            },
                            "required": ["name", "ingredients", "instructions"]
                        }
                    }
                }
            }
        }
    }]
}

# Base configuration with out topK=1
input_text = "I want a recipe for chocolate lava cake"
messages = [{
    "role": "user",
    "content": [{"text": input_text}]
}]

# Inference parameters
inf_params = {"topP": 1, "temperature": 1}

response = shopper.converse(
    modelId="us.amazon.nova-lite-v1:0",
    messages=messages,
    toolConfig=tool_config,
    inferenceConfig=inf_params
)
# Sometimes produces much less structured or incomplete output

The next instance exhibits how it may be improved by utilizing the accuracy of the operate name

Extra ModelRequestFields:

# Enhanced configuration with topK=1
response = shopper.converse(
    modelId="us.amazon.nova-lite-v1:0",
    messages=messages,
    toolConfig=tool_config,
    inferenceConfig=inf_params,
    additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
)
# Produces extra correct and structured operate name

All the time use it to name the likelihood to maximise your Amazon Nova capabilities and enhance accuracy Extra ModelRequestFields and topk = 1. This may enable the mannequin to pick a single most probably token, stopping the collection of random tokens. This will increase deterministic output technology and will increase the accuracy of operate calls by about 30-40%.

The next code instance additional explains the way to efficiently execute software calls: The primary situation illustrates recipe technology with out express instruments. This instance doesn’t use TOPK. This normally leads to a much less structured response.

input_text = """
I am on the lookout for a decadent chocolate dessert that is fast to arrange. 
One thing that appears fancy however is not sophisticated to make.
"""

messages = [{
    "role": "user",
    "content": [{"text": input_text}]
}]

response = shopper.converse(
    modelId="us.amazon.nova-lite-v1:0",
    messages=messages,
    inferenceConfig={"topP": 1, "temperature": 1}
)
# Generates a conversational recipe description
# Much less structured, extra narrative-driven response

On this instance, the situation illustrates recipe technology with structured instruments. Add the TOPK set to 1. This produces a extra structured output.

response = shopper.converse(
    modelId="us.amazon.nova-lite-v1:0",
    messages=messages,
    toolConfig=tool_config,
    inferenceConfig={"topP": 1, "temperature": 1},
    additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
)
# Generates a extremely structured, JSON-compliant recipe
# Contains exact ingredient measurements
# Supplies step-by-step directions

General, OpenAI gives extra flexibility and broader schema assist. Amazon Nova gives extra correct and managed output technology and is the only option when working with high-stakes structured information eventualities, as demonstrated in Amazon Nova’s efficiency on IFVAL benchmarks described in part 2.1.1 of Technical Stories and Mannequin Playing cards. As a result of its operate name scheme supplies wonderful management and accuracy, we suggest utilizing Amazon Nova for purposes that require predictable, structured responses.

Conclusion

The evolution of Openai’s mannequin to Amazon Nova represents a significant shift in AI use. It exhibits the transition to a mannequin that gives related or superior efficiency at a fraction of the fee, because of multimodal and prolonged context processing extensions.

Whether or not you are utilizing a strong, enterprise-ready Amazon Nova Professional, an agile and economical Amazon Nova Lite, or a flexible Amazon Nova Micro, the advantages are clear.

Price discount – Companies can scale their purposes extra economically, with as much as 4 instances decrease token prices
Enhanced response efficiency – Excessive response time (190 tokens per second) making real-time purposes extra executable
Extensions – From deep doc evaluation to built-in visible content material, bigger context home windows and multimodal assist unlock new purposes

By redefining use circumstances, leveraging prolonged contexts and evolving refined directions repeatedly, you’ll be able to easily migrate current workflows from Openai’s O4 and O4-Mini fashions into the revolutionary world of Amazon Nova.

Concerning the creator

Claudio Mazzoni I’m the SR Specialist Answer Architect for the Amazon Bedrock GTM Group. Claudio goes past guiding costumes by means of the Gen Ai Journey. Exterior of labor, Claudio spends time together with her household, works within the backyard and cooks Uruguayan delicacies.

Pat Riley I’m Sr. Specialist Options Architect for the Amazon Bedrock Go-To-Market workforce. Pat has spent the previous 15 years as a guide on analytics and machine studying. When he isn’t primarily based on AWS, you will discover him groping round on a picket mission.

Searching for the most effective worth efficiency

Why go from Openai to Amazon Nova?

Disassemble the Amazon Nova suite

Understanding the fundamentals

Openai method

Amazon Nova method

Evolution of fast methods: Migrating from Openai to Amazon Nova

Use Amazon Nova extension context

Adjusts output constraints

Operate name

Conclusion

Concerning the creator

How the crypto market modified in March 2025, in response to Binance Analysis

Doge reportedly plans a hackathon to construct a “mega API” for IRS knowledge

Converter

Editors Pick

Newsletter

Categories

Related Posts