With a willingness to remain aggressive, immediately’s companies look to AI, serving to them reduce prices and maximize effectivity. It’s obligatory for them to seek out essentially the most applicable AI mannequin. For a lot of corporations, the transfer from Openai’s mannequin household to Amazon Nova represents not solely a mannequin change, however a strategic transfer in the direction of scalability, effectivity and broader multimodal capabilities.
On this weblog, we’ll clarify the way to optimize prompts on Amazon Nova and the most effective worth efficiency.
Why go from Openai to Amazon Nova?
Openai’s mannequin stays robust, however increasing it might make working prices prohibitive. Take into account these numbers Artificial analysis:
| Mannequin | Enter token value (per million tokens) | Output token value (per million tokens) | Context window | Output pace (tokens per second) | Latency (seconds per first token) |
| GPT-4O | ~$2.50 | ~$10.00 | Max 128K tokens | ~63 | ~0.49 |
| GPT-4O MINI | ~$0.15 | ~$0.60 | Max 128K tokens | ~90 | ~0.43 |
| Nova Micro | ~$0.035 | ~$0.14 | Max 128K tokens | ~195 | ~0.29 |
| Novalite | ~$0.06 | ~$0.24 | As much as 300k tokens | ~146 | ~0.29 |
| Nova Professional | ~$0.80 | ~$3.20 | As much as 300k tokens | ~90 | ~0.34 |
For big quantities of purposes (corresponding to international buyer assist or large-scale doc evaluation), these variations in value are disruptive. Not solely does Amazon Nova Professional provide greater than 3 times less expensive, however it additionally means that you can deal with extra intensive and sophisticated inputs with an extended context window.
Disassemble the Amazon Nova suite
Amazon Nova is just not a single mannequin. This suite is designed for a wide range of wants.
- Amazon Nova Professional – Strong multimodal mannequin that may deal with textual content, pictures and movies. Glorious for duties corresponding to doc evaluation and deep information visualization. Benchmark comparisons present that Amazon Nova Professional matches or outweighs GPT-4o on advanced inference duties. According to NOVA Technical Report and Model Card Section 2.1.1.
- Amazon Nova Lite – Supplies a balanced mixture of multimodal processing and pace. Amazon Nova Lite is right for purposes corresponding to doc abstract, translation, and primary visible search, offering low latency and cost-effective output in comparison with the GPT-4o Mini. You’ll find these benchmark outcomes NOVA Technical Report and Model Card Section 2.1.2.
- Amazon Nova Micro – Textual content-only mannequin designed for ultra-low latency. With output speeds of as much as 195 tokens per second, Amazon Nova Micro is right for real-time purposes corresponding to chat-based assistants and automated FAQs. That token value is dramatically decrease than the GPT-4o Mini value, 4.3 instances cheaper per token.
Amazon Nova’s decrease value per second and better output offers you the flexibleness to simplify prompts for real-time purposes, permitting you to stability the standard, pace and value of your use circumstances.
Understanding the fundamentals
Understanding the variations between quick engineering finest practices on each Openai and Amazon Nova to make the most effective choices about which mannequin household fits your wants. Every mannequin household has its personal strengths, however there are a number of that apply to each households. High quality accuracy is achieved by means of readability of directions, structured prompts, and iterative enhancements in each mannequin households. Whether or not you are utilizing robust output instructions or utilizing well-defined use circumstances, the objective is to scale back ambiguity and enhance the standard of responses.
Openai method
Openai makes use of a layered messaging system for fast engineering. That is the place the system, builders, and person prompts work in concord, controlling tone, security, and output format. Their method highlights:
- The position of hierarchical messages – Make it possible for complete security and elegance pointers (set at system immediate) are saved when configuring mannequin roles and habits utilizing system messages
- Educating preparations and delimiters – Directives are positioned first, clearly separating contexts, examples, and queries
- Selective Chain – Detailed step-by-step inference is used when benefiting advanced duties
- Format and construction – Use robust directives corresponding to DO, want, and offering constant output (e.g. JSON)
Amazon Nova method
- Outline fast use circumstances
- job – What precisely ought to the mannequin do?
- position – The position that the mannequin assumes
- Response Model – Output construction or tone
- Directions – You have to observe the rules mannequin
- mind-set
- Specific state response – Present clear and highly effective directions to restrict mannequin responses
- Structural considering – Sequential considering will encourage structural considering
- Format and construction
- Use a delimiter to part the immediate, corresponding to ## job ##, ## context ##, or ##instance.
- Specify the output format, corresponding to JSON, YAML, or MarkDown
- Use robust directions or caps corresponding to Do, Mot, or Mustide
- For instance, the reply to derive a mannequin begins with “{” or “json…”
Evolution of fast methods: Migrating from Openai to Amazon Nova
Migrating to Amazon Nova is not only a change in API endpoints, however requires realigning immediate engineering to match Amazon Nova’s strengths. You have to reconfigure the definition of the use case. Begin by splitting your present GPT-4O or GPT-4O miniprompt into core parts of duties, roles, response types, directions and success standards. Be sure to construct these parts clearly to supply a blueprint in your mannequin.
To grasp the way to migrate current OpenaI prompts to work finest with Amazon Nova Professional, contemplate the next instance utilizing Assembly Notes Summarizer: That is the GPT-4O system immediate.
A person immediate is a gathering be aware that must be summarized.
GPT generates this useful response:
This is what the Amazon Nova Professional immediate seems to be like to fulfill or exceed the standard of the response from the GPT-4o: This immediate begins with a system immediate and makes use of the identical finest practices mentioned on this publish. I used A Temperature of .2 and TOP of 0.9 right here:
That is the person’s immediate.
The next instance exhibits that the Amazon Nova response meets and exceeds the accuracy of the Openai instance.
Some updates to the immediate can obtain comparable or higher outcomes for Amazon Nova Professional, whereas having fun with less expensive inference prices.
Use Amazon Nova extension context
Amazon Nova Lite and Amazon Nova Professional can assist as much as 300,000 enter tokens. This implies you could embody extra context on the immediate if essential. Broaden the background information and detailed steps accordingly. If the unique OpenAI immediate was optimized for 128,000 tokens, alter it to make use of the Amazon Nova extension window.
Adjusts output constraints
In case your GPT immediate requires strict formatting (for instance, “JSON solely” or “JSON solely”), make certain your Amazon Nova immediate contains these directives. Moreover, if the duty comprises multimodal enter, specify when to incorporate picture or video references.
Operate name
The rise within the generator AI brokers have resulted in purposeful or software calls, one of the essential capabilities of a selected main language mannequin (LLM). The flexibility of the mannequin to accurately choose the proper instruments in your job is the distinction between success and failure of an agent system in a low latency method.
Each Openai and Amazon Nova fashions share similarity of operate calls, notably assist for structured API calls. Each mannequin households assist software choice by means of outlined software schemas. I am going to clarify this later on this publish. Each present a mechanism to find out when to invoke these instruments.
OpenAI operate calls use a versatile JSON schema to outline and configure API interactions. This mannequin helps a variety of schema configurations, permitting builders to shortly implement exterior operate calls by way of easy JSON definitions tied to API endpoints.
Right here is an instance operate.
instruments = [{
     "type": "function",
     "function": {
       "name": "get_weather",
       "description": "Get current temperature for a given location.",
       "parameters": {
           "type": "object",
           "properties": {
             "location": {
               "type": "string",
               "description": "City and country e.g. Montevideo, Uruguay"
             }      Â
         },      Â
         "required": [        Â
           "location"
     ],
     "additionalProperties": False
   },
   "strict": True
   }
}]
completion = shopper.chat.completions.create(
   mannequin="gpt-4o",
   messages=[{"role": "user", "content": "What is the weather like in Punta del Este today?"}],
   instruments=instruments
Just like Openai’s method, Amazon Nova can invoke the software when the configuration schema passes, as proven within the following code instance: Amazon Nova makes use of a whole lot of grasping decoding when calling instruments, and we suggest setting temperature, TOPP, and TOPK to 1. This ensures that the mannequin has the best accuracy in software choice. These grasping decoding parameters and different nice examples of software use are lined intimately in Instrument Use (Operate Name) in Amazon Nova.
Under is an instance of a operate name with out utilizing it Extra ModelRequestFields:
tool_config = {
"instruments": [{
"toolSpec": {
"name": "get_recipe",
"description": "Structured recipe generation system",
"inputSchema": {
"json": {
"type": "object",
"properties": {
"recipe": {
"type": "object",
"properties": {
"name": {"type": "string"},
"ingredients": {
"type": "array",
"items": {
"type": "object",
"properties": {
"item": {"type": "string"},
"amount": {"type": "number"},
"unit": {"type": "string"}
}
}
},
"instructions": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["name", "ingredients", "instructions"]
}
}
}
}
}
}]
}
# Base configuration with out topK=1
input_text = "I want a recipe for chocolate lava cake"
messages = [{
"role": "user",
"content": [{"text": input_text}]
}]
# Inference parameters
inf_params = {"topP": 1, "temperature": 1}
response = shopper.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
toolConfig=tool_config,
inferenceConfig=inf_params
)
# Sometimes produces much less structured or incomplete output
Â
The next instance exhibits how it may be improved by utilizing the accuracy of the operate name
Extra ModelRequestFields:
# Enhanced configuration with topK=1
response = shopper.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
toolConfig=tool_config,
inferenceConfig=inf_params,
additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
)
# Produces extra correct and structured operate name
All the time use it to name the likelihood to maximise your Amazon Nova capabilities and enhance accuracy Extra ModelRequestFields and topk = 1. This may enable the mannequin to pick a single most probably token, stopping the collection of random tokens. This will increase deterministic output technology and will increase the accuracy of operate calls by about 30-40%.
The next code instance additional explains the way to efficiently execute software calls: The primary situation illustrates recipe technology with out express instruments. This instance doesn’t use TOPK. This normally leads to a much less structured response.
input_text = """
I am on the lookout for a decadent chocolate dessert that is fast to arrange.
One thing that appears fancy however is not sophisticated to make.
"""
messages = [{
"role": "user",
"content": [{"text": input_text}]
}]
response = shopper.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
inferenceConfig={"topP": 1, "temperature": 1}
)
# Generates a conversational recipe description
# Much less structured, extra narrative-driven response
On this instance, the situation illustrates recipe technology with structured instruments. Add the TOPK set to 1. This produces a extra structured output.
response = shopper.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
toolConfig=tool_config,
inferenceConfig={"topP": 1, "temperature": 1},
additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
)
# Generates a extremely structured, JSON-compliant recipe
# Contains exact ingredient measurements
# Supplies step-by-step directions
General, OpenAI gives extra flexibility and broader schema assist. Amazon Nova gives extra correct and managed output technology and is the only option when working with high-stakes structured information eventualities, as demonstrated in Amazon Nova’s efficiency on IFVAL benchmarks described in part 2.1.1 of Technical Stories and Mannequin Playing cards. As a result of its operate name scheme supplies wonderful management and accuracy, we suggest utilizing Amazon Nova for purposes that require predictable, structured responses.
Conclusion
The evolution of Openai’s mannequin to Amazon Nova represents a significant shift in AI use. It exhibits the transition to a mannequin that gives related or superior efficiency at a fraction of the fee, because of multimodal and prolonged context processing extensions.
Whether or not you are utilizing a strong, enterprise-ready Amazon Nova Professional, an agile and economical Amazon Nova Lite, or a flexible Amazon Nova Micro, the advantages are clear.
- Price discount – Companies can scale their purposes extra economically, with as much as 4 instances decrease token prices
- Enhanced response efficiency – Excessive response time (190 tokens per second) making real-time purposes extra executable
- Extensions – From deep doc evaluation to built-in visible content material, bigger context home windows and multimodal assist unlock new purposes
By redefining use circumstances, leveraging prolonged contexts and evolving refined directions repeatedly, you’ll be able to easily migrate current workflows from Openai’s O4 and O4-Mini fashions into the revolutionary world of Amazon Nova.
Concerning the creator
Claudio Mazzoni I’m the SR Specialist Answer Architect for the Amazon Bedrock GTM Group. Claudio goes past guiding costumes by means of the Gen Ai Journey. Exterior of labor, Claudio spends time together with her household, works within the backyard and cooks Uruguayan delicacies.
Pat Riley I’m Sr. Specialist Options Architect for the Amazon Bedrock Go-To-Market workforce. Pat has spent the previous 15 years as a guide on analytics and machine studying. When he isn’t primarily based on AWS, you will discover him groping round on a picket mission.

