Right now, we’re excited to announce that Mistral-Small-24B-Instruct-2501—a twenty-four billion parameter massive language mannequin (LLM) from Mistral AI that’s optimized for low latency textual content era duties—is obtainable for patrons via Amazon SageMaker JumpStart and Amazon Bedrock Market. Amazon Bedrock Market is a brand new functionality in Amazon Bedrock that builders can use to find, take a look at, and use over 100 standard, rising, and specialised basis fashions (FMs) alongside the present number of industry-leading fashions in Amazon Bedrock. These fashions are along with the industry-leading fashions which might be already accessible on Amazon Bedrock. It’s also possible to use this mannequin with SageMaker JumpStart, a machine studying (ML) hub that gives entry to algorithms and fashions that may be deployed with one click on for operating inference. On this publish, we stroll via the way to uncover, deploy, and use Mistral-Small-24B-Instruct-2501.
Overview of Mistral Small 3 (2501)
Mistral Small 3 (2501), a latency-optimized 24B-parameter mannequin launched beneath Apache 2.0 maintains a stability between efficiency and computational effectivity. Mistral provides each the pretrained (Mistral-Small-24B-Base-2501) and instruction-tuned (Mistral-Small-24B-Instruct-2501) checkpoints of the mannequin beneath Apache 2.0. Mistral Small 3 (2501) includes a 32 okay token context window. Based on Mistral, the mannequin demonstrates sturdy efficiency in code, math, normal data, and instruction following in comparison with its friends. Mistral Small 3 (2501) is designed for the 80% of generative AI duties that require sturdy language and instruction following efficiency with very low latency. The instruction-tuning course of is concentrated on enhancing the mannequin’s potential to observe advanced instructions, preserve coherent conversations, and generate correct, context-aware responses. The 2501 model follows earlier iterations (Mistral-Small-2409 and Mistral-Small-2402) launched in 2024, incorporating enhancements in instruction-following and reliability. At present, the instruct model of this mannequin, Mistral-Small-24B-Instruct-2501 is obtainable for patrons to deploy and use on SageMaker JumpStart and Bedrock Market.
Optimized for conversational help
Mistral Small 3 (2501) excels in eventualities the place fast, correct responses are vital, equivalent to in digital assistants. This consists of digital assistants the place customers anticipate quick suggestions and close to real-time interactions. Mistral Small 3 (2501) can deal with speedy perform execution when used as a part of automated or agentic workflows. The structure is designed to sometimes reply in lower than 100 milliseconds, in line with Mistral, making it preferrred for customer support automation, interactive help, dwell chat, and content material moderation.
Efficiency metrics and benchmarks
Based on Mistral, the instruction-tuned version of the model achieves over 81% accuracy on Massive Multitask Language Understanding (MMLU) with 150 tokens per second latency, making it at the moment probably the most environment friendly mannequin in its class. In third-party evaluations carried out by Mistral, the mannequin demonstrates aggressive efficiency in opposition to bigger fashions equivalent to Llama 3.3 70B and Qwen 32B. Notably, Mistral claims that the mannequin performs on the similar stage as Llama 3.3 70B instruct and is more than three times faster on the same hardware.
SageMaker JumpStart overview
SageMaker JumpStart is a completely managed service that gives state-of-the-art basis fashions for varied use circumstances equivalent to content material writing, code era, query answering, copywriting, summarization, classification, and knowledge retrieval. It supplies a group of pre-trained fashions you could deploy shortly, accelerating the event and deployment of ML purposes. One of many key elements of SageMaker JumpStart is mannequin hubs, which provide an enormous catalog of pre-trained fashions, equivalent to Mistral, for a wide range of duties.
Now you can uncover and deploy Mistral fashions in Amazon SageMaker Studio or programmatically via the SageMaker Python SDK, enabling you to derive mannequin efficiency and MLOps controls with Amazon SageMaker options equivalent to Amazon SageMaker Pipelines, Amazon SageMaker Debugger, or container logs. The mannequin is deployed in a safe AWS setting and beneath your VPC controls, serving to to help knowledge safety for enterprise safety wants.
Conditions
To attempt Mistral-Small-24B-Instruct-2501 in SageMaker JumpStart, you want the next conditions:
Amazon Bedrock Market overview
To get began, within the AWS Administration Console for Amazon Bedrock, choose Mannequin catalog within the Basis fashions part of the navigation pane. Right here, you may seek for fashions that assist you to with a selected use case or language. The outcomes of the search embrace each serverless fashions and fashions accessible in Amazon Bedrock Market. You possibly can filter outcomes by supplier, modality (equivalent to textual content, picture, or audio), or job (equivalent to classification or textual content summarization).
Deploy Mistral-Small-24B-Instruct-2501 in Amazon Bedrock Market
To entry Mistral-Small-24B-Instruct-2501 in Amazon Bedrock, full the next steps:
- On the Amazon Bedrock console, choose Mannequin catalog beneath Basis fashions within the navigation pane.
On the time of scripting this publish, you should use the InvokeModel API to invoke the mannequin. It doesn’t help Converse APIs or different Amazon Bedrock tooling.
- Filter for Mistral as a supplier and choose the Mistral-Small-24B-Instruct-2501
The mannequin element web page supplies important details about the mannequin’s capabilities, pricing construction, and implementation pointers. You could find detailed utilization directions, together with pattern API calls and code snippets for integration.
The web page additionally consists of deployment choices and licensing info that can assist you get began with Mistral-Small-24B-Instruct-2501 in your purposes.
- To start utilizing Mistral-Small-24B-Instruct-2501, select Deploy.

- You may be prompted to configure the deployment particulars for Mistral-Small-24B-Instruct-2501. The mannequin ID will likely be pre-populated.
- For Endpoint title, enter an endpoint title (as much as 50 alphanumeric characters).
- For Variety of situations, enter a quantity between 1and 100.
- For Occasion sort, choose your occasion sort. For optimum efficiency with Mistral-Small-24B-Instruct-2501, a GPU-based occasion sort equivalent to ml.g6.12xlarge is advisable.
- Optionally, you may configure superior safety and infrastructure settings, together with digital non-public cloud (VPC) networking, service function permissions, and encryption settings. For many use circumstances, the default settings will work nicely. Nonetheless, for manufacturing deployments, you may need to assessment these settings to align together with your group’s safety and compliance necessities.
- Select Deploy to start utilizing the mannequin.

When the deployment is full, you may take a look at Mistral-Small-24B-Instruct-2501 capabilities straight within the Amazon Bedrock playground.
- Select Open in playground to entry an interactive interface the place you may experiment with completely different prompts and modify mannequin parameters equivalent to temperature and most size.
When utilizing Mistral-Small-24B-Instruct-2501 with the Amazon Bedrock InvokeModel and Playground console, use DeepSeek’s chat template for optimum outcomes. For instance, <|start▁of▁sentence|><|Consumer|>content material for inference<|Assistant|>.
This is a wonderful strategy to discover the mannequin’s reasoning and textual content era skills earlier than integrating it into your purposes. The playground supplies quick suggestions, serving to you perceive how the mannequin responds to varied inputs and letting you fine-tune your prompts for optimum outcomes.

You possibly can shortly take a look at the mannequin within the playground via the UI. Nonetheless, to invoke the deployed mannequin programmatically with Amazon Bedrock APIs, you must get the endpoint Amazon Useful resource Title (ARN).
Uncover Mistral-Small-24B-Instruct-2501 in SageMaker JumpStart
You possibly can entry Mistral-Small-24B-Instruct-2501 via SageMaker JumpStart within the SageMaker Studio UI and the SageMaker Python SDK. On this part, we go over the way to uncover the fashions in SageMaker Studio.
SageMaker Studio is an built-in improvement setting (IDE) that gives a single web-based visible interface the place you may entry purpose-built instruments to carry out ML improvement steps, from getting ready knowledge to constructing, coaching, and deploying your ML fashions. For extra details about the way to get began and arrange SageMaker Studio, see Amazon SageMaker Studio.
- Within the SageMaker Studio console, entry SageMaker JumpStart by selecting JumpStart within the navigation pane.

- Choose HuggingFace.
- From the SageMaker JumpStart touchdown web page, seek for
Mistral-Small-24B-Instruct-2501utilizing the search field.
- Choose a mannequin card to view particulars in regards to the mannequin equivalent to license, knowledge used to coach, and the way to use the mannequin. Select Deploy to deploy the mannequin and create an endpoint.

Deploy Mistral-Small-24B-Instruct-2501 with the SageMaker SDK
Deployment begins whenever you select Deploy. After deployment finishes, you will notice that an endpoint is created. Check the endpoint by passing a pattern inference request payload or by deciding on the testing choice utilizing the SDK. When you choose the choice to make use of the SDK, you will notice instance code that you should use within the pocket book editor of your selection in SageMaker Studio.
- To deploy utilizing the SDK, begin by deciding on the Mistral-Small-24B-Instruct-2501 mannequin, specified by the
model_idwith the worthmistral-small-24B-instruct-2501. You possibly can deploy your selection of the chosen fashions on SageMaker utilizing the next code. Equally, you may deploy Mistral-Small-24b-Instruct-2501 utilizing its mannequin ID.
This deploys the mannequin on SageMaker with default configurations, together with the default occasion sort and default VPC configurations. You possibly can change these configurations by specifying non-default values in JumpStartModel. The EULA worth should be explicitly outlined as True to simply accept the end-user license settlement (EULA). See AWS service quotas for the way to request a service quota improve.
- After the mannequin is deployed, you may run inference in opposition to the deployed endpoint via the SageMaker predictor:
Retail math instance
Right here’s an instance of how Mistral-Small-24B-Instruct-2501 can break down a standard buying situation. On this case, you ask the mannequin to calculate the ultimate worth of a shirt after making use of a number of reductions—a scenario many people face whereas buying. Discover how the mannequin supplies a transparent, step-by-step resolution to observe.
The next is the output:
The response exhibits clear step-by-step reasoning with out introducing incorrect info or hallucinated details. Every mathematical step is explicitly proven, making it easy to confirm the accuracy of the calculations.
Clear up
To keep away from undesirable prices, full the next steps on this part to scrub up your sources.
Delete the Amazon Bedrock Market deployment
In case you deployed the mannequin utilizing Amazon Bedrock Market, full the next steps:
- On the Amazon Bedrock console, beneath Basis fashions within the navigation pane, choose Market deployments.
- Within the Managed deployments part, find the endpoint you need to delete.
- Choose the endpoint, and on the Actions menu, choose Delete.
- Confirm the endpoint particulars to ensure you’re deleting the right deployment:
- Endpoint title
- Mannequin title
- Endpoint standing
- Select Delete to delete the endpoint.
- Within the deletion affirmation dialog, assessment the warning message, enter affirm, and select Delete to completely take away the endpoint.

Delete the SageMaker JumpStart predictor
After you’re accomplished operating the pocket book, make certain to delete all sources that you simply created within the course of to keep away from extra billing. For extra particulars, see Delete Endpoints and Assets.
Conclusion
On this publish, we confirmed you the way to get began with Mistral-Small-24B-Instruct-2501 in SageMaker Studio and deploy the mannequin for inference. As a result of basis fashions are pre-trained, they will help decrease coaching and infrastructure prices and allow customization in your use case. Go to SageMaker JumpStart in SageMaker Studio now to get began.
For extra Mistral sources on AWS, take a look at the Mistral-on-AWS GitHub repo.
Concerning the Authors
Niithiyn Vijeaswaran is a Generative AI Specialist Options Architect with the Third-Celebration Mannequin Science crew at AWS. His space of focus is AWS AI accelerators (AWS Neuron). He holds a Bachelor’s diploma in Pc Science and Bioinformatics.
Preston Tuggle is a Sr. Specialist Options Architect engaged on generative AI.
Shane Rai is a Principal Generative AI Specialist with the AWS World Vast Specialist Group (WWSO). He works with clients throughout industries to resolve their most urgent and revolutionary enterprise wants utilizing the breadth of cloud-based AI/ML companies provided by AWS, together with mannequin choices from high tier basis mannequin suppliers.
Avan Bala is a Options Architect at AWS. His space of focus is AI for DevOps and machine studying. He holds a bachelor’s diploma in Pc Science with a minor in Arithmetic and Statistics from the College of Maryland. Avan is at the moment working with the Enterprise Engaged East Crew and likes to specialise in initiatives about rising AI applied sciences.
Banu Nagasundaram leads product, engineering, and strategic partnerships for Amazon SageMaker JumpStart, the machine studying and generative AI hub supplied by SageMaker. She is obsessed with constructing options that assist clients speed up their AI journey and unlock enterprise worth.

