We sit up for announce that Amazon Bedrock customized mannequin imports are supported Qwen Mannequin. Now you can import customized weights for QWEN2, QWEN2_VL, and QWEN2_5_VL architectures, together with fashions akin to QWEN 2, 2.5 Coder, QWen 2.5 VL, and QWQ 32b. In the event you needn’t take your individual custom-made QWEN fashions to Amazon Bedrock and handle infrastructure or mannequin servings, you’ll be able to deploy them in a completely managed serverless surroundings.
This put up covers how one can deploy a QWEN 2.5 mannequin utilizing Amazon Bedrock customized mannequin imports, making it accessible to organizations wanting to make use of the newest AI capabilities inside their AWS infrastructure at an efficient value.
Qwen mannequin overview
Qwen 2 and a pair of.5 are a big household of language fashions out there in a variety of sizes and specialised variants to go well with quite a lot of wants.
- Basic language fashions: A mannequin with a spread of 0.5B to 72B parameters with each a generic job base and an educational model
- Qwen 2.5-Coder: Specializing in code technology and completion
- Qwen 2.5-math: Specializing in superior mathematical reasoning
- Qwen 2.5-VL (Imaginative and prescient Language): Allow picture and video processing features, multimodal purposes
Overview of importing Amazon Bedrock customized fashions
Amazon Bedrock Customized Mannequin imports can help you import and use custom-made fashions together with present primary fashions (FMS) through a single serverless, built-in API. You’ll be able to entry imported customized fashions on demand with out the necessity to handle the underlying infrastructure. Speed up the event of generated AI purposes by integrating supported customized fashions with native Amazon bedrock instruments and options such because the Amazon Bedrock Data Bases, Amazon Bedrock Guardrails, and Amazon Bedrock Agent. Importing Amazon Bedrock customized fashions is mostly out there within the US East (N. Virginia), US (Oregon), and Europe (Frankfurt) AWS areas. Subsequent, we’ll discover how one can use the QWEN 2.5 mannequin in two frequent use circumstances: as a coding assistant and for picture understanding. QWEN2.5-CODER is a cutting-edge code mannequin that matches the matching options of proprietary fashions just like the GPT-4O. It helps over 90 programming languages and is great at code technology, debugging and inference. QWen 2.5-VL brings superior multimodal performance. In response to Qwen, Qwen 2.5-VL is expert in not solely recognizing objects akin to flowers and animals, but in addition analyzing charts, extracting textual content from photos, deciphering doc layouts, and processing lengthy movies.
Stipulations
Earlier than importing a QWEN mannequin with Amazon Bedrock Customized Mannequin Import, be sure it exists as follows:
- Energetic AWS account
- Save QWEN mannequin information Amazon Easy Storage Service (Amazon S3) bucket
- Sufficient permissions to create an Amazon bedrock mannequin import job
- We now have confirmed that your space helps importing Amazon Bedrock customized fashions
Use Case 1: Qwen Coding Assistant
This instance exhibits how one can construct a coding assistant utilizing the QWEN2.5-Coder-7B-Instruct mannequin
- Go to Hugging my face Search and replica the mannequin ID qwen/qwen2.5-coder-7b-instruct.
I am going to use it Qwen/Qwen2.5-Coder-7B-Instruct For the remainder of the walkthrough. We now have not demonstrated the fine-tuning process, however you may as well tweak it earlier than importing.
- Use the next command to obtain a snapshot of the mannequin regionally: The Python library for hugging your face offers a utility referred to as Snapshot Obtain for this.
Relying on the mannequin measurement, this will take a couple of minutes. As soon as full, the Qwen Coder 7B mannequin folder will comprise the next information:
- Configuration File: embody
config.json,generation_config.json,tokenizer_config.json,tokenizer.jsonandvocab.json - Mannequin File:4
safetensorInformation andmannequin.safetensors.index.json - doc:
LICENSE,README.mdandmerges.txt

- Add and use the mannequin to Amazon S3
boto3Or the command line:
aws s3 cp ./extractedfolder s3://yourbucket/path/ --recursive
- Begin the import mannequin job utilizing the next API name:
You may as well do that utilizing Amazon Bedrock’s AWS Administration Console.
- Choose on the Amazon Bedrock console Imported fashions Within the navigation pane.
- select Import the mannequin.

- Enter the main points together with a Mannequin title, Import the job titleand the mannequin S3 location.

- Create a brand new service function or use an present service function. Subsequent, choose the import mannequin

- After deciding on Import The console should show the standing as an import when the mannequin is imported.


If you’re utilizing your individual function, add the next belief relationships as defined when creating the service function for mannequin import:
As soon as the mannequin is imported, watch for the mannequin inference to be prepared earlier than chatting with the mannequin through the playground or API. Within the following instance, we add Python It prompts the mannequin to output Python code immediately and lists the gadgets in an S3 bucket. Remember to make use of the suitable chat template and enter the immediate within the required format. For instance, you should use the code under to get a chat template appropriate for any mannequin that hugs your face.
Watch out when utilizing invoke_model The API requires that the imported mannequin makes use of the complete Amazon useful resource title (ARN). Yow will discover the mannequin ARN within the bedrock console by going to the imported mannequin part and viewing the mannequin particulars web page, as proven within the following picture.

When you’re able to infer the mannequin, you’ll be able to name the mannequin utilizing the bedrock console or the chat playground within the API.

Use Case 2: Understanding QWEN 2.5 VL Photos
QWEN2.5-VL-* offers multimodal performance that mixes imaginative and prescient and language understanding in a single mannequin. This part exhibits you how one can deploy QWEN2.5-VL utilizing an Amazon Bedrock customized mannequin, and imports and assessments the picture understanding function.
Import QWEN2.5-VL-7B to Amazon Bedrock
Obtain the mannequin from Huggingface Face and add it to Amazon S3.
Subsequent, import the mannequin into Amazon Bedrock (through console or API):
Check the imaginative and prescient function
As soon as the import is full, check the mannequin with picture enter. The QWEN2.5-VL-* mannequin requires the correct formatting of multimodal inputs.
As soon as photos of cat examples (akin to the next picture) are supplied, the mannequin will precisely clarify essential options such because the cat’s location, fur colour, eye colour, and basic look. This demonstrates the power to course of visible info within the QWEN2.5-VL-* mannequin and generate descriptions of associated texts.

Mannequin response:
Pricing
You should utilize Amazon Bedrock Customized Mannequin Import to host FMs together with Amazon Bedrock, utilizing the weights of customized fashions inside Amazon Bedrock for supported architectures, offering them in a completely managed method in on-demand mode. Importing a customized mannequin doesn’t cost to import a mannequin. You may be charged for inference based mostly on two elements: the variety of energetic mannequin copies and the length of their exercise. The billing happens in a 5-minute increment ranging from the primary profitable name of every mannequin copy. Pricing per minute varies based mostly on elements akin to structure, context size, area, computing unit model, and different elements, and is layered by mannequin copy measurement. The customized mannequin required for internet hosting is determined by the mannequin’s structure, parameter depend, and context size. Amazon Bedrock routinely manages scaling based mostly on utilization patterns. If there is no such thing as a 5 minute name, scale it to zero and scale as wanted, however this may occasionally embody a chilly begin latency of as much as 1 minute. If the inference quantity persistently exceeds the concurrency restrict of a single copy, a further copy is added. Most throughput and concurrency throughout import are decided throughout import based mostly on elements akin to enter/output token combine, {hardware} kind, mannequin measurement, structure, and inference optimization.
For extra info, see Amazon Bedrock Pricing.
cleansing
To keep away from steady charges after finishing the experiment:
- Use the console or API to take away imported QWEN fashions from Amazon Bedrock customized fashions.
- Optionally, when you not want an S3 bucket, take away the mannequin file from the S3 bucket.
Do not forget that importing Amazon Bedrock customized fashions shouldn’t be billed to the import course of itself, however it’s billed to make use of and storage of the mannequin’s inference.
Conclusion
Amazon Bedrock Customized Mannequin Import helps organizations profit from enterprise-grade infrastructure, whereas additionally utilizing highly effective public fashions, significantly Qwen 2.5. The serverless nature of Amazon Bedrock eliminates the complexity of mannequin deployment and operational administration, permitting groups to deal with constructing purposes somewhat than infrastructure. Amazon Bedrock presents a production-ready surroundings for AI workloads, together with auto-scaling, pay-per-user pricing, and seamless integration with AWS providers. The mixture of QWEN 2.5’s superior AI capabilities and Amazon Bedrock Managed Infrastructure offers the optimum steadiness of efficiency, value, and operational effectivity. Organizations can begin and scale up with smaller fashions when wanted, whereas nonetheless absolutely controlling the deployment of their fashions and benefiting from AWS safety and compliance capabilities.
For extra info, see the Amazon Bedrock Consumer Information.
Concerning the writer
Ajit Mahareddy It’s an skilled product with over 20 years of expertise in product administration, engineering and market. Previous to his present function, AJIT led AI/ML merchandise to main expertise corporations akin to Uber, Turing and eHealth. He’s captivated with advancing generative AI expertise and selling real-world affect with generative AI.
Shreyas Subramanian A number one knowledge scientist, serving to clients through the use of generative AI and fixing enterprise challenges utilizing AWS providers. Shrayas has a background in large-scale optimization and ML, and augmentation studying to speed up ML use and optimization duties.
Yang Yang Chang He’s a senior Generated AI Information Scientist at Amazon Net Providers, working as a Generated AI Specialist on cutting-edge AI/ML applied sciences, serving to clients use Generated AI to attain the specified outcomes. Yanyan graduated from Texas A&M College with a PhD in Electrical Engineering. Exterior of labor, she likes to journey, work out and discover new issues.
Dharinee Gupta He’s the Engineering Supervisor at AWS Bedrock and focuses on enabling clients to seamlessly make the most of open supply fashions through serverless options. Her staff focuses on optimizing these fashions to supply the very best cost-performance steadiness for his or her clients. Previous to her present function, she gained intensive expertise in authentication and authentication techniques on Amazon and developed a safe entry resolution for Amazon’s providing. Dharinee is captivated with making superior AI applied sciences accessible and environment friendly for AWS clients.
Lokeshwaran Ravi I am a senior deep studying compiler engineer at AWS and focuses on ML optimization, mannequin acceleration, and AI safety. He focuses on bettering effectivity, lowering prices, and democratizing AI expertise by making a secure ecosystem, making cutting-edge ML accessible and impactful throughout the trade.
June won He’s the main product supervisor for Amazon Sagemaker Jumpstart. He focuses on making Basis fashions straightforward to find to assist clients construct generative AI purposes. His expertise on Amazon additionally consists of cellular purchasing apps and final mile supply.

