TII Falcon-H1 fashions now accessible on Amazon Bedrock Market and Amazon SageMaker JumpStart

by root September 11, 2025

written by root September 11, 2025 0 comment 7 views

This put up was co-authored with Jingwei Zuo from TII.

We’re excited to announce the provision of the Technology Innovation Institute (TII)’s Falcon-H1 fashions on Amazon Bedrock Market and Amazon SageMaker JumpStart. With this launch, builders and information scientists can now use six instruction-tuned Falcon-H1 fashions (0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B) on AWS, and have entry to a complete suite of hybrid structure fashions that mix conventional consideration mechanisms with State Area Fashions (SSMs) to ship distinctive efficiency with unprecedented effectivity.

On this put up, we current an outline of Falcon-H1 capabilities and present how one can get began with TII’s Falcon-H1 fashions on each Amazon Bedrock Market and SageMaker JumpStart.

Overview of TII and AWS collaboration

TII is a number one analysis institute based mostly in Abu Dhabi. As a part of UAE’s Superior Know-how Analysis Council (ATRC), TII focuses on superior know-how analysis and growth throughout AI, quantum computing, autonomous robotics, cryptography, and extra. TII employs worldwide groups of scientists, researchers, and engineers in an open and agile setting, aiming to drive technological innovation and place Abu Dhabi and the UAE as a worldwide analysis and growth hub in alignment with the UAE National Strategy for Artificial Intelligence 2031.

TII and Amazon Internet Providers (AWS) are collaborating to develop entry to made-in-the-UAE AI fashions throughout the globe. By combining TII’s technical experience in constructing massive language fashions (LLMs) with AWS Cloud-based AI and machine studying (ML) providers, professionals worldwide can now construct and scale generative AI functions utilizing the Falcon-H1 sequence of fashions.

About Falcon-H1 fashions

The Falcon-H1 structure implements a parallel hybrid design, utilizing components from Mamba and Transformer architectures to mix the sooner inference and decrease reminiscence footprint of SSMs like Mamba with the effectiveness of Transformers’ consideration mechanism in understanding context and enhanced generalization capabilities. The Falcon-H1 structure scales throughout a number of configurations starting from 0.5–34 billion parameters and offers native assist for 18 languages. Based on TII, the Falcon-H1 household demonstrates notable effectivity with published metrics indicating that smaller mannequin variants obtain efficiency parity with bigger fashions. A number of the advantages of Falcon-H1 sequence embrace:

Efficiency – The hybrid attention-SSM mannequin has optimized parameters with adjustable ratios between consideration and SSM heads, resulting in sooner inference, decrease reminiscence utilization, and powerful generalization capabilities. Based on TII benchmarks revealed in Falcon-H1’s technical blog post and technical report, Falcon-H1 fashions display superior efficiency throughout a number of scales towards different main Transformer fashions of comparable or bigger scales. For instance, Falcon-H1-0.5B delivers efficiency much like typical 7B fashions from 2024, and Falcon-H1-1.5B-Deep rivals lots of the present main 7B-10B fashions.
Wide selection of mannequin sizes – The Falcon-H1 sequence contains six sizes: 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B, with each base and instruction-tuned variants. The Instruct fashions are actually accessible in Amazon Bedrock Market and SageMaker JumpStart.
Multilingual by design – The fashions assist 18 languages natively (Arabic, Czech, German, English, Spanish, French, Hindi, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Romanian, Russian, Swedish, Urdu, and Chinese language) and may scale to over 100 languages based on TII, because of a multilingual tokenizer skilled on numerous language datasets.
As much as 256,000 context size – The Falcon-H1 sequence permits functions in long-document processing, multi-turn dialogue, and long-range reasoning, exhibiting a definite benefit over rivals in sensible long-context functions like Retrieval Augmented Technology (RAG).
Sturdy information and coaching technique – Coaching of Falcon-H1 fashions employs an progressive strategy that introduces complicated information early on, opposite to conventional curriculum studying. It additionally implements strategic information reuse based mostly on cautious memorization window evaluation. Moreover, the coaching course of scales easily throughout mannequin sizes by means of a custom-made Maximal Replace Parametrization (µP) recipe, particularly tailored for this novel structure.
Balanced efficiency in science and knowledge-intensive domains – By a rigorously designed information combination and common evaluations throughout coaching, the mannequin achieves sturdy basic capabilities and broad world data whereas minimizing unintended specialization or domain-specific biases.

In step with their mission to foster AI accessibility and collaboration, TII have launched Falcon-H1 fashions underneath the Falcon LLM license. It gives the next advantages:

Open supply nature and accessibility
Multi-language capabilities
Price-effectiveness in comparison with proprietary fashions
Power-efficiency

About Amazon Bedrock Market and SageMaker JumpStart

Amazon Bedrock Market gives entry to over 100 well-liked, rising, specialised, and domain-specific fashions, so yow will discover the perfect proprietary and publicly accessible fashions in your use case based mostly on components corresponding to accuracy, flexibility, and price. On Amazon Bedrock Market you possibly can uncover fashions in a single place and entry them by means of unified and safe Amazon Bedrock APIs. You may as well choose your required variety of situations and the occasion sort to fulfill the calls for of your workload and optimize your prices.

SageMaker JumpStart helps you shortly get began with machine studying. It offers entry to state-of-the-art mannequin architectures, corresponding to language fashions, laptop imaginative and prescient fashions, and extra, with out having to construct them from scratch. With SageMaker JumpStart you possibly can deploy fashions in a safe setting by provisioning them on SageMaker inference situations and isolating them inside your digital personal cloud (VPC). You may as well use Amazon SageMaker AI to additional customise and fine-tune the fashions and streamline the complete mannequin deployment course of.

Resolution overview

This put up demonstrates how one can deploy a Falcon-H1 mannequin utilizing each Amazon Bedrock Market and SageMaker JumpStart. Though we use Falcon-H1-0.5B for example, you possibly can apply these steps to different fashions within the Falcon-H1 sequence. For assist figuring out which deployment choice—Amazon Bedrock Market or SageMaker JumpStart—most accurately fits your particular necessities, see Amazon Bedrock or Amazon SageMaker AI?

Deploy Falcon-H1-0.5B-Instruct with Amazon Bedrock Market

On this part, we present how one can deploy the Falcon-H1-0.5B-Instruct mannequin in Amazon Bedrock Market.

Stipulations

To strive the Falcon-H1-0.5B-Instruct mannequin in Amazon Bedrock Market, you will need to have entry to an AWS account that can include your AWS sources.Previous to deploying Falcon-H1-0.5B-Instruct, confirm that your AWS account has ample quota allocation for ml.g6.xlarge situations. The default quota for endpoints utilizing a number of occasion varieties and sizes is 0, so trying to deploy the mannequin with no larger quota will set off a deployment failure.

To request a quota enhance, open the AWS Service Quotas console and seek for Amazon SageMaker. Find ml.g6.xlarge for endpoint utilization and select Request quota enhance, then specify your required restrict worth. After the request is accredited, you possibly can proceed with the deployment.

Deploy the mannequin utilizing the Amazon Bedrock Market UI

To deploy the mannequin utilizing Amazon Bedrock Market, full the next steps:

On the Amazon Bedrock console, underneath Uncover within the navigation pane, select Mannequin catalog.
Filter for Falcon-H1 because the mannequin title and select Falcon-H1-0.5B-Instruct.

The mannequin overview web page contains details about the mannequin’s license phrases, options, setup directions, and hyperlinks to additional sources.

Assessment the mannequin license phrases, and when you agree with the phrases, select Deploy.

For Endpoint title, enter an endpoint title or depart it because the default pre-populated title.
To attenuate prices whereas experimenting, set the Variety of situations to 1.
For Occasion sort, select from the record of appropriate occasion varieties. Falcon-H1-0.5B-Instruct is an environment friendly mannequin, so ml.m6.xlarge is ample for this train.

Though the default configurations are usually ample for primary wants, you possibly can customise superior settings like VPC, service entry permissions, encryption keys, and useful resource tags. These superior settings may require adjustment for manufacturing environments to keep up compliance together with your group’s safety protocols.

Select Deploy.
A immediate asks you to remain on the web page whereas the AWS Identification and Entry Administration (IAM) position is being created. In case your AWS account lacks ample quota for the chosen occasion sort, you’ll obtain an error message. On this case, discuss with the previous prerequisite part to extend your quota, then strive the deployment once more.

Whereas deployment is in progress, you possibly can select Market mannequin deployments within the navigation pane to observe the deployment progress within the Managed deployment part. When the deployment is full, the endpoint standing will change from Creating to In Service.

Work together with the mannequin within the Amazon Bedrock Market playground

Now you can check Falcon-H1 capabilities immediately within the Amazon Bedrock playground by choosing the managed deployment and selecting Open in playground.

Now you can use the Amazon Bedrock Market playground to work together with Falcon-H1-0.5B-Instruct.

Invoke the mannequin utilizing code

On this part, we display to invoke the mannequin utilizing the Amazon Bedrock Converse API.

Exchange the placeholder code with the endpoint’s Amazon Useful resource Identify (ARN), which begins with arn:aws:sagemaker. Yow will discover this ARN on the endpoint particulars web page within the Managed deployments part.

import boto3
bedrock_runtime = boto3.consumer("bedrock-runtime")
endpoint_arn = "{ENDPOINT ARN}" # Exchange with endpoint ARN
response = bedrock_runtime.converse( modelId=endpoint_arn, messages=[{"role": "user", "content": [{"text": "What is generative AI?"}]}], inferenceConfig={"temperature": 0.1, "topP": 0.1})

print(response["output"]["message"]["content"][0]["text"])

To study extra in regards to the detailed steps and instance code for invoking the mannequin utilizing Amazon Bedrock APIs, discuss with Submit prompts and generate response utilizing the API.

Deploy Falcon-H1-0.5B-Instruct with SageMaker JumpStart

You may entry FMs in SageMaker JumpStart by means of Amazon SageMaker Studio, the SageMaker SDK, and the AWS Administration Console. On this walkthrough, we display how one can deploy Falcon-H1-0.5B-Instruct utilizing the SageMaker Python SDK. Consult with Deploy a mannequin in Studio to learn to deploy the mannequin by means of SageMaker Studio.

Stipulations

To deploy Falcon-H1-0.5B-Instruct with SageMaker JumpStart, you will need to have the next conditions:

An AWS account that can include your AWS sources.
An IAM position to entry SageMaker AI. To study extra about how IAM works with SageMaker AI, see Identification and Entry Administration for Amazon SageMaker AI.
Entry to SageMaker Studio with a JupyterLab house, or an interactive growth setting (IDE) corresponding to Visible Studio Code or PyCharm.

Deploy the mannequin programmatically utilizing the SageMaker Python SDK

Earlier than deploying Falcon-H1-0.5B-Instruct utilizing the SageMaker Python SDK, be sure you have put in the SDK and configured your AWS credentials and permissions.

The next code instance demonstrates how one can deploy the mannequin:

import sagemakerfrom sagemaker.jumpstart.mannequin
import JumpStartModelfrom sagemaker
import Session
import boto3
import json

# Initialize SageMaker session
session = sagemaker.Session()
position = sagemaker.get_execution_role()

# Specify mannequin parameters
model_id = "huggingface-llm-falcon-h1-0-5b-instruct"
instance_type = "ml.g6.xlarge" # Select applicable occasion based mostly in your wants

# Create and deploy the mannequin
mannequin = JumpStartModel( model_id=model_id, position=position, instance_type=instance_type, model_version="*" # Newest model)

# Deploy the mannequin
predictor = mannequin.deploy( initial_instance_count=1, accept_eula=True # Required for deploying basis fashions)

print("Endpoint title:")
print(predictor.endpoint_name)

Carry out inference utilizing the SageMaker Python API

When the earlier code section completes efficiently, the Falcon-H1-0.5B-Instruct mannequin deployment is full and accessible on a SageMaker endpoint. Be aware the endpoint title proven within the output—you’ll exchange the placeholder within the following code section with this worth.The next code demonstrates how one can put together the enter information, make the inference API name, and course of the mannequin’s response:

import json
import boto3

session = boto3.Session() # Be certain that your AWS credentials are configured
sagemaker_runtime = session.consumer("sagemaker-runtime")

endpoint_name = "{ENDPOINT_NAME}" # Exchange with endpoint title from deployment output

payload = { "messages": [ { "role": "user", "content": "What is generative AI?" } ], "parameters": { "max_tokens": 256, "temperature": 0.1, "top_p": 0.1 } }

# Carry out inference
response = sagemaker_runtime.invoke_endpoint( EndpointName=endpoint_name, ContentType="software/json", Physique=json.dumps(payload))

# Parse the response
consequence = json.masses(response["Body"].learn().decode("utf-8"))generated_text = consequence["choices"][0]["message"]["content"].strip()
print("Generated Response:")
print(generated_text)

Clear up

To keep away from ongoing prices for AWS sources used whereas experimenting with Falcon-H1 fashions, be certain that to delete all deployed endpoints and their related sources once you’re completed. To take action, full the next steps:

Delete Amazon Bedrock Market sources:
1. On the Amazon Bedrock console, select Market mannequin deployment within the navigation pane.
2. Below Managed deployments, select the Falcon-H1 mannequin endpoint you deployed earlier.
3. Select Delete and make sure the deletion when you not want to make use of this endpoint in Amazon Bedrock Market.
Delete SageMaker endpoints:
1. On the SageMaker AI console, within the navigation pane, select Endpoints underneath Inference.
2. Choose the endpoint related to the Falcon-H1 fashions.
3. Select Delete and make sure the deletion. This stops the endpoint and avoids additional compute prices.
Delete SageMaker fashions:
1. On the SageMaker AI console, select Fashions underneath Inference.
2. Choose the mannequin related together with your endpoint and select Delete.

All the time confirm that each one endpoints are deleted after experimentation to optimize prices. Consult with the Amazon SageMaker documentation for extra steerage on managing sources.

Conclusion

The provision of Falcon-H1 fashions in Amazon Bedrock Market and SageMaker JumpStart helps builders, researchers, and companies construct cutting-edge generative AI functions with ease. Falcon-H1 fashions provide multilingual assist (18 languages) throughout numerous mannequin sizes (from 0.5B to 34B parameters) and assist as much as 256K context size, because of their environment friendly hybrid attention-SSM structure.

Through the use of the seamless discovery and deployment capabilities of Amazon Bedrock Market and SageMaker JumpStart, you possibly can speed up your AI innovation whereas benefiting from the safe, scalable, and cost-effective AWS Cloud infrastructure.

We encourage you to discover the Falcon-H1 fashions in Amazon Bedrock Market or SageMaker JumpStart. You should use these fashions in AWS Areas the place Amazon Bedrock or SageMaker JumpStart and the required occasion varieties can be found.

For additional studying, discover the AWS Machine Studying Weblog, SageMaker JumpStart GitHub repository, and Amazon Bedrock Consumer Information. Begin constructing your subsequent generative AI software with Falcon-H1 fashions and unlock new potentialities with AWS!

Particular because of everybody who contributed to the launch: Evan Kravitz, Varun Morishetty, and Yotam Moss.

Concerning the authors

Mehran Nikoo leads the Go-to-Market technique for Amazon Bedrock and agentic AI in EMEA at AWS, the place he has been driving the event of AI programs and cloud-native options during the last 4 years. Previous to becoming a member of AWS, Mehran held management and technical positions at Trainline, McLaren, and Microsoft. He holds an MBA from Warwick Enterprise Faculty and an MRes in Laptop Science from Birkbeck, College of London.

Mustapha Tawbi is a Senior Companion Options Architect at AWS, specializing in generative AI and ML, with 25 years of enterprise know-how expertise throughout AWS, IBM, Sopra Group, and Capgemini. He has a PhD in Laptop Science from Sorbonne and a Grasp’s diploma in Information Science from Heriot-Watt College Dubai. Mustapha leads generative AI technical collaborations with AWS companions all through the MENAT area.

Jingwei Zuo is a Lead Researcher on the Know-how Innovation Institute (TII) within the UAE, the place he leads the Falcon Foundational Fashions workforce. He acquired his PhD in 2022 from College of Paris-Saclay, the place he was awarded the Plateau de Saclay Doctoral Prize. He holds an MSc (2018) from the College of Paris-Saclay, an Engineer diploma (2017) from Sorbonne Université, and a BSc from Huazhong College of Science & Know-how.

John Liu is a Principal Product Supervisor for Amazon Bedrock at AWS. Beforehand, he served because the Head of Product for AWS Web3/Blockchain. Previous to becoming a member of AWS, John held numerous product management roles at public blockchain protocols and monetary know-how (fintech) firms for 14 years. He additionally has 9 years of portfolio administration expertise at a number of hedge funds.

Hamza MIMI is a Options Architect for companions and strategic offers within the MENAT area at AWS, the place he bridges cutting-edge know-how with impactful enterprise outcomes. With experience in AI and a ardour for sustainability, he helps organizations architect progressive options that drive each digital transformation and environmental duty, reworking complicated challenges into alternatives for development and constructive change.

Welcome to Ivugangingo!

At Ivugangingo, we're passionate about delivering insightful content that empowers and informs our readers across a spectrum of crucial topics. Whether you're delving into the world of insurance, navigating the complexities of cryptocurrency, or seeking wellness tips in health and fitness, we've got you covered.

TII Falcon-H1 fashions now accessible on Amazon Bedrock Market and Amazon SageMaker JumpStart

Overview of TII and AWS collaboration

About Falcon-H1 fashions

About Amazon Bedrock Market and SageMaker JumpStart

Resolution overview

Deploy Falcon-H1-0.5B-Instruct with Amazon Bedrock Market

Stipulations

Deploy the mannequin utilizing the Amazon Bedrock Market UI

Work together with the mannequin within the Amazon Bedrock Market playground

Invoke the mannequin utilizing code

Deploy Falcon-H1-0.5B-Instruct with SageMaker JumpStart

Stipulations

Deploy the mannequin programmatically utilizing the SageMaker Python SDK

Carry out inference utilizing the SageMaker Python API

Clear up

Conclusion

Concerning the authors

Bitcoin forecast in the present day as Africa will break $15 million in pre-sale of crypto, Bitcoin Hypervirus

How you can cease autoplay with x

Converter

Editors Pick

Newsletter

Categories

Related Posts