Textual content-to-SQL era stays a persistent problem in enterprise AI purposes, significantly when working with customized SQL dialects or domain-specific database schemas. Whereas basis fashions (FMs) exhibit sturdy efficiency on normal SQL, attaining production-grade accuracy for specialised dialects requires fine-tuning. Nevertheless, fine-tuning introduces an operational trade-off: internet hosting customized fashions on persistent infrastructure incurs steady prices, even during times of zero utilization.
The on-demand inference of Amazon Bedrock with fine-tuned Amazon Nova Micro fashions affords an alternate. By combining the effectivity of LoRA (Low-Rank Adaptation) fine-tuning with serverless and pay-per-token inference, organizations can obtain customized text-to-SQL capabilities with out the overhead price incurred by persistent mannequin internet hosting. Regardless of the extra inference time overhead of making use of LoRA adapters, testing demonstrated latency appropriate for interactive text-to-SQL purposes, with prices scaling by utilization relatively than provisioned capability.
On this publish, we exhibit two approaches to fine-tune Amazon Nova Micro for customized SQL dialect era to ship each price effectivity and manufacturing prepared efficiency. Our instance workload maintained a price of $0.80 month-to-month with a pattern visitors of twenty-two,000 queries monthly, which resulted in prices financial savings in comparison with a persistently hosted mannequin infrastructure.
Conditions
To deploy these options, you’ll need the next:
- An AWS account with billing enabled
- Customary IAM permissions and position configured to entry:
- Quota for ml.g5.48xl occasion for Amazon SageMaker AI coaching.
Answer overview
The answer consists of the next high-level steps:
- Put together your customized SQL coaching dataset with I/O pairs particular to your group’s SQL dialect and enterprise necessities.
- Begin the fine-tuning course of on Amazon Nova Micro mannequin utilizing your ready dataset and chosen fine-tuning method.
- Amazon Bedrock mannequin customization for streamlined deployment
- Amazon SageMaker AI for fine-grained coaching customization and management
- Deploy the customized mannequin on Amazon Bedrock to make use of on-demand inference, eradicating infrastructure administration whereas paying just for token utilization.
- Validate mannequin efficiency with take a look at queries particular to your customized SQL dialect and enterprise use instances.
To exhibit this method in apply, we offer two full implementation paths that handle totally different organizational wants. The primary makes use of the managed mannequin customization of Amazon Bedrock for groups prioritizing simplicity and speedy deployment. The second makes use of Amazon SageMaker AI coaching jobs for organizations requiring extra granular management over hyperparameters and coaching infrastructure. Each implementations share the identical information preparation pipeline and deploy to Amazon Bedrock for on-demand inference. The next are hyperlinks to every GitHub code pattern:
The next structure diagram illustrates the end-to-end workflow, which encompasses information preparation, each fine-tuning approaches, and the Bedrock deployment path that allows serverless inference.
1. Dataset preparation
Our demonstration makes use of the sql-create-context dataset. This dataset is a curated mixture of WikiSQL and Spider datasets containing over 78,000 examples of pure language questions paired with SQL queries throughout various database schemas. This dataset supplies a super basis for text-to-SQL fine-tuning on account of its selection in question complexity, from easy SELECT statements to complicated multi-table joins with aggregations.
Information formatting and construction
The Coaching information is structured as outlined within the documentation. This includes creating JSONL information that comprise system immediate directions paired with consumer queries and corresponding SQL responses of various complexity. The formatted coaching dataset is then break up into coaching and validation units, saved as JSONL information, and uploaded to Amazon Easy Storage Service (Amazon S3) for the fine-tuning course of.
Pattern Transformed Report
{
"schemaVersion": "bedrock-conversation-2024",
"system": [
{
"text": "You are a powerful text-to-SQL model. Your job is to answer questions about a database. You can use the following table schema for context: CREATE TABLE head (age INTEGER)"
}
],
"messages": [
{
"role": "user",
"content": [
{
"text": "Return the SQL query that answers the following question: How many heads of the departments are older than 56 ?"
}
]
},
{
"position": "assistant",
"content material": [
{
"text": "SELECT COUNT(*) FROM head WHERE age > 56"
}
]
}
]
}
Amazon Bedrock fine-tuning method
The mannequin customization of Amazon Bedrock supplies a streamlined, totally managed method to fine-tuning Amazon Nova fashions with out the necessity to provision or handle coaching infrastructure. This methodology is right for groups in search of speedy iteration and minimal operational overhead whereas attaining customized mannequin efficiency tailor-made to their text-to-SQL use case.
Utilizing the customization capabilities of Amazon Bedrock, coaching information is uploaded to Amazon S3, and fine-tuning jobs are configured via the AWS console or API. AWS then handles the underlying coaching infrastructure. The ensuing customized mannequin may be deployed utilizing on-demand inference, sustaining the identical token-based pricing as the bottom Nova Micro mannequin with no further markup making it a cheap resolution for variable workloads.This method is well-suited when it’s essential to rapidly customise a mannequin for customized SQL dialects with out managing ML infrastructure, wish to minimal operational complexity, or want serverless inference with computerized scaling.
2a. Making a Effective-tuning Job Utilizing Amazon Bedrock
Amazon Bedrock helps fine-tuning utilizing each the AWS Console and AWS SDK for Python (Boto3). The AWS documentation incorporates normal steering on submit a coaching job with each approaches. In our implementation, we used the AWS SDK for Python (Boto3). Discuss with the sample notebook in our GitHub samples repository to view our step-by-step implementation.
Configure hyperparameters
After deciding on the mannequin to fine-tune, we then configure our hyperparameters for our use case. For Amazon Nova Micro fine-tuning on Amazon Bedrock, the next hyperparameters may be custom-made to optimize our text-to-SQL mannequin:
| Parameter | Vary/Constraints | Objective | What we used |
| Epochs | 1–5 | Variety of full passes via the coaching dataset | 5 epochs |
| Batch Dimension | Fastened at 1 | Variety of samples processed earlier than updating mannequin weights | 1 (fastened for Nova Micro) |
| Studying Charge | 0.000001–0.0001 | Step measurement for gradient descent optimization | 0.00001 for secure convergence |
| Studying Charge Warmup Steps | 0–100 | Variety of steps to progressively enhance studying charge | 10 |
Notice: These hyperparameters had been optimized for our particular dataset and use case. Optimum values might fluctuate based mostly on dataset measurement and complexity. Within the pattern dataset, this configuration offered improved stability between mannequin accuracy and coaching time, finishing in roughly 2-3 hours.
Analyzing coaching metrics
Amazon Bedrock robotically generates coaching and validation metrics, that are saved in your specified S3 output location. These metrics embrace:
- Coaching loss: Measures how effectively the mannequin suits the coaching information
- Validation loss: Signifies generalization efficiency on unseen information

The coaching and validation loss curves present profitable coaching: each lower persistently, comply with comparable patterns, and converge to comparable remaining values.
3a. Deploy with on-demand inference
After your fine-tuning job completes efficiently, you possibly can deploy your customized Nova Micro mannequin utilizing on-demand inference. This deployment choice supplies computerized scaling and pay-per-token pricing, making it splendid for variable workloads with out the necessity to provision devoted compute assets.
Invoking the customized Nova Micro mannequin
After deployment, you possibly can invoke your customized text-to-SQL mannequin through the use of the deployment ARN because the mannequin ID within the Amazon Bedrock Converse API.
# Use the deployment ARN because the mannequin ID
deployment_arn = "arn:aws:bedrock:us-east-1:<account-id>:deployment/<deployment-id>"
# Put together the inference request
response = bedrock_runtime.converse(
modelId=deployment_arn,
messages=[
{
"role": "user",
"content": [
{
"text": """Database schema:
CREATE TABLE sales (
id INT,
product_name VARCHAR(100),
category VARCHAR(50),
revenue DECIMAL(10,2),
sale_date DATE
);
Question: What are the top 5 products by revenue in the Electronics category?"""
}
]
}
],
inferenceConfig={
"maxTokens": 512,
"temperature": 0.1, # Low temperature for deterministic SQL era
"topP": 0.9
}
)
# Extract the generated SQL question
sql_query = response['output']['message']['content']['text']
print(f"Generated SQL:
{sql_query}")
Amazon SageMaker AI fine-tuning method
Whereas the Amazon Bedrock method streamlines mannequin customization via a managed coaching expertise, organizations in search of deeper optimization management may profit from the SageMaker AI method. SageMaker AI supplies in depth management over coaching parameters that may considerably affect effectivity and mannequin efficiency. You may regulate batch measurement for velocity and reminiscence optimzation, fine-tune dropout settings throughout layers to forestall overfitting, and configure studying charge schedules for coaching stability. For LoRA fine-tuning particularly, You should use SageMaker AI to customise scaling elements and regularization parameters with totally different settings optimized for multimodal versus text-only datasets. Moreover, you possibly can regulate the context window measurement and optimizer settings to match your particular use case necessities. See the next notebook for the entire code pattern.
1b. Information preparation and add
The information preparation and add course of for the SageMaker AI fine-tuning method is an identical to the Amazon Bedrock implementation. Each approaches convert the SQL dataset to the bedrock-conversation-2024 schema format, break up the information into coaching and take a look at units, and add the JSONL information on to S3.
# S3 prefix for coaching information
training_input_path = f's3://{sess.default_bucket()}/datasets/nova-sql-context'
# Add datasets to S3
train_s3_path = sess.upload_data(
path="information/train_dataset.jsonl",
bucket=bucket_name,
key_prefix=training_input_path
)
test_s3_path = sess.upload_data(
path="information/test_dataset.jsonl",
bucket=bucket_name,
key_prefix=training_input_path
)
print(f'Coaching information uploaded to: {train_s3_path}')
print(f'Take a look at information uploaded to: {test_s3_path}')
2b. Making a fine-tuning job utilizing Amazon SageMaker AI
Choose the mannequin ID, recipe, and picture URI:
# Nova configuration
model_id = "nova-micro/prod"
recipe = "https://uncooked.githubusercontent.com/aws/sagemaker-hyperpod-recipes/refs/heads/principal/recipes_collection/recipes/fine-tuning/nova/nova_1_0/nova_micro/SFT/nova_micro_1_0_g5_g6_48x_gpu_lora_sft.yaml"
instance_type = "ml.g5.48xlarge"
instance_count = 1
# Nova-specific picture URI
image_uri = f"708977205387.dkr.ecr.{sess.boto_region_name}.amazonaws.com/nova-fine-tune-repo:SM-TJ-SFT-latest"
print(f'Mannequin ID: {model_id}')
print(f'Recipe: {recipe}')
print(f'Occasion kind: {instance_type}')
print(f'Occasion rely: {instance_count}')
print(f'Picture URI: {image_uri}')
Configuring customized coaching recipes
A key differentiator when utilizing Amazon SageMaker AI for Nova mannequin fine-tuning is the flexibility to customise a coaching recipe. Recipes are pre-configured coaching stacks offered by AWS that can assist you rapidly begin coaching and fine-tuning. Whereas sustaining compatibility with the usual hyperparameter set (epochs, batch measurement, studying charge, and warmup steps) of Amazon Bedrock, the recipes lengthen hyperparameter choices via:
- Regularization parameters: hidden_dropout, attention_dropout, and ffn_dropout to forestall overfitting.
- Optimizer settings: Customizable beta coefficients and weight decay settings.
- Structure controls: Adapter rank and scaling elements for LoRA coaching.
- Superior scheduling: Customized studying charge schedules and warmup methods.
The really useful method is to begin with the default settings to create a baseline, then optimize based mostly in your particular wants. Right here’s a listing of a number of the further parameters that you could optimize for.
| Parameter | Vary/Constraints | Objective |
max_length |
1024–8192 | Management the utmost context window measurement for enter sequences |
global_batch_size |
16,32,64 | Variety of samples processed earlier than updating mannequin weights |
hidden_dropout |
0.0–1.0 | Regularization for hidden layer states to forestall overfitting |
attention_dropout |
0.0–1.0 | Regularization for consideration mechanism weights |
ffn_dropout |
0.0–1.0 | Regularization for feed ahead community layers |
weight_decay |
0.0–1.0 | L2 Regularization energy for mannequin weights |
Adapter_dropout |
0.0–1.0 | Regularization for LoRA adapter parameters |
The whole recipe that we used may be discovered here.
Creating and executing a SageMaker AI coaching job
After configuring your mannequin and recipe, initialize the ModelTrainer object and start coaching:
from sagemaker.prepare import ModelTrainer
coach = ModelTrainer.from_recipe(
training_recipe=recipe,
recipe_overrides=recipe_overrides,
compute=compute_config,
stopping_condition=stopping_condition,
output_data_config=output_config,
position=position,
base_job_name=job_name,
sagemaker_session=sess,
training_image=image_uri
)
# Configure information channels
from sagemaker.prepare.configs import InputData, S3DataSource
train_input = InputData(
channel_name="prepare",
data_source=S3DataSource(
s3_uri=train_s3_path,
s3_data_type="Converse",
s3_data_distribution_type="FullyReplicated"
)
)
val_input = InputData(
channel_name="val",
data_source=S3DataSource(
s3_uri=test_s3_path,
s3_data_type="Converse",
s3_data_distribution_type="FullyReplicated"
)
)
# Start coaching
training_job = coach.prepare(
input_data_config=[train_input,val_input],
wait=False
)
After coaching, we register the mannequin with Amazon Bedrock via the create_custom_model_deployment Amazon Bedrock API, enabling on-demand inference via the converse API utilizing the deployed mannequin ARN, system prompts, and consumer messages.
In our SageMaker AI coaching job, we used default recipe parameters, together with an epoch of two and batch measurement of 64, our information contained 20,000 traces thus the entire coaching job lasted for 4 hours. With our ml.g5.48xlarge occasion, the entire price for fine-tuning our Nova Micro mannequin was $65.
4. Testing and analysis
For evaluating our mannequin, we carried out each operational and accuracy testing. To judge accuracy, we applied an LLM-as-a-Decide method the place we collected questions and SQL responses from our fine-tuned mannequin and used a decide mannequin to attain them towards the bottom fact responses.
def get_score(system, consumer, assistant, generated):
formatted_prompt = (
"You're a information science trainer that's introducing college students to SQL. "
f"Take into account the next query and schema:"
f"<query>{consumer}</query>"
f"<schema>{system}</schema>"
"Right here is the right reply:"
f"<correct_answer>{assistant}</correct_answer>"
f"Right here is the coed's reply:"
f"<student_answer>{generated}</student_answer>"
"Please present a numeric rating from 0 to 100 on how effectively the coed's "
"reply matches the right reply. Put the rating in <SCORE> XML tags."
)
_, outcome = ask_claude(formatted_prompt)
sample = r'<SCORE>(.*?)</SCORE>'
match = re.search(sample, outcome)
return match.group(1) if match else "0"
For operational testing, we gathered metrics together with TTFT (Time to First Token) and OTPS (Output Tokens Per Second). In comparison with the bottom Nova Micro mannequin, we skilled chilly begin time to first token averaging 639 ms throughout 5 runs (34% enhance). This latency enhance stems from making use of LoRA adapters at inference time relatively than baking them into mannequin weights. Nevertheless, this architectural selection delivers substantial price advantages, because the fine-tuned Nova Micro mannequin prices the identical as the bottom mannequin, enabling on-demand pricing with pay-per-use flexibility and no minimal commitments. Throughout regular operation, our time to first token averages 380 ms throughout 50 calls (7% enhance). Finish-to-end latency totals roughly 477 ms for full response era. Token era maintains a charge of roughly 183 tokens per second, representing solely a 27% lower from the bottom mannequin whereas remaining extremely appropriate for interactive purposes.

Value abstract
One-time prices:
- Amazon Bedrock mannequin coaching price: $0.001 per 1,000 tokens × variety of epochs
- For two,000 examples, 5 epochs and roughly 800 tokens every = $8.00
- SageMaker AI mannequin coaching price: We used the ml.g5.48xlarge occasion, which prices $16.288/hour
- Coaching lasted 4 hours with a 20,000-line dataset = $65.15
- Ongoing prices
- Storage: $1.95 monthly per customized mannequin
- On-demand inference: Identical per-token pricing as base Nova Micro
- Enter tokens: $0.000035 per 1,000 tokens (Amazon Nova Micro)
- Output tokens: $0.00014 per 1,000 tokens (Amazon Nova Micro)
Instance calculation for manufacturing workload:
For 22,000 queries monthly (100 customers × 10 queries/day × 22 enterprise days):
- Common 800 enter tokens + 60 output tokens per question
- Enter price: (22,000 × 800 / 1,000) × 0.000035 = 0.616
- Output price: (22,000 × 60 / 1,000) × 0.00014 = 0.184
- Whole month-to-month inference price: 0.80 USD
This evaluation validates that for customized dialect text-to-SQL use instances, fine-tuning a Nova mannequin utilizing PEFT LoRA on Amazon Bedrock is considerably less expensive than self-hosting customized fashions on persistent infrastructure. Self-hosted approaches may suite use instances requiring most management over infrastructure, safety configurations, or integration necessities, however the Amazon Bedrock on-demand price mannequin affords vital price financial savings for many manufacturing text-to-SQL workloads.
Conclusion
These implementation choices exhibit how Amazon Nova fine-tuning may be tailor-made to organizational wants and technical necessities. We explored two distinct approaches that serve totally different audiences and use instances. Whether or not you select the managed simplicity of Amazon Bedrock or extra management via SageMaker AI coaching, the serverless deployment mannequin and on-demand pricing implies that you solely pay for what you employ, whereas eradicating infrastructure administration.
The Amazon Bedrock mannequin customization method supplies a streamlined, managed resolution that eliminates infrastructure complexity. Information scientists can deal with information preparation and mannequin analysis with out managing coaching infrastructure, making it splendid for fast experimentation and improvement.
The SageMaker AI coaching method affords elevated management over each side of the fine-tuning course of. Machine studying (ML) engineers acquire granular management over coaching parameters, infrastructure choice, and integration with present MLOps workflows, which permits optimization for required efficiency, price, and operational necessities. For instance, you possibly can regulate batch sizes and occasion sorts to optimize coaching velocity, or modify studying charges and LoRA parameters to stability mannequin high quality with coaching time based mostly in your particular operational wants
Select Amazon Bedrock mannequin customization when: You want speedy iteration, have restricted ML infrastructure experience, or wish to decrease operational overhead whereas nonetheless attaining customized mannequin efficiency.
Select SageMaker AI coaching when: You require fine-grained parameter management, have particular infrastructure or compliance necessities, want integration with present MLOps pipelines, or wish to optimize each side of the coaching course of.
Get began
Able to construct your individual cost-effective text-to-SQL resolution? Entry our full implementations:
Each approaches use the identical cost-efficient deployment mannequin, so you possibly can select based mostly in your workforce’s experience and necessities relatively than price constraints.
In regards to the authors

