Monday, June 8, 2026
banner
Top Selling Multipurpose WP Theme

Deep studying AMI and AWS deep learning containers SOCI snapshot and index help is now enabled. Seekable OCI (SOCI) is a expertise that allows environment friendly container picture administration by means of selective file downloads. Use a layer-based indexing system to map the places of recordsdata inside a container picture in order that the container begins with solely the recordsdata it wants loaded (lazy loading). This method is especially invaluable for organizations managing giant container photos in cloud environments, because it reduces community bandwidth utilization and quickens container startup time.

This put up explains the right way to use SOCI with publicly out there deep studying AMIs and containers, when to make use of the completely different SOCI modes offered by the instrument, and the right way to use this instrument rapidly and effectively along with your present workloads.

background

As organizations deploy synthetic intelligence (AI) and machine studying (ML) workloads at scale, container startup time has grow to be a bottleneck in manufacturing environments. Whether or not you are spinning up a coaching job, offering an inference endpoint, or autoscaling a GPU cluster, the time spent downloading multi-gigabyte container photos has a direct influence on value, consumer expertise, and operational effectivity. Conventional container deployment approaches require groups to obtain your entire picture earlier than beginning a workload. This course of can take a number of minutes besides photos which can be generally utilized in manufacturing environments. Throughout growth, a couple of minutes of ready time is sort of imperceptible. In a manufacturing setting, the identical period of time can add up rapidly.

Organizations deploying deep studying infrastructure at scale usually face important challenges, together with:

  • Chilly begin time has been prolonged. Pulling a regular 15-20 GB Docker picture can take 4-6 minutes per occasion, delaying coaching jobs and inference endpoints throughout scaling occasions.
  • Waste of computing sources. GPU situations sit idle throughout picture pulls, spending costly compute time whereas ready for containers to finish initialization.
  • Scaling bottleneck. When autoscaling is triggered by a spike in demand, gradual container startup occasions forestall fast response, leading to efficiency degradation and dropped requests.
  • Bandwidth constraints. Massive deployments that pull giant numbers of photos concurrently can saturate community bandwidth and trigger cascading delays all through the infrastructure.
  • Developer productiveness. Knowledge scientists and ML engineers waste invaluable time ready for containers to begin throughout iterative growth and experimentation cycles.

Container traction mechanism

When pulling containers on your workloads, the AWS Deep Studying AMI (DLAMI) and Deep Studying Containers present three choices: normal Docker pull, SOCI parallel pull, and SOCI lazy load with SOCI index. Consider these as a sliding scale of trade-offs. Docker pulls are sequential and take time. SOCI parallel pull reduces startup time by chunking downloads on the expense of computing sources. SOCI lazy loading means that you can load containers virtually immediately, however requires fetching recordsdata on demand. You possibly can select the correct mechanism on your workload utilizing the next information.

  • The selection between lazy load mode and parallel pull mode is determined by your picture, occasion specs, and storage configuration. Lazy loading requires the picture to have a SOCI index. With out this, the system reverts to plain pull.
  • Decrease-spec situations ought to use lazy loading to preserve sources, whereas higher-spec situations with a number of vCPUs and excessive community bandwidth profit from parallel pull mode. Storage efficiency varies. EBS volumes are restricted by provisioned IOPS and quantity kind, which may trigger bottlenecks throughout unpacking. NVMe occasion retailer, however, supplies most I/O efficiency on the expense of information persistence throughout occasion cease/begin cycles.

The next examples reveal varied mechanisms based mostly on the vLLM deep studying container.

Deep studying container pull mechanism

resolution structure

The next diagram exhibits the structure for utilizing SOCI with DLAMI and deep studying containers.

Solution architecture demonstrating SOCI snapshot integration with DLAMI and deep learning containers on Amazon EC2

Comparability of container startup time with SOCI snapshotter

The next benchmark compares normal Docker pull and SOCI snapshots in each lazy loading and parallel pull modes.

lazy loading mode

In lazy loading mode, the container is began instantly by fetching solely the required knowledge on demand, and the remaining layers are loaded within the background as wanted.

Conditions

SOCI index required

vital: In lazy loading mode, the container picture SOCI index Saved within the registry. With no SOCI index, the snapshotter reverts to plain pull conduct and there’s no efficiency enchancment. AWS deep studying containers (DLCs) with the -soci tag suffix include pre-built SOCI indexes pushed to the registry and are lazy-loadable out of the field. For customized photos, you need to: Create and push a SOCI index

setting

  • occasion kind: g5.2xlarge
  • EBS: Measurement 500GiB, IOPS 3000, Throughput 125
  • Ami: Deep Studying Based mostly OSS Nvidia Driver GPU AMI (Ubuntu 24.04) 20260413 (ami-06abbbf2049359343)
  • docker picture: public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci
  • picture dimension:9.72GB (compressed), 32.7GB (disk utilization)
  • community: Co., Ltd

Begin a container with Docker (non-SOCI)

Begin the inference server immediately utilizing Docker. Because the picture doesn’t exist regionally, Docker pulls and extracts your entire picture earlier than beginning the container.

Whole time: 6 minutes 59 seconds 099.

#!/bin/bash
time docker run 
    --gpus all 
    -d 
    -v ~/.cache/huggingface:/root/.cache/huggingface 
    --env "HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN" 
    -p 8000:8000 
    --ipc=host 
    public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci 
    --model mistralai/Mistral-7B-v0.1
# output
Unable to search out picture 'public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci' regionally
0.19.0-gpu-py312-ec2-soci: Pulling from deep-learning-containers/vllm
340d44d2921c: Pull full
....2001a2421bf1: Pull full
Digest: sha256:a6344c96a33ef98a32a27f89b41b8c0529d4fbbba248eb57f811725d415f68fc
Standing: Downloaded newer picture for public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci
e12d969eb71517d9a6a23b9b11cfa22ddda26a95f6a0f0d8df00cd5c4fdfe912

actual    6m59.099s
consumer    0m0.391s
sys     0m0.452s

Begin a container with SOCI snapshotter (lazy loading)

Begin the inference container utilizing nerdctl and a SOCI snapshot. Though the picture doesn’t exist regionally, SOCI listed photos enable nerdctl to begin the container with solely the index and required layers, permitting for lazy loading of the remaining layers. Whole time: 21.125 seconds.

#!/bin/bash
time sudo nerdctl run 
     --snapshotter soci 
    --gpus all 
    -d 
    -v ~/.cache/huggingface:/root/.cache/huggingface 
    --env "HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN" 
    -p 8000:8000 
    --ipc=host 
    public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci 
    --model mistralai/Mistral-7B-v0.1
# output
public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci:           resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:a6344c96a33ef98a32a27f89b41b8c0529d4fbbba248eb57f811725d415f68fc:    finished           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:d91ad3b46204eace6de2fb27c46d9600337fa9c124b4c82fe0f335d391017daa: finished           |++++++++++++++++++++++++++++++++++++++|
config-sha256:886ed36d57c44081a74a0ab052f57366d96ab2c0fe39bb3e2f8a46cc20db8ec2:   finished           |++++++++++++++++++++++++++++++++++++++|
elapsed: 10.5s                                                                    complete:  48.1 Ok (4.6 KiB/s)
189307b7899438415f3df4288b3fbb26bcc4cd43678e88ec3b062bc6330e3e3b

actual    0m21.125s
consumer    0m0.004s
sys     0m0.011s

Overview of lazy loading

When utilizing the SOCI snapshotter with lazy loading, the container was began with . 21.125 secondsin comparison with 6 minutes 59.099 seconds Use normal Docker. This enchancment is achieved as a result of SOCI pulls solely the layers wanted to begin the container, and the remaining layers are loaded on demand as wanted.

parallel pull mode

In lazy loading mode, the container is began instantly by fetching solely the required knowledge on demand. parallel pull mode It downloads your entire picture earlier than launching, however with larger concurrency than a regular Docker pull. This mode is right whenever you need a full picture out there at startup or when working I/O-intensive workloads.

setting

  • Occasion kind: g5.4xlarge
  • EBS: 500GiB gp3, 16000 IOPS, 1000MB/s throughput
  • Ami: Deep Studying Based mostly OSS Nvidia Driver GPU AMI (Ubuntu 24.04) 20260413 (ami-06abbbf2049359343)
  • Docker picture: 763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker
  • picture dimension: 19.32GB (compressed), 60.4GB (disk utilization)
  • community: Co., Ltd

Observe: We use a non-public ECR picture for this benchmark as a result of public ECR is fronted by Amazon CloudFront, which limits community bandwidth and impacts parallel mode efficiency. Personal ECR is served immediately from Amazon Easy Storage Service (Amazon S3) and supplies larger throughput.

Allow parallel pull mode

The Deep Studying AMI’s SOCI snapshotter is ready to lazy loading mode by default. To allow parallel pull mode, modify the configuration file positioned at: /and so on/soci-snapshotter-grpc/config.toml:

# Parallel Pull Mode - considerably improves picture pull occasions for big AI/ML photos
# These are conservative defaults really helpful by AWS for ECR
[pull_modes.parallel_pull_unpack]
allow = true # false(default): lazy loading/true: parallel mode
max_concurrent_downloads = -1 # limitless world cap throughout all photos
max_concurrent_downloads_per_image = 20 # per-image obtain connections
concurrent_download_chunk_size = "16mb"
max_concurrent_unpacks = -1 # limitless world cap throughout all photos
max_concurrent_unpacks_per_image = 10 # per-image parallel unpack threads
discard_unpacked_layers = true

Restart the service to use the configuration.

sudo systemctl restart soci-snapshotter.service

Trace: will be tuned max_concurrent_downloads_per_image and max_concurrent_unpacks_per_image Based mostly on occasion kind and community bandwidth. For detailed tuning steerage, see Introducing Seekable OCI Parallel Pull Mode for Amazon EKS.

Be certain parallel mode is energetic

Confirm that parallel mode is enabled by monitoring the SOCI snapshot logs whereas pulling the picture.

journalctl -u soci-snapshotter -f

Search for log entries that point out parallel pull/unpack.

Apr 16 23:59:08 ip-172-31-86-91 soci-snapshotter-grpc[3108]:
  {"layerDigest":"sha256:e87500e698966458d9dfc34df84602985c9821f39666619792fe6282aa6df5d4",
   "stage":"information",
   "msg":"making ready snapshot with parallel pull/unpack",
   "time":"2026-04-16T23:59:08.654819383Z"}

Pulling photos with Docker (non-SOCI)

Customary Docker pull obtain and extraction of layers with restricted concurrency.

Whole time: 4 minutes 44.163 seconds

time docker pull 
  763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker

Digest: sha256:fd0cf60bbb34a5d30f22595215a633e5d4a7260fc0868aabe3f04b1174b7365d
Standing: Downloaded newer picture for
  763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker
763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker

actual    4m44.163s
consumer    0m0.339s
sys     0m0.423s

Pull photos in SOCI parallel mode

Utilizing nerdctl in SOCI parallel pull mode will increase concurrency for each obtain and decompression operations.

Whole time: 2 minutes 12.846 seconds

time sudo nerdctl pull --snapshotter soci 
  763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker

763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker:
  resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:fd0cf60bbb34a5d30f22595215a633e5d4a7260fc0868aabe3f04b1174b7365d:
  finished           |++++++++++++++++++++++++++++++++++++++|
config-sha256:5e6a53b7478b0631dd3c4222ab6619dae3a3dd32a565921f10b0b03fdc316d46:
  finished           |++++++++++++++++++++++++++++++++++++++|
elapsed: 132.8s    complete:  89.3 Ok (688.0 B/s)

actual    2m12.846s
consumer    0m0.018s
sys     0m0.075s

Parallel pull overview

SOCI parallel pull mode reduces picture pull time. 4 minutes 44 seconds to 2 minutes 12 secondsrepresents, 2.2x enchancment In pull efficiency.

conclusion

SOCI snapshots enhance each container startup and picture pull operations.

  • lazy loading mode — achieved 20x enchancment Container startup time (from 6 minutes or extra as much as 21 seconds)
  • parallel pull mode — achieved 2.2x enchancment Picture pull time (from 4 minutes 44 seconds to 2 minutes 12 seconds)

Select lazy load mode should you want the quickest container startup, and select parallel pull mode should you want a whole picture out there earlier than your workload begins.

cleansing

For those who launched an EC2 occasion to check SOCI snapshots, terminate the occasion to keep away from incurring ongoing expenses. Delete any container photos that you simply pushed to Amazon Elastic Container Registry (Amazon ECR) throughout your exams and delete any SOCI indexes which can be not wanted.

Get began with SOCI

DLAMI and deep studying containers are actually typically out there, together with SOCI snapshotter and SOCI index. For extra details about publicly out there DLAMIs and deep studying containers, take a look at the SOCI Index DLAMI and choose a picture that helps SOCI. Deep learning container repository To get extra details about photos supported utilizing the SOCI index,

For detailed configuration steerage and greatest practices, see: SOCI documentation and Deep learning container SOCI documentation.

Concerning the writer

Ohad Katz

Ohad Katz

Ohad Katz is a former techniques growth engineer on the AWS Deep Studying AMI (DLAMI) staff.

Yadan Wei

Yadan Wei

Yadan Wei is a software program growth engineer on the AWS Deep Studying Containers (DLC) staff, the place he builds and maintains production-ready Docker container photos that allow clients to coach and deploy deep studying fashions on AWS companies similar to SageMaker, EC2, ECS, and EKS.

nick song

nick tune

Nick Music is a software program growth engineer at AWS, engaged on deep studying AMIs that present clients with optimized deep studying infrastructure.

banner
Top Selling Multipurpose WP Theme

Converter

Top Selling Multipurpose WP Theme

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

banner
Top Selling Multipurpose WP Theme

Leave a Comment

banner
Top Selling Multipurpose WP Theme

Latest

Best selling

22000,00 $
16000,00 $
6500,00 $
5999,00 $

Top rated

6500,00 $
22000,00 $
900000,00 $

Products

Knowledge Unleashed
Knowledge Unleashed

Welcome to Ivugangingo!

At Ivugangingo, we're passionate about delivering insightful content that empowers and informs our readers across a spectrum of crucial topics. Whether you're delving into the world of insurance, navigating the complexities of cryptocurrency, or seeking wellness tips in health and fitness, we've got you covered.