Deep studying AMI and AWS deep learning containers SOCI snapshot and index help is now enabled. Seekable OCI (SOCI) is a expertise that allows environment friendly container picture administration by means of selective file downloads. Use a layer-based indexing system to map the places of recordsdata inside a container picture in order that the container begins with solely the recordsdata it wants loaded (lazy loading). This method is especially invaluable for organizations managing giant container photos in cloud environments, because it reduces community bandwidth utilization and quickens container startup time.
This put up explains the right way to use SOCI with publicly out there deep studying AMIs and containers, when to make use of the completely different SOCI modes offered by the instrument, and the right way to use this instrument rapidly and effectively along with your present workloads.
background
As organizations deploy synthetic intelligence (AI) and machine studying (ML) workloads at scale, container startup time has grow to be a bottleneck in manufacturing environments. Whether or not you are spinning up a coaching job, offering an inference endpoint, or autoscaling a GPU cluster, the time spent downloading multi-gigabyte container photos has a direct influence on value, consumer expertise, and operational effectivity. Conventional container deployment approaches require groups to obtain your entire picture earlier than beginning a workload. This course of can take a number of minutes besides photos which can be generally utilized in manufacturing environments. Throughout growth, a couple of minutes of ready time is sort of imperceptible. In a manufacturing setting, the identical period of time can add up rapidly.
Organizations deploying deep studying infrastructure at scale usually face important challenges, together with:
- Chilly begin time has been prolonged. Pulling a regular 15-20 GB Docker picture can take 4-6 minutes per occasion, delaying coaching jobs and inference endpoints throughout scaling occasions.
- Waste of computing sources. GPU situations sit idle throughout picture pulls, spending costly compute time whereas ready for containers to finish initialization.
- Scaling bottleneck. When autoscaling is triggered by a spike in demand, gradual container startup occasions forestall fast response, leading to efficiency degradation and dropped requests.
- Bandwidth constraints. Massive deployments that pull giant numbers of photos concurrently can saturate community bandwidth and trigger cascading delays all through the infrastructure.
- Developer productiveness. Knowledge scientists and ML engineers waste invaluable time ready for containers to begin throughout iterative growth and experimentation cycles.
Container traction mechanism
When pulling containers on your workloads, the AWS Deep Studying AMI (DLAMI) and Deep Studying Containers present three choices: normal Docker pull, SOCI parallel pull, and SOCI lazy load with SOCI index. Consider these as a sliding scale of trade-offs. Docker pulls are sequential and take time. SOCI parallel pull reduces startup time by chunking downloads on the expense of computing sources. SOCI lazy loading means that you can load containers virtually immediately, however requires fetching recordsdata on demand. You possibly can select the correct mechanism on your workload utilizing the next information.
- The selection between lazy load mode and parallel pull mode is determined by your picture, occasion specs, and storage configuration. Lazy loading requires the picture to have a SOCI index. With out this, the system reverts to plain pull.
- Decrease-spec situations ought to use lazy loading to preserve sources, whereas higher-spec situations with a number of vCPUs and excessive community bandwidth profit from parallel pull mode. Storage efficiency varies. EBS volumes are restricted by provisioned IOPS and quantity kind, which may trigger bottlenecks throughout unpacking. NVMe occasion retailer, however, supplies most I/O efficiency on the expense of information persistence throughout occasion cease/begin cycles.
The next examples reveal varied mechanisms based mostly on the vLLM deep studying container.
Deep studying container pull mechanism
resolution structure
The next diagram exhibits the structure for utilizing SOCI with DLAMI and deep studying containers.

Comparability of container startup time with SOCI snapshotter
The next benchmark compares normal Docker pull and SOCI snapshots in each lazy loading and parallel pull modes.
lazy loading mode
In lazy loading mode, the container is began instantly by fetching solely the required knowledge on demand, and the remaining layers are loaded within the background as wanted.
Conditions
SOCI index required
vital: In lazy loading mode, the container picture SOCI index Saved within the registry. With no SOCI index, the snapshotter reverts to plain pull conduct and there’s no efficiency enchancment. AWS deep studying containers (DLCs) with the -soci tag suffix include pre-built SOCI indexes pushed to the registry and are lazy-loadable out of the field. For customized photos, you need to: Create and push a SOCI index
setting
- occasion kind: g5.2xlarge
- EBS: Measurement 500GiB, IOPS 3000, Throughput 125
- Ami: Deep Studying Based mostly OSS Nvidia Driver GPU AMI (Ubuntu 24.04) 20260413 (
ami-06abbbf2049359343) - docker picture:
public.ecr.aws/deep-learning-containers/vllm:0.19.0-gpu-py312-ec2-soci - picture dimension:9.72GB (compressed), 32.7GB (disk utilization)
- community: Co., Ltd
Begin a container with Docker (non-SOCI)
Begin the inference server immediately utilizing Docker. Because the picture doesn’t exist regionally, Docker pulls and extracts your entire picture earlier than beginning the container.
Whole time: 6 minutes 59 seconds 099.
Begin a container with SOCI snapshotter (lazy loading)
Begin the inference container utilizing nerdctl and a SOCI snapshot. Though the picture doesn’t exist regionally, SOCI listed photos enable nerdctl to begin the container with solely the index and required layers, permitting for lazy loading of the remaining layers. Whole time: 21.125 seconds.
Overview of lazy loading
When utilizing the SOCI snapshotter with lazy loading, the container was began with . 21.125 secondsin comparison with 6 minutes 59.099 seconds Use normal Docker. This enchancment is achieved as a result of SOCI pulls solely the layers wanted to begin the container, and the remaining layers are loaded on demand as wanted.
parallel pull mode
In lazy loading mode, the container is began instantly by fetching solely the required knowledge on demand. parallel pull mode It downloads your entire picture earlier than launching, however with larger concurrency than a regular Docker pull. This mode is right whenever you need a full picture out there at startup or when working I/O-intensive workloads.
setting
- Occasion kind: g5.4xlarge
- EBS: 500GiB gp3, 16000 IOPS, 1000MB/s throughput
- Ami: Deep Studying Based mostly OSS Nvidia Driver GPU AMI (Ubuntu 24.04) 20260413 (
ami-06abbbf2049359343) - Docker picture:
763104351884.dkr.ecr.us-east-1.amazonaws.com/sglang:0.5.10-gpu-py312-cu129-ubuntu24.04-sagemaker - picture dimension: 19.32GB (compressed), 60.4GB (disk utilization)
- community: Co., Ltd
Observe: We use a non-public ECR picture for this benchmark as a result of public ECR is fronted by Amazon CloudFront, which limits community bandwidth and impacts parallel mode efficiency. Personal ECR is served immediately from Amazon Easy Storage Service (Amazon S3) and supplies larger throughput.
Allow parallel pull mode
The Deep Studying AMI’s SOCI snapshotter is ready to lazy loading mode by default. To allow parallel pull mode, modify the configuration file positioned at: /and so on/soci-snapshotter-grpc/config.toml:
Restart the service to use the configuration.
Trace: will be tuned max_concurrent_downloads_per_image and max_concurrent_unpacks_per_image Based mostly on occasion kind and community bandwidth. For detailed tuning steerage, see Introducing Seekable OCI Parallel Pull Mode for Amazon EKS.
Be certain parallel mode is energetic
Confirm that parallel mode is enabled by monitoring the SOCI snapshot logs whereas pulling the picture.
Search for log entries that point out parallel pull/unpack.
Pulling photos with Docker (non-SOCI)
Customary Docker pull obtain and extraction of layers with restricted concurrency.
Whole time: 4 minutes 44.163 seconds
Pull photos in SOCI parallel mode
Utilizing nerdctl in SOCI parallel pull mode will increase concurrency for each obtain and decompression operations.
Whole time: 2 minutes 12.846 seconds
Parallel pull overview
SOCI parallel pull mode reduces picture pull time. 4 minutes 44 seconds to 2 minutes 12 secondsrepresents, 2.2x enchancment In pull efficiency.
conclusion
SOCI snapshots enhance each container startup and picture pull operations.
- lazy loading mode — achieved 20x enchancment Container startup time (from 6 minutes or extra as much as 21 seconds)
- parallel pull mode — achieved 2.2x enchancment Picture pull time (from 4 minutes 44 seconds to 2 minutes 12 seconds)
Select lazy load mode should you want the quickest container startup, and select parallel pull mode should you want a whole picture out there earlier than your workload begins.
cleansing
For those who launched an EC2 occasion to check SOCI snapshots, terminate the occasion to keep away from incurring ongoing expenses. Delete any container photos that you simply pushed to Amazon Elastic Container Registry (Amazon ECR) throughout your exams and delete any SOCI indexes which can be not wanted.
Get began with SOCI
DLAMI and deep studying containers are actually typically out there, together with SOCI snapshotter and SOCI index. For extra details about publicly out there DLAMIs and deep studying containers, take a look at the SOCI Index DLAMI and choose a picture that helps SOCI. Deep learning container repository To get extra details about photos supported utilizing the SOCI index,
For detailed configuration steerage and greatest practices, see: SOCI documentation and Deep learning container SOCI documentation.
Concerning the writer

