How iFood constructed a platform to run tons of of machine studying fashions with Amazon SageMaker Inference

Headquartered in São Paulo, Brazil, iFood is a nationwide personal firm and the chief in food-tech in Latin America, processing tens of millions of orders month-to-month. iFood has stood out for its technique of incorporating cutting-edge know-how into its operations. With the assist of AWS, iFood has developed a strong machine studying (ML) inference infrastructure, utilizing providers equivalent to Amazon SageMaker to effectively create and deploy ML fashions. This partnership has allowed iFood not solely to optimize its inner processes, but additionally to supply modern options to its supply companions and eating places.

iFood’s ML platform contains a set of instruments, processes, and workflows developed with the next goals:

Speed up the event and coaching of AI/ML fashions, making them extra dependable and reproducible
Make it possible for deploying these fashions to manufacturing is dependable, scalable, and traceable
Facilitate the testing, monitoring, and analysis of fashions in manufacturing in a clear, accessible, and standardized method

To attain these goals, iFood makes use of SageMaker, which simplifies the coaching and deployment of fashions. Moreover, the combination of SageMaker options in iFood’s infrastructure automates vital processes, equivalent to producing coaching datasets, coaching fashions, deploying fashions to manufacturing, and repeatedly monitoring their efficiency.

On this put up, we present how iFood makes use of SageMaker to revolutionize its ML operations. By harnessing the facility of SageMaker, iFood streamlines your entire ML lifecycle, from mannequin coaching to deployment. This integration not solely simplifies complicated processes but additionally automates vital duties.

AI inference at iFood
iFood has harnessed the facility of a strong AI/ML platform to raise the client expertise throughout its various touchpoints. Utilizing the slicing fringe of AI/ML capabilities, the corporate has developed a set of transformative options to deal with a large number of buyer use circumstances:

Personalised suggestions – At iFood, AI-powered suggestion fashions analyze a buyer’s previous order historical past, preferences, and contextual elements to recommend essentially the most related eating places and menu objects. This personalised strategy makes certain clients uncover new cuisines and dishes tailor-made to their tastes, enhancing satisfaction and driving elevated order volumes.
Intelligent order tracking – iFood’s AI techniques monitor orders in actual time, predicting supply instances with a excessive diploma of accuracy. By understanding elements like visitors patterns, restaurant preparation instances, and courier places, the AI can proactively notify clients of their order standing and anticipated arrival, lowering uncertainty and anxiousness through the supply course of.
Automated customer Service – To deal with the 1000’s of day by day buyer inquiries, iFood has developed an AI-powered chatbot that may shortly resolve widespread points and questions. This clever digital agent understands pure language, accesses related knowledge, and offers personalised responses, delivering quick and constant assist with out overburdening the human customer support group.
Grocery shopping assistance – Integrating superior language fashions, iFood’s app permits clients to easily communicate or sort their recipe wants or grocery checklist, and the AI will routinely generate an in depth buying checklist. This voice-enabled grocery planning function saves clients effort and time, enhancing their general buying expertise.

By way of these various AI-powered initiatives, iFood is ready to anticipate buyer wants, streamline key processes, and ship a persistently distinctive expertise—additional strengthening its place because the main food-tech platform in Latin America.

Resolution overview

The next diagram illustrates iFood’s legacy structure, which had separate workflows for knowledge science and engineering groups, creating challenges in effectively deploying correct, real-time machine studying fashions into manufacturing techniques.

Previously, the information science and engineering groups at iFood operated independently. Knowledge scientists would construct fashions utilizing notebooks, modify weights, and publish them onto providers. Engineering groups would then wrestle to combine these fashions into manufacturing techniques. This disconnection between the 2 groups made it difficult to deploy correct real-time ML fashions.

To beat this problem, iFood constructed an inner ML platform that helped bridge this hole. This platform has streamlined the workflow, offering a seamless expertise for creating, coaching, and delivering fashions for inference. It offers a centralized integration the place knowledge scientists may construct, practice, and deploy fashions seamlessly from an built-in strategy, contemplating the event workflow of the groups. The interplay with engineering groups may devour these fashions and combine them into functions from each a web-based and offline perspective, enabling a extra environment friendly and streamlined workflow.

By breaking down the obstacles between knowledge science and engineering, AWS AI platforms empowered iFood to make use of the complete potential of their knowledge and speed up the event of AI functions. The automated deployment and scalable inference capabilities offered by SageMaker made certain that fashions have been available to energy clever functions and supply correct predictions on demand. This centralization of ML providers as a product has been a sport changer for iFood, permitting them to deal with constructing high-performing fashions reasonably than the intricate particulars of inference.

One of many core capabilities of iFood’s ML platform is the flexibility to offer the infrastructure to serve predictions. A number of use circumstances are supported by the inference made out there by way of ML Go!, chargeable for deploying SageMaker pipelines and endpoints. The previous are used to schedule offline predictions jobs, and the latter are employed to create mannequin providers, to be consumed by the appliance providers. The next diagram illustrates iFood’s up to date structure, which includes an inner ML platform constructed to streamline workflows between knowledge science and engineering groups, enabling environment friendly deployment of machine studying fashions into manufacturing techniques.

Integrating mannequin deployment into the service growth course of was a key initiative to allow knowledge scientists and ML engineers to deploy and keep these fashions. The ML platform empowers the constructing and evolution of ML techniques. A number of different integrations with different necessary platforms, just like the function platform and knowledge platform, have been delivered to extend the expertise for the customers as a complete. The method of consuming ML-based choices was streamlined—but it surely doesn’t finish there. The iFood’s ML platform, ML Go!, is now specializing in new inference capabilities, supported by latest options by which the iFood’s group was chargeable for supporting their ideation and growth. The next diagram illustrates the ultimate structure of iFood’s ML platform, showcasing how mannequin deployment is built-in into the service growth course of, the platform’s connections with function and knowledge platforms, and its deal with new inference capabilities.

One of many largest modifications is oriented to the creation of 1 abstraction for connecting with SageMaker Endpoints and Jobs, referred to as ML Go! Gateway, and in addition, the separation of considerations throughout the Endpoints, by means of the Inference Parts function, making the serving quicker and extra environment friendly. On this new inference construction, the Endpoints are additionally managed by the ML Go! CI/CD, leaving for the pipelines, to deal solely with mannequin promotions, and never the infrastructure itself. It would scale back the lead time to modifications, and alter failure ratio over the deployments.

Utilizing SageMaker Inference Mannequin Serving Containers:

One of many key options of contemporary machine studying platforms is the standardization of machine studying and AI providers. By encapsulating fashions and dependencies as Docker containers, these platforms guarantee consistency and portability throughout completely different environments and phases of ML. Utilizing SageMaker, knowledge scientists and builders can use pre-built Docker containers, making it easy to deploy and handle ML providers. As a mission progresses, they’ll spin up new cases and configure them in line with their particular necessities. SageMaker offers Docker containers which might be designed to work seamlessly with SageMaker. These containers present a standardized and scalable atmosphere for operating ML workloads on SageMaker.

SageMaker offers a set of pre-built containers for widespread ML frameworks and algorithms, equivalent to TensorFlow, PyTorch, XGBoost, and plenty of others. These containers are optimized for efficiency and embody all the mandatory dependencies and libraries pre-installed, making it easy to get began along with your ML initiatives. Along with the pre-built containers, it offers choices to deliver your personal customized containers to SageMaker, which embody your particular ML code, dependencies, and libraries. This may be significantly helpful in case you’re utilizing a much less widespread framework or have particular necessities that aren’t met by the pre-built containers.

iFood was extremely centered on utilizing customized containers for the coaching and deployment of ML workloads, offering a constant and reproducible atmosphere for ML experiments, and making it easy to trace and replicate outcomes. Step one on this journey was to standardize the ML customized code, which is definitely the piece of code that the information scientists ought to deal with. And not using a pocket book, and with BruceML, the best way to create the code to coach and serve fashions has modified, to be encapsulated from the beginning as container photographs. BruceML was chargeable for creating the scaffolding required to seamlessly combine with the SageMaker platform, permitting the groups to benefit from its varied options, equivalent to hyperparameter tuning, mannequin deployment, and monitoring. By standardizing ML providers and utilizing containerization, fashionable platforms democratize ML, enabling iFood to quickly construct, deploy, and scale clever functions.

Automating mannequin deployment and ML system retraining

When operating ML fashions in manufacturing, it’s vital to have a strong and automatic course of for deploying and recalibrating these fashions throughout completely different use circumstances. This helps be sure that the fashions stay correct and performant over time. The group at iFood understood this problem nicely—not solely the mannequin is deployed. As an alternative, they depend on one other idea to maintain issues operating nicely: ML pipelines.

Utilizing Amazon SageMaker Pipelines, they have been in a position to construct a CI/CD system for ML, to ship automated retraining and mannequin deployment. Additionally they built-in this whole system with the corporate’s present CI/CD pipeline, making it environment friendly and in addition sustaining good DevOps practices used at iFood. It begins with the ML Go! CI/CD pipeline pushing the most recent code artifacts containing the mannequin coaching and deployment logic. It contains the coaching course of, which makes use of completely different containers for implementing your entire pipeline. When coaching is full, the inference pipeline will be executed to start the mannequin deployment. It may be a completely new mannequin, or the promotion of a brand new model to extend the efficiency of an present one. Each mannequin out there for deployment can also be secured and registered routinely by ML Go! in Amazon SageMaker Mannequin Registry, offering versioning and monitoring capabilities.

The ultimate step is determined by the supposed inference necessities. For batch prediction use circumstances, the pipeline creates a SageMaker batch remodel job to run large-scale predictions. For real-time inference, the pipeline deploys the mannequin to a SageMaker endpoint, rigorously choosing the suitable container variant and occasion sort to deal with the anticipated manufacturing visitors and latency wants. This end-to-end automation has been a sport changer for iFood, permitting them to quickly iterate on their ML fashions and deploy updates and recalibrations shortly and confidently throughout their varied use circumstances. SageMaker Pipelines has offered a streamlined method to orchestrate these complicated workflows, ensuring mannequin operationalization is environment friendly and dependable.

Operating inference in several SLA codecs

iFood makes use of the inference capabilities of SageMaker to energy its clever functions and ship correct predictions to its clients. By integrating the sturdy inference choices out there in SageMaker, iFood has been in a position to seamlessly deploy ML fashions and make them out there for real-time and batch predictions. For iFood’s on-line, real-time prediction use circumstances, the corporate makes use of SageMaker hosted endpoints to deploy their fashions. These endpoints are built-in into iFood’s customer-facing functions, permitting for instant inference on incoming knowledge from customers. SageMaker handles the scaling and administration of those endpoints, ensuring that iFood’s fashions are available to offer correct predictions and improve the person expertise.

Along with real-time predictions, iFood additionally makes use of SageMaker batch remodel to carry out large-scale, asynchronous inference on datasets. That is significantly helpful for iFood’s knowledge preprocessing and batch prediction necessities, equivalent to producing suggestions or insights for his or her restaurant companions. SageMaker batch remodel jobs allow iFood to effectively course of huge quantities of knowledge, additional enhancing their data-driven decision-making.

Constructing upon the success of standardization to SageMaker Inference, iFood has been instrumental in partnering with the SageMaker Inference group to construct and improve key AI inference capabilities throughout the SageMaker platform. For the reason that early days of ML, iFood has offered the SageMaker Inference group with worthwhile inputs and experience, enabling the introduction of a number of new options and optimizations:

Price and efficiency optimizations for generative AI inference – iFood helped the SageMaker Inference group develop innovative techniques to optimize the usage of accelerators, enabling SageMaker Inference to cut back basis mannequin (FM) deployment prices by 50% on common and latency by 20% on common with inference elements. This breakthrough delivers important price financial savings and efficiency enhancements for purchasers operating generative AI workloads on SageMaker.
Scaling enhancements for AI inference – iFood’s experience in distributed techniques and auto scaling has additionally helped the SageMaker group develop superior capabilities to raised deal with the scaling necessities of generative AI fashions. These enhancements scale back auto scaling instances by up to 40% and auto scaling detection by six times, ensuring that clients can quickly scale their inference workloads on SageMaker to fulfill spikes in demand with out compromising efficiency.
Streamlined generative AI mannequin deployment for inference – Recognizing the necessity for simplified mannequin deployment, iFood collaborated with AWS to introduce the flexibility to deploy open supply massive language fashions (LLMs) and FMs with just some clicks. This user-friendly performance removes the complexity historically related to deploying these superior fashions, empowering extra clients to harness the facility of AI.
Scale-to-zero for inference endpoints – iFood performed an important position in collaborating with SageMaker Inference to develop and launch the scale-to-zero function for SageMaker inference endpoints. This modern functionality permits inference endpoints to routinely shut down when not in use and quickly spin up on demand when new requests arrive. This function is especially helpful for dev/take a look at environments, low-traffic functions, and inference use circumstances with various inference calls for, as a result of it eliminates idle useful resource prices whereas sustaining the flexibility to shortly serve requests when wanted. The size-to-zero performance represents a significant development in cost-efficiency for AI inference, making it extra accessible and economically viable for a wider vary of use circumstances.
Packaging AI mannequin inference extra effectively – To additional simplify the AI mannequin lifecycle, iFood labored with AWS to reinforce SageMaker’s capabilities for packaging LLMs and fashions for deployment. These enhancements make it easy to organize and deploy these AI fashions, accelerating their adoption and integration.
Multi-model endpoints for GPU – iFood collaborated with the SageMaker Inference group to launch multi-model endpoints for GPU-based cases. This enhancement permits you to deploy a number of AI fashions on a single GPU-enabled endpoint, considerably enhancing useful resource utilization and cost-efficiency. By profiting from iFood’s experience in GPU optimization and mannequin serving, SageMaker now provides an answer that may dynamically load and unload fashions on GPUs, lowering infrastructure prices by as much as 75% for purchasers with a number of fashions and ranging visitors patterns.
Asynchronous inference – Recognizing the necessity for dealing with long-running inference requests, the group at iFood labored intently with the SageMaker Inference group to develop and launch Asynchronous Inference in SageMaker. This function allows you to course of massive payloads or time-consuming inference requests with out the constraints of real-time API calls. iFood’s expertise with large-scale distributed techniques helped form this resolution, which now permits for higher administration of resource-intensive inference duties, and the flexibility to deal with inference requests that may take a number of minutes to finish. This functionality has opened up new use circumstances for AI inference, significantly in industries coping with complicated knowledge processing duties equivalent to genomics, video evaluation, and monetary modeling.

By intently partnering with the SageMaker Inference group, iFood has performed a pivotal position in driving the fast evolution of AI inference and generative AI inference capabilities in SageMaker. The options and optimizations launched by way of this collaboration are empowering AWS clients to unlock the transformative potential of inference with better ease, cost-effectiveness, and efficiency.

“At iFood, we have been on the forefront of adopting transformative machine studying and AI applied sciences, and our partnership with the SageMaker Inference product group has been instrumental in shaping the way forward for AI functions. Collectively, we’ve developed methods to effectively handle inference workloads, permitting us to run fashions with velocity and price-performance. The teachings we’ve realized supported us within the creation of our inner platform, which may function a blueprint for different organizations trying to harness the facility of AI inference. We imagine the options now we have inbuilt collaboration will broadly assist different enterprises who run inference workloads on SageMaker, unlocking new frontiers of innovation and enterprise transformation, by fixing recurring and necessary issues within the universe of machine studying engineering.”

– says Daniel Vieira, ML Platform supervisor at iFood.

Conclusion

Utilizing the capabilities of SageMaker, iFood reworked its strategy to ML and AI, unleashing new potentialities for enhancing the client expertise. By constructing a strong and centralized ML platform, iFood has bridged the hole between its knowledge science and engineering groups, streamlining the mannequin lifecycle from growth to deployment. The mixing of SageMaker options has enabled iFood to deploy ML fashions for each real-time and batch-oriented use circumstances. For real-time, customer-facing functions, iFood makes use of SageMaker hosted endpoints to offer instant predictions and improve the person expertise. Moreover, the corporate makes use of SageMaker batch remodel to effectively course of massive datasets and generate insights for its restaurant companions. This flexibility in inference choices has been key to iFood’s capacity to energy a various vary of clever functions.

The automation of deployment and retraining by way of ML Go!, supported by SageMaker Pipelines and SageMaker Inference, has been a sport changer for iFood. This has enabled the corporate to quickly iterate on its ML fashions, deploy updates with confidence, and keep the continuing efficiency and reliability of its clever functions. Furthermore, iFood’s strategic partnership with the SageMaker Inference group has been instrumental in driving the evolution of AI inference capabilities throughout the platform. By way of this collaboration, iFood has helped form price and efficiency optimizations, scale enhancements, and simplify mannequin deployment options—all of which at the moment are benefiting a wider vary of AWS clients.

By profiting from the capabilities SageMaker provides, iFood has been in a position to unlock the transformative potential of AI and ML, delivering modern options that improve the client expertise and strengthen its place because the main food-tech platform in Latin America. This journey serves as a testomony to the facility of cloud-based AI infrastructure and the worth of strategic partnerships in driving technology-driven enterprise transformation.

By following iFood’s instance, you possibly can unlock the complete potential of SageMaker for your corporation, driving innovation and staying forward in your trade.

In regards to the Authors

Daniel Vieira is a seasoned Machine Studying Engineering Supervisor at iFood, with a robust educational background in laptop science, holding each a bachelor’s and a grasp’s diploma from the Federal College of Minas Gerais (UFMG). With over a decade of expertise in software program engineering and platform growth, Daniel leads iFood’s ML platform, constructing a strong, scalable ecosystem that drives impactful ML options throughout the corporate. In his spare time, Daniel Vieira enjoys music, philosophy, and studying about new issues whereas consuming a superb cup of espresso.

Debora Fanin serves as a Senior Buyer Options Supervisor AWS for the Digital Native Enterprise section in Brazil. On this position, Debora manages buyer transformations, creating cloud adoption methods to assist cost-effective, well timed deployments. Her obligations embody designing change administration plans, guiding solution-focused choices, and addressing potential dangers to align with buyer goals. Debora’s educational path features a Grasp’s diploma in Administration at FEI and certifications equivalent to Amazon Options Architect Affiliate and Agile credentials. Her skilled historical past spans IT and mission administration roles throughout various sectors, the place she developed experience in cloud applied sciences, knowledge science, and buyer relations.

Saurabh Trikande is a Senior Product Supervisor for Amazon Bedrock and Amazon SageMaker Inference. He’s obsessed with working with clients and companions, motivated by the purpose of democratizing AI. He focuses on core challenges associated to deploying complicated AI functions, inference with multi-tenant fashions, price optimizations, and making the deployment of generative AI fashions extra accessible. In his spare time, Saurabh enjoys climbing, studying about modern applied sciences, following TechCrunch, and spending time along with his household.

Gopi Mudiyala is a Senior Technical Account Supervisor at AWS. He helps clients within the monetary providers trade with their operations in AWS. As a machine studying fanatic, Gopi works to assist clients succeed of their ML journey. In his spare time, he likes to play badminton, spend time with household, and journey.

How iFood constructed a platform to run tons of of machine studying fashions with Amazon SageMaker Inference

Resolution overview

Utilizing SageMaker Inference Mannequin Serving Containers:

Automating mannequin deployment and ML system retraining

Operating inference in several SLA codecs

Conclusion

In regards to the Authors

Why the hole in utilizing AI for legal professionals?

High Turbotax Low cost Codes and Coupons April 2025

Converter

Editors Pick

Newsletter

Categories

Related Posts

Leave a Comment Cancel Reply

Latest

Best selling