Written by
Stijn Van den Enden
Stijn Van den Enden
Stijn Van den Enden
All blog posts
AWS machine learning models
AWS machine learning models
Reading time 7 min
8 MAY 2025

In my previous blog post, I emphasized the importance of gathering data. However, in some cases you might not have any suitable data available. You might have raw data, but perhaps it’s unlabeled and unfit for machine learning. If you don’t have the financial resources to label this data, or you don’t have a product like reCAPTCHA to do it for free, there’s another option. Since Amazon launched its Amazon Web Services cloud platform as a side-business over a decade ago, it has grown at a tremendous pace. AWS offers now more than 165 services, giving anyone, from startups to multinational corporations, access to a dependable and scalable technical infrastructure. Some of these services offer what we call pre-trained machine learning models. Amazon’s pre-trained machine learning models can recognize images or objects, process text, give recommendations and more. The best part of it all is that you are able to use services based on Deep Learning without having to know anything about machine learning at all. These services are trained by Amazon, using data from its websites, its massive product catalog and its warehouses. The information on the AWS websites might be a bit overwhelming at first. That’s why in this blog post I would like to give an overview of a few services using Amazon’s machine learning models, which I think can easily be introduced into your applications. Computer vision with Rekognition Amazon Rekognition is a service that analyzes images and videos. You can use this service to identify people’s faces, everyday objects or even different celebrities. Practical uses are adding labels to videos, for instance following the ball during a football match, or picking out celebrities in an audience. Since Rekognition also has an API to compare similarities between persons in multiple images, you can use it to verify someone’s identity or automatically tag friends on social media. Speaking of social media: depending on the context of a platform, some user contributions might not be deemed acceptable. Through Rekognition, a social media platform can semi-automatically control suggestive or explicit content, giving the opportunity to blur or deny uploaded media when certain labels are associated with it. Digitalize archives with Textract Amazon Textract allows you to extract text from a scanned document. It uses Optical Character Recognition (OCR) and goes a step further by taking context into account. If your company receives a lot of printed forms instead of their digital counterpart, you might have a few thousand papers you need to digitalize manually. With regular OCR, it’s challenging to detect where a form label ends and a form field begins. Likewise it would be difficult for OCR to read newspapers, when text is placed in two or more columns. Textract is able to identify which group of words belong together, whether it’s a paragraph, a form field or a data table, helping you to reduce the time and effort you need to digitalize those archives. Analyze text with Comprehend Amazon Comprehend is a Natural Language Processing (NLP) service. It helps you discover the subject of a document, key phrases, important locations, people mentioned and more. One of its features is to analyze sentiment in a text. This can give you a quick insight in interactions with customers: are they happy, angry, satisfied? Amazon Comprehend can even highlight similar frustrations around a certain topic. If reviews around a certain product are automatically found to be mostly positive, you could easily incorporate this in a promotional campaign. Similarly if reviews are mostly negative, that might be something to forward to the manufacturer. A subservice of Comprehend called Comprehend Medical, is used to mine patient records and extract patient data and treatment information. Its goal is to help health care providers to quickly get an overview of previous interactions with a patient. By identifying key information from medical notes and adding some structure to it, Comprehend Medical assists medical customers to process a ton of documents in a short period of time. Take notes with Transcribe Amazon Transcribe is a general-purpose service to convert speech to text, with support for 14 languages. It automatically adds punctuation and formatting, making the text easier to read and search through. A great application for this is creating a transcript from an audio file and sending it to Comprehend for further analysis. A call center could use real-time streaming transcription to detect the name of a customer and present their information to the operator. Alternatively, the call center could label conversations with keywords to analyze which issues arise frequently. One of Transcribe’s features is to identify multiple speakers. This is useful for transcribing interviews or creating meeting minutes without having one of the meeting participants spend extra time jotting everything down. Multilingual with Translate When you’re getting reactions from customers on your products, you can translate them into your preferred language, so you can grasp subtle implications of certain words. Or you can extend your reach by translating your social media posts. You can even combine Transcribe and Translate to automatically generate subtitles for live events in multiple languages. Express yourself with Polly The Polly service can be considered the inverse of Transcribe. With Polly, you can convert text to speech, making the voice sound as close to natural speech as possible. With support for over 30 languages and many more lifelike voices, nothing is stopping you from making your applications talk back to you. Polly has some support for Speech Synthesis Markup Language (SSML), which gives you more control on how certain parts of the text are pronounced. Besides adding pauses, you can put emphasis on words, exchange acronyms with their unabbreviated form and even add breathing sounds. This amount of customization makes it possible to synthesize voice samples that sound very natural. Generating realistic speech has been a key factor to the success of apps like Duolingo , where pronunciation is of great significance. You can read about this particular use case in this blogpost . Bonus: if you don’t feel like reading, you can have it read to you by Polly! Make suggestions with Personalize When you look for any product on Amazon’s website , you immediately get suggestions for similar products or products that other customers have bought in combination. It’s mind blowing that out of the millions of items offered by Amazon, you get an accurate list of related products at the same moment the page loads. This powerful tool is available to you through Amazon Personalize . You need to provide an item inventory (products, documents, video’s, …), some demographic information about your users, and Personalize will combine this with an activity stream from your application to generate recommendations either in real-time or in bulk. This can easily be applied to a multitude of applications. You can present a list of similar items to customers of a webshop. A course provider would be able to suggest courses similar to a topic of interest. Found a restaurant you liked? Here’s a list of similar restaurants in your area. If you can provide the data, Personalize can provide the recommendations. Create conversations with Lex Amazon Lex is a service that provides conversational AI. It uses the same Natural Language Understanding technology as Amazon’s virtual assistant Alexa. Users can chat to your application instead of clicking through it. Everything starts with an intent . This defines the intention of the user, the goal we want to achieve for our user. It can be as simple as scheduling an appointment, providing directions to a location or getting a recipe that matches a list of ingredients. Intents are triggered by utterances . An utterance is something you say that has meaning. “I need an appointment with Dr. Smith”, “When can I see Dr Smith?”, “Is Dr. Smith available next week Wednesday?” are all utterances for the same intent: making an appointment. Lex is powerful enough to generalize these utterances so that slight variations can also trigger the correct intent. Finally, in the case of registering an appointment, you need to specify a few slots , pieces of data required for the user to provide in order to fulfill the intent. In the case of the example above, the name of the person you want to see, the time period and perhaps the reason of your visit. Even though the requirements are pretty simple, everything depends on the quality of the utterances and the chaining of intents. If you don’t have enough sample sentences or the conversation keeps asking information that the user already presented, your user will end up frustrated and overwhelmed. Predict demand with Forecast A fairly new service provided by AWS is called Forecast . This service also emerged from Amazon’s own necessity to estimate the demand for their immense product inventory. With Forecast, you can get insight in historical time series data. For instance, you could analyze the energy consumption of a region to project it to the near future. This gives you a probability of what the electricity demand tomorrow would be. Likewise, you might be able to predict that a component of your production facility needs maintenance before it wears out. Forecast can leverage Automated Machine Learning (AutoML) to find the optimal learning parameters to fit your use case. The quality of this services depends on the amount and quality of the data you can provide. This service used to be only available to a select group until very recently, but is now available to everyone. You can sign up for Forecast here . 🚀 Takeaway If you want to bring machine learning to your customers but are held back by a lack of understanding, Amazon offers out-of-the-box services to add intelligence to your applications. These services, trained and used by Amazon, can help your business grow and can give a personal experience to your customers, without any prior knowledge on machine learning.

Read more
machine learning
machine learning
Reading time 4 min
6 MAY 2025

Whether we unlock our phones with facial recognition, shout voice commands to our smart devices from across the room or get served a list of movies we might like… machine learning has in many cases changed our lives for the better. However, as with many great technologies, it has its dark side as well. A major one being the massive, often unregulated, collection and processing of personal data. Sometimes it seems that for every positive story, there’s a negative one about our privacy being at risk . It’s clear that we are forced to give privacy the attention it deserves. Today I’d like to talk about how we can use machine learning applications without privacy concerns and worrying that private information might become public . Machine learning with edge devices By placing the intelligence on edge devices on premise, we can ensure that certain information does not leave the sensor that captures it. An edge device is a piece of hardware that is used to process data closely to its source. Instead of sending videos or sound to a centralized processor, they are dealt with on the machine itself. In other words, you avoid transferring all this data to an external application or a cloud-based service. Edge devices are often used to reduce latency. Instead of waiting for the data to travel across a network, you get an immediate result. Another reason to employ an edge device is to reduce the cost of bandwidth. Devices that are using a mobile network might not operate well in rural areas. Self-driving cars, for example, take full advantage of both these reasons. Sending each video capture to a central server would be too time-consuming and the total latency would interfere with the quick reactions we expect from an autonomous vehicle. Even though these are important aspects to consider, the focus of this blog post is privacy. With the General Data Protection Regulation (GDPR) put in effect by the European Parliament in 2018, people have become more aware of how their personal information is used . Companies have to ask consent to store and process this information. Even more, violations of this regulation, for instance by not taking adequate security measures to protect personal data, can result in large fines. This is where edge devices excel. They can immediately process an image or a sound clip without the need for external storage or processing. Since they don’t store the raw data, this information becomes volatile. For instance, an edge device could use camera images to count the number of people in a room. If the camera image is processed on the device itself and only the size of the crowd is forwarded, everybody’s privacy remains guaranteed. Prototyping with Edge TPU Coral, a sub-brand of Google, is a platform that offers software and hardware tools to use machine learning. One of the hardware components they offer is the Coral Dev Board . It has been announced as “ Google’s answer to Raspberry Pi ”. The Coral Dev Board runs a Linux distribution based on Debian and has everything on board to prototype machine learning products. Central to the board is a Tensor Processing Unit (TPU) which has been created to run Tensorflow (Lite) operations in a power-efficient way. You can read about Tensorflow and how it helps enable fast machine learning in one of our previous blog posts . If you look closely at a machine learning process, you can identify two stages. The first stage is training a model from examples so that it can learn certain patterns. The second stage is to apply the model’s capabilities to new data. With the dev board above, the idea is that you train your model on cloud infrastructure. It makes sense, since this step usually requires a lot of computing power. Once all the elements of your model have been learned, they can be downloaded to the device using a dedicated compiler. The result is a little machine that can run a powerful artificial intelligence algorithm while disconnected from the cloud. Keeping data local with Federated Learning The process above might make you wonder about which data is used to train the machine learning model. There are a lot of publicly available datasets you can use for this step. In general these datasets are stored on a central server. To avoid this, you can use a technique called Federated Learning. Instead of having the central server train the entire model, several nodes or edge devices are doing this individually. Each node sends updates on the parameters they have learned, either to a central server (Single Party) or to each other in a peer-to-peer setup (Multi Party). All of these changes are then combined to create one global model. The biggest benefit to this setup is that the recorded (sensitive) data never leaves the local node . This has been used for example in Apple’s QuickType keyboard for predicting emojis , from the usage of a large number of users. Earlier this year, Google released TensorFlow Federated to create applications that learn from decentralized data. Takeaway At ACA we highly value privacy, and so do our customers. Keeping your personal data and sensitive information private is (y)our priority. With techniques like federated learning, we can help you unleash your AI potential without compromising on data security. Curious how exactly that would work in your organization? Send us an email through our contact form and we’ll soon be in touch.

Read more
ai
ai
Reading time 5 min
6 MAY 2025

In the near future, Artificial Intelligence (AI) will bring your company to the next level. Increasing productivity, use of resources, maintainability, staffing efficiency and much more. But before that can happen, you need to collect data and provide enough examples to train your AI algorithms. Whether your company is active in the financial sector or the medical sector, whether you’re focused on warehousing or garbage disposal, every company has one thing in common: data already flows through the organization. This blog post aims to make you aware of the importance of data collection as a stepping stone to Artificial Intelligence . Only when your data is visible, adequate, and complemented with external data and representative for your demographic, can you profit from positive opportunities that present themselves in today’s world and enables you to make better business decisions. What is Artificial Intelligence? Artificial Intelligence (AI) in its simplest form is the imitation of human intelligence by a machine. In other words, it enables programs to make human-like decisions and follow human-like reasoning. A popular subdomain of Artificial Intelligence is Machine Learning. Instead of explicitly programming a set of rules, Machine Learning applications deduct patterns from examples and ‘learn’ how things work. Unhide your data Accessible data can be put to good use. Surely somebody knows how many people are working for your company, how much inventory you keep, how much stock you’ve been moving over the last couple of months, and how your factory scores on efficiency and productivity. But what happens with this data once it has been acquired? A nice presentation to the board? Are these numbers stored somewhere in the cloud? Perhaps they are available in a centralized database? Or worst of all, perhaps they are in an Excel file on a private drive collecting dust? In many companies, only a limited number of people have access to certain assets. Since this implies that data is isolated from the rest of the organization, we call them information silos. Not only does this imply distrust in the organization, it provides a limitation to the team or application processing the data. For the same data, there might be different interpretations between teams, or a correlation between features might remain hidden because the data is distributed over different silos. There’s a big advantage when data is generally available in a standardized way. Not only can you rely on the trustworthiness of the source, you can guarantee a minimum of quality and completeness. If you build a company culture centered around data and start collecting that data in a uniformed way today, it will fuel your artificial intelligence tomorrow. Keep more than just YOUR data Although predicting the future is never certain, you can avoid surprises by incorporating external factors. For instance, when you’re selling electric cars, an increasing oil price might have a positive influence on your sales. A change of government policy on the other hand might have a negative influence. A heat wave might require that your employees get more breaks to prevent exhaustion, which has an influence on productivity. Even annotating data with company initiatives can be beneficial: marketing campaigns (hopefully) result in increased visibility of your organization and solutions, which leads to more sales. That’s why the numbers of your organization should be stored together with external facts and figures that impact the processes which are valuable for your business. A machine learning algorithm can easily consider these extra parameters to extract a connection between multiple sets of data. It’s able to make a distinction between seasonal effects, the effect of climatic conditions and a general trend of increasing sales. Centralizing decision-making around company data is important, but so is external data: the world around us changes constantly. Be prepared to collect a LOT of data. Be wary of biased data There are many examples of where data mining has wrongfully concluded the significance of a certain input feature. Having a complete representation of your inventory or customer base is vital to the impact of data analysis. Besides that, normalization of your input can prevent that your model ever becomes aware of unwanted features. A neural network designed to detect skin cancer was able to identify a correlation between the presence of a ruler next to a tumor when analysing pictures. In an attempt to classify wolves and huskies, scientists deliberately selected images with a specific background to train their algorithm. Thus proving that biased data leads to an inaccurate machine learning model. This is a difficulty that even experienced data scientists face. No wonder experts say they spend more time preparing the data than designing models and training them… " It makes more sense to worry about the data and be less picky about what algorithm to apply. " – Artificial Intelligence: A Modern Approach (S. Russell and P. Norvig) Even though collected data is very valuable for your company, you probably didn’t collect it with use for AI applications in mind. It therefore probably contains disruptive features which will influence the learning process. It’s vital to reflect on and asses your data collection from here on out if you want to prepare it for use in AI applications. Takeaway More and more companies are changing their process to be data-driven in order to have a competitive advantage. For one to understand how certain aspects influence your productivity, it’s important to collect high quality data. When your sources are reliable and you have a suitable application to present insightful patterns, you can use this to support business decisions. Today, the hard part is not collecting the data. There are enough tools that will help you do just that. The real challenge lies in the structuring and capturing of the right data . Finding a solution that fits for your specific case isn’t easy, but you can start by setting up a database or data warehouse, thinking about how you’ll structure your data, and then applying it. If you need help or if you have questions, click here to contact us and shoot us a message! Take action today, because knowing how to realize this takes time and practice. Prepare your company for a data-driven culture and start building knowledge on machine learning to leverage the potential benefit you gain from your data.

Read more
How we built an intelligent stock management system
How we built an intelligent stock management system
Reading time 5 min
5 MAY 2025

What’s the ideal store supply level for a product? How can we determine the number of future sales? Is it possible to reduce the number of product deliveries without going out-of-stock? ACA is building an intelligent stock management system for a customer with about 30 retail locations. Between these stores, that customer sells several thousand products. Who is able to answer the questions above for all these products for every store, at all times? Let me tell you the story of how we took a dive in historical data, combined some data science with machine learning and got some answers to the questions above. Gathering Data For each individual product in the catalogue, a shop manager has to determine the desired store supply. However, a human shop manager simply can’t take as many variables into account as an AI model. This results in more anticipation (a larger buffer of stock) and therefore higher costs for storage. Our goal is to help the shop manager figure out the right amount of stock with an intelligent stock management system powered by machine learning. By looking at evolution from the past, we can give an indication on how many goods will most likely be sold in the coming time period. "If one wants to define the future, they must study the past." – Confucius Before we started to think on a model for predicting product demand, we explored the sales data. From the application we are building, we had about 9 months of product history. We were able to consult legacy systems to supplement our data. These two combined gave us 21 months of sales data, which is still less than ideal. When you want to detect seasonal effects, you need multiple years’ worth of data. We decided to give it a go anyway. Our goal was to assist the people in their process, not to automate the current system based on the model’s predictions. Some products are more popular depending on the weather. For instance, de-icing salt is sold more during days with freezing conditions. The popularity of pumpkin lanterns peaks right before October 31st. This was also the case for some of the items in our client’s product catalogue. Depending on the type of product, the temperature, precipitation or hours of sunshine might be a factor in customer demand. So we gathered historical weather data for the location of the shop. Needless to say, store opening times, promotions for the product itself or similar products, variation of price and product unavailability all have an influence on customer behavior as well. Public holidays or sport events might also affect your business. By adding all these predictor variables , you can further improve the accuracy of a time-series model. Building the model Now that we found the necessary data, it’s time to start working on a model. Because a daily prediction was too fine grained for our client, we decided we would target a weekly prediction for a few reasons: while the uncertainty increases the further you look into the future, the running costs of a daily prediction are too high. there is not enough variation for articles that sell only 0 to 3 times a week. a weekly restock is ideal for most retailers and/or suppliers. However, a weekly prediction posed an additional challenge. On average there are 52,18 weeks in a year. That means that seasonal effects might take place somewhat later each year. There are advantages, too: a weekly prediction gave us the ability to include less popular products, which are not sold on a daily basis. We considered a few techniques on predicting time series. Because of the limited timeframe on the data, we went for a model based on structural time series. To implement the model we selected the STS module from TensorFlow Probability. Below is the result of a prediction from our model. The red line represents the number of articles in the store for a particular product. The blue line is the weekly-based prediction of our model, reduced with daily sold items for that product. Even though at some points we’re going out-of-stock, this gives a pretty good estimate of how much supplies the store needs in the coming week. Putting it all together It’s difficult to put an exact value on the cost of oversupply. By looking at the total articles stored per week, we gathered that our model would reduce the inventory carrying cost by almost 75%. Clearly, empty shelves in a store are not appealing from a customer’s point of view. But this information gives our client the opportunity to reduce the size and frequency of deliveries to an optimal point. In addition to the fact that the model gives a good prediction, we can get information on how much influence a feature has on the model’s prediction. A structural time series is represented as the sum of simpler components. This means we can actually see what effect the temperature has on sales. Furthermore, if we were to start a marketing campaign for this product, we can infer the causal impact. Basically, we can estimate how many products would have been sold when we didn’t have a promotion. There’s often a big challenge in explaining how a machine learning model exactly produces its target. With structural time series, we are able to point out which features have the biggest influence on the prediction. The graph above shows the influence of a season (13 weeks) on product sales. Even in this short time period, there’s a clear increase in sales in July. Takeaway There’s no easy way to predict the future. But by looking back in time, we can discover patterns which we can project forward. We used this technique to give one of our clients an idea of how much sales they might generate in the coming week(s). Going further, we can assume our model becomes more and more reliable as the historical data grows. I started this blog post with asking who would be able to determine the store supply of thousands of products in multiple stores. With a little nudge in the right direction from an intelligent stock management system, anybody can be that person.

Read more
Reading time 6 min
25 SEP 2019

Training machine learning models can take up a lot of time if you don’t have the hardware to support it. For instdataance with neural networks, you need to calculate the addition of every neuron to the total error during the training step. This can result into thousands of calculations per step. Complex models and large datasets can result in a long process of training. Evaluating such models at scale can potentially slow down your application’s performance. Not to mention hyperparameters you need to tune, restarting the process a few times over. In this blog post I want to talk about how you can tackle these issues by making maximum use of your resources. In particular, I want to talk about TensorFlow, a framework designed for parallel computing and Kubernetes, a platform able to scale up or down in terms of application usage. TensorFlow TensorFlow is an open-source library for building and training machine learning models. Originally a Google project, it has had many successes in the field of AI. It is available in multiple layers of abstraction, which allows you to quickly set up predefined machine learning models. TensorFlow was designed to run on distributed systems. The computations it requires can be run in parallel across multiple devices, through data flow graphs underneath. These represent a series of mathematical equations, with multidimensional array representations (tensors) at its edges. Deepmind used this power to create AlphaGo Zero, using 64 GPU workers and 19 CPU parameter servers to play 4.9 million games of GO against itself in just 3 days. Kubernetes Kubernetes is Google project as well and is an open-source platform for managing containerized applications at scale. With Kubernetes, you can easily add more instance nodes and get more out of your available hardware. You can compare Kubernetes to cash registers at the supermarket. Whenever there’s a long queue of customers waiting, the store quickly opens up a new register to handle a few of those customers. In reality, this means that Kubernetes (the cash register) is a virtual machine running a service and the customers are consumers of that service. The power of Kubernetes is in its ease of usage. You don’t need to add newly created instances to the load balancer, it’s done automatically. You don’t need to connect the new instance with file storage or networks, Kubernetes does it for you. And if an instance doesn’t behave like it should, it kills it off and immediately spins up a new one. Distributed Training Like I mentioned before, you can reduce the time it takes to train a model by doing computations in parallel over different hardware units. Even with a limited configuration, you can reduce your training time to a minimum by distributing it over multiple devices. TensorFlow allows you to use CPUs, GPUs and even TPUs or Tensor Processing Unit, a chip designed to run TensorFlow operations. You need to define a Strategy and make sure you create and compile your model within the scope of that strategy. mirrored_strategy = tf.distribute.MirroredStrategy() with mirrored_strategy.scope(): model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(1,))]) model.compile(loss='mse', optimizer='sgd') Default The MirroredStrategy above allows you to distribute training over multiple GPUs on the same machine. The model is replicated for every GPU and variable updates are being executed for every replica. A more interesting variant of this strategy is the MultiWorkerMirroredStrategy . It gives you the opportunity to distribute the training over multiple machines (workers), and each of those may use multiple GPUs. This is where Kubernetes can help fast-track your machine learning. You can create multiple service nodes with Kubernetes according to the need for parameter servers and workers. Parameter servers keep track of the model parameters, workers calculate the updates of those parameters. In general, you can reduce the bandwidth between the members of the server cluster by adding more parameter servers. To make the setup run, you need to set an environment variable TS_CONFIG which defines the role of each node and the setup of the rest of the cluster. os.environ["TF_CONFIG"] = json.dumps({ 'cluster': { 'worker': ['worker-0:5000', 'worker-1:5000'], 'ps': ['ps-0:5000'] }, 'task': {'type': 'ps', 'index': 0} }) Default To make the setup easier, there’s a Github repository with a template for Kubernetes. Note that it doesn’t set up TS_CONFIG itself, but passes its content as parameters to the script. These parameters are used to define which devices can be used in a distributed training. cluster = tf.train.ClusterSpec({ "worker": ['worker-0:5000', 'worker-1:5000'], "ps": ['ps-0:5000']}) server = tf.train.Server( cluster, job_name='ps', task_index='0') Default The ClusterSpec specifies the worker and parameter servers in the cluster. It has the same value for all nodes. The Server contains the definition of the task of the current node, hence a different value per node. TensorFlow Serving For distributed inference, TensorFlow contains a package for hosting machine learning models. This is called TensorFlow Serving and it has been designed to quickly set up and manage machine learning models. All it needs is a SavedModel representation. SavedModel is a format to save trained Tensorflow models in a way that they can easily be loaded and restored. A SavedModel can be serialized into a directory, making it somewhat portable and easy to share. You can quickly create a SavedModel by using the built-in function save . model_version = '1' model_name = 'my_model' model_path = os.path.join('/path/to/save/dir/', model_name, model_version) model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(1,))]) model.compile(loss='mse', optimizer='sgd') model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid)) tf.saved_model.save(model, model_path) Default You can use the SavedModel CLI to inspect SavedModel files. Once you have these files in place, TensorFlow Serving can turn it into a gRPC or RESTful interface. The Docker image tensorflow/serving provides the easiest path towards a running server. There are multiple versions of this image, including one for GPU usage. Besides choosing the right image, you only need to provide the path to the directory you just created and name your model. $ docker run -t --rm -p 8500:8500 -p 8501:8501 \ -v "/path/to/save/dir/my_model:/models/my_model" \ -e MODEL_NAME=my_model \ tensorflow/serving Default Obviously, with Kubernetes you can now create a deployment for this image, and scale up/down the number of replicas automatically. Put a LoadBalancer Service in front of it, and your users will be redirected to the right node without anyone noticing. Because inference requires much less computation, you don’t have to distribute the computation amongst multiple nodes. Note that the save directory path also contains a “version” directory. This is a convention TensorFlow Serving uses to watch the directory for new versions of a SavedModel. When it detects a new one, it loads it automatically, ready to be served. With TensorFlow Serving and Kubernetes, you can handle any amount of load for your classification, regression or prediction models. 🚀 Takeaway You can gain a lot of time by distributing the necessary computations for your machine learning project. By combining a highly scalable library like TensorFlow with a flexible platform like Kubernetes, you can make optimal use of your resources and your time. Of course, you can speed up things even more if you have a knowledgeable Kubernetes team at your side, or somebody to help tune your machine learning models. If you’re ready to ramp up your machine learning, we can do exactly that! Interested or questions? Shoot me an email at stijn.vandenenden@acagroup.be

Read more