We learn & share

ACA Group Blog

Read more about our thoughts, views, and opinions on various topics, important announcements, useful insights, and advice from our experts.

Featured

8 MAY 2025
Reading time 5 min

In the ever-evolving landscape of data management, investing in platforms and navigating migrations between them is a recurring theme in many data strategies. How can we ensure that these investments remain relevant and can evolve over time, avoiding endless migration projects? The answer lies in embracing ‘Composability’ - a key principle for designing robust, future-proof data (mesh) platforms. Is there a silver bullet we can buy off-the-shelf? The data-solution market is flooded with data vendor tools positioning themselves as the platform for everything, as the all-in-one silver bullet. It's important to know that there is no silver bullet. While opting for a single off-the-shelf platform might seem like a quick and easy solution at first, it can lead to problems down the line. These monolithic off-the-shelf platforms often end up inflexible to support all use cases, not customizable enough, and eventually become outdated.This results in big complicated migration projects to the next silver bullet platform, and organizations ending up with multiple all-in-one platforms, causing disruptions in day-to-day operations and hindering overall progress. Flexibility is key to your data mesh platform architecture A complete data platform must address numerous aspects: data storage, query engines, security, data access, discovery, observability, governance, developer experience, automation, a marketplace, data quality, etc. Some vendors claim their all-in-one data solution can tackle all of these. However, typically such a platform excels in certain aspects, but falls short in others. For example, a platform might offer a high-end query engine, but lack depth in features of the data marketplace included in their solution. To future-proof your platform, it must incorporate the best tools for each aspect and evolve as new technologies emerge. Today's cutting-edge solutions can be outdated tomorrow, so flexibility and evolvability are essential for your data mesh platform architecture. Embrace composability: Engineer your future Rather than locking into one single tool, aim to build a platform with composability at its core. Picture a platform where different technologies and tools can be seamlessly integrated, replaced, or evolved, with an integrated and automated self-service experience on top. A platform that is both generic at its core and flexible enough to accommodate the ever-changing landscape of data solutions and requirements. A platform with a long-term return on investment by allowing you to expand capabilities incrementally, avoiding costly, large-scale migrations. Composability enables you to continually adapt your platform capabilities by adding new technologies under the umbrella of one stable core platform layer. Two key ingredients of composability Building blocks: These are the individual components that make up your platform. Interoperability: All building blocks must work together seamlessly to create a cohesive system. An ecosystem of building blocks When building composable data platforms, the key lies in sourcing the right building blocks. But where do we get these? Traditional monolithic data platforms aim to solve all problems in one package, but this stifles the flexibility that composability demands. Instead, vendors should focus on decomposing these platforms into specialized, cost-effective components that excel at addressing specific challenges. By offering targeted solutions as building blocks, they empower organizations to assemble a data platform tailored to their unique needs. In addition to vendor solutions, open-source data technologies also offer a wealth of building blocks. It should be possible to combine both vendor-specific and open-source tools into a data platform tailored to your needs. This approach enhances agility, fosters innovation, and allows for continuous evolution by integrating the latest and most relevant technologies. Standardization as glue between building blocks To create a truly composable ecosystem, the building blocks must be able to work together, i.e. interoperability. This is where standards come into play, enabling seamless integration between data platform building blocks. Standardization ensures that different tools can operate in harmony, offering a flexible, interoperable platform. Imagine a standard for data access management that allows seamless integration across various components. It would enable an access management building block to list data products and grant access uniformly. Simultaneously, it would allow data storage and serving building blocks to integrate their data and permission models, ensuring that any access management solution can be effortlessly composed with them. This creates a flexible ecosystem where data access is consistently managed across different systems. The discovery of data products in a catalog or marketplace can be greatly enhanced by adopting a standard specification for data products. With this standard, each data product can be made discoverable in a generic way. When data catalogs or marketplaces adopt this standard, it provides the flexibility to choose and integrate any catalog or marketplace building block into your platform, fostering a more adaptable and interoperable data ecosystem. A data contract standard allows data products to specify their quality checks, SLOs, and SLAs in a generic format, enabling smooth integration of data quality tools with any data product. It enables you to combine the best solutions for ensuring data reliability across different platforms. Widely accepted standards are key to ensuring interoperability through agreed-upon APIs, SPIs, contracts, and plugin mechanisms. In essence, standards act as the glue that binds a composable data ecosystem. A strong belief in evolutionary architectures At ACA Group, we firmly believe in evolutionary architectures and platform engineering, principles that seamlessly extend to data mesh platforms. It's not about locking yourself into a rigid structure but creating an ecosystem that can evolve, staying at the forefront of innovation. That’s where composability comes in. Do you want a data platform that not only meets your current needs but also paves the way for the challenges and opportunities of tomorrow? Let’s engineer it together Ready to learn more about composability in data mesh solutions? {% module_block module "widget_f1f5c870-47cf-4a61-9810-b273e8d58226" %}{% module_attribute "buttons" is_json="true" %}{% raw %}[{"appearance":{"link_color":"light","primary_color":"primary","secondary_color":"primary","tertiary_color":"light","tertiary_icon_accent_color":"dark","tertiary_text_color":"dark","variant":"primary"},"content":{"arrow":"right","icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"tertiary_icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"text":"Contact us now!"},"target":{"link":{"no_follow":false,"open_in_new_tab":false,"rel":"","sponsored":false,"url":{"content_id":230950468795,"href":"https://25145356.hs-sites-eu1.com/en/contact","href_with_scheme":null,"type":"CONTENT"},"user_generated_content":false}},"type":"normal"}]{% endraw %}{% end_module_attribute %}{% module_attribute "child_css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "definition_id" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "field_types" is_json="true" %}{% raw %}{"buttons":"group","styles":"group"}{% endraw %}{% end_module_attribute %}{% module_attribute "isJsModule" is_json="true" %}{% raw %}true{% endraw %}{% end_module_attribute %}{% module_attribute "label" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "module_id" is_json="true" %}{% raw %}201493994716{% endraw %}{% end_module_attribute %}{% module_attribute "path" is_json="true" %}{% raw %}"@projects/aca-group-project/aca-group-app/components/modules/ButtonGroup"{% endraw %}{% end_module_attribute %}{% module_attribute "schema_version" is_json="true" %}{% raw %}2{% endraw %}{% end_module_attribute %}{% module_attribute "smart_objects" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "smart_type" is_json="true" %}{% raw %}"NOT_SMART"{% endraw %}{% end_module_attribute %}{% module_attribute "tag" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "type" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "wrap_field_tag" is_json="true" %}{% raw %}"div"{% endraw %}{% end_module_attribute %}{% end_module_block %}

Read more
We learn & share

ACA Group Blog

Read more about our thoughts, views, and opinions on various topics, important announcements, useful insights, and advice from our experts.

Featured

8 MAY 2025
Reading time 5 min

In the ever-evolving landscape of data management, investing in platforms and navigating migrations between them is a recurring theme in many data strategies. How can we ensure that these investments remain relevant and can evolve over time, avoiding endless migration projects? The answer lies in embracing ‘Composability’ - a key principle for designing robust, future-proof data (mesh) platforms. Is there a silver bullet we can buy off-the-shelf? The data-solution market is flooded with data vendor tools positioning themselves as the platform for everything, as the all-in-one silver bullet. It's important to know that there is no silver bullet. While opting for a single off-the-shelf platform might seem like a quick and easy solution at first, it can lead to problems down the line. These monolithic off-the-shelf platforms often end up inflexible to support all use cases, not customizable enough, and eventually become outdated.This results in big complicated migration projects to the next silver bullet platform, and organizations ending up with multiple all-in-one platforms, causing disruptions in day-to-day operations and hindering overall progress. Flexibility is key to your data mesh platform architecture A complete data platform must address numerous aspects: data storage, query engines, security, data access, discovery, observability, governance, developer experience, automation, a marketplace, data quality, etc. Some vendors claim their all-in-one data solution can tackle all of these. However, typically such a platform excels in certain aspects, but falls short in others. For example, a platform might offer a high-end query engine, but lack depth in features of the data marketplace included in their solution. To future-proof your platform, it must incorporate the best tools for each aspect and evolve as new technologies emerge. Today's cutting-edge solutions can be outdated tomorrow, so flexibility and evolvability are essential for your data mesh platform architecture. Embrace composability: Engineer your future Rather than locking into one single tool, aim to build a platform with composability at its core. Picture a platform where different technologies and tools can be seamlessly integrated, replaced, or evolved, with an integrated and automated self-service experience on top. A platform that is both generic at its core and flexible enough to accommodate the ever-changing landscape of data solutions and requirements. A platform with a long-term return on investment by allowing you to expand capabilities incrementally, avoiding costly, large-scale migrations. Composability enables you to continually adapt your platform capabilities by adding new technologies under the umbrella of one stable core platform layer. Two key ingredients of composability Building blocks: These are the individual components that make up your platform. Interoperability: All building blocks must work together seamlessly to create a cohesive system. An ecosystem of building blocks When building composable data platforms, the key lies in sourcing the right building blocks. But where do we get these? Traditional monolithic data platforms aim to solve all problems in one package, but this stifles the flexibility that composability demands. Instead, vendors should focus on decomposing these platforms into specialized, cost-effective components that excel at addressing specific challenges. By offering targeted solutions as building blocks, they empower organizations to assemble a data platform tailored to their unique needs. In addition to vendor solutions, open-source data technologies also offer a wealth of building blocks. It should be possible to combine both vendor-specific and open-source tools into a data platform tailored to your needs. This approach enhances agility, fosters innovation, and allows for continuous evolution by integrating the latest and most relevant technologies. Standardization as glue between building blocks To create a truly composable ecosystem, the building blocks must be able to work together, i.e. interoperability. This is where standards come into play, enabling seamless integration between data platform building blocks. Standardization ensures that different tools can operate in harmony, offering a flexible, interoperable platform. Imagine a standard for data access management that allows seamless integration across various components. It would enable an access management building block to list data products and grant access uniformly. Simultaneously, it would allow data storage and serving building blocks to integrate their data and permission models, ensuring that any access management solution can be effortlessly composed with them. This creates a flexible ecosystem where data access is consistently managed across different systems. The discovery of data products in a catalog or marketplace can be greatly enhanced by adopting a standard specification for data products. With this standard, each data product can be made discoverable in a generic way. When data catalogs or marketplaces adopt this standard, it provides the flexibility to choose and integrate any catalog or marketplace building block into your platform, fostering a more adaptable and interoperable data ecosystem. A data contract standard allows data products to specify their quality checks, SLOs, and SLAs in a generic format, enabling smooth integration of data quality tools with any data product. It enables you to combine the best solutions for ensuring data reliability across different platforms. Widely accepted standards are key to ensuring interoperability through agreed-upon APIs, SPIs, contracts, and plugin mechanisms. In essence, standards act as the glue that binds a composable data ecosystem. A strong belief in evolutionary architectures At ACA Group, we firmly believe in evolutionary architectures and platform engineering, principles that seamlessly extend to data mesh platforms. It's not about locking yourself into a rigid structure but creating an ecosystem that can evolve, staying at the forefront of innovation. That’s where composability comes in. Do you want a data platform that not only meets your current needs but also paves the way for the challenges and opportunities of tomorrow? Let’s engineer it together Ready to learn more about composability in data mesh solutions? {% module_block module "widget_f1f5c870-47cf-4a61-9810-b273e8d58226" %}{% module_attribute "buttons" is_json="true" %}{% raw %}[{"appearance":{"link_color":"light","primary_color":"primary","secondary_color":"primary","tertiary_color":"light","tertiary_icon_accent_color":"dark","tertiary_text_color":"dark","variant":"primary"},"content":{"arrow":"right","icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"tertiary_icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"text":"Contact us now!"},"target":{"link":{"no_follow":false,"open_in_new_tab":false,"rel":"","sponsored":false,"url":{"content_id":230950468795,"href":"https://25145356.hs-sites-eu1.com/en/contact","href_with_scheme":null,"type":"CONTENT"},"user_generated_content":false}},"type":"normal"}]{% endraw %}{% end_module_attribute %}{% module_attribute "child_css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "definition_id" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "field_types" is_json="true" %}{% raw %}{"buttons":"group","styles":"group"}{% endraw %}{% end_module_attribute %}{% module_attribute "isJsModule" is_json="true" %}{% raw %}true{% endraw %}{% end_module_attribute %}{% module_attribute "label" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "module_id" is_json="true" %}{% raw %}201493994716{% endraw %}{% end_module_attribute %}{% module_attribute "path" is_json="true" %}{% raw %}"@projects/aca-group-project/aca-group-app/components/modules/ButtonGroup"{% endraw %}{% end_module_attribute %}{% module_attribute "schema_version" is_json="true" %}{% raw %}2{% endraw %}{% end_module_attribute %}{% module_attribute "smart_objects" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "smart_type" is_json="true" %}{% raw %}"NOT_SMART"{% endraw %}{% end_module_attribute %}{% module_attribute "tag" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "type" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "wrap_field_tag" is_json="true" %}{% raw %}"div"{% endraw %}{% end_module_attribute %}{% end_module_block %}

Read more

All blog posts

Lets' talk!

We'd love to talk to you!

Contact us and we'll get you connected with the expert you deserve!

Lets' talk!

We'd love to talk to you!

Contact us and we'll get you connected with the expert you deserve!

Lets' talk!

We'd love to talk to you!

Contact us and we'll get you connected with the expert you deserve!

Lets' talk!

We'd love to talk to you!

Contact us and we'll get you connected with the expert you deserve!

AWS machine learning models
AWS machine learning models
Reading time 7 min
8 MAY 2025

In my previous blog post, I emphasized the importance of gathering data. However, in some cases you might not have any suitable data available. You might have raw data, but perhaps it’s unlabeled and unfit for machine learning. If you don’t have the financial resources to label this data, or you don’t have a product like reCAPTCHA to do it for free, there’s another option. Since Amazon launched its Amazon Web Services cloud platform as a side-business over a decade ago, it has grown at a tremendous pace. AWS offers now more than 165 services, giving anyone, from startups to multinational corporations, access to a dependable and scalable technical infrastructure. Some of these services offer what we call pre-trained machine learning models. Amazon’s pre-trained machine learning models can recognize images or objects, process text, give recommendations and more. The best part of it all is that you are able to use services based on Deep Learning without having to know anything about machine learning at all. These services are trained by Amazon, using data from its websites, its massive product catalog and its warehouses. The information on the AWS websites might be a bit overwhelming at first. That’s why in this blog post I would like to give an overview of a few services using Amazon’s machine learning models, which I think can easily be introduced into your applications. Computer vision with Rekognition Amazon Rekognition is a service that analyzes images and videos. You can use this service to identify people’s faces, everyday objects or even different celebrities. Practical uses are adding labels to videos, for instance following the ball during a football match, or picking out celebrities in an audience. Since Rekognition also has an API to compare similarities between persons in multiple images, you can use it to verify someone’s identity or automatically tag friends on social media. Speaking of social media: depending on the context of a platform, some user contributions might not be deemed acceptable. Through Rekognition, a social media platform can semi-automatically control suggestive or explicit content, giving the opportunity to blur or deny uploaded media when certain labels are associated with it. Digitalize archives with Textract Amazon Textract allows you to extract text from a scanned document. It uses Optical Character Recognition (OCR) and goes a step further by taking context into account. If your company receives a lot of printed forms instead of their digital counterpart, you might have a few thousand papers you need to digitalize manually. With regular OCR, it’s challenging to detect where a form label ends and a form field begins. Likewise it would be difficult for OCR to read newspapers, when text is placed in two or more columns. Textract is able to identify which group of words belong together, whether it’s a paragraph, a form field or a data table, helping you to reduce the time and effort you need to digitalize those archives. Analyze text with Comprehend Amazon Comprehend is a Natural Language Processing (NLP) service. It helps you discover the subject of a document, key phrases, important locations, people mentioned and more. One of its features is to analyze sentiment in a text. This can give you a quick insight in interactions with customers: are they happy, angry, satisfied? Amazon Comprehend can even highlight similar frustrations around a certain topic. If reviews around a certain product are automatically found to be mostly positive, you could easily incorporate this in a promotional campaign. Similarly if reviews are mostly negative, that might be something to forward to the manufacturer. A subservice of Comprehend called Comprehend Medical, is used to mine patient records and extract patient data and treatment information. Its goal is to help health care providers to quickly get an overview of previous interactions with a patient. By identifying key information from medical notes and adding some structure to it, Comprehend Medical assists medical customers to process a ton of documents in a short period of time. Take notes with Transcribe Amazon Transcribe is a general-purpose service to convert speech to text, with support for 14 languages. It automatically adds punctuation and formatting, making the text easier to read and search through. A great application for this is creating a transcript from an audio file and sending it to Comprehend for further analysis. A call center could use real-time streaming transcription to detect the name of a customer and present their information to the operator. Alternatively, the call center could label conversations with keywords to analyze which issues arise frequently. One of Transcribe’s features is to identify multiple speakers. This is useful for transcribing interviews or creating meeting minutes without having one of the meeting participants spend extra time jotting everything down. Multilingual with Translate When you’re getting reactions from customers on your products, you can translate them into your preferred language, so you can grasp subtle implications of certain words. Or you can extend your reach by translating your social media posts. You can even combine Transcribe and Translate to automatically generate subtitles for live events in multiple languages. Express yourself with Polly The Polly service can be considered the inverse of Transcribe. With Polly, you can convert text to speech, making the voice sound as close to natural speech as possible. With support for over 30 languages and many more lifelike voices, nothing is stopping you from making your applications talk back to you. Polly has some support for Speech Synthesis Markup Language (SSML), which gives you more control on how certain parts of the text are pronounced. Besides adding pauses, you can put emphasis on words, exchange acronyms with their unabbreviated form and even add breathing sounds. This amount of customization makes it possible to synthesize voice samples that sound very natural. Generating realistic speech has been a key factor to the success of apps like Duolingo , where pronunciation is of great significance. You can read about this particular use case in this blogpost . Bonus: if you don’t feel like reading, you can have it read to you by Polly! Make suggestions with Personalize When you look for any product on Amazon’s website , you immediately get suggestions for similar products or products that other customers have bought in combination. It’s mind blowing that out of the millions of items offered by Amazon, you get an accurate list of related products at the same moment the page loads. This powerful tool is available to you through Amazon Personalize . You need to provide an item inventory (products, documents, video’s, …), some demographic information about your users, and Personalize will combine this with an activity stream from your application to generate recommendations either in real-time or in bulk. This can easily be applied to a multitude of applications. You can present a list of similar items to customers of a webshop. A course provider would be able to suggest courses similar to a topic of interest. Found a restaurant you liked? Here’s a list of similar restaurants in your area. If you can provide the data, Personalize can provide the recommendations. Create conversations with Lex Amazon Lex is a service that provides conversational AI. It uses the same Natural Language Understanding technology as Amazon’s virtual assistant Alexa. Users can chat to your application instead of clicking through it. Everything starts with an intent . This defines the intention of the user, the goal we want to achieve for our user. It can be as simple as scheduling an appointment, providing directions to a location or getting a recipe that matches a list of ingredients. Intents are triggered by utterances . An utterance is something you say that has meaning. “I need an appointment with Dr. Smith”, “When can I see Dr Smith?”, “Is Dr. Smith available next week Wednesday?” are all utterances for the same intent: making an appointment. Lex is powerful enough to generalize these utterances so that slight variations can also trigger the correct intent. Finally, in the case of registering an appointment, you need to specify a few slots , pieces of data required for the user to provide in order to fulfill the intent. In the case of the example above, the name of the person you want to see, the time period and perhaps the reason of your visit. Even though the requirements are pretty simple, everything depends on the quality of the utterances and the chaining of intents. If you don’t have enough sample sentences or the conversation keeps asking information that the user already presented, your user will end up frustrated and overwhelmed. Predict demand with Forecast A fairly new service provided by AWS is called Forecast . This service also emerged from Amazon’s own necessity to estimate the demand for their immense product inventory. With Forecast, you can get insight in historical time series data. For instance, you could analyze the energy consumption of a region to project it to the near future. This gives you a probability of what the electricity demand tomorrow would be. Likewise, you might be able to predict that a component of your production facility needs maintenance before it wears out. Forecast can leverage Automated Machine Learning (AutoML) to find the optimal learning parameters to fit your use case. The quality of this services depends on the amount and quality of the data you can provide. This service used to be only available to a select group until very recently, but is now available to everyone. You can sign up for Forecast here . 🚀 Takeaway If you want to bring machine learning to your customers but are held back by a lack of understanding, Amazon offers out-of-the-box services to add intelligence to your applications. These services, trained and used by Amazon, can help your business grow and can give a personal experience to your customers, without any prior knowledge on machine learning.

Read more
aws invent 2021
aws invent 2021
Reading time 5 min
6 MAY 2025

Like every year, Amazon held its AWS re:Invent 2021 in Las Vegas. While we weren’t able to attend in person due to the pandemic, as an AWS Partner we were eager to follow the digital event. Below is a quick rundown of our highlights of the event to give you a summary in case you missed it! AWS closer to home AWS will build 30 new ‘ Local Zones ’ in 2022, including one in our home base: Belgium. AWS Local Zones are a type of infrastructure deployment that places compute, storage, database, and other select AWS services close to large population and industry centers. The Belgian Local Zone should be operational by 2023. Additionally, the possibilities of AWS Outposts have increased . The most important change is that you can now run far more services on your own server delivered by AWS. Quick recap: AWS Outposts is a family of fully managed solutions delivering AWS infrastructure and services to virtually any on-premises or edge location for a consistent hybrid experience. Outposts was previously only available in a 42U Outposts rack configuration. From now on, AWS offers a variety of form factors, including 1U and 2U Outposts servers for when there’s less space available. We’re very tempted to get one for the office… AWS EKS Anywhere was previously announced, but is now a reality! With this service, it’s possible to set up a Kubernetes cluster on your own infrastructure or infrastructure from your favorite cloud provider, while still managing it through AWS EKS. All the benefits of freedom of choice combined with the unified overview and dashboard of AWS EKS. Who said you can’t have your cake and eat it too? Low-code to regain primary focus With Amplify Studio , AWS takes the next step in low-code development. Amplify Studio is a fully-fledged low-code generator platform that builds upon the existing Amplify framework. The platform allows users to build applications through drag and drop with the possibility of adding custom code wherever necessary. Definitely something we’ll be looking at on our next Ship-IT Day! Machine Learning going strong(er) Ever wanted to start with machine learning, but not quite ready to invest some of your hard-earned money? With SageMaker Studio Lab , AWS announced a free platform that lets users start exploring AI/ML tools without having to register for an AWS account or leave credit card details behind. You can try it yourself for free in your browser through Jupyter notebooks ! Additionally, AWS announced SageMaker Canvas : a visual, no-code machine learning capability for business analysts. This allows them to get started with ML without having extensive experience and get more insights in data. The third chapter in the SageMaker saga consists of SageMaker Ground Truth Plus . With this new service, you hire a team of experts to train and label your data, a traditionally very labor intensive process. According to Amazon, customers can expect to save up to 40% through SageMaker Ground Truth Plus. There were two more minor announcements: the AI ML Scholarschip Program , a free program for students to get to know ML tools, and Lex Automated Chatbot Designer , which lets you quickly develop a smart chatbot with advanced natural language processing support. Networking for everyone Tired of less than optimal reception or a slow connection? Why not build your own private 5G network? Yep: with AWS Private 5G , Amazon delivers the hardware, management and sim cards for you to set up your very own 5G network. Use cases (besides being fed up with your current cellular network) include warehouses or large sites (e.g. a football stadium) that require low latency, excellent coverage and a large bandwidth. The best part? Customers only pay for the end user’s usage of the network. Continuing the network theme, there’s now AWS Cloud WAN . This service allows users to build a managed WAN (Wide Area Network) to connect cloud and on-premise environments with a central management UI on a network components level as well as service level. Lastly, there’s also AWS Workspaces Web . Through this service, customers can grant employees safe access to internal website and SaaS applications. The big advantage here is that information critical to the company never leaves the environment and doesn’t leave any traces on workstations, thanks to a non-persistent web browser. Kubernetes anyone? No AWS event goes without mentioning Kubernetes, and AWS re:Invent 2021 is no different. Amazon announced two new services in the Kubernetes space: AWS Karpenter and AWS Marketplace for Containers Anywhere . With AWS Karpenter, managing autoscaling Kubernetes infrastructure becomes both simpler and less restrictive. It takes care of automatically starting compute when the load of an application changes. Interestingly, Karpenter is fully open-source, a trend which we’ll see more and more according to Amazon. AwS Marketplace for Containers Anywhere is primarily useful for customers who’ve already fully committed to container managed platforms. It allows users to search, subscribe and deploy 3rd party Kubernetes apps from the AWS Marketplace in any Kubernetes cluster, no matter the environment. IoT updates There have been numerous smaller updates to AWS’s IoT services, most notably to: GreenGrass SSM , which now allows you to securely manage your devices using AWS Systems Manager Amazon Monitron to predict when maintenance is required for rotating parts in machines AWS IoT TwinMaker , to simply make Digital Twins of real-world systems AWS IoT FleetWise , whichs helps users to collect vehicle data in the cloud in near-real time. Upping the serverless game In the serverless landscape, AWS announced serverless Redshift , EMR , MSK , and Kinesis . This enables to set up services while the right instance type is automatically linked. If the service is not in use, the instance automatically stops. This way, customers only pay for when a service is actually being used. This is particularly interesting for experimental services and integrations in environments which do not get used very often. Sustainability Just like ACA Group’s commitment to sustainability , AWS is serious about their ambition towards net-zero carbon by 2040. They’ve developed the AWS Customer Carbon Footprint tool, which lets users calculate carbon emissions through their website . Other announcements included AWS Mainframe Modernization , a collection of tools and guides to take over existing mainframes with AWS, and AWS Well-Architected Framework , a set of design principles, guidelines, best practices and improvements to validate sustainability goals and create reports. We can't wait to start experimenting with all the new additions and improvements announced at AWS re:Invent 2021. Thanks for reading! Discover our cloud hosting services

Read more
wind mills carbon footprint
wind mills carbon footprint
Reading time 3 min
6 MAY 2025

The world is rapidly changing, both from a technological and environmental point of view. Often, these challenges go hand in hand. For example, through the push towards electric vehicles, smart homes and sustainable energy. But while there has been a longstanding focus on the automotive, manufacturing and agricultural industries, there is no pathway to a cleaner environment without addressing the sizable energy consumption of data centers and cloud computing. The carbon footprint of cloud computing According to the International Energy Agency’s (IEA) latest report , data centers around the world in 2021 used 220 to 320 TWh of electricity, which is around 0.9 to 1.3% of the global electricity demand. In addition, global data transmission networks consumed 260-340 TWh, or 1.1 to 1.4% of electricity. Combined, data centers and transmission networks contribute to 0.9% of energy-related emissions. While these may seem fairly low numbers, the demand for data services is rising exponentially. Global internet traffic surged over the past decade, an evolution that accelerated during the pandemic. Since 2010, the number of internet users across the world has more than doubled and global internet traffic has increased 15-fold , or 30% per year . This means that the carbon footprint of cloud computing is something all companies, large or small, must consider. But what can you do without sacrificing the computing power needed to support innovation and deliver goods and services as promised? Amazon Web Services (AWS) While cloud computing also comes with a footprint, it offers a much more eco-friendly way to operate your IT systems than local servers. That’s why we believe a cloud-first approach is key to make your business more sustainable. Especially when cloud-based technologies are powered with renewable energy. That’s why ACA Group carefully chooses its partnerships and evaluates the environmental impact of those partners. In this context, we have selected AWS as a cloud provider. Combined with our flexible Kubernetes setups, it allows us to choose for the least amount of carbon emissions while still meeting (and even exceeding) the expectations of our customers. It shows that cloud computing needs do not come at the planet’s expense. But why AWS? As the world’s most prominent cloud provider, Amazon Web Services is focused on efficiency and continuous innovation across its global infrastructure. In fact, they are well on their way to powering their operations with 100% renewable energy by 2025. Amazon recently became the world’s largest corporate purchaser of renewable energy ; Their investments supply enough electricity to power 3 million US households for a year. Efficient computing Creating clean energy sources is essential, but no less important is rethinking how computing resources are allocated. In a cloud efficiency report, 451 Research showed that AWS’s infrastructure is 3.6 times more energy efficient than the median of U.S. enterprise data centers they surveyed. Amazon attributes this greater efficiency to, among other things, removing the central uninterruptible power supply from their data design and integrating small battery packs and custom power supplies into the server racks. Tese changes combined reduce energy conversion loss by about 35%. The servers themselves are more efficient as well: their Graviton2 CPUs are extremely power- efficient and offer better performance per watt than any other processor currently in use in Amazon data centers. AWS offers unlimited access to cloud computing and services. While this comes at a price, efficient use of resources not only reduces costs, but also indirectly reduces carbon emissions. How can you achieve this? Build applications that are resource-efficient. Consume resources with the lowest possible footprint. Maximize the output on resources used. Reduce the amount of data and distance traveled across the network. Use resources just-in-time. ➡️ Curious how we at ACA Group set up our cloud stacks for maximum sustainability without giving up power, availability and flexibility? Talk to us here !

Read more
Navigating the Cloud-Native Landscape with Harbor Registry
Reading time 8 min
23 APR 2023

In the fast-moving world of IT, ACA Group constantly takes the time to explore innovative solutions and tools to provide the best services to our customers. Recently, we shared our experience with Flux, a CloudNative Continuous Deployment tool that implements GitOps. What is Harbor? Harbor is a CloudNative tool designed to leverage the flexibility, scalability and resilience of the cloud. It is a containerized solution that provides advanced features such as vulnerability scanning and artifact registry. Harbor can run on any Kubernetes based solution on public/private cloud, but also on your local Kubernetes cluster. It is a self-managed solution that needs to be deployed on your Kubernetes cluster. You can choose the components to deploy based on your needs. For some features, for example vulnerability scanning, a selection between different tools can be made. The benefits of Harbor Registery We are excited to use Harbor Registry for our projects in the coming months because of the following benefits: Harbor is easy to scale : All components can be set up with multiple replicas, preventing unexpected downtime and providing fail-over. This ensures that your container images are always available when you need them. Harbor is composable : It allows you to deploy only the workloads for the features you use. Harbor is not only a contaner registery : It can, for example, also be used as a Chartmuseum to store helm charts. Harbor is multi-tenant : Project-specific configuration is possible with specific quota and policies. It also has the most complete set of features of any container registry we have ever worked with, such as: Connection with OpenID Connect. Audit logging for container pull and push actions. Policies for container images like tag retention and tag immutability. Replication to other registries (for example ECR, ACR, ...). Webhooks can be triggered when specific actions occur. A project can serve as a cache proxy for public images. Additionally, Harbor has a well-documented API, and we use a Terraform provider to set up resources like projects and users within Harbor using Terraform code. As you can read, Harbor has a lot to offer. ;-) In the next section, we take a deeper look into some of the features we haven’t covered yet, but are definitely worth mentioning. Container Image Vulnerability Scanning One of the most interesting features is the built-in Vulnerability Scanning. Harbor has a built-in vulnerability scanning tool that automatically scans container images for vulnerabilities when pushed to the registry. A job will be scheduled when a container image is pushed and once the result is available, it will be visible next to the container image. You can also get more details on the specific vulnerabilities that are found: For us, having these insights on the quality of the containers is a huge improvement compared to the current container registry we are using. By making some additional configurations, we can make the vulnerability scanning experience even better. Some examples: Block pulling container images that have Critical CVE vulnerabilities. Shedule frequent scans of images already stored in the Harbor registery. Allow specific CVEs that can't be fixed at the moment. Set up a webhook to take action when a CVE is detected. Harbor uses Trivy as its default vulnerability scanning tool, but it's easy to switch to another tool by installing it on your cluster and registering it in the Harbor interface. With these simple steps, you can take container security to the next level. Container Image Signing Harbor also offers a Container Image Signing feature that allows users to verify the trust of container images. When Notary or Cosign is used to secure container images, Harbor can validate their signatures, ensuring that the images have not been tampered with by any unauthorized sources other than your build tools. The Container Image Signing feature is signified by a green check mark in Harbor's interface, indicating that the image has been correctly signed. While this blog post won't cover how this feature works, you can find detailed documentation on the process via this link . Robot Accounts In addition to regular users, Harbor also allows for the creation of robot accounts. These system users are not associated with personal accounts and are often utilized by scripts and processes to authenticate with the Harbor registry. For instance, when building a container image, scripts may use a robot account to push the container image to the Harbor registry. To increase security, it's possible to set up an expiration time for robot users. Moreover, the access rights can be restricted to a particular project, and even the level of permissions within that project can be customized. The audit log records all activities performed by the robot account, just like regular users. What does the Harbor registery setup look like? We run Harbor registry on EKS, the Kubernetes Service provided by AWS. Since we are running on AWS, we can use some of the AWS services to provide some of the dependencies. We have 3 layers of configuration in this setup: AWS resources. Kubernetes resources. Resources within Harbor registery. ⬇️ In the next sections, we will take a deeper dive into these layers of configuration. 1. Setting up the AWS resources Within the ACA Group, we try to manage all our infrastructure as code. We use Terraform to setup all the dependencies for our Harbor registry: Route53 to provide DNS. RDS to provide a multi-az postgres database. Elasticache to provide a high-available Redis cluster for session management. EFS to provide multi-zone shared storage for our containers. EKS to provide the Kubernetes cluster master layer. Nodegroups deployed over multiple availability zones that will serve as compute capacity for our Kubernetes cluster. Once these resources are available, we can generate the Kubernetes configuration and deploy these to our Kubernetes cluster. 2. Setting up the kubernetes resources We use a helm chart to generate the Kubernetes configuration files that are required to set up Harbor. These are YAML files that are stored in GIT repositories. Ultimately, flux will deploy these YAML files to the Kubernetes cluster. ℹ️ you can read about deployments with flux in another blogpost here . As a result, the following workloads are created on your Kubernetes cluster: Harbor core. Harbor Portal, serves the UI. Harbor Registry, manages the container registry. Harbor Jobservice, schedules background jobs. Harbor Trivy, CVE / vulnerability scanning. Notary resources, image signing. Additionally, various Kubernetes objects are generated, including the Ingress, which exposes the user interface (UI) on your specified URL. If you want more details, you can directly install Harbor on your local Kubernetes cluster by running the Helm install command: helm install my-release harbor/harbor Default 3. Seting up resources within Harbor Now that we have the Harbor registry up and running, we can efficiently create various resources such as projects, robot accounts, and retention policies within it. Once again, we want to manage these resources in code instead of creating them via the UI. This not only helps us track the changes effectively but also prevents any potential misconfigurations through pull request mechanisms. With its comprehensive and well-documented API, Harbor allows many tools to develop custom addons. Leveraging our expertise in Terraform code, we prefer the Terraform Harbor addon to efficiently manage the resources within the Harbor registry. The following example will create a project within Harbor: resource "harbor_project" "myproject" { name = "myproject" public = false vulnerability_scanning = true enable_content_trust = true deployment_security = "" } Default Using the Harbor registery After deploying the Harbor registry and creating projects, it becomes a functional container registry that operates similarly to any other container registry. To push container images to the Harbor registry, conventional build jobs can be used. However, the build job requires authentication credentials, usually from a robot account. Following that, you have to update the configuration of your jenkins, tekton, BitBucket Pipeline, GitHub action, or similar job to specify the correct project and the Harbor URL, such as registry.example.be/myproject. The push command can also be found in the Harbor interface: After pushing the container image to the Harbor registry, it can be pulled from other locations. To pull an image to your local machine using Docker, you can use the following commands: docker login docker pull registry.example.be/myproject/image:version Default To utilize the container image in a Kubernetes environment, start by creating a Secret with the "docker-registry" type that includes the necessary credentials for deploying the container image. As a Secret is specific to a namespace, you need to run this command for each namespace that uses a container from the Harbor registry. kubectl -n NAMESPACE create secret docker-registry registry.example.be --docker-server=registry.example.be --docker-username='firstname.lastname' --docker-password='mysupersecurepassword' --docker-email=me@company.be Default Now you can point to the container image within your Deployment, StatefulSet, Job, … The imagePullSecrets section points to the Secret created in the step above. image: registry.example.be/myproject/image:version … imagePullSecrets: - name: registry.example.be Default Conclusion This blog post provided an overview of the numerous advantages and features of the Harbor registry. We also shared our approach to setting up and utilizing the container registry. At ACA, we use Harbor as the container registry for one of our most important projects and are currently in the process of adopting it as our default registry for new projects. Once it has been set up, we will create a plan to migrate additional active projects. Our goal is to enhance stability, availability and security for our clients. If you would like to learn more about Harbor registry, feel free to contact us! {% module_block module "widget_32015f12-8114-463e-bcf8-473d84a7e2dd" %}{% module_attribute "buttons" is_json="true" %}{% raw %}[{"appearance":{"link_color":"light","primary_color":"primary","secondary_color":"primary","tertiary_color":"light","tertiary_icon_accent_color":"dark","tertiary_text_color":"dark","variant":"primary"},"content":{"arrow":"right","icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"tertiary_icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"text":"Talk to use here!"},"target":{"link":{"no_follow":false,"open_in_new_tab":false,"rel":"","sponsored":false,"url":{"content_id":null,"href":"https://www.acagroup.be/en/services/cloud/","href_with_scheme":"https://www.acagroup.be/en/services/cloud/","type":"EXTERNAL"},"user_generated_content":false}},"type":"normal"}]{% endraw %}{% end_module_attribute %}{% module_attribute "child_css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "definition_id" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "field_types" is_json="true" %}{% raw %}{"buttons":"group","styles":"group"}{% endraw %}{% end_module_attribute %}{% module_attribute "isJsModule" is_json="true" %}{% raw %}true{% endraw %}{% end_module_attribute %}{% module_attribute "label" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "module_id" is_json="true" %}{% raw %}201493994716{% endraw %}{% end_module_attribute %}{% module_attribute "path" is_json="true" %}{% raw %}"@projects/aca-group-project/aca-group-app/components/modules/ButtonGroup"{% endraw %}{% end_module_attribute %}{% module_attribute "schema_version" is_json="true" %}{% raw %}2{% endraw %}{% end_module_attribute %}{% module_attribute "smart_objects" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "smart_type" is_json="true" %}{% raw %}"NOT_SMART"{% endraw %}{% end_module_attribute %}{% module_attribute "tag" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "type" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "wrap_field_tag" is_json="true" %}{% raw %}"div"{% endraw %}{% end_module_attribute %}{% end_module_block %}

Read more
How to secure your cloud with AWS Config
How to secure your cloud with AWS Config
How to secure your cloud with AWS Config
Reading time 6 min
26 FEB 2020

AWS Config is a service that enables you to assess, audit, and evaluate the configurations of your AWS resources. This can be used for: security: validate security best practices on your AWS Account compliance: report on deviations on configuration for AWS resources based upon best practices or architectural principles and guidelines efficiency: report on lost or unused resources in your AWS Account In this blog post, I’d like to detail how to monitor your cloud resources with this tool. This first part discusses AWS Config account setup, enabling notifications when resources are not compliant, and deployment. Why use AWS Config? AWS is the main cloud platform we use at ACA. We manage multiple accounts in AWS to host all sorts of applications for ourselves and for our customers. Over the years, we set up more and more projects in AWS. This led to a lot of accounts being created, which in turn use a lot of cloud resources. Naturally, this means that keeping track of all these resources becomes increasingly challenging as well. AWS Config helped us deal with this challenge. We use it to inventorize and monitor all the resources in our entire AWS organization . It also allows us to set compliance rules for our resources that need to be conform in every account. For example: an Elastic IP should not be left unused or an EC2 security group should not allow all incoming traffic without restrictions. This way, we’re able to create a standard for all our AWS accounts. Having AWS Config enabled in your organization gives us a couple of advantages. We always have an up-to-date inventory of all the resources in our accounts. It allows to inspect the change history of all our resources 24/7. It gives us the possibility to create organization rules and continuously check if our resources are compliant. If that’s not the case, we instantly get a notification. Setting up AWS Config for a single account In this first part of my AWS Config blog, I want to show how to set up AWS Config in a single account. In a future blog post, I’ll explain more about you can do this for an entire AWS organization. The image below shows an overview of the setup in a single account, containing the AWS Config recorder, the AWS Config rules, and the S3 bucket. The AWS Config recorder is the main component of the set-up. You can turn on the default recorder in the AWS console. By default, it will record all resource types. You can find more information about all the available resource types on this page . When you start recording, all the the AWS resources are stored in the S3 bucket as configuration items. Recording these configuration items is not free. At the point of writing it costs $0.003 per recorded configuration item. This cost is generated when the configuration item is first recorded or when something changes to it or one of its relationships . In the settings of the AWS Config recorder, you can also specify how long these configuration items should be stored in the S3 bucket. The AWS Config rules are the most important part of your setup. These rules can be used as compliancy checks to make sure the resources in your account are configured as intended. It’s possible to create custom rules or choose from a large set of AWS managed rules . In our setup at ACA, we chose to only use AWS managed rules since they fitted all our needs. In the image below, you can see one of the rules we deployed. Just like recording configuration items, running rule evaluations costs money. At the moment of writing this is $0.001 for the first 100.000 rule evaluations per region, $.0008 from 100.000 – 500.000 and after that $.0005. There are a lot of rules available with different benefits to your AWS account. These are some of the AWS managed rules we configured: Rules that improve security AccessKeysRotated: checks if the Access keys of an IAM user are rotated within a specified amount of days IamRootAccessKeyCheck: checks if a root account has access keys assigned to it, which isn’t recommended S3BucketServerSideEncryptionEnabled: checks if default encryption for a S3 bucket is enabled Rules that detect unused resources (cost reduction) Ec2VolumeInuseCheck: checks if an EBS volume is being used EipAttached: checks if an Elastic IP is being used Rules that detect resource optimizations VpcVpn2TunnelsUp: checks if a VPN connection has two tunnels available Setting up notifications when resources are not compliant AWS Config rules check configuration items. If a configuration item doesn’t pass the rule requirements, it is marked as ‘non compliant’. Whenever this happens, you want to be notified so you can take the appropriate actions to fix it. In the image below, you can see the way we implemented the notifications for our AWS Config rules. To start with notifications, CloudTrail should be enabled and there should be a trail that logs all activity in the account. Now CloudWatch is able to pick up the CloudTrail events. In our setup, we created 5 CloudWatch event rules that send notifications according to priority. This makes it possible for us to decide what the priority level of the alert for each AWS Config rule should be. The image below shows an example of this. In the ‘Targets’ section, you can see the SNS topic which receives the messages of the CloudWatch event rule. Opsgenie has a separate subscription for each of the SNS topics (P1, P2, P3, P4 P5). This way, we receive notifications when compliance changes happen and also see the severity by looking at the priority level from our Opsgenie alert. Deploying your AWS Config At ACA, we try to always manage our AWS infrastructure with Terraform. This is no different for AWS Config. This is our deployment workflow: We manage everything AWS Config related in Terraform. Here’s an example of one of the AWS Config rules in Terraform, in which the rule_identifier attribute value can be found in the documentation of the AWS Config managed rules: resource "aws_config_config_rule" "mfa_enabled_for_iam_console_access" { name = "MfaEnabledForIamConsoleAccess" description = "Checks whether AWS Multi-Factor Authentication (MFA) is enabled for all AWS Identity and Access Management (IAM) users that use a console password. The rule is compliant if MFA is enabled." rule_identifier = "MFA_ENABLED_FOR_IAM_CONSOLE_ACCESS" maximum_execution_frequency = "One_Hour" excluded_accounts = "${var.aws_config_organization_rules_excluded_accounts}" } The Terraform code is version controlled with Git. When the code needs to be deployed, Jenkins does a checkout of the Git repository and deploys it to AWS with Terraform. Takeaway With AWS Config we’re able to get more insights in our AWS cloud resources. AWS Config improves our security , avoids keeping resources around that are not being used and makes sure our resources are being configured in an optimal way. Besides these advantages, it also provides us with an inventory of all our resources and their configuration history, which we can inspect at any time. This concludes this blog post on the AWS Config topic. In a future part I want to explain how to set it up for an AWS organization. If you found this topic interesting and you got a question or if you would like to know more about our AWS Config setup, then please reach out to us at cloud@aca-it.be {% module_block module "widget_1f2727bf-c08a-40a0-9306-0cb030d1f763" %}{% module_attribute "buttons" is_json="true" %}{% raw %}[{"appearance":{"link_color":"light","primary_color":"primary","secondary_color":"primary","tertiary_color":"light","tertiary_icon_accent_color":"dark","tertiary_text_color":"dark","variant":"primary"},"content":{"arrow":"right","icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"tertiary_icon":{"alt":null,"height":null,"loading":"disabled","size_type":null,"src":"","width":null},"text":"I want to automatically secure my cloud"},"target":{"link":{"no_follow":false,"open_in_new_tab":false,"rel":"","sponsored":false,"url":{"content_id":217528923385,"href":"https://25145356.hs-sites-eu1.com/en/services/cloud","href_with_scheme":null,"type":"CONTENT"},"user_generated_content":false}},"type":"normal"}]{% endraw %}{% end_module_attribute %}{% module_attribute "child_css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "css" is_json="true" %}{% raw %}{}{% endraw %}{% end_module_attribute %}{% module_attribute "definition_id" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "field_types" is_json="true" %}{% raw %}{"buttons":"group","styles":"group"}{% endraw %}{% end_module_attribute %}{% module_attribute "isJsModule" is_json="true" %}{% raw %}true{% endraw %}{% end_module_attribute %}{% module_attribute "label" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "module_id" is_json="true" %}{% raw %}201493994716{% endraw %}{% end_module_attribute %}{% module_attribute "path" is_json="true" %}{% raw %}"@projects/aca-group-project/aca-group-app/components/modules/ButtonGroup"{% endraw %}{% end_module_attribute %}{% module_attribute "schema_version" is_json="true" %}{% raw %}2{% endraw %}{% end_module_attribute %}{% module_attribute "smart_objects" is_json="true" %}{% raw %}null{% endraw %}{% end_module_attribute %}{% module_attribute "smart_type" is_json="true" %}{% raw %}"NOT_SMART"{% endraw %}{% end_module_attribute %}{% module_attribute "tag" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "type" is_json="true" %}{% raw %}"module"{% endraw %}{% end_module_attribute %}{% module_attribute "wrap_field_tag" is_json="true" %}{% raw %}"div"{% endraw %}{% end_module_attribute %}{% end_module_block %}

Read more
How to monitor your Kubernetes cluster with Datadog
Reading time 9 min
16 JAN 2019

Over the past few months, Kubernetes has become a more mature product and setting up a cluster has become a lot easier. Especially with the official release of Amazon Elastic Container Service for Kubernetes (EKS) on Amazon Web Services , another major cloud provider is able to provide a Kubernetes cluster with a few clicks. While the complexity of creating a Kubernetes cluster has decreased drastically, there still are some challenging tasks when setting up the resources within the cluster. The biggest challenge for us has always been providing reliable monitoring and logging for the components within the cluster. Since we’ve migrated to Datadog , things have changed for the better. In this blog post, we’ll teach you how to monitor your Kubernetes cluster with Datadog. Setting up Datadog monitoring and logging For this blog post, we’ll assume you have an active Kubernetes setup and kubectl configured. Our cloud services team prefers the following Kubernetes setup: Amazon Web Services (AWS) as the cloud provider Amazon Elastic Container Service for Kubernetes (EKS) which offers managed Kubernetes Terraform to automate the process of creating the required resources within the AWS account VPC and networking requirements EKS cluster Kubernetes worker nodes Datadog for monitoring and log collection and OpsGenie for alert and incident management. Of course, you’re free to choose your own tools. One requirement, however, is that you must use Datadog (else this whole blog post won’t make a lot of sense). If you’re new to Datadog, you need to create a Datadog account. You can try it out for 14 days for free by clicking here and pressing the “Get started” button. Complete the form and login to your newly created organization. Time to add some hosts! Kubernetes DaemonSet for creating Datadog agents A Kubernetes DaemonSet makes sure that a Docker container running the Datadog agent is created on every worker node (host) that has joined the Kubernetes cluster. This way, you can monitor the resources for all active worker nodes within the cluster. The YAML file specifies the configuration for all Datadog components we want to enable: Datadog Process Agent Datadog Log Agent and Datadog JMX If you wonder what the file looks like, this is it: apiVersion: apps/v1 kind: DaemonSet metadata: name: datadog-agent namespace: tools labels: k8s-app: datadog-agent spec: selector: matchLabels: name: datadog-agent template: metadata: labels: name: datadog-agent spec: #tolerations: #- key: node-role.kubernetes.io/master # operator: Exists # effect: NoSchedule serviceAccountName: datadog-agent containers: - image: datadog/agent:latest-jmx imagePullPolicy: Always name: datadog-agent ports: - containerPort: 8125 # hostPort: 8125 name: dogstatsdport protocol: UDP - containerPort: 8126 # hostPort: 8126 name: traceport protocol: TCP env: - name: DD_API_KEY valueFrom: secretKeyRef: name: datadog key: DATADOG_API_KEY - name: DD_COLLECT_KUBERNETES_EVENTS value: "true" - name: DD_LEADER_ELECTION value: "true" - name: KUBERNETES value: "yes" - name: DD_PROCESS_AGENT_ENABLED value: "true" - name: DD_LOGS_ENABLED value: "true" - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL value: "true" - name: SD_BACKEND value: "docker" - name: SD_JMX_ENABLE value: "yes" - name: DD_KUBERNETES_KUBELET_HOST valueFrom: fieldRef: fieldPath: status.hostIP resources: requests: memory: "400Mi" cpu: "200m" limits: memory: "400Mi" cpu: "200m" volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock - name: procdir mountPath: /host/proc readOnly: true - name: sys-fs mountPath: /host/sys readOnly: true - name: root-fs mountPath: /rootfs readOnly: true - name: cgroups mountPath: /host/sys/fs/cgroup readOnly: true - name: pointerdir mountPath: /opt/datadog-agent/run - name: dd-agent-config mountPath: /conf.d - name: datadog-yaml mountPath: /etc/datadog-agent/datadog.yaml subPath: datadog.yaml livenessProbe: exec: command: - ./probe.sh initialDelaySeconds: 60 periodSeconds: 5 failureThreshold: 3 successThreshold: 1 timeoutSeconds: 3 volumes: - hostPath: path: /var/run/docker.sock name: dockersocket - hostPath: path: /proc name: procdir - hostPath: path: /sys/fs/cgroup name: cgroups - hostPath: path: /opt/datadog-agent/run name: pointerdir - name: sys-fs hostPath: path: /sys - name: root-fs hostPath: path: / - name: datadog-yaml configMap: name: dd-agent-config items: - key: datadog-yaml path: datadog.yaml Default As a whole the file looks a bit overwhelming, so let’s zoom in on some aspects. #tolerations: #- key: node-role.Kubernetes.io/master # operator: Exists # effect: NoSchedule Default Since we use EKS, the master plane is maintained by AWS. Therefore we don’t want any Datadog agent pods to run on the master nodes. Uncomment this if you want to monitor your master nodes, for example when you are running Kops. containers: - image: Datadog/agent:latest-JMX imagePullPolicy: Always name: Datadog-agent Default We use the JMX-enabled version of the Datadog agent image, which is required for Kafka and Zookeeper integrations. If you don’t need JMX, you should use Datadog/agent:latest as this image is less resource-intensive. We specify “imagePullPolicy: Always” so we are sure that on startup, the image labelled “latest” is pulled again. In other cases when a new “latest” release is available, it won’t get pulled as we already have an image tagged “latest” available on the node. env: - name: DD_API_KEY valueFrom: secretKeyRef: name: Datadog key: Datadog_API_KEY Default We use SealedSecrets , which stores the Datadog API Key. It also sets the environment variable to the value of the Secret. If you don’t know how to get an API Key from Datadog, you can do that here . Enter a useful name and press the “Create API” button. - name: DD_LOGS_ENABLED value: "true" Default This ensures the Datadog logs agent is enabled. - name: SD_BACKEND value: "Docker" - name: SD_JMX_ENABLE value: "yes" Default This enables autodiscovery and JMX, which we need for our Zookeeper and Kafka integration to work, as it will use JMX to collect data. For more information on autodiscovery, you can read the Datadog docs here . resources: requests: memory: "400Mi" cpu: "200m" limits: memory: "400Mi" cpu: "200m" Default After enabling JMX, the memory usage of the container drastically increases. If you are not using the JMX version of the image, half of these limits should be fine. - name: Datadog-yaml mountPath: /etc/Datadog-agent/Datadog.yaml subPath: Datadog.yaml … - name: Datadog-yaml configMap: name: dd-agent-config items: - key: Datadog-yaml path: Datadog.yaml Default To add some custom configuration, we need to override the default Datadog.yaml configuration file. The ConfigMap has the following content: apiVersion: v1 kind: ConfigMap metadata: name: datadogtoken namespace: tools data: event.tokenKey: "0" --- apiVersion: v1 kind: ConfigMap metadata: name: dd-agent-config namespace: tools data: datadog-yaml: |- check_runners: 1 listeners: - name: kubelet config_providers: - name: kubelet polling: true tags: tst, kubelet, kubernetes, worker, env:tst, environment:tst, application:kubernetes, location:aws Default The first ConfigMap called Datadogtoken is required to have a persistent state when a new leader is elected. The content of the dd-agent-config ConfigMap is used to create the Datadog.yaml configuration file. We specify and add some extra tags to the resources collected by the agent, which is useful to create filters later on. livenessProbe: exec: command: - ./probe.sh initialDelaySeconds: 60 periodSeconds: 5 failureThreshold: 3 successThreshold: 1 timeoutSeconds: 3 Default When having a Kubernetes cluster with a lot of nodes, we’ve seen containers being stuck in a CrashLoopBackOff status. It’s therefore a good idea to do a more advanced health check to see whether your containers have actually booted. Make sure the health checks start polling after 60 minutes, which seems to be the best value. Once you have gathered all required configuration in your ConfigMap and DaemonSet files, you can create the resources using your Kubernetes CLI. kubectl create -f ConfigMap.yaml kubectl create -f DaemonSet.yaml Default After a few seconds, you should start seeing logs and metrics in the Datadog GUI. Taking a look at the collected data Datadog has a range of powerful monitoring features. The host map gives you a visualization of your nodes over the AWS availability zones. The colours in the map represent the relative CPU utilization for each node, green displaying a low level of CPU utilization and orange displaying a busier CPU. Each node is visible in the infrastructure list. Selecting one of the nodes reveals its details. You can monitor containers in the container view and see more details (e.g. graphs which visualize a trend) by selecting a specific container. Last but not least, processes can be monitored separately from the process list, with trends visible for every process. These fine-grained viewing levels make it easy to quickly pinpoint problems and generally lead to faster response times. All data is available to create beautiful dashboards and good monitors to alert on failures. The creation of these monitors can be scripted, making it fairly easy to set up additional accounts and setups. Easy to see why Datadog is indispensable in our solutions… 😉 Logging with Datadog Logs Datadog Logs is a little bit less mature than the monitoring part, but it’s still one of our favourite logging solutions. It’s relatively cheap and the same agent can be used for both monitoring and logging. Monitors – which are used to trigger alerts – can be created from the log data and log data can also be visualized in dashboards. You can see the logs by navigating here and filter them by container, namespace or pod name. It’s also possible to filter your logs by label, which you can add to your Deployment, StatefulSet, … Setting up additional Datadog integrations As you’ve noticed, Datadog already provides a lot of data by default. However, extra metric collection and dashboards can easily be added by adding integrations. Datadog claims they have more than 200 integrations you can enable. Here’s a list of integrations we usually enable on our clusters: AWS Docker Kubernetes Kafka Zookeeper ElasticSearch OpsGenie Installing integrations is usually a very straightforward process. Some of them can be enabled with one click, others require some extra configuration. Let’s take a deeper look at setting up some of the above integrations. AWS Integration Setup This integration should be configured both on the Datadog and AWS side. First, in AWS, we need to create a IAM Policy and a AssumeRolePolicy to allow access from Datadog to our AWS account. AssumeRolePolicy { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::464622532012:root" }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "${var.Datadog_aws_external_id}" } } } Default The content for the IAM Policy can be found here . Attach both Policies to an IAM Role called DatadogAWSIntegrationRole. Go to your Datadog account setting and press on the “+Available” button under the AWS integration. Go to the configuration tab, replace the variable ${var.Datadog_aws_external_id} in the policy above with the value of AWS External ID. Add the AWS account number and for the role use DatadogAWSIntegrationRole as created above. Optionally, you can add tags which will be added to all metric gathered by this integration. On the left, limit the selection to the AWS services you use. Lastly, save the integration and your AWS integration (and integration for the enabled AWS Services) will be shown under “Installed”. Integration in action When you go to your dashboard list, you’ll see some new interesting dashboards with new metrics you can use to create new monitors with, such as: Database (RDS) memory usage, load, cpu, disk usage, connections Number of available VPN tunnels for a VPN connection Number of healthy hosts behind a load balancer ... Docker Integration Enabling the Docker integration is as easy as pressing the “+Available” button. A “Docker – Overview” dashboard is available as soon as you enable the integration. Kubernetes Integration Just like the Docker integration above, enabling the Kubernetes integration is as easy as pressing the “+Available” button, with a “Kubernetes – Overview” dashboard available as soon as you enable the integration. If you want all data for this integration, you should make sure kube-state-metrics is running within your Kubernetes cluster. More information here . 🚀 Takeaway The goal of this article was to show you how Datadog can become your most indispensable tools in your monitoring and logging infrastructure. Setup is pretty easy and there is so much information that can be collected and visualized effectively. If you can create a good set of monitors so Datadog alerts in case of degradation or increased error rates, most incidents can be solved even before they become actual problems. You can script the creation of these monitors using the Datadog API, reducing the setup time of your monitoring and alerting framework drastically. Do you want more information, or could you use some help setting up your own EKS cluster with Datadog monitoring? Don’t hesitate to contact us !

Read more