AWS Monitoring helps you gain observability into your AWS environment
AWS Fargate is a serverless compute engine that manages EC2 instances for containers running on Amazon Elastic Container Services (ECS) and Amazon Elastic Container Service for Kubernetes (EKS). Fargate automatically provisions instances for container clusters, removing the need to configure them manually.
As with all cloud services, monitoring is crucial to keep them running smoothly. So in this article, we will go over the vital metrics you should keep an eye on when running a cluster with Fargate.
When choosing Fargate for EKS, a managed Kubernetes hosting service, AWS will manage the control plane, ensuring availability and scaling according to the demands of your workload. AWS manages the worker nodes through “managed node groups,” and you can configure EKS to schedule the containers on Fargate.
For ECS, a container orchestrator created by AWS, you usually tell ECS how many EC2 instances of which type it should use. However, Fargate automates this process, freeing you from handling a fleet of nodes.
The EC2 instances Fargate provisions for you are priced on-demand by default. This makes Fargate more expensive in terms of cost-per-hour, but since it removes the entire management overhead and automatically scales your instances up and down, it can still lead to cost savings.
This section will explore the most critical metrics you need to keep an eye on when running your clusters with Fargate.
To ensure the health and security of your Fargate-backed cluster, it’s essential to monitor the state of your ECS and EKS cluster. By monitoring the number of running services, containers, and deployments, as well as the status of your ECS tasks, you can quickly identify any potential issues and optimize cluster performance.
Also, tracking your cluster’s state can help you identify any potential security risks or compliance issues, for example, when you have unknown workloads running.
Monitoring the memory usage of your containers is crucial for the health and security of your Fargate-backed cluster. Containers must have enough resources to run in a dependable way. Keeping an eye on whether they do or not allows you to identify any potential issues and optimize cluster performance.
Properly monitoring this metric can also help you reduce the provisioned memory and save costs without affecting performance. If your containers are not using the total amount of memory you provisioned, you can safely reduce the memory range.
Proper CPU utilization monitoring for containers is paramount for the efficient and secure functioning of your cluster. The CPU, or central processing unit, executes commands and processes any incoming data. When CPU usage reaches a specific limit, it can result in throttling, which slows down the entire system and can even cause it to crash.
By monitoring CPU utilization, you can identify when your compute resources are nearing their limit and adjust them to scale your deployments and optimize your cluster’s performance. This can help you avoid downtime or performance issues and ensure your cluster runs reliably.
Amazon Elastic File System (EFS) is used to store the files your containers use. You should also keep an eye on your containers’ storage activity to ensure everything is running smoothly. Thankfully, both ECS and EKS allow you to collect metrics that provide greater visibility into your storage usage.
By monitoring EFS usage, you can quickly detect any potential issues or I/O bottlenecks and address them before your applications are impacted. Additionally, you can use this data to optimize your EFS configuration and ensure you get the best possible performance from your storage solution.
Fargate automatically attaches an elastic network interface (ENI) to handle your workload’s network traffic within your VPC. This means monitoring metrics, such as network bandwidth, latency, and packet loss, are also on the table. They give you visibility into the traffic flowing in and out of your clusters, helping you identify potential security threats and optimize network performance.
If you enable VPC Flow Logs, you can make even more informed network security and troubleshooting decisions.
If the metrics that Fargate provides out of the box aren’t enough, there are ways to get a more detailed look into your clusters.
In the case of ECS, AWS offers an easy option to add more metrics. When creating an ECS cluster on the AWS console, select “Enable Container Insights,” as shown in figure 1.
Fig. 1: Enabling Container InsightsAfter activating it, you can see these new metrics for the running containers in your cluster. Just head to the CloudWatch console; then, in the left pane, expand the “Insights” menu item and click “Container Insights.” In the main panel, select your ECS cluster, and you will see the metrics, as shown in figure 2.
Fig. 2: Container InsightsAWS also allows you to enable Container Insights for your EKS deployments. Here, AWS requires you to run the CloudWatch Agent and AWS Distro for OpenTelemetry Collector in a container inside your cluster. The process involves more than checking a box; refer to this walk- through in the AWS documentation for guidance.
While EKS and Fargate remove much of the work required to run and maintain a Kubernetes cluster, you can always use existing open-source solutions for Kubernetes as well. For example, Istio lets you deploy Envoy proxy sidecars to each pod, then use Prometheus to collect the metrics and Grafana to visualize them.
AWS Fargate gives you the power of containers without the burden of having to manage the underlying infrastructure. This gives you a middle way between full serverless solutions like AWS Lambda and provisioning your own instances manually.
As with all services, it’s a good idea to monitor the key metrics. After all, deploying in a data center thousands of miles away comes with some risks, and you always want to stay on top of what’s going on with your workloads.
Monitoring how many services and instances are part of your cluster gives you a good overview. Some may no longer be necessary, or you may want to ensure no one is running unauthorized workloads. Diving more into the details of CPU and memory usage might also allow you to save some money or improve performance by updating the limits you set.
And then there are I/O resources such as EFS, ENIs, and VPCs. Data in the cloud isn’t free, and if you have to move it between regions or to your users, things can get costly quickly. So keeping an eye on these resources is important to avoid a surprisingly high bill at the end of the month.
With Container Insights, you get metrics for CPU, memory, disk, and network usage in your clusters, and with ECS, it’s just one click away.
Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 “Learn” portal. Get paid for your writing.
Apply Now