How to Monitor EC2 Instances Using CloudWatch Alarms

AWS
EmpowerCodes
Oct 30, 2025

Monitoring EC2 instances is a critical part of maintaining application performance, reliability, and cost efficiency in AWS. Without proper monitoring, resource exhaustion, degraded performance, or downtime can occur without your team knowing until it affects end-users. Amazon CloudWatch provides a comprehensive monitoring service that allows you to collect metrics, set automated alarms, and receive notifications so you can respond promptly to issues.

This guide explains the fundamentals of monitoring EC2 instances using CloudWatch Alarms, the key metrics to track, and step-by-step instructions to set up effective monitoring in AWS.

Why Monitor EC2 Instances?

Applications running on EC2 depend on CPU, memory, disk, and network performance. Monitoring helps you:

  • Detect performance issues early

  • Maintain application availability

  • Optimize resource utilization and cost

  • Trigger auto scaling actions based on load

  • Improve operational visibility and response time

CloudWatch provides real-time metrics and the ability to create alarms that notify you whenever a threshold is reached or a specific condition occurs.

Understanding CloudWatch for EC2

CloudWatch collects both default and custom metrics for EC2 instances:

Default Metrics (Enabled Automatically):

  • CPU Utilization

  • Network In / Out

  • Disk Read / Write

  • Status Check Failed (Instance and System)

These metrics are available at 5-minute intervals by default or 1-minute intervals when detailed monitoring is enabled.

Custom Metrics (Require CloudWatch Agent):

  • Memory Utilization

  • Disk Space Usage

  • Process-level Metrics

  • Swap Usage

  • Application-specific metrics

Default EC2 metrics do not include memory usage, so the CloudWatch Agent is required if you need detailed system-level insights.

Key Metrics to Monitor for EC2 Performance

Here are the most important EC2 metrics used for health and performance monitoring:

MetricReason to Monitor
CPU UtilizationDetect over- or under-provisioning, performance bottlenecks
Status Check FailedIdentifies hardware or OS failure
Network In/OutDetect traffic spikes, DDoS impact, or connectivity issues
Disk Read/Write OpsMeasures storage performance for I/O-intensive workloads
Memory Utilization*Tracks app memory usage to avoid crashes
Disk Space Usage*Prevents storage overflows that can halt applications

(*) Requires CloudWatch Agent.

How CloudWatch Alarms Work

CloudWatch Alarms monitor metrics for a defined threshold. When the threshold is breached, the alarm changes its state:

  • OK – Metric is within defined threshold

  • ALARM – Threshold crossed

  • INSUFFICIENT DATA – Not enough data to evaluate

You can configure actions for each state, such as:

  • Send email or SMS using SNS

  • Trigger Auto Scaling policies

  • Restart or stop EC2 instance

  • Execute Systems Manager Automation actions

Step-by-Step: Create a CloudWatch Alarm for EC2 Monitoring

Step 1: Open the CloudWatch Console

  1. Login to your AWS Management Console

  2. Go to Services > CloudWatch

  3. Select Alarms > All Alarms

  4. Click Create Alarm

Step 2: Choose a Metric

  1. Click Select Metric

  2. Choose EC2 Metrics

  3. Select Per-Instance Metrics

  4. Choose the instance and metric you want to monitor, for example CPUUtilization

  5. Click Select Metric

Step 3: Configure the Alarm

  1. Define the threshold condition. Example:

    • Whenever CPUUtilization is greater than 80% for 5 consecutive minutes

  2. Set the evaluation period (e.g., 5 data points of 1 minute)

  3. Choose the statistic type: Average, Maximum, or Minimum

    • CPU: Use Average

    • Network spikes: Use Maximum

Step 4: Configure Notification (SNS)

  1. Select In Alarm state

  2. Choose Create new SNS topic

  3. Enter email or phone number for alerts

  4. Confirm the subscription

This ensures that whenever the alarm is triggered, the selected users will receive notifications.

Step 5: Name and Create the Alarm

  1. Provide a meaningful name such as High-CPU-Alarm-EC2-Prod

  2. Add a description for clarity

  3. Review and click Create Alarm

Your CloudWatch Alarm is now live.

Setting a Memory Usage Alarm (Requires Agent)

Since memory metrics are not available by default, follow these steps:

  • Install the CloudWatch Agent on EC2

  • Configure the agent to collect memory metrics

  • Create a CloudWatch alarm for “mem_used_percent”

This helps prevent application crashes due to low memory.

Automating Responses with Alarm Actions

CloudWatch alarms can trigger automated remediation actions to reduce manual intervention.

Examples:

ScenarioAutomatic Action
High CPU LoadTrigger Auto Scaling to add more instances
Low CPU LoadScale down to reduce cost
Failed Instance Status CheckAutomatically reboot the instance
Low Disk SpaceRun SSM automation to clean up storage

These actions reduce downtime and improve operational efficiency.

Best Practices for EC2 Monitoring

To get the most value from CloudWatch monitoring, follow these recommended practices:

Monitor Both System and Application Metrics

Default metrics are not enough for deep troubleshooting. Install CloudWatch Agent to track memory, disk, and application metrics.

Tag Your Resources for Organized Monitoring

Use tags such as Environment, Application, and Owner to group alarm visibility and reporting.

Use Composite Alarms for Accurate Alerts

Composite alarms reduce noise by combining multiple alarm conditions. Example:

Trigger an alert only if:

  • CPU > 80% and

  • Memory > 85% and

  • Disk I/O high

This prevents false alarms.

Enable Detailed Monitoring for Production Instances

Enable 1-minute interval monitoring for real-time performance tracking.

Configure Separate Alarms Based on Environment

  • Development: Less strict thresholds

  • Production: Tight thresholds and automated actions

Set Up Dashboards

Create CloudWatch dashboards to monitor multiple EC2 instances from a single panel.

Common Alarm Use Cases

Here are practical use cases to apply CloudWatch alarms effectively:

Application Server Monitoring
Monitor CPU, memory, and latency to prevent traffic-related service slowdowns.

Database Server Monitoring
Monitor disk I/O, read/write latency, and free storage space for databases running on EC2.

Autoscaling Policies
Trigger scale-out and scale-in events based on load thresholds to optimize costs and performance.

Health Monitoring and Self-Healing
Reboot or replace unhealthy instances automatically.

Conclusion

Monitoring EC2 instances using CloudWatch Alarms is essential for maintaining performance, reliability, and high availability of your applications. By tracking key metrics and setting intelligent alarms, teams can detect issues early, automate responses, and ensure seamless operations.

Whether you are running a single instance or a fleet of servers, CloudWatch provides the capabilities needed to monitor system health, improve performance, and optimize costs. Establishing well-configured alarms, dashboards, and automated remediation workflows ensures proactive management of your cloud infrastructure.