Create a solution to monitor your processes and alert you if they stop running. Here’s a breakdown of how we can achieve this using a Python Lambda function and other AWS services:

1. Overview

  • We’ll utilize AWS Systems Manager Agent (SSM Agent) to execute a command on your EC2 instance that checks for the running process.
  • CloudWatch Events will trigger the Lambda function periodically.
  • The Lambda function will use SSM to run the command and check the output.
  • If the process isn’t running, the function will publish a notification to an SNS topic.
  • You’ll receive alerts via email or other methods subscribed to the SNS topic.

2. Prerequisites

  • An AWS account with necessary permissions.
  • An EC2 instance with SSM Agent installed and configured.
  • An SNS topic for receiving notifications.

3. Code and Implementation

Python

import boto3
import json

ssm_client = boto3.client('ssm')
sns_client = boto3.client('sns')

def lambda_handler(event, context):
    # Replace with your instance ID and process name
    instance_id = 'your-instance-id'
    process_name = 'your-process-name'

    # Command to check if the process is running
    command = f'ps aux | grep {process_name} | grep -v grep'

    response = ssm_client.send_command(
        InstanceIds=[instance_id],
        DocumentName='AWS-RunShellScript',
        Parameters={'commands': [command]}
    )

    command_id = response['Command']['CommandId']

    # Wait for the command to complete
    waiter = ssm_client.get_waiter('command_executed')
    waiter.wait(CommandId=command_id, InstanceId=instance_id)

    # Get command output
    output = ssm_client.get_command_invocation(
        CommandId=command_id,
        InstanceId=instance_id
    )

    # Check if the process is running
    if output['StandardOutputContent']:
        print(f'Process {process_name} is running on instance {instance_id}')
    else:
        print(f'Process {process_name} is NOT running on instance {instance_id}')
        # Publish notification to SNS topic
        sns_client.publish(
            TopicArn='your-sns-topic-arn',
            Message=f'Process {process_name} is NOT running on instance {instance_id}'
        )

    return {
        'statusCode': 200,
        'body': json.dumps('Process monitoring completed.')
    }

4. Implementation Instructions

  1. Create an IAM role for the Lambda function with permissions to access SSM and SNS.
  2. Create a Lambda function with the code above.
  3. Configure a CloudWatch Events rule to trigger the Lambda function periodically (e.g., every 5 minutes).
  4. Replace placeholders like your-instance-id, your-process-name, and your-sns-topic-arn with your actual values.
  5. Test the function by manually running it and simulating the process stopping.

5. Additional Notes

  • You can customize the command to check for specific process states or conditions.
  • Consider adding error handling and logging to the Lambda function for better monitoring.
  • Explore CloudWatch Agent for more advanced process monitoring and metric collection.

Let me know if you have any questions or need further assistance!

By DSD