Transforming a monolithic application to a micro-services oriented architecture – Part 3

In this article, I am going to explain how to setup a base stack for Amazon Cloud Formation service. This stack will create several shared resources needed for deploying docker containers on Windows Server 2016 instances.

Network

Since we already have an Amazon VPC (Virtual Private Cloud) and SubnetIDs defined, we decided that we will deploy our applications under the same network configuration.

So if you want the base stack to also create the VPC and SubnetIDs, it is doable, it can be created through the same stack, we choose to skip it and only specify the already created VPC and SubnetIDs to use.

Parameters

Our Amazon Cloud Formation stack asks for 3 parameters when we start it up:

1. VpcId, since we use existing VPC.

VpcId:
  Type: 'AWS::EC2::VPC::Id'
  Description: Select a default VPC ID.
  Default: vpc-4745d02c

2. SubnetIDs, since we use existing SubnetIDs

SubnetIDs:
  Type: 'List<AWS::EC2::Subnet::Id>'
  Description: Select at least 2 default subnet IDs in your selected VPC.
  Default: subnet-4645d02d,subnet-4445d02f

3. InstanceType, to specify what type of instances will be launched by this stack:

InstanceType:
  Description: EC2 instance type
  Type: String
  Default: t3.xlarge
  AllowedValues:  
    - t3.medium
    - t3.large
    - t3.xlarge
  ConstraintDescription: Please choose a valid instance type.

Mappings

Based on the region we deploy to, we can map the Image (AMI) to use for starting the instance with the type provided in the parameters.

An AMI (Amazon Machine Image) is a complete configuration (fully configured template) for launching an instance, which is a virtual server in cloud. It contains all you need to launch and you must use an AMI when launching an instance on Amazon.

At this moment, we are launching our stack in us-west-2 region and we are mapping it to the latest Windows Server 2016 ECS Optimized AMI from the Amazon Marketplace.

Basically, when you launch an instance from this AMI, it will have all it needs to work as a node in our cluster, on which docker containers can be installed and requests can be routed to it.

# These are the latest ECS optimized AMIs:
#
#   Windows_Server-2016-English-Full-ECS_Optimized-2018.10.23 
# - ami-0fded406f9181f23e
#   ECS agent: 1.21.0
#   Docker: 18.03.1-ee-3      
#   ecs-init:     
#
# (note the AMI identifier is region specific) 

AWSRegionToAMI:
  us-west-2:
    AMI: ami-0fded406f9181f23e
    #AMI: ami-61522f19

Security Groups

The next step from our base stack is to start creating the actual resources. First thing first, we are creating the 2 security groups to assign to the ECS instances and to the Load Balancer:

1.Security Group for the ECS cluster, open the 3389 port for allowing us to remotely connect to the instances:

# This security group defines who/where is allowed to access
# the ECS hosts directly.
# By default we're just allowing access from the load balancer.
# If you want to SSH 
# into the hosts, or expose non-load balanced services 
# you can open their ports here.
ECSHostSecurityGroup:
  Type: AWS::EC2::SecurityGroup
  Properties: 
    VpcId: !Ref VpcId
    GroupDescription: Access to the ECS
    SecurityGroupIngress:
      # Only allow inbound access to ECS from the ELB
      - SourceSecurityGroupId: !Ref LoadBalancerSecurityGroup 
        IpProtocol: -1
      # Allow RDP to the machines in case we want to inspect
      # or manually do something on any of them
      - CidrIp: 0.0.0.0/0
        IpProtocol: tcp
        FromPort: '3389'
        ToPort: '3398'
      Tags: 
        - Key: Name
          Value: !Sub ${AWS::StackName}-ECS-Hosts

2. Security Group to assign to the Application Load Balancer so we allow traffic to it:

# This security group defines who/where is allowed to access
# the Application Load Balancer.
# By default, we've opened this up to the public internet (0.0.0.0/0) 
# but can you restrict it further if you want.
LoadBalancerSecurityGroup:
  Type: AWS::EC2::SecurityGroup
  Properties: 
    VpcId: !Ref VpcId
    GroupDescription: Access to the load balancer that sits in front of ECS
    SecurityGroupIngress:
      # Allow access from anywhere to our ECS services
      - CidrIp: 0.0.0.0/0
        IpProtocol: -1
    Tags: 
      - Key: Name
        Value: !Sub ${AWS::StackName}-LoadBalancers

Application Load Balancer

The next thing to create is the Application Load Balancer and HTTP listeners for it.

1. Load Balancer:

LoadBalancer:
  Type: AWS::ElasticLoadBalancingV2::LoadBalancer
  Properties:
    Name: !Sub ${AWS::StackName}-alb
    Subnets: !Ref SubnetIDs
      #- !Join
      #    - ','
      #    - !Ref SubnetIDs
      # !Join [ ",", !Ref SubnetIDs ]
    SecurityGroups: 
      - !Ref LoadBalancerSecurityGroup
    Tags: 
      - Key: Name
        Value: !Sub ${AWS::StackName}-alb
    LoadBalancerAttributes:
      - Key: access_logs.s3.enabled
        Value: true
      - Key: access_logs.s3.bucket
        Value: 'my-bucket-logs'
      # The number of seconds to wait before an idle connection is closed.
      - Key: idle_timeout.timeout_seconds 
        Value: 3600
      # - Key: deletion_protection.enabled
      #  Value: true

2. Listener for port 80:

LoadBalancerListener80:
  Type: AWS::ElasticLoadBalancingV2::Listener
  Properties:
    LoadBalancerArn: !Ref LoadBalancer
    Port: 80
    Protocol: HTTP 
    DefaultActions: 
      - Type: forward
        TargetGroupArn: !Ref DefaultTargetGroup

3. Listener for port 443:

LoadBalancerListener443:
  Type: AWS::ElasticLoadBalancingV2::Listener
  Properties:
    LoadBalancerArn: !Ref LoadBalancer
    Port: 443
    Protocol: HTTPS 
    Certificates:
      - CertificateArn: arn:aws:acm:REGION:ACCOUNT_ID:certificate/CERTIFICATE_ID
    SslPolicy: ELBSecurityPolicy-2016-08
    DefaultActions: 
      - Type: forward
        TargetGroupArn: !Ref DefaultTargetGroup

4. If you want to add another certificate to the above listener, you must create a new ListenerCertificate resource, it’s the only way since Certificates Property on Listener resource is only accepting one certificate ARN.

# Uncomment next if you want to add another certificate to the listener, 
# itis  the only way you can do it
# Listener443Certificate:
#   Type: 'AWS::ElasticLoadBalancingV2::ListenerCertificate'
#   Properties:
#     Certificates:
#       - CertificateArn: arn:aws:acm:REGION:ACCOUNT_ID:certificate/CERTIFICATE_ID
#     ListenerArn: !Ref LoadBalancerListener443

The default target group

Since it is mandatory to have a target group when creating listeners for an Application Load Balancer, we create a default target group here which will never be used in reality.

# We define a default target group here, as this is a mandatory Parameters
# when creating an Application Load Balancer Listener. This is not used, instead
# a target group is created per-service in each service template (../services/*)
DefaultTargetGroup:
  Type: AWS::ElasticLoadBalancingV2::TargetGroup
  Properties:
    # Name: !Sub ${AWS::StackName}-default
    VpcId: !Ref VpcId
    Port: 80
    Protocol: HTTP

Alarms for the ELB and for Target Groups

Next part of the stack is to create alarms which will track requests and responses to the ALB and to Target Groups respectively and it will inform us when something weird is happening.

1. Alarms for the Application Load Balancer, triggered when 4xx and 5xx responses from it are way many than usual, and as an action, we select an already created SNS notification resource, which will send us email and SMS notifications. Of course, if you like, you can also create the SNS notification in the stack, but we skip it since we had it already, so we just fill the SNS ARN in the AlarmActions section:

ELB4XXResponseCountIsHigh:
  Type: AWS::CloudWatch::Alarm
  Properties:
    AlarmDescription: 'ELB responded with 4XX more than 10 times in the last 5 min'
    ActionsEnabled: true
    ComparisonOperator: GreaterThanOrEqualToThreshold
    EvaluationPeriods: 1
    MetricName: HTTPCode_ELB_4XX_Count
    Namespace: AWS/ApplicationELB
    Period: 300
    Statistic: Sum
    Threshold: 50
    AlarmActions:
      - 'arn:aws:sns:REGIN:ACCOUNT_ID:SNS_NAME'
    Dimensions:
      - Name: LoadBalancer
        Value: !GetAtt 'LoadBalancer.LoadBalancerFullName'

ELB5XXResponseCountIsHigh:
  Type: AWS::CloudWatch::Alarm
  Properties:
    AlarmDescription: 'ELB responded with 5XX more than 2 times in the last 5 min'
    ActionsEnabled: true
    ComparisonOperator: GreaterThanOrEqualToThreshold
    EvaluationPeriods: 1
    MetricName: HTTPCode_ELB_5XX_Count
    Namespace: AWS/ApplicationELB
    Period: 300
    Statistic: Sum
    Threshold: 10
    AlarmActions:
      - 'arn:aws:sns:REGIN:ACCOUNT_ID:SNS_NAME'
    Dimensions:
      - Name: LoadBalancer
        Value: !GetAtt 'LoadBalancer.LoadBalancerFullName'

2. Alarms for the target groups, same as for the load balancer, but, this time, responses that came directly from all target groups associated are counted:

Target4XXResponseCountIsHigh:
  Type: AWS::CloudWatch::Alarm
  Properties:
    AlarmDescription: 'Targets responded with 4XX > 100 times in the last 5 min'
    ActionsEnabled: true
    ComparisonOperator: GreaterThanOrEqualToThreshold
    EvaluationPeriods: 1
    MetricName: HTTPCode_Target_4XX_Count
    Namespace: AWS/ApplicationELB
    Period: 300
    Statistic: Sum
    Threshold: 100
    AlarmActions:
      - 'arn:aws:sns:REGIN:ACCOUNT_ID:SNS_NAME'
    Dimensions:
      - Name: LoadBalancer
        Value: !GetAtt 'LoadBalancer.LoadBalancerFullName'

Target5XXResponseCountIsHigh:
  Type: AWS::CloudWatch::Alarm
  Properties:
    AlarmDescription: 'Targets responded with 5XX > 2 times in the last 5 min'
    ActionsEnabled: true
    ComparisonOperator: GreaterThanOrEqualToThreshold
    EvaluationPeriods: 1
    MetricName: HTTPCode_Target_5XX_Count
    Namespace: AWS/ApplicationELB
    Period: 300
    Statistic: Sum
    Threshold: 2
    AlarmActions:
      - 'arn:aws:sns:REGIN:ACCOUNT_ID:SNS_NAME'
    Dimensions:
      - Name: LoadBalancer
        Value: !GetAtt 'LoadBalancer.LoadBalancerFullName'

The ECS cluster

Creating an ECS cluster through Amazon Cloud Formation stack is as simple as this:

ECSCluster:
    Type: AWS::ECS::Cluster
    Properties:
        ClusterName: !Ref AWS::StackName

The Auto Scaling Group

Next resource we are creating through the base stack is the Amazon Auto Scaling Group and all the other resources it needs to launch instances in our ECS Cluster.

We have to specify the minimum/maximum and desired launched instances to have at any moment in time.

Also, it is important to give to the auto scaling group enough timeout for create and update policies so that each instance is turned on and ready to launch containers (it is associated with the ECS cluster). Once an instance is ready, it should trigger OK signal to the Amazon Cloud Formation BEFORE the timeout is over. If OK signal is not sent by all the instances (desired count), then the stack fails to launch and resources are deleted/rolled back.

The timeout period depends on how much it takes for the instance to turn on and for how much it takes for our instance bootstrapping script to finish. All this will be explained in details later when we define the instances launch configuration.

ECSAutoScalingGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  Properties: 
    VPCZoneIdentifier: !Ref SubnetIDs
      #- !Join
      #    - ','
      #    - !Ref SubnetIDs
      # !Join [ ",", !Ref SubnetIDs ]
    LaunchConfigurationName: !Ref ECSLaunchConfiguration
    MinSize: 1
    MaxSize: 5
    DesiredCapacity: 1
    NotificationConfigurations:
    - TopicARN: !Ref ASGSNSTopic
      NotificationTypes:
      - autoscaling:EC2_INSTANCE_LAUNCH
      - autoscaling:EC2_INSTANCE_LAUNCH_ERROR
      - autoscaling:EC2_INSTANCE_TERMINATE
      - autoscaling:EC2_INSTANCE_TERMINATE_ERROR
    Tags: 
      - Key: Name
        Value: !Sub ${AWS::StackName} ECS host
        PropagateAtLaunch: 'true'
      - Key: Description
        Value: >
          This instance is part of the Auto Scalling Group which was created
          through ECS using a Cloud Formation Stack
        PropagateAtLaunch: 'true'
  CreationPolicy:
    ResourceSignal: 
      # if this is not enough, 
      # the create instance will fail and the stack will be rolled back
      Timeout: PT40M 
  UpdatePolicy:
    AutoScalingRollingUpdate:
      MinInstancesInService: 1
      MaxBatchSize: '1'
      # if this is not enough, 
      # the update instance will fail and the stack will be rolled back
      PauseTime: PT40M
      SuspendProcesses:
        - AlarmNotification
        - ReplaceUnhealthy 
      WaitOnResourceSignals: 'true'

Alarms and Scale policies

In the past we used to scale instances based on the traffic load and CPU load on them. Now, we will scale containers based on that instead. So, the normal question comes, based on what we will scale instances? Well, based on CPU Reservation.

That means, we will scale instances so that we are sure that there is always enough CPU resources available for launching new containers on them.

1. First we create the policies:

ScaleUpWhenCPUReservationIsHighPolicy:
  Type: AWS::AutoScaling::ScalingPolicy
  Properties:
    AdjustmentType: ChangeInCapacity
    PolicyType: StepScaling
    StepAdjustments:
      - MetricIntervalLowerBound: 0
        ScalingAdjustment: 1
    EstimatedInstanceWarmup: 1200
    AutoScalingGroupName: !Ref ECSAutoScalingGroup
        
ScaleDownWhenCPUReservationIsLowPolicy:
  Type: AWS::AutoScaling::ScalingPolicy
  Properties:
    AdjustmentType: ChangeInCapacity
    PolicyType: StepScaling
    StepAdjustments:
      - MetricIntervalUpperBound: 0
        ScalingAdjustment: -1
    AutoScalingGroupName: !Ref ECSAutoScalingGroup

2. Second, we create the alarms which will trigger the policies. If CPU reservation is bigger than 75%, we open a new instance and we also trigger the same SNS topic which sends us emails and SMSs, otherwise, if CPU reservation is lower then 25%, we close one instance, that easy!

CPUReservationIsHighAlarm:
  Type: AWS::CloudWatch::Alarm
  Properties:
    ActionsEnabled: true
    ComparisonOperator: GreaterThanOrEqualToThreshold
    EvaluationPeriods: 1
    MetricName: CPUReservation
    Namespace: AWS/ECS
    Period: 300
    Statistic: Average
    Threshold: 75
    AlarmActions:
      - !Ref ScaleUpWhenCPUReservationIsHighPolicy
      - "arn:aws:sns:REGION:ACCOUNT_ID:SNS_TOPIC_NAME"
    Dimensions:
      - Name: ClusterName
        Value: !Ref ECSCluster

CPUReservationIsLowAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      ActionsEnabled: true
      ComparisonOperator: LessThanOrEqualToThreshold
      EvaluationPeriods: 3
      MetricName: CPUReservation
      Namespace: AWS/ECS
      Period: 300
      Statistic: Average
      Threshold: 25
      AlarmActions:
        - !Ref ScaleDownWhenCPUReservationIsLowPolicy
      Dimensions:
        - Name: ClusterName
          Value: !Ref ECSCluster

The Launch Configuration

One of the most important part of our base stack is the launch configuration. This is used by the Auto Scaling Group to trigger new instances and basically describes exactly what type of instances, from what AMI and how to launch and bootstrap them.

The ECS Launch Configuration looks like this:

ECSLaunchConfiguration:
  Type: AWS::AutoScaling::LaunchConfiguration
  Properties:
    ImageId:  !FindInMap [AWSRegionToAMI, !Ref "AWS::Region", AMI]
    InstanceType: !Ref InstanceType
    SecurityGroups: 
      - !Ref ECSHostSecurityGroup
    IamInstanceProfile: !Ref ECSInstanceProfile
    KeyName: billing
    BlockDeviceMappings:
      - DeviceName: /dev/sda1
        Ebs:
          VolumeSize: '100'
          VolumeType: gp2
    AssociatePublicIpAddress: 'true'
    UserData: !Base64 
      Fn::Join:
        - ''
        - - '<script>
            
            '
          - 'cfn-init.exe -v -s '
          - !Ref 'AWS::StackId'
          - ' -r ECSLaunchConfiguration'
          - ' --region '
          - !Ref 'AWS::Region'
          - '

            '
          - 'cfn-signal.exe -e %ERRORLEVEL% --stack '
          - !Ref 'AWS::StackName'
          - ' --resource ECSAutoScalingGroup '
          - ' --region '
          - !Ref 'AWS::Region'
          - '
            
            '
          - 'Write-Output ECS_CLUSTER='
          - !Ref ECSCluster
          - ' | Out-File -FilePath c:\ecs.config'
          - '

            '
          - </script>
  Metadata:
    AWS::CloudFormation::Init:
      config:
        commands:
          01_import_powershell_module:
            command: !Sub powershell.exe -Command Import-Module ECSTools
          02_add_instance_to_cluster:
            command: !Sub powershell.exe -Command Initialize-ECSAgent -Cluster ${ECSCluster} -EnableTaskIAMRole
          03_create_custom_folder:
            command: !Sub powershell.exe -Command New-Item -ItemType directory -Path c:\\custom
          04_docker_pull_base_image_latest:
            command: !Sub powershell.exe -Command docker pull BASE_IMAGE:latest
          05_docker_pull_dotnet-framework:
            command: !Sub powershell.exe -Command docker pull microsoft/dotnet-framework:4.7.2-runtime-windowsservercore-ltsc2016
        files:
          c:\cfn\cfn-hup.conf:
            content: !Join ['', ['[main]
                    ', stack=, !Ref 'AWS::StackId', '
                    ', region=, !Ref 'AWS::Region', '
                    ']]
          c:\cfn\hooks.d\cfn-auto-reloader.conf:
            content: !Join ['', ['[cfn-auto-reloader-hook]
                    ', 'triggers=post.update
                    ', 'path=Resources.ECSLaunchConfiguration.Metadata.AWS::CloudFormation::Init
                    ', 'action=cfn-init.exe -v -s ', !Ref 'AWS::StackId', ' -r ECSLaunchConfiguration',
                    ' --region ', !Ref 'AWS::Region', '
                    ']]
        services: 
          windows:
            cfn-hup: 
                enabled: 'true'
                ensureRunning: 'true'
                files: 
                    - c:\cfn\cfn-hup.conf
                    - c:\etc\cfn\hooks.d\cfn-auto-reloader.conf

As you can see, in the Metadata we define settings for the cfn-init helper script (commands to run, services to startup and so on). The actual script is run in UserData where we first run the cfn-init.exe and only after that, we trigger the ready signal to the Amazon Cloud Formation Stack. Basically, the stack waits for this signal for the timeout specified, otherwise, the stack fails.

Of course, we specify the InstanceType we passed as parameter and the AMI automatically associated from the Mappings section.

IAM Role and InstanceProfile

The next resources to create is the role for the ECS Instance Profile. As you have seen, in the Launch Configuration we passed an Instance Profile which has a Role associated. With this role, any instance launched can be attached to the ECS cluster, can download images from the ECR repository and so on.

1. The Role:

# This IAM Role is attached to all of the ECS hosts. It is based on the default role
# published here:
# http://docs.aws.amazon.com/AmazonECS/latest/developerguide/instance_IAM_role.html
#
# You can add other IAM policy statements here to allow access from your ECS hosts
# to other AWS services. Please note that this role will be used by ALL containers
# running on the ECS host.

ECSRole:
  Type: AWS::IAM::Role
  Properties: 
    Path: /
    # RoleName: !Sub ${AWS::StackName}-ECSRole-${AWS::Region}
    AssumeRolePolicyDocument: |
      {
        "Statement": [{
          "Action": "sts:AssumeRole",
          "Effect": "Allow",
          "Principal": { 
            "Service": "ec2.amazonaws.com" 
          }
        }]
      }
    Policies: 
      - PolicyName: ecs-service
        PolicyDocument: |
          {
            "Statement": [{
              "Effect": "Allow",
              "Action": [
                "ecs:CreateCluster",
                "ecs:DeregisterContainerInstance",
                "ecs:DiscoverPollEndpoint",
                "ecs:Poll",
                "ecs:RegisterContainerInstance",
                "ecs:StartTelemetrySession",
                "ecs:Submit*",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "ecr:BatchCheckLayerAvailability",
                "ecr:BatchGetImage",
                "ecr:GetDownloadUrlForLayer",
                "ecr:GetAuthorizationToken"
              ],
              "Resource": "*"
            }]
          }

2. The Instance Profile:

ECSInstanceProfile: 
  Type: AWS::IAM::InstanceProfile
  Properties:
    Path: /
    Roles: 
      - !Ref ECSRole

Auto Scaling Group and Draining Instance

One of the main problem we came across is this: make sure Auto Scaling Group DOES NOT close any instance UNLESS there is no container left on the instance.

Since the scale down policy associated with the Auto Scaling Group knows nothing about the ECS and the containers installed on the instance, we have been in the situation when basically the scaling group just closed an instance by force, and it still had containers we need in production started on it.

If we want to replace the production instances with new ones from a new AMI, same thing, old instances are closed and containers and requests are lost.

While the ECS knows to take action and replace containers with others on other instances, we still can have running requests being closed by force, or even worse, services with no containers in service (if all of them were on the closed instance) and therefore the application will return 503 bad gateway until a new container comes into service.

ECS to start a new instance is fast, but, still, we do want 100% up-time, so we were forced to find a way to put an instance in DRAINING/TERMINATING mode first, wait to be sure that no container and no request is routed to it any-more, and only after that, the Auto Scaling Group can close it.

To solve this problem, we are using an Auto Scaling Group Lifecycle Hook, an Amazon Lambda function (defined as a python script) which will check for all instances in TERMINATING mode in the Auto Scaling Group, and it will check for installed tasks on them.

This will be explained deeper in another article, but, for the moment, I will paste the script and the resources here:

1. Role for the SNS Topic

SNSLambdaRole:
  Type: "AWS::IAM::Role"
  Properties:
    AssumeRolePolicyDocument:
      Version: "2012-10-17"
      Statement:
      -
        Effect: "Allow"
        Principal:
          Service:
            - "autoscaling.amazonaws.com"
        Action:
          - "sts:AssumeRole"
    ManagedPolicyArns:
    - arn:aws:iam::aws:policy/service-role/AutoScalingNotificationAccessRole
    Path: "/"

2. Role for the Lambda function for execution

LambdaExecutionRole:
  Type: "AWS::IAM::Role"
  Properties:
    Policies:
        -
          PolicyName: "lambda-inline"
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
            -
              Effect: "Allow"
              Action:
              - autoscaling:CompleteLifecycleAction
              - logs:CreateLogGroup
              - logs:CreateLogStream
              - logs:PutLogEvents
              - ec2:DescribeInstances
              - ec2:DescribeInstanceAttribute
              - ec2:DescribeInstanceStatus
              - ec2:DescribeHosts
              - ecs:ListContainerInstances
              - ecs:SubmitContainerStateChange
              - ecs:SubmitTaskStateChange
              - ecs:DescribeContainerInstances
              - ecs:UpdateContainerInstancesState
              - ecs:ListTasks
              - ecs:DescribeTasks
              - sns:Publish
              - sns:ListSubscriptions
              Resource: "*"
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
        -
          Effect: "Allow"
          Principal:
            Service:
              - "lambda.amazonaws.com"
          Action:
            - "sts:AssumeRole"
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/service-role/AutoScalingNotificationAccessRole
      Path: "/"

3. The SNS Topic

ASGSNSTopic:
  Type: "AWS::SNS::Topic"
  Properties:
    Subscription:
      -
        Endpoint:
          Fn::GetAtt:
            - "LambdaFunctionForASG"
            - "Arn"
        Protocol: "lambda"
  DependsOn: "LambdaFunctionForASG"

4. The Lambda function (with the python script zipped into a bucket)

LambdaFunctionForASG:
  Type: "AWS::Lambda::Function"
  Properties:
    Code:
      S3Bucket: "MY_BUCKET_NAME"
      S3Key: "index.zip"
    Description: Lambda code for the autoscaling hook triggers invoked 
                 when autoscaling events of launching and 
                 terminating instance occur
    Handler: "index.lambda_handler"
    Role:
      Fn::GetAtt:
        - "LambdaExecutionRole"
        - "Arn"
    Runtime: "python2.7"
    Timeout: "300"

Check the python script at this link: index.zip

5. Permission for Lambda Function to invoke

LambdaInvokePermission:
  Type: "AWS::Lambda::Permission"
  Properties:
    FunctionName: !Ref LambdaFunctionForASG
    Action: lambda:InvokeFunction
    Principal: "sns.amazonaws.com"
    SourceArn: !Ref ASGSNSTopic

6. Subscribe Lambda to the SNS Topic

LambdaSubscriptionToSNSTopic:
  Type: AWS::SNS::Subscription
  Properties:
    Endpoint:
      Fn::GetAtt:
        - "LambdaFunctionForASG"
        - "Arn"
    Protocol: 'lambda'
    TopicArn: !Ref ASGSNSTopic

7. Create the Lifecycle Hook

ASGTerminateHook:
  Type: "AWS::AutoScaling::LifecycleHook"
  Properties:
    AutoScalingGroupName: !Ref ECSAutoScalingGroup
    DefaultResult: "ABANDON"
    HeartbeatTimeout: "900"
    LifecycleTransition: "autoscaling:EC2_INSTANCE_TERMINATING"
    NotificationTargetARN: !Ref ASGSNSTopic
    RoleARN:
      Fn::GetAtt:
      - "SNSLambdaRole"
      - "Arn"
  DependsOn: "ASGSNSTopic"

Outputs

Finally, our base stack will output several parameters which can be used in other stacks. While creating an output with identifier and value is enough for nested stacks, for totally separated stacks we need to also define the Export parameter, with a unique name in our Amazon Account.

Our current outputs are:

Outputs:

VpcId:
  Description: The VPC Id this stack used to create its resources
  Value: !Ref VpcId
  Export:
    Name: !Sub "${AWS::StackName}-VpcId"

Cluster:
  Description: A reference to the ECS cluster
  Value: !Ref ECSCluster
  Export:
    Name: !Sub "${AWS::StackName}-Cluster"

ECSHostSecurityGroup: 
  Description: A reference to the security group for ECS hosts
  Value: !Ref ECSHostSecurityGroup
  Export:
    Name: !Sub "${AWS::StackName}-ECSHostSecurityGroup"

LoadBalancerSecurityGroup:
  Description: A reference to the security group for load balancers
  Value: !Ref LoadBalancerSecurityGroup
  Export:
    Name: !Sub "${AWS::StackName}-LoadBalancerSecurityGroup"

LoadBalancer:
  Description: A reference to the Application Load Balancer
  Value: !Ref LoadBalancer
  Export:
    Name: !Sub "${AWS::StackName}-LoadBalancer"

LoadBalancerUrl:
  Description: The URL of the ALB
  Value: !GetAtt LoadBalancer.DNSName
  Export:
    Name: !Sub "${AWS::StackName}-LoadBalancerUrl"

Listener80:
  Description: A reference to a port 80 listener
  Value: !Ref LoadBalancerListener80
  Export:
    Name: !Sub "${AWS::StackName}-Listener80"

Listener443:
  Description: A reference to a port 443 listener
  Value: !Ref LoadBalancerListener443
  Export:
    Name: !Sub "${AWS::StackName}-Listener443"

If we want to use the cluster name exported by this stack in another stack, it is enough to write in the new stack something like:

Fn::ImportValue: !Sub "${BaseStackName}-Cluster"

For more information you can read this article: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-stack-exports.html

Finally, we launch the stack from the AWS Console > CloudFormation and watch how the resources are being created automatically.

Transforming a monolithic application to a micro-services oriented architecture – Part 3 – AWS CloudFormation Stack