How to reduce AWS costs

We’ve decided to do a more in-depth article on additional cost-cutting measures apart from those described in a previous article. Please keep in mind this isn’t the way to do it ™, because in a perfect world everyone would have cost-tags in place in every resource created, preferably via your-favourite-Infrastructure-as-Code strategy, which would make everyone’s life easier.

AWS service cost reductions

In this example, we’ve divided the analysis in seven different areas, which mirror AWS Services usually higher on the billing data (in no specific order):

  • RDS
  • EC2
  • S3
  • Elastic Load Balancer
  • Elasticache
  • Route 53
  • ACM

In the Reduce AWS costs – example report we can see some of the details with the technical explanation of the cost reduction activities.

VPC Endpoint for Amazon S3

A VPC Endpoint is a service that enables you to have selected AWS Services on your VPC. It has several advantages, as it allows finer-grained control access to your resources and avoids traffic through the Internet, which you’ll pay for.

In a typical Web Application, Amazon S3 is used to store static assets, such as images, CSS, to improve your site’s performance and modularity. It also allows you to store your assets on a highly durable and available Object Store (99.999999999%).

Types of VPC Endpoints

AWS provides two types of VPC Endpoints:

  • Interface Endpoints – it creates an ENI on your VPC, with a private IP. The service integrates with internal DNS resolution on your VPC, which allows you to reach the service through your subnets;
  • Gateway Endpoints – adds a gateway that can be used on your Route Tables. You may add a Gateway Endpoint for each VPC.

Test the new Setup

Before going into Live, you should add a new subnet on your current setup/VPC, to make sure your current scenario is tested before going into Production. (We are using Terraform version 0.11 on these samples)

We will start by creating a new subnet prv-subnet-1, on an existing VPC (vpc_id):

resource "aws_subnet" "prv-subnet-1" {
  vpc_id                  = "${var.vpc_id}"
  cidr_block              = "172.31.60.0/24"
  availability_zone       = "eu-west-1a"
  map_public_ip_on_launch = false
  
  tags = {
    Name        = "prv-subnet-1"
    Terraform   = "true"
  }
}

Now, let’s add a new Route Table, using an existing NAT GW (natgw_id):

# create the route table
resource "aws_route_table" "test" {
  vpc_id = "${var.vpc_id}"

  # add default gw
  route {
    cidr_block      = "0.0.0.0/0"
    nat_gateway_id  = "${var.natgw_id}"
  }

  tags = {
    Name = "prv-eu-west-1a-rtb"
  }
}

# associate with subnet
resource "aws_route_table_association" "assoc" {
  subnet_id      = "${aws_subnet.prv-subnet-1.id}"
  route_table_id = "${aws_route_table.test.id}"
}

You may now add a Gateway VPC Endpoint (vpce) to your new Route Table, which is now associated with prv-subnet-1.

resource "aws_vpc_endpoint" "vpce-s3" {
  vpc_id              = "${var.vpc_id}"
  vpc_endpoint_type   = "Gateway"
  service_name        = "com.amazonaws.eu-west-3.s3"
  route_table_ids     = ["${aws_route_table.test.id}"]
}

We will see that after some minutes you will have a new route entry pointing to a pl-123456ab (vpce) device, the Gateway VPC Endpoint.

At this time, you should check that your access to S3 is still valid.
The first issue you might encounter is that aws:SourceIp bucket policies, based on Public IPs, will no longer work. This is due to the fact that now you’ll be accessing S3 objects directly from your VPC, rather than using Public IP Address.

Let’s take a look at a sample ACL based on https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html, which uses a aws:SourceIp condition:

{
"Version": "2012-10-17",
"Id": "S3PolicyId1",
{
  "Version": "2012-10-17",
  "Id": "S3PolicyId1",
  "Statement": [
    {
      "Sid": "IPAllow",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::examplebucket/*",
      "Condition": {
         "IpAddress": {"aws:SourceIp": "54.240.143.188/32"},
      } 
    } 
  ]
}

The expected change here is to change the aws:SourceIp to the CIDR block of your VPC. However, that will not work! If you review the AWS Documentation it states:

You cannot use an IAM policy or bucket policy to allow access from a VPC IPv4 CIDR range (the private IPv4 address range). VPC CIDR blocks can be overlapping or identical, which may lead to unexpected results. Therefore, you cannot use the aws:SourceIp condition in your IAM policies for requests to Amazon S3 through a VPC endpoint.

So, you’re left with two options to allow/restrict access:

  • Restrict your policy to a VPC or a specific Gateway Endpoint, using aws:sourceVpc;
  • On the VPC side, only add the Gateway VPC Endpoint to the subnets that need access to.

Heads up: Even after raising the limits, you cannot have more than 255 gateway endpoints per VPC. (https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-endpoints).


Here are the changes you should make to the previous example:

{
"Version": "2012-10-17",
"Id": "S3PolicyId1",
{
  "Version": "2012-10-17",
  "Id": "S3PolicyId1",
  "Statement": [
    {
      "Sid": "IPAllow",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::examplebucket/*",
      "Condition": {
        "StringEquals": {
          "aws:sourceVpc": [
            "vpc-aabbccddeeff"
          ]
        }
    } 
  ]
}


Going into Production

Before going into Production, you should review all your bucket policies, to make sure you are using the right Bucket Policies, as explained earlier. To keep your configuration clean and reusable, we’ve created a Terraform module:

# main.tf

resource "aws_vpc_endpoint" "s3" {
  vpc_id              = "${data.aws_vpc.selected.id}"
  vpc_endpoint_type   = "Gateway"
  service_name        = "com.amazonaws.${var.aws_region}.s3"
  route_table_ids     = ["${distinct(data.aws_route_table.selected.*.route_table_id)}"]
}
# data.tf

data "aws_subnet" "selected" {
  count      = "${length(var.subnets)}"
  cidr_block = "${var.subnets[count.index]}"
}

data "aws_route_table" "selected" {
  count     = "${length(var.subnets)}"
  subnet_id = "${data.aws_subnet.selected.*.id[count.index]}"
}

data "aws_vpc" "selected" {
  tags = "${map("Name",var.vpc_name)}"
}
# variables.tf

variable "aws_region" {
  description = "Region to attach S3 VPC Endpoint"
}

variable "tags" {
  description = "Tags to the resources"
  type = "map"
  default = {}
}

variable "vpc_name" {
  description = "VPC Name where to attach the S3 VPC Endpoint"
}

variable "subnets" {
  description = "List of Subnets to add the VPC Endpoint"
  type = "list"
}

Here is a sample usage of the module:

# s3-vpc-endpoint.tf

module "vpc-attach" {
  source        = "modules/terraform-aws-s3-vpc-gateway"
  aws_region    = "${var.aws_region}"
  vpc_name      = "${var.vpc_name}"
  subnets       = ["172.31.50.0/24","172.31.51.0/24",""
172.31.52.0/24"]
  tags = {
    Terraform   = "true"
    Environment = "production"
  }
}

Select a Maintenance Window to apply this module, as the new endpoint will switch the network routes, and consequentely open TCP connections will be closed.

From now on, your S3 Bucket connections on your VPC Region will no longer use the Internet which reduces cost. And that is, apparently, important.

Reduce costs on AWS, not spending

This looks like something out of Captain Obvious journal, but in fact is one of the ways we’ve been helping some customers cutting costs on AWS: stop them from spending.

Usually we have access to an invoice which looks somewhat like the following:

AWS Service Charges:

CloudFront $2400

CloudTrail $901

CloudWatch $124

Data Transfer $4901

DynamoDB $0

Elastic Compute Cloud $28432

Simple Storage Service $5326

Kinesis $1143

There’s that big Elastic Compute Cloud line which you can drill-down on. However, in order for you to be able to do it efficiently (and possibly allocate the costs internally) you’d have to know how to identify each of the billing components. That’s where tagging comes to your rescue: deploy your infrastructure with the corresponding cost tags (Prod/Dev; Marketing/Finance/etc) on each resource and benefit from the results in the end of the following month. To make things really easy, invest some time in terraform-deploying your resources with the tags, which will ensure you’re measuring costs right from the start. Use whichever tool you like to collect and measure costs (Cost Explorer would be a good choice).

At last, a very frequent mistake is usually responsible for unusually high Data Transfer expenses: go through all existing VPCs and make sure you have Gateway Endpoints for S3 or DynamoDB; otherwise you’ll be uselessly paying for traffic regarding AWS services usage.

For more in-depth cost reduction measures, call us.