Using the Application Load Balancer and WAF to replace CloudFront Security Groups

09/12/2016
Posted in AWS Blog
09/12/2016 Ben Bridts

If you’ve been using a Lambda function to update security groups that grant CloudFront access to your resources, you may have seen problems starting to appear the last few days. There are now 32 IP ranges used by CloudFront, and you can add only 50 rules in a security group. This seems fine, but if you want to allow both HTTP and HTTPS,  you’ll have to split the 64 rules over two groups. This may limit you in other ways, as you can add only 5 security groups to a resource.

You can replace this lambda with the recently launched WAF  (web application firewall) for ALB (application load balancers) .

Here is how to do that (assuming you already have a CloudFront distribution and Application Load Balancer setup).

CloudFront configuration

  1. Go to the “Origins” tab of the Distribution you want to use and edit the origin that’s pointing to your ALB.
  2. Add a new Origin Custom Header. You can use any header name and value you like, I opted for “X-Origin-Verify” with a random value
    edit origin

WAF/ALB Configuration

  1. Go to the WAF service page and create a new Web ACL
  2. Give the ACL a name and select the region and name of your ALB
    acl config
  3. Create a new “String matching condition”. We’ll create one called “cloudfront-origin-header” that will match when our custom header has the same random value.
    header-rule
  4. (Optional) If you want to allow your own ip, without the secret header for testing purposes add an “IP match condition” that will match the IPs you trust. We have named that condition “trusted-ips”
    ip condition
  5. Now we can create a rule to allow requests that match the conditions we created. Click on “Create rule”  to create a rule for all requests with our custom header.
    header rule
  6. (Optional) Do the same for a rule with the IP condition
    trusted ip rule
  7. Configure the ACL to allow the rules we just created and block all requests that don’t match any rules
    acl create

Result

If you surf directly to the ALB with an untrusted IP address, you should now see a 403 page:

screenshot-2016-12-09-15-35-28

However, when you add the Custom header, or go through CloudFront, you are allowed to visit the website:

curl alb allowedcurl cloudfront

Caveats

This service is very new, so while setting this up, we ran into some rough edges. We’ve opened  a support request so that AWS can look into fixing those.

  • You can’t see the ACLs you created inside a region (WAF for CloudFront is a global service) if you use the CLI. According the the documentation, you should be able to do this if you override the endpoint url. At the time of writing this gives errors. If you want to try if this has been fixed you can use this command: aws waf list-web-acls –endpoint-url https://waf-regional.us-east-1.amazonaws.com
  • Currently there are no metrics available for the WAF inside a region (even though you have to specify a metric name for the rules and conditions you create).
  • If there are no healthy hosts in the target group of your ALB, you will always get a 503 error response. Even if the requests gets blocked by the WAF.
Share this AWSome post

Comments (4)

  1. Narendra

    Hi All,

    Currently i am implementing AWS WAF to block bad requests (4xx) automatically. but lambda function does not read elb s3 logs and path. cause lambda function written in python according to cloudfront.

    I also tried following aws tutorials. but it is integrated with cloudfront. while my site is not compatible with cloudfront. so i want to integrate lambda function (Bad Requests) from cloudfront to ALB. but lambda function unable to read alb logs and path.

    https://docs.aws.amazon.com/waf/latest/developerguide/tutorials-4xx-blocking.html

    Can some one help me to integrate or manipulate lambda function (Block Bad Requests) from cloudfront access log to alb access logs ( i mean lambda function should be read all access logs from alb logs instead of cloudfront logs.).

    ================lambda function (Block Bad Requests)===================

    ”’
    Copyright 2016 Amazon.com, Inc. or its affiliates. All Rights Reserved.
    Licensed under the Amazon Software License (the “License”). You may not use this file except in compliance with the License.
    A copy of the License is located at http://aws.amazon.com/asl/ or in the “license” file accompanying this file.
    This file is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied.
    See the License for the specific language governing permissions and limitations under the License.
    ”’

    import json
    import urllib
    import boto3
    import gzip
    import datetime
    import time
    import math

    print(‘Loading function’)

    #======================================================================================================================
    # Contants
    #======================================================================================================================
    # Configurables
    OUTPUT_BUCKET = None
    IP_SET_ID_MANUAL_BLOCK = None
    IP_SET_ID_AUTO_BLOCK = None

    BLACKLIST_BLOCK_PERIOD = None # in seconds
    REQUEST_PER_MINUTE_LIMIT = None
    BLOCK_ERROR_CODES = [‘400′,’403′,’404′,’405’] # Error codes to parse logs for

    LIMIT_IP_ADDRESS_RANGES_PER_IP_MATCH_CONDITION = 1000
    API_CALL_NUM_RETRIES = 3

    OUTPUT_FILE_NAME = ‘current_outstanding_requesters.json’

    LINE_FORMAT = {
    ‘date’: 0,
    ‘time’ : 1,
    ‘source_ip’ : 4,
    ‘code’ : 8
    }

    #======================================================================================================================
    # Auxiliary Functions
    #======================================================================================================================
    def get_outstanding_requesters(bucket_name, key_name):
    print ‘[get_outstanding_requesters] Start’

    outstanding_requesters = {}
    outstanding_requesters[‘block’] = {}
    result = {}
    num_requests = 0
    try:
    #————————————————————————————————————–
    print ‘[get_outstanding_requesters] \tDownload file from S3’
    #————————————————————————————————————–
    local_file_path = ‘/tmp/’ + key_name.split(‘/’)[-1]
    s3 = boto3.client(‘s3’)
    s3.download_file(bucket_name, key_name, local_file_path)

    #————————————————————————————————————–
    print ‘[get_outstanding_requesters] \tRead file content’
    #————————————————————————————————————–
    with gzip.open(local_file_path,’r’) as content:
    for line in content:
    try:
    if line.startswith(‘#’):
    continue

    line_data = line.split(‘\t’)
    if line_data[LINE_FORMAT[‘code’]] in BLOCK_ERROR_CODES:
    request_key = line_data[LINE_FORMAT[‘date’]]
    request_key += ‘-‘ + line_data[LINE_FORMAT[‘time’]][:-3]
    request_key += ‘-‘ + line_data[LINE_FORMAT[‘source_ip’]]
    if request_key in result.keys():
    result[request_key] += 1
    else:
    result[request_key] = 1

    num_requests += 1

    except Exception, e:
    print (“[get_outstanding_requesters] \t\tError to process line: %s”%line)

    #————————————————————————————————————–
    print ‘[get_outstanding_requesters] \tKeep only outstanding requesters’
    #————————————————————————————————————–
    now_timestamp_str = datetime.datetime.now().strftime(“%Y-%m-%d %H:%M:%S”)
    for k, v in result.iteritems():
    k = k.split(‘-‘)[-1]
    if v > REQUEST_PER_MINUTE_LIMIT:
    if k not in outstanding_requesters[‘block’].keys() or outstanding_requesters[‘block’][k] REQUEST_PER_MINUTE_LIMIT:
    if k in outstanding_requesters[‘block’].keys():
    print “[merge_current_blocked_requesters] \t\tUpdating data of BLOCK %s rule”%k
    max_v = v[‘max_req_per_min’]
    if outstanding_requesters[‘block’][k][‘max_req_per_min’] > max_v:
    max_v = outstanding_requesters[‘block’][k][‘max_req_per_min’]
    outstanding_requesters[‘block’][k] = { ‘max_req_per_min’: max_v, ‘updated_at’: now_timestamp_str }
    else:
    prev_updated_at = datetime.datetime.strptime(v[‘updated_at’], “%Y-%m-%d %H:%M:%S”)
    total_diff_sec = (now_timestamp – prev_updated_at).total_seconds()
    if total_diff_sec > (BLACKLIST_BLOCK_PERIOD):
    print “[merge_current_blocked_requesters] \t\tExpired BLOCK %s rule”%k
    outstanding_requesters[‘block’][k] = v

    except Exception, e:
    print “[merge_current_blocked_requesters] \tError merging data”

    print “[merge_current_blocked_requesters] End”
    return outstanding_requesters

    def write_output(key_name, outstanding_requesters):
    print “[write_output] Start”

    try:
    current_data = ‘/tmp/’ + key_name.split(‘/’)[-1] + ‘_LOCAL.json’
    with open(current_data, ‘w’) as outfile:
    json.dump(outstanding_requesters, outfile)

    s3 = boto3.client(‘s3’)
    s3.upload_file(current_data, OUTPUT_BUCKET, OUTPUT_FILE_NAME, ExtraArgs={‘ContentType’: “application/json”})

    except Exception, e:
    print “[write_output] \tError to write output file”

    print “[write_output] End”

    def waf_get_ip_set(ip_set_id):
    response = None
    waf = boto3.client(‘waf’)

    for attempt in range(API_CALL_NUM_RETRIES):
    try:
    response = waf.get_ip_set(IPSetId=ip_set_id)
    except Exception, e:
    print e
    delay = math.pow(2, attempt)
    print “[waf_get_ip_set] Retrying in %d seconds…” % (delay)
    time.sleep(delay)
    else:
    break
    else:
    print “[waf_get_ip_set] Failed ALL attempts to call API”

    return response

    def waf_update_ip_set(ip_set_id, updates_list):
    response = None

    if updates_list != []:
    waf = boto3.client(‘waf’)
    for attempt in range(API_CALL_NUM_RETRIES):
    try:
    response = waf.update_ip_set(IPSetId=ip_set_id,
    ChangeToken=waf.get_change_token()[‘ChangeToken’],
    Updates=updates_list)
    except Exception, e:
    delay = math.pow(2, attempt)
    print “[waf_update_ip_set] Retrying in %d seconds…” % (delay)
    time.sleep(delay)
    else:
    break
    else:
    print “[waf_update_ip_set] Failed ALL attempts to call API”

    return response

    def get_ip_set_already_blocked():
    print “[get_ip_set_already_blocked] Start”
    ip_set_already_blocked = []
    try:
    if IP_SET_ID_MANUAL_BLOCK != None:
    response = waf_get_ip_set(IP_SET_ID_MANUAL_BLOCK)
    if response != None:
    for k in response[‘IPSet’][‘IPSetDescriptors’]:
    ip_set_already_blocked.append(k[‘Value’])
    except Exception, e:
    print “[get_ip_set_already_blocked] Error getting WAF IP Set”
    print e

    print “[get_ip_set_already_blocked] End”
    return ip_set_already_blocked

    def is_already_blocked(ip, ip_set):
    result = False

    try:
    for net in ip_set:
    ipaddr = int(”.join([ ‘%02x’ % int(x) for x in ip.split(‘.’) ]), 16)
    netstr, bits = net.split(‘/’)
    netaddr = int(”.join([ ‘%02x’ % int(x) for x in netstr.split(‘.’) ]), 16)
    mask = (0xffffffff << (32 – int(bits))) & 0xffffffff

    if (ipaddr & mask) == (netaddr & mask):
    result = True
    break
    except Exception, e:
    pass

    return result

    def update_waf_ip_set(outstanding_requesters, ip_set_id, ip_set_already_blocked):
    print "[update_waf_ip_set] Start"

    counter = 0
    try:
    if ip_set_id == None:
    print "[update_waf_ip_set] Ignore process when ip_set_id is None"
    return

    updates_list = []
    waf = boto3.client('waf')

    #————————————————————————————————————–
    print "[update_waf_ip_set] \tTruncate [if necessary] list to respect WAF limit"
    #————————————————————————————————————–
    top_outstanding_requesters = {}
    for key, value in sorted(outstanding_requesters.items(), key=lambda kv: kv[1]['max_req_per_min'], reverse=True):
    if counter < LIMIT_IP_ADDRESS_RANGES_PER_IP_MATCH_CONDITION:
    if not is_already_blocked(key, ip_set_already_blocked):
    top_outstanding_requesters[key] = value
    counter += 1
    else:
    break

    #————————————————————————————————————–
    print "[update_waf_ip_set] \tRemove IPs that are not in current outstanding requesters list"
    #————————————————————————————————————–
    response = waf_get_ip_set(ip_set_id)
    if response != None:
    for k in response['IPSet']['IPSetDescriptors']:
    ip_value = k['Value'].split('/')[0]
    if ip_value not in top_outstanding_requesters.keys():
    updates_list.append({
    'Action': 'DELETE',
    'IPSetDescriptor': {
    'Type': 'IPV4',
    'Value': k['Value']
    }
    })
    else:
    # Dont block an already blocked IP
    top_outstanding_requesters.pop(ip_value, None)

    #————————————————————————————————————–
    print "[update_waf_ip_set] \tBlock remaining outstanding requesters"
    #————————————————————————————————————–
    for k in top_outstanding_requesters.keys():
    updates_list.append({
    'Action': 'INSERT',
    'IPSetDescriptor': {
    'Type': 'IPV4',
    'Value': "%s/32"%k
    }
    })

    #————————————————————————————————————–
    print "[update_waf_ip_set] \tCommit changes in WAF IP set"
    #————————————————————————————————————–
    response = waf_update_ip_set(ip_set_id, updates_list)

    except Exception, e:
    print "[update_waf_ip_set] Error to update waf ip set"
    print e

    print "[update_waf_ip_set] End"
    return counter

    #======================================================================================================================
    # Lambda Entry Point
    #======================================================================================================================
    def lambda_handler(event, context):
    print '[lambda_handler] Start'
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    key_name = urllib.unquote_plus(event['Records'][0]['s3']['object']['key']).decode('utf8')

    try:
    if key_name == OUTPUT_FILE_NAME:
    print '[lambda_handler] \tIgnore processinf output file'
    return

    #————————————————————————————————————–
    print "[lambda_handler] \tReading (if necessary) CloudFormation output values"
    #————————————————————————————————————–
    global OUTPUT_BUCKET
    global IP_SET_ID_MANUAL_BLOCK
    global IP_SET_ID_AUTO_BLOCK
    global BLACKLIST_BLOCK_PERIOD
    global REQUEST_PER_MINUTE_LIMIT

    if (OUTPUT_BUCKET == None or IP_SET_ID_MANUAL_BLOCK == None or
    IP_SET_ID_AUTO_BLOCK == None or BLACKLIST_BLOCK_PERIOD == None or
    REQUEST_PER_MINUTE_LIMIT == None):

    outputs = {}
    cf = boto3.client('cloudformation')
    stack_name = context.invoked_function_arn.split(':')[6].rsplit('-', 2)[0]
    response = cf.describe_stacks(StackName=stack_name)
    for e in response['Stacks'][0]['Outputs']:
    outputs[e['OutputKey']] = e['OutputValue']

    if OUTPUT_BUCKET == None:
    OUTPUT_BUCKET = outputs['CloudFrontAccessLogBucket']
    if IP_SET_ID_MANUAL_BLOCK == None:
    IP_SET_ID_MANUAL_BLOCK = outputs['ManualBlockIPSetID']
    if IP_SET_ID_AUTO_BLOCK == None:
    IP_SET_ID_AUTO_BLOCK = outputs['AutoBlockIPSetID']
    if BLACKLIST_BLOCK_PERIOD == None:
    BLACKLIST_BLOCK_PERIOD = int(outputs['WAFBlockPeriod']) # in seconds
    if REQUEST_PER_MINUTE_LIMIT == None:
    REQUEST_PER_MINUTE_LIMIT = int(outputs['RequestThreshold'])

    print "[lambda_handler] \t\tOUTPUT_BUCKET = %s"%OUTPUT_BUCKET
    print "[lambda_handler] \t\tIP_SET_ID_MANUAL_BLOCK = %s"%IP_SET_ID_MANUAL_BLOCK
    print "[lambda_handler] \t\tIP_SET_ID_AUTO_BLOCK = %s"%IP_SET_ID_AUTO_BLOCK
    print "[lambda_handler] \t\tBLACKLIST_BLOCK_PERIOD = %d"%BLACKLIST_BLOCK_PERIOD
    print "[lambda_handler] \t\tREQUEST_PER_MINUTE_LIMIT = %d"%REQUEST_PER_MINUTE_LIMIT

    #————————————————————————————————————–
    print "[lambda_handler] \tReading input data and get outstanding requesters"
    #————————————————————————————————————–
    outstanding_requesters, num_requests = get_outstanding_requesters(bucket_name, key_name)

    #————————————————————————————————————–
    print "[lambda_handler] \tMerge with current blocked requesters"
    #————————————————————————————————————–
    outstanding_requesters = merge_current_blocked_requesters(key_name, outstanding_requesters)

    #————————————————————————————————————–
    print "[lambda_handler] \tUpdate new blocked requesters list to S3"
    #————————————————————————————————————–
    write_output(key_name, outstanding_requesters)

    #————————————————————————————————————–
    print "[lambda_handler] \tUpdate WAF IP Set"
    #————————————————————————————————————–
    ip_set_already_blocked = get_ip_set_already_blocked()
    num_blocked = update_waf_ip_set(outstanding_requesters['block'], IP_SET_ID_AUTO_BLOCK, ip_set_already_blocked)

    cw = boto3.client('cloudwatch')
    response = cw.put_metric_data(
    Namespace='WAFReactiveBlacklist-%s'%OUTPUT_BUCKET,
    MetricData=[
    {
    'MetricName': 'IPBlocked',
    'Timestamp': datetime.datetime.now(),
    'Value': num_blocked,
    'Unit': 'Count'
    },
    {
    'MetricName': 'NumRequests',
    'Timestamp': datetime.datetime.now(),
    'Value': num_requests,
    'Unit': 'Count'
    }
    ]
    )

    return outstanding_requesters
    except Exception as e:
    raise e
    print '[main] End'

    ========================================================================

    Thanks
    Narendra

  2. Pritesh

    Thanks for the post.

    I tried this appraoch for fronting a regional API Gateway with Cloudfront distribution. However, if I have different header values at the Cloudfront distribution and the API Gateway WAF ACL, it still allows the requests forwarded through Cloudfront.

    Could you confirm if you’re able to reproduce this issue?

    Thanks!

    • Ben Bridts

      Hi Pritesh,

      I just tested adding a regional WAF to a regional API Gateway and that works as expected:

      Calling it wihtout a Header gives me a 403 Forbidden and with the Header allows my request through.

      Are you sure you have the right default action configured in WAF?

      Kind regards,
      Ben

Leave a Reply

Your email address will not be published. Required fields are marked *