Dashboard for Cloud Custodian

An alternate method to get the visuals and build your own dashboards

After using the Cloud Custodian for over four years, we can all agree to have missed one prominent feature that is “Dashboard”. Cloud Custodian does not have the front end / GUI where you can easily navigate the findings, provide a single plane of glass view of all the accounts you have from all of the public cloud providers, show checks on policy health, display various charts, and guidance to tell the story to the management. Due to this shortcoming, the user has to integrate with native tools or third-party tools. We know how powerful is the Cloud Custodian with all the execution modes, filters, and action items. The fact that it is serverless, running Cloud Custodian is very cheap. Every organization's environment is different and so is the configuration, simply stating the monthly cost to run 200 policies (approx.) is less than $100 (depends on how frequently you are running).

In this story, I will go through the high-level architecture of the Cloud Custodian and Sumo Logic setup which enables us to ingest the Custodian Logs and write various queries to look for non-compliant items, check for policy health, and draw pretty dashboards.

Example- Identify AWS Redshift Cluster Publicly Accessiblepolicies:
- name: redshift-cluster-publicly-accessible
resource: aws.redshift
comments: |
Find Redshift clusters that are publicly accessible.This is a
notify only policy. The policy run once every 24 hours.
filters:
- "tag:redshift-publicly-accessible-exempt": absent
- PubliclyAccessible: true
mode:
type: periodic
schedule: "rate(24 hours)"
execution-options:
output_dir: s3://s3bucket/cclogs/{account_id}/
runtime: python3.8
action:
- type: delete

Different Components

The basic component of Cloud Custodian depending on your implementation includes — Lambda Function, CloudWatch Log Groups, and Cloud Watch Event Rules. Firstly, you write a policy in YAML as shown above, as an example to identify the publicly accessible Redshift clusters. When you deploy the policy to the AWS account, the real magic happens. It creates the lambda function which includes the policy. It will then create the CloudWatch Log Groups. This is where you can check the log streams. Every time the policy runs it creates a new log stream. This log stream contains the timestamp and debugging messages. You can also see if the resources matched the filters and identified them as non-compliant items. Lastly, it creates the Cloud Watch Event Rule. This is where you can check how often the policy will run. It includes the event rule name, status, event schedule, and target. I have a separate story where I have discussed how to solve the quota problem for cloud watch event rules while deploying the cloud custodian policies.

Architecture

A high-level architecture includes the Lambda function where the Custodian and the policy reside. The Cloud Watch event rule will trigger the policy to execute. Custodian will look for the items matched to the filter and produces the output in GZ format. This output is sent to the s3 bucket as defined in the policy. The IAM role that is used by Custodian must have access to that s3 bucket in order to drop those files. You must have deployed the hosted collector within that AWS account to ingest the Custodian output logs from s3 (3 GZ files) into the Sumo Logic (SIEM solution).

Cloud Custodian output is ingested from s3 into Sumo Logic

SumoLogic

Sumo Logic is a cloud-based SIEM solution(Security Information and Event Management). A hosted collector must be configured for Source S3. This means a hosted collector will take the data from the s3 bucket and ingest it into SumoLogic. Refer to the SumoLogic support page for instructions on how to create the collector and source.

Sumo Logic — S3 Source for Hosted Collector

We have a separate story that explains the components required and corresponding configurations. Refer to the story- Ingesting Cloud Custodian Logs into SumoLoigc (Part 1) and Ingesting Cloud Custodian Logs into SumoLoigc (Part 2). A separate story to identify the Cloud Custodian Policy Health Checks.

Dashboard

We have created the below dashboard to give a high-level counts on various things- 1) Total number of AWS accounts 2) Count of low and high tier accounts 3) Count on active and suspended accounts 4) Total number of CIS Benchmark Policies 5) Total number of Cost Saving Policies (Separation into Action Vs Notify) 6) Total number of Security Related Policies, etc.

Sumo Logic Dashboard — Illustration Purposes Only

In order to get these counts, it is very important that you have the policy to count the resources. In this scenario, we are using the policy that is counting on lambda functions. We have also adopted a simplified naming convention which allows us to identify- (i) if the policy is CSP (cost-saving policy) or Sec (security-related) or misc (miscellaneous (ii) if the policy is to notify only (indicated as -n-) or action (indicated as -na-) (iii) acts on existing or newly created resources. The below policy structure shows the

Policy Structure

In the below query, you have to enter your _sourceCategory, _sourceName. The policy name that counts the lambda function is “sec-n-lambda-function-count”. We have to use regex to separate the FunctionName that matches with “cis-” because the CIS benchmark policies start with “cis’”

_sourceCategory="aws/cc/sourcecategory" AND _sourceName=*CustodianLogs/*/policyname/*/*/*/*/resources.json.gz
| parse field=_sourceName "*/*/*/*/*/*/*/*" as clogs, account_id, policies_name, year, month, date, _min, crunlog nodrop
| parse regex "\"FunctionName\":\s\"(?<FunctionName>.+?)\"" multi nodrop
| where FunctionName matches "*cis-*"
| count(FunctionName) group by FunctionName
| fields -_count
| count

The below screenshot from the Sumo Logic dashboard shows — 1) Total number of policies related to missing tags for existing resources (covers all existing) and 2) Total number of policies related to missing tags for newly created resources (in the past 30 days). It is important to note that you have to write individual policies for each resource to count the resources. 3) It also gives the count of policies related to encryption. For example- the number of policies related to encryption that has guard rails, the number of policies that are just notified only, and the number of policies that covers CIS benchmarks (related to encryption).

Sumo Logic Dashboard — Illustration Purposes Only

Encryption Related Policies — Dashboard

This dashboard contains all the resources that have policies related to encryption. For example, the policies are looking where encryption is not enabled and then notifying it, in other cases where it has a guard rails and taking actions. This provides you with a quick way to identify all non-compliant items.

Sumo Logic Dashboard — Illustration Purposes Only

A sample Sumo Logic query to draw the dashboard like above. Replace the below query with source category, source, source name, collector, and policy name.


_sourceCategory="source-category" and
_source="resources_file_sourcename" and _collector="collectorname"
AND _sourceName=*cclogs/*/sec-n-redshift-cluster-not-encrypted/*/*/*/*/resources.json.gz
| parse field=_sourceName "*/*/*/*/*/*/*/*" as clogs, account_id, policies_name, year, month, date, _min, crunlog nodrop
| parse regex "\"ClusterIdentifier\":\s\"(?<ClusterIdentifier>.+?)\"" multi nodrop
| count (ClusterIdentifier) group by ClusterIdentifier, account_id
| fields -_count

Publicly Accessible Resources — Dashboard

The below screenshot shows the dashboard for resources that are exposed to the world. You have to write each individual query in Sumo Logic and then add it to the dashboard.

Sumo Logic Dashboard — Illustration Purposes Only

Comparing Historical Data

Example#1 — In the below example, we are comparing historical data to understand how many AMIs existed and were created in the last 4 weeks across all of your AWS accounts (hundreds).


_sourceCategory="YourSourceCategory" and
_source="cloudcustodianresourcefilename" and _collector="YourCollector" AND _sourceName=*CustodianLogs/*/policyname/*/*/*/*/resources.json.gz
| parse field=_sourceName "*/*/*/*/*/*/*/*" as clogs, account_id, policies_name, year, month, date, _min, crunlog nodrop
| parse regex "\"ImageId\":\s\"(?<ImageId>.+?)\"" multi nodrop
| count (ImageId) group by ImageId
| count| compare timeshift 1w 4

The below screenshot shows the count of AMIs every week for the last 4 weeks. The data shown is just for illustration purposes only. We have manually edited it to show the differences (historical values).

Historical comparison of data (last 30 days)

Example #2 — In the below example, we are comparing historical data to understand how many old EBS volume snapshots were deleted in the last 4 weeks across all of your AWS accounts (hundreds).


_sourceCategory="YourSourceCategory" and
_source="cloudcustodianresourcefilename" and _collector="YourCollector" AND _sourceName=*CustodianLogs/*/policyname/*/*/*/*/resources.json.gz
| parse field=_sourceName "*/*/*/*/*/*/*/*" as clogs, account_id, policies_name, year, month, date, _min, crunlog nodrop
| parse regex "\"SnapshotId\":\s\"(?<SnapshotId>.+?)\"" multi nodrop
| count (SnapshotId) group by SnapshotId
| count| compare timeshift 1w 4

The below screenshot shows the count of old EBS volume snapshots that were deleted every week for the last 4 weeks. The data shown is just for illustration purposes only. We have manually edited it to show the differences (historical values).

Historical comparison of data (last 30 days)

AWS Resources Inventory — Dashboard

We have a separate story where we have discussed the problem and the solution — How to tag at resource and account level in AWS? The below screenshot from sumo provides you with the count of all AWS resources. You can draw a dashboard for each account or for all AWS accounts (100s of accounts together). You just need to adjust your query in Sumo Logic.

Screenshot from Sumo Logic — Inventory Dashboard

Other Stories

Cloud Custodian Policy Health Checks

Ingesting Cloud Custodian Logs into Sumo Logic

Cloud Custodian [GZ] Output Files

Upgrade your Cloud Custodian to the latest version

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aakif Shaikh, CISSP, CEH, CHFI, CISA, GWAPT

Over 18 years of experience in a wide variety of technical domains within information security including information assurance, compliance, and risk management.