#Let's Data is now a trusted AWS Partner
#LetsData is now a trusted APN (AWS Partner Network) Partner and we are now listed in the AWS Partner Solution Finder listing.
We've also earned this shiny new AWS Partner badge!
The AWS Partner Network membership has many advantages and the validation criteria is thorough. A company has to follow the AWS best practices and undergo a Foundational Technical Review (FTR), which is designed to ensure the company follows best practices across categories such as audition, logging, access controls, security etc.Â
While we've built #LetsData following software best practices and have taken care that anything that customers / developers find nagging is eliminated, the FTR uncovered a range of things that takes our offering from a solid, startup service offering to an Enterprise grade offering.Â
The value add of the simple FTR process is immense IMHO. In addition, the process emphasizes usage of tools and services (Management, Governance, Security, Identity and Compliance ) that are not mainly used when developing a product but are lifesavers when things go wrong. The FTR becomes a great ramp-up on the operational side of things.Â
Let's look at the tools and services that we discovered as part of the FTR process and are now using continuously to shore up our compliance and governance infrastructure.Â
Security Hub
AWS Security Hub automate AWS security checks and centralize security alerts. Its fairly easy to get started and use. It comes with a few different default benchmarks that once can enable. It finds all sorts of violations and deviations from best practices that are nice fixes to the product. An extremely useful service IMHO!Â
AWS Config
AWS Config assess, audit, and evaluate configurations of the AWS resources. The way it works is that you enable the AWS Config recorder which starts recording the resource configurations and any changes to these. For example, an S3 bucket ACL change etc.Â
Interestingly, it allows the users (and services such as security hub) define configuration rules that they are interested in. These rules are continuously evaluated and the results are forwarded to the default bus on AWS Event Bridge. Event Bridge allows you to build custom processing logic on each rule and then forward them to the whole gamut of AWS foundational services(SNS, Lambda, Cloudwatch etc).  Â
This is a powerful construct, because now this service becomes a foundational building block - you can detect any configuration change that you are interested in and invoke custom actions / pipeline. I was really amazed at the end to end around this - a large part of this is the richness of Event Bridge - but the overall end to end is a very complete experience. A job really well done guys!  Â
AWS Trusted Advisor
AWS Trusted Advisor provides recommendations that help you follow AWS best practices, reduce costs, improve performance, improve security. This service looks at the more practical side of the operations, things such as Cost optimization, Performance, Fault tolerance, Service limits and Security as well.Â
We didn't find many issues when we ran this, some of this could be because:
Trusted Advisor has tiers that are tied to the AWS Account's customer support plans. We were in the basic plan, so we didnt get to run the full gamut of checks. Â
we dont have real usecases running at this time, so cost, performance and service limits issues probably wouldn't surface at this time
there are comprehensive checks for AWS services such as EC2, EBS, RDS etc that we are not currently using.
But overall looks like a great service to have continuously running and once we have some customer traction, I might just upgrade to premium support so that I can get the full gamut of Trusted Advisor checks!
AWS Guard Duty
Intelligent Threat Detection to Protect Your AWS Accounts and Workloads. This service looks at a bunch of different logs to flag issues that could be possibly security threats. In our case, it flagged for iam user root credential usage, bucket policy changes and changes in the cloudtrail logging configuration. Additional controls for EKS, RDS and Malware probably didn't run (since we are no in those services yet). We're keeping this running, let see what threats we discover over time.Â
Amazon CloudTrail
Track user activity and API usage. This again is a foundational service that integrates with every other service in AWS and audits whatever actions are being done and by whom. I've used this to debug issues or two ever once in a while. However, the FTR's emphasis on cloudtrail enablement, cloudtrail logs security and paranoia over having these logs delivered to a different account underscores the importance of the data that this service logs. Its your complete AWS central audit system that can help trace every action that took place in the account. Here are some notable observations:Â
the data cloudtrail captures is very important and unlocks newer use cases of compliance, security and more.Â
cloudtrail auto generates insights from this data, such as the Cloudwatch Log service is seeing 465% increase in Create Log Stream API calls at this time etc.Â
integrates with data lakes, event data stores and allows sql query semantics over this data.Â
AWS Event Bridge
Although not in the FTR services toolset, but I did want to mention this service since I was amazed when I used it in #LetsData use cases. The concept of event buses and custom rules trigged via events on the event buses or by cron is very powerful.Â
The filtering mechanisms are simple to implement and the complexity from the large number and different types of events is very beautifully abstracted in the console by having all the sample events available to test in the console.Â
And the auto delivery integrations to SNS, Lambda and Cloudwatch Events complete the end to end for this really well.
And the fact that systems can publish automatically on the default event bus without you having to configure different services, grant permission and other similar actions makes it simple to use.Â
For example, to setup the "Alarm when S3 bucket becomes public", all one had to do was to copy the name of the Security Hub config rule for s3 bucket becomes public and create an event filter on the default bus to trigger when rule matches and is in alarm. Delivering this to cloudwatch logs automatically and configuring a cloudwatch alarm on these events completed the end to end without much fuss. Great vision folks!
AWS Backup
AWS Backup is a cost-effective, fully managed, policy-based service that simplifies data protection at scale and centralizes backup and restore. We enabled backups for all S3 buckets in the account. This included buckets that had customer data as well as #LetsData buckets. Great centralized backup service - did require some finagling to get it working - ideally I'd want this to be as easy and facile as DynamoDB Point in Time Restore (that feature works like a charm out of the box!)
What we fixed
Okay, now that we know what all was run, here are the issues that we found and fixed:
Cloudfront and S3 Bucket interactions
Cloudfront and S3 integrations are easy to configure and developers can get started in a few clicks. However, this also easily leaves secrity bugs if not configured properly. The Security Hub disabled public access on our S3 buckets (which we had behind the cloudfront, this was a remnant from when we didn't have cloudfront and public objects were used), configured default objects and configured controls that would disallow discovering unintended content in the S3 bucket using path based attacks. Benign but important. Great set of checks!
S3 Buckets
S3Â being central to almost every usecase at AWS, gets a larger number of security checks to make sure the bucket configurations are correct. Here is what we enforced:
we locked our S3 buckets to use SSL transport only (no man in the middle attacks)
enabled block public access on all buckets so that even accidentally nothing leaksÂ
enabled versioning on all our buckets
we even added alerts that alarm when a bucket becomes public! And interestingly, the way alert was done uncovered a very powerful foundational pattern on the AWS IMHO (see AWS Event Bridge above)Â
Again, great set of standardizations!
Identity and Access Management:
Security Hub flags for issues in the AWS account's identity, access and credentials usage. Things such as:
disabling root security credentials - I had resisted this for the longest time, but if the system is built using the security best practices such as IAM Roles granting temporary access credentials, disabling the root security credentials is a validation that your system is secure!
add hardware MFA to root account, software MFA to IAM accounts with console access, enforcing strong passwords and so on. These controls do hamper mobility, I believe when it comes to running secure enterprise services, such controls are a must. Also, they prevent operator errors that can occur from time to time since we are only human. Recommendations on rotation of keys and credentials every 90 days, audit access frequently and disable employee access on leaving company instills you to make sure that you have processes around these.
standardize cross-account access via IAM Roles (we were already doing this) and a randomized / not guessable externalId when granting temporary credentials for an additional security layer as a solution to the confused deputy problem https://aws.amazon.com/blogs/apn/securely-accessing-customer-aws-accounts-with-cross-account-iam-roles (we added this)
secret and credential storage in Secrets Manager (we were already doing this)
we created an incident response runbook following the whitepaper: https://docs.aws.amazon.com/whitepapers/latest/aws-security-incident-response-guide/runbooks.html - we normally know what to do when an incident occurs and these processes do formalize over time IMO as the issue collateral builds up. However, having a process defined at this time was useful in making sure we had everything in place to respond to any incidents.
We are using IAM according to the best practices! Great validation to have!
Data Security
We had already classified the different types of data in #LetsData and published it as part of the privacy policy and data at rest and data in transit was being encrypted. We audited this again as part of the FTR and found no violations. We are encrypted at rest and in transit!Â
Cloudtrail
We audited what events we had had enabled for cloudtrail. Additionally, we enabled log security and have them delivered to a different AWS account. This enabled us to have a complete auditing solution in place in case we need to trace any issues!Â
Backup and Restore
The FTR asks you to do your Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) calculations to define what the service's recovery objectives are (SLAs).
Since the durability of Dynamo DB and S3 have so many 9s, we had taken this for granted that data loss would probably be only an academic concern. However, an important component of data loss is functional bugs that we or the partner code might have.
For example, if someone pushes a code bug that updates the database rows to an incorrect value, your data durability is toast. Granted, its a bug and fault of the code writer, but finger pointing and blame accusations are pointless in this scenario. As a resilient service, if this happpens, what will we do about it?
This logic had us enabling AWS Backups on all the S3 buckets and DynamoDB point in time restore (PITR) on all the Dynamo DB tables. We defined backup schedules and retention duration and came up with approximate RPOs and RTOs. We also did resilency tests (simulate data loss and then restore from backups)! And because we've built this into the service, #LetsData customers get this for free and can be assured that we have their data covered according to enterprise data best practices. Thanks FTR - we had completely missed this one!Â
So that is what we had been upto in the last week or so! We are a trusted AWS Partner and have been validated for following the AWS foundational best practices for software running on AWS! You can engage with us either directly or though our partner page, knowing that AWS vouches somewhat for what we have to offer!
Originally posted as LinkedIn Article on the #LetsData page - moved to this substack to backfill the blog