Revisit Amazon Web Services re:Invent 2024’s biggest moments and watch keynotes and innovation talks on demand
Homepage » Amazon Web Services Solutions » Security & Compliance » Sensitive Data Protection on Amazon Web Services
Sensitive Data Protection on Amazon Web Services
Automate the sensitive data discovery process and manage data assets on a single platform
Homepage » Amazon Web Services Solutions » Security & Compliance » Sensitive Data Protection on Amazon Web Services
Sensitive Data Protection on Amazon Web Services
Automate the sensitive data discovery process and manage data assets on a single platform
Overview
Sensitive Data Protection on Amazon Web Services is a solution that aims to help enterprises to automate the sensitive data discovery process and manage data assets on a single platform. This solution provides a web application capable of discovering and managing sensitive data across multiple Amazon Web Services accounts.
By using this solution, customers can achieve improved data security, compliance, and data management. The solution allows customers to create data catalogs, define sensitive data using built-in or customized data identification rules, and scan the data source (Amazon S3, Amazon RDS, data lakes with Amazon Glue Data Catalog as the metadata catalog, self-built databases on Amazon EC2, and databases on other clouds or IDCs) using a classification template. After the scan, customers can have an overview of their data assets and browse the data at the column level to understand where the data is, what it is, and its sensitivity levels. With this information, customers can take appropriate actions to secure sensitive data and comply with regulations such as GDPR, HIPAA, and PIPL.
Sensitive Data Protection on Amazon Web Services uses Amazon Glue Data Catalog, Amazon Glue Crawler, and the Amazon Glue PII Detection feature as key components to produce data catalogs and perform sensitive data discovery jobs for each Amazon Web Services account. It centralizes information such as data catalogs and data discovery results in a single place and provides a dashboard and detailed reports, making it easier for customers to manage their sensitive data protection efforts.
Benefits
The solution automatically discovers data assets across multiple accounts and generates a data catalog.
The solution automatically discovers data assets across multiple accounts and generates a data catalog.
The solution utilizes machine learning and pattern matching technologies to automatically identify sensitive data. The solution offers built-in sensitive data types for you to choose from. In addition, you can define custom sensitive data types based on business needs.
The solution utilizes machine learning and pattern matching technologies to automatically identify sensitive data. The solution offers built-in sensitive data types for you to choose from. In addition, you can define custom sensitive data types based on business needs.
The solution provides regular scanning capabilities and offers a visualization dashboard and downloadable reports to assist customers in achieving ongoing compliance and providing technical justification.
The solution provides regular scanning capabilities and offers a visualization dashboard and downloadable reports to assist customers in achieving ongoing compliance and providing technical justification.
Built on Amazon Web Services, the solution seamlessly integrates with other services. The open-source code of the solution makes it easy for clients to integrate or customize it.
Built on Amazon Web Services, the solution seamlessly integrates with other services. The open-source code of the solution makes it easy for clients to integrate or customize it.
Technical details
The Application Load Balancer distributes the solution's frontend web UI assets hosted in Amazon Lambda.
Identity provider is used for user authentication.
The Amazon Lambda function is packaged as Docker images and stored in the Amazon ECR (Elastic Container Registry).
The backend Lambda function is a target for the Application Load Balancer.
The backend Lambda function invokes Amazon Step Functions in monitored accounts for sensitive data detection.
In Amazon Step Functions workflow, the Amazon Glue Crawler runs to take inventory of the structured data sources and is stored in the Glue Database as metadata tables. Amazon SageMaker Processing Job is used to preprocess unstructured file in S3 buckets, and store metadata in the Glue database. Amazon Glue Job is used to detect sensitive data.
The Amazon Step Functions send Amazon SQS messages to the detection job queue after the Glue job has run.
Lambda function processes messages from Amazon SQS.
The Amazon Athena query detection results and save them to MySQL instance in Amazon RDS.
Related content
Amazon Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.