Revisit Amazon Web Services re:Invent 2024’s biggest moments and watch keynotes and innovation talks on demand
General
Q: What is Amazon S3?
Amazon S3 is storage for the Internet. It’s a simple storage service that offers software developers a highly-scalable, reliable, and low-latency data storage infrastructure at very low costs.
Q: What can I do with Amazon S3?
Amazon S3 provides a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web. Using this web service, developers can easily build applications that make use of Internet storage. Since Amazon S3 is highly scalable and you only pay for what you use, developers can start small and grow their application as they wish, with no compromise on performance or reliability. It is designed to be highly flexible: Store any type and amount of data that you want; read the same piece of data a million times or only for emergency disaster recovery; build a simple FTP application, or a sophisticated web application such as the Amazon.com retail web site. Amazon S3 frees developers to focus on innovation, not figuring out how to store their data.
Q: What are the technical benefits of Amazon S3?
Amazon S3 was carefully engineered to meet the requirements for scalability, reliability, speed, low-cost, and simplicity that must be met for Amazon’s internal developers. Amazon S3 passes these same benefits onto any external developer. More information about the Amazon S3 design requirements is available on the Amazon S3 features page.
Q: What can developers do now that they could not before?
Until now, a sophisticated and scalable data storage infrastructure like Amazon’s has been beyond the reach of small developers. Amazon S3 enables any developer to leverage Amazon’s own benefits of massive scale with no up-front investment or performance compromises. Developers are now free to innovate knowing that no matter how successful their businesses become, it will be inexpensive and simple to ensure their data is quickly accessible, always available, and secure.
Q: How much data can I store?
The total volume of data and number of objects you can store are unlimited. Individual Amazon S3 objects can range in size from 1 byte to 5 terabytes. The largest object that can be uploaded in a single PUT is 5 gigabytes. For objects larger than 100 megabytes, customers should consider using the Multipart Upload capability.
Q: How can I delete large numbers of objects?
You can use Multi-Object Delete to delete large numbers of objects from Amazon S3. This feature allows you to send multiple object keys in a single request to speed up your deletes. Amazon does not charge you for using Multi-Object Delete.
Q: Does Amazon store its own data in Amazon S3?
Yes. Developers within Amazon use Amazon S3 for a wide variety of projects. Many of these projects use Amazon S3 as their authoritative data store, and rely on it for business-critical operations.
Q: How is Amazon S3 data organized?
Amazon S3 is a simple key-based object store. When you store data, you assign a unique object key that can later be used to retrieve the data. Keys can be any string, and can be constructed to mimic hierarchical attributes.
Q: How do I interface with Amazon S3?
Amazon S3 provides simple, standards-based REST web services interfaces that is designed to work with any Internet-development toolkit. The operations are intentionally made simple to make it easy to add new distribution protocols and functional layers.
Q: How reliable is Amazon S3?
Amazon S3 gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. S3 Standard is designed for 99.99% availability and Standard - IA is designed for 99.9% availability. Both are backed by the Amazon S3 Service Level Agreement.
Q: What happens if traffic from my application suddenly spikes?
Amazon S3 was designed from the ground up to handle traffic for any Internet application. Pay-as-you-go pricing and unlimited capacity ensures that your incremental costs don’t change and that your service is not interrupted. Amazon S3’s massive scale enables us to spread load evenly, so that no individual application is affected by traffic spikes.
Q: What is the consistency model for Amazon S3?
Amazon S3 delivers strong read-after-write consistency automatically for any storage request, without changes to performance or availability, without sacrificing regional isolation for applications, and at no additional cost.
Any request for S3 storage is now strongly consistent. After a successful write of a new object or an overwrite of an existing object, any subsequent read request immediately receives the latest version of the object. S3 also provides strong consistency for list operations, so after a write, you can immediately perform a listing of the objects in a bucket with any changes reflected.
Q: Why does strong read-after-write consistency help me?
Strong read-after-write consistency helps you when you need to immediately read an object after a write. For example, strong read-after-write consistency helps for workloads like Apache Spark and Apache Hadoop where you often read and list immediately after writing objects. High-performance computing workloads also benefit in that when an object is overwritten and then read many times simultaneously, strong read-after-write consistency provides assurance that the latest write is read across all reads. These applications automatically and immediately benefit from strong read-after-write consistency. S3 strong consistency also reduces costs by removing the need for extra infrastructure to provide strong consistency.
Q: How secure is my data in Amazon S3?
Amazon S3 is secure by default. Upon creation, only you have access to Amazon S3 buckets that you create, and you have complete control over who has access to your data. Amazon S3 supports user authentication to control access to data. You can use access control mechanisms such as bucket policies to selectively grant permissions to users and groups of users. The Amazon S3 console highlights your publicly accessible buckets, indicates the source of public accessibility, and also warns you if changes to your bucket policies or bucket ACLs would make your bucket publicly accessible. You should enable Block Public Access for all accounts and buckets that you do not want publicly accessible.
You can securely upload/download your data to Amazon S3 via SSL endpoints using the HTTPS protocol. Amazon S3 automatically encrypts all object uploads (as of January 5, 2023) to your bucket. Alternatively, you can use your own encryption libraries to encrypt data before storing it in Amazon S3.
For more information, visit the S3 User Guide.
Q: What options do I have for encrypting data stored on Amazon S3?
Amazon S3 encrypts all new data uploads to any bucket. Amazon S3 applies S3-managed server-side encryption (SSE-S3) as the base level of encryption to all object uploads (as of January 5, 2023). SSE-S3 provides a fully-managed solution where the Amazon Web Services handles key management and key protection using multiple layers of security. You should continue to use SSE-S3 if you prefer to have Amazon Web Services manage your keys. Additionally, you can choose to encrypt data using SSE-C, SSE-KMS, or a client library such as the Amazon S3 Encryption Client. All four enable you to store sensitive data encrypted at rest in Amazon S3.
SSE-C lets Amazon S3 perform the encryption and decryption of your objects while you retain control of the keys used to encrypt objects. With SSE-C, you don’t need to implement or use a client-side library to perform the encryption and decryption of objects you store in Amazon S3, but you do need to manage the keys that you send to Amazon S3 to encrypt and decrypt objects. Use SSE-C if you want to maintain your own encryption keys, but don’t want to implement or leverage a client-side encryption library.
SSE-KMS lets Amazon Key Management Service (KMS) manage your encryption keys. Using Amazon KMS to manage your keys provides several additional benefits. With Amazon KMS, there are separate permissions for the use of the KMS key, providing an additional layer of control as well as protection against unauthorized access to your objects stored in Amazon S3. Amazon KMS provides an audit trail so you can see who used your key to access which object and when, as well as view failed attempts to access data from users without permission to decrypt the data. Also, Amazon KMS provides additional security controls to support customer efforts to comply with PCI-DSS, HIPAA/HITECH, and FedRAMP industry requirements.
DSSE-KMS simplifies the process of applying two layers of encryption to your data, without having to invest in infrastructure required for client-side encryption. Each layer of encryption uses a different implementation of the 256-bit Advanced Encryption Standard with Galois Counter Mode (AES-GCM) algorithm and is vetted and accepted for use on top-secret workloads. DSSE-KMS uses Amazon KMS to generate data keys, and lets Amazon KMS manage your encryption keys. With Amazon KMS, there are separate permissions for the use of the KMS key, providing an additional layer of control and protection against unauthorized access to your objects stored in Amazon S3. Amazon KMS provides an audit trail so you can see who used your key to access which object and when, as well as view failed attempts to access data from users without permission to decrypt the data. Also, Amazon KMS provides additional security controls to support customer efforts to comply with industry requirements like the PCI-DSS.
Using an encryption client library, such as the Amazon S3 Encryption Client, you retain control of the keys and complete the encryption and decryption of objects client-side using an encryption library of your choice. Some customers prefer full end-to-end control of the encryption and decryption of objects; that way, only encrypted objects are transmitted over the internet to Amazon S3. Use a client-side library if you want to maintain control of your encryption keys, are able to implement or use a client-side encryption library, and need to have your objects encrypted before they are sent to Amazon S3 for storage.
For more information on using Amazon S3 SSE-S3, SSE-C, or SSE-KMS, please refer to the topic on Using Encryption in the Amazon S3 User Guide.
Q: How can I control access to my data stored on Amazon S3?
Customers can use a number of mechanisms for controlling access to Amazon S3 resources, including Identity and Access Management (IAM) policies, bucket policies, access point policies, access control lists (ACLs), Query String Authentication, Virtual Private Cloud (VPC) endpoint policies, and Amazon S3 Block Public Access.
IAM
IAM enables organizations with multiple employees to create and manage multiple users under a single Amazon Web Services account. With IAM policies, customers can grant IAM users fine-grained control to their Amazon S3 bucket or objects while also retaining full control over everything the users do.
Bucket and access point policies
With bucket policies and access point policies, customers can define rules which apply broadly across all requests to their Amazon S3 resources, such as granting write privileges to a subset of Amazon S3 resources. Customers can also restrict access based on an aspect of the request, such as HTTP referrer and IP address.
ACLs
Amazon S3 supports S3's original access control method, access control lists (ACLs). With ACLs, customers can grant specific permissions (i.e. READ, WRITE, FULL_CONTROL) to specific users for an individual bucket or object. For customers who prefer to use policies exclusively for access control, Amazon S3 offers the S3 Object Ownership feature to disable ACLs. You can use S3 Inventory to review ACLs usage in your buckets before enabling S3 Object Ownership when migrating to IAM-based buckets policies.
Query String Authentication
With Query String Authentication, customers can create a URL to an Amazon S3 object which is only valid for a limited time. For more information on the various access control policies available in Amazon S3, refer to the Access Control topic in the Amazon S3 User Guide.
Amazon VPC
When customers create an Amazon VPC endpoint, they can attach an endpoint policy to it that controls access to the Amazon S3 resources to which they are connecting. Customers can also use Amazon S3 bucket policies to control access to buckets from specific endpoints or specific VPCs.
S3 Block Public Access
Amazon S3 Block Public Access provides settings for access points, buckets, and accounts to help customers manage public access to Amazon S3 resources. With S3 Block Public Access, account administrators and bucket owners can easily set up centralized controls to limit public access to their Amazon S3 resources that are enforced regardless of how the resources are created. All new buckets have Block Public Access turned on by default.
Learn more about policies and permissions in the Amazon IAM User Guide.
Service Level Agreement (SLA)
Q: Does Amazon S3 offer a Service Level Agreement (SLA)?
Yes. The Amazon S3 SLA provides for a service credit if a customer's monthly uptime percentage is below our service commitment in any billing cycle. More information can be found in the Service Level Agreement.
Billing
Q: How much does Amazon S3 cost?
With Amazon S3, you pay only for what you use. There is no minimum fee.
We charge less where our costs are less. There is no Data Transfer charge for data transferred within the Amazon S3 Amazon Web Services China (Beijing) Region or Amazon Web Services China (Ningxia) Region via a COPY request. There is no Data Transfer charge for data transferred between Amazon EC2 and Amazon S3 within the Amazon Web Services China (Beijing) Region or within the Amazon Web Services China (Ningxia) Region. Data transferred between Amazon EC2 and Amazon S3 across two Amazon Web Services Regions - i.e. between the Amazon EC2 Amazon Web Services China (Ningxia) Region and Amazon S3 Amazon Web Services China (Beijing) Region is charged at the Internet transfer rate specified on the pricing section of the billing console.
Q: How will I be charged and billed for my use of Amazon S3?
There are no set-up fees or commitments to begin using the service. At the end of the month, you will be billed for that month's usage. You can view your charges for the current billing period at any time on the Amazon Web Services Management Console, by logging into your Amazon Web Services account, and clicking “Account Activity” under “Your Web Services Account”.
Q: How am I charged for accessing Amazon S3 through the Amazon Web Services Management Console?
Normal Amazon S3 pricing applies when accessing the service through the Amazon Web Services Management Console. To provide an optimized experience, the Amazon Web Services Management Console may proactively execute requests. Also, some interactive operations result in more than one request to the service.
Q: Do your prices include taxes?
Our prices are exclusive of applicable taxes and duties, including VAT and applicable sales tax.
Security
S3 Access Grants
Q: What are Amazon S3 Access Grants?
Amazon S3 Access Grants map identities in directories such as Active Directory, or Amazon Identity and Access Management (IAM) principals, to datasets in S3. This helps you manage data permissions at scale by automatically granting S3 access to end-users based on their corporate identity. Additionally, S3 Access Grants log end-user identity and the application used to access S3 data in Amazon CloudTrail. This helps to provide a detailed audit history down to the end-user identity for all access to the data in your S3 buckets.
Q: Why should I use S3 Access Grants?
You should use S3 Access Grants if your S3 data is shared and accessed by many users and applications, where some of their identities are in your corporate directory such as Okta or Entra ID, and you need a scalable, simple, and auditable way to grant access to these S3 datasets at scale.
Q: How do I get started with S3 Access Grants?
You can get started with S3 Access Grants in four steps. First, configure an S3 Access Grants instance. In this step, if you want to use S3 Access Grants with users and groups in your corporate directory, enable IAM Identity Center and connect S3 Access Grants to your Identity Center instance. Second, register a location with S3 Access Grants. During this process, you give S3 Access Grants an IAM role that is used to create temporary S3 credentials that users and applications can use to access S3. Third, define permission grants that specify who can access what. Finally, at the time of access, have your application request temporary credentials from S3 Access Grants and use Access Grants-vended credentials to access S3.
Q: What types of identity are supported for S3 Access Grants permission grants?
S3 Access Grants supports two kinds of identities: enterprise user or group identities from Amazon IAM Identity Center, and Amazon IAM principals including IAM users and roles. When you use S3 Access Grants with IAM Identity Center, you can define data permissions on the basis of directory group memberships. IAM Identity Center is an Amazon service that connects to commonly-used identity providers, including Entra ID, Okta, Ping, and others. In addition to supporting directory identities via IAM Identity Center, S3 Access Grants also supports permission rules for IAM principal including IAM users and roles. This is for use cases where you either manage a custom identity federation not through IAM Identity Center but via IAM and SAML assertion (example implementation), or manage application identities based on IAM principals, and still would like to use S3 Access Grants due to its scalability and auditability.
Q: What are the different access levels that S3 Access Grants offers?
Access Grants offers three access levels: READ, WRITE, and READWRITE. READ allows you to view and retrieve objects from S3. WRITE allows you to write to and delete from S3. READWRITE allows you to do both READ and WRITE.
Q: Can I customize my access levels?
No. You can only use the three pre-defined access levels (READ/WRITE/READWRITE) that S3 Access Grants offers.
Q: Are there any limits on S3 Access Grants?
Yes. You can create up to 100,000 grants per S3 Access Grants instance, and up to 1,000 locations per S3 Access Grants instance.
Q: Is there any performance impact for data access when I use S3 Access Grants?
No. Once you have obtained the credentials from S3 Access Grants, you can reuse unexpired credentials for subsequent requests. For these subsequent requests, there is no additional latency for requests authenticated via S3 Access Grants credentials compared to other methods.
Q: What other Amazon Web Services services are required to use S3 Access Grants?
If you intend to use S3 Access Grants for directory identities, you will need to set up Amazon IAM Identity Center first. Amazon IAM Identity Center helps you create or connect your workforce identities, whether the identities are created and stored in Identity Center, or in an external third-party Identity Provider. Refer to the Identity Center documentation for the setup process. Once you have set up the Identity Center instance, you can connect the instance to S3 Access Grants. Thereafter, S3 Access Grants relies on Identity Center to retrieve user attributes such as group membership to evaluate requests and make authorization decisions.
Q: Does S3 Access Grants require client-side modifications?
Yes. Whereas today, you initialize your S3 client with IAM credentials associated with your application (for example, IAM role credentials for EC2 or IAM Roles Anywhere; or using long-term IAM user credentials), your application will need to instead obtain S3 Access Grants credentials first before initializing the S3 client. These S3 Access Grants credentials will be specific to the authenticated user in your application. Once the S3 client is initialized with these S3 Access Grants credentials, it can make requests for S3 data as usual using the credentials.
Q: Since client-side modifications are necessary, what Amazon Web Services services and third-party applications are integrated with S3 Access Grants out-of-box today?
S3 Access Grants today already integrates with EMR and open-source Spark via the S3A connector. In addition, S3 Access Grants integrates with third-party software including Immuta and Informatica so that you can centralize permission management. And finally, S3 Access Grants supports Terraform and CloudFormation for you to programmatically provision S3 Access Grants.
Q: Is S3 Access Grants a replacement for Amazon IAM?
No. S3 Access Grants does not replace IAM and in fact works well with your existing IAM-based data protection strategies (encryption, network, data-perimeter rules). S3 Access Grants is built on IAM primitives and enables you to express finer-grained S3 permissions at scale.
Q: Does S3 Access Grants work with KMS?
Yes. To utilize S3 Access Grants for objects encrypted with KMS, bucket owners include the necessary KMS permissions in the IAM role that they grant to S3 Access Grants as part of the location registration. S3 Access Grants can then subsequently utilize that IAM role to access the KMS-encrypted objects in the buckets.
Q: How do I view and manage my S3 Access Grants permission grants?
You can use either the S3 Access Grants console experience in the Amazon Web Services Management Console or SDK and CLI APIs for you to view and manage your S3 Access Grants permissions.
Q: Can you grant public access to data with S3 Access Grants?
No, you cannot grant public access to data with S3 Access Grants.
Q: How can I audit requests that were authorized via S3 Access Grants?
The request by the application to initiate a data access session with S3 Access Grants will be recorded in CloudTrail. CloudTrail will distinguish the identity of the user making the request and the application identity accessing the data on the user’s behalf. This helps you audit end-user identity of who accessed what data at what time.
Q: How is S3 Access Grants priced?
S3 Access Grants is charged based on the number of requests to S3 Access Grants. See the pricing page for details.
Q: What is the relationship between S3 Access Grants and Lake Formation?
Amazon Lake Formation is for use cases where you need to manage access for tabular data (e.g., Glue tables), where you might want to enforce row- and column-level access. S3 Access Grants is for managing access for direct S3 permissions such as unstructured data including videos, images, logs, etc.
Q: Is S3 Access Grants integrated with IAM Access Analyzer?
No. S3 Access Grants is not integrated with IAM Access Analyzer at this time. You can’t yet use IAM Access Analyzer to analyze S3 Access Grants permission grants. Customers can audit S3 Access Grants directly by going to the S3 Access Grants page in the S3 console, or programmatically using the ListAccessGrants API.
S3 Access Points
Q: What is Amazon S3 Access Points?
Today, customers manage access to their S3 buckets using a single bucket policy that controls access for hundreds of applications with different permission levels.
Amazon S3 Access Points simplifies managing data access at scale for applications using shared datasets on S3. With S3 Access Points, you can now easily create hundreds of access points per bucket, representing a new way of provisioning access to shared datasets. Access points provide a customized path into a bucket, with a unique hostname and access policy that enforces the specific permissions and network controls for any request made through the access point. S3 Access Points can be associated with buckets in the same account or in another trusted account. Learn more by visiting the user guide.
Q: Why should I use an access point?
S3 Access Points simplify how you manage data access to your shared datasets on S3. You no longer have to manage a single, complex bucket policy with hundreds of different permission rules that need to be written, read, tracked, and audited. With S3 Access Points, you can create access points or delegate permissions to trusted accounts to create cross-account access points on your bucket. This permits access to shared datasets with policies tailored to the specific application.
Using S3 Access Points, you can decompose one large bucket policy into separate, discrete access point policies for each application that needs to access the shared dataset. This makes it simpler to focus on building the right access policy for an application, while not having to worry about disrupting what any other application is doing within the shared dataset. You can also create a Service Control Policy (SCP) and require that all access points be restricted to a Virtual Private Cloud (VPC), firewalling your data to within your private networks.
Q: How do S3 Access Points work?
Each S3 Access Point is configured with an access policy specific to a use case or application, and a bucket can have thousands of access points. For example, you can create an access point for your S3 bucket that grants access for groups of users or applications for your data lake. An access point can support a single user or application, or groups of users or applications within and across accounts, allowing separate management of each access point.
Additionally, you can delegate permissions to trusted accounts to create cross-account access points on your bucket. The cross-account access points don’t grant access to data until you are granted permissions from the bucket owner. The bucket owner always retains ultimate control on the data and must update the bucket policy to authorize requests from the cross-account access point. Visit the user guide for a sample bucket policy.
Each access point is associated with a single bucket and contains a network origin control, and a Block Public Access control. You can create an access point with a network origin control that only permits storage access from your Virtual Private Cloud, a logically isolated section of the Amazon Web Services cloud. You can also create an access point with the access point policy configured to only allow access to objects with defined prefixes or to objects with specific tags.
You can access data in shared buckets through an access point in one of two ways. For S3 object operations, you can use the access point ARN in place of a bucket name. For requests requiring a bucket name in the standard S3 bucket name format, you can use an access point alias instead. Aliases for S3 Access Points are automatically generated and are interchangeable with S3 bucket names anywhere you use a bucket name for data access. Every time you create an access point for a bucket, S3 automatically generates a new Access Point Alias. For the full set of compatible operations and Amazon Web Services services, visit the S3 Documentation.
Q: Is there a quota on how many access points I can create?
By default, you can create 10,000 access points per Region per account on buckets in your account and cross-account. There is no hard limit on the number of access points per Amazon Web Services account. Visit Service Quotas to request an increase in this quota.
Q: How do I write access point policies?
You can write an access point policy just like a bucket policy, using IAM rules to govern permissions and the access point ARN in the policy document.
Q: How is restricting access to specific VPCs using network origin controls on access points different from restricting access to VPCs using the bucket policy?
You can continue to use bucket policies to limit bucket access to specified VPCs. Access points provide an easier, auditable way to lock down all or a subset of data in a shared dataset to VPC-only traffic for all applications in your organization using API controls. You can use an Amazon Organizations Service Control Policy (SCP) to mandate that any access point created in your organization set the “network origin control” API parameter value to “vpc”. Then, any new access point created automatically restricts data access to VPC-only traffic. No additional access policy is required to make sure that data requests are processed only from specified VPCs.
Q: Can I enforce a “No internet data access” policy for all access points in my organization?
Yes. To enforce a “No internet data access” policy for access points in your organization, you would want to make sure all access points enforce VPC only access. To do so, you will write an Amazon Web Services Organizations Service Control Policy (SCP) that only supports the value “vpc” for the “network origin control” parameter in the create_access_point() API. If you had any internet-facing access points that you created previously, they can be removed. You will also need to modify the bucket policy in each of your buckets to further restrict internet access directly to your bucket through the bucket hostname. Since other Amazon Web Services services may be directly accessing your bucket, make sure you set up access to allow the Amazon Web Services services you want by modifying the policy to permit these Amazon Web Services services. Refer to the S3 documentation for examples of how to do this.
Q: Can I completely disable direct access to a bucket using the bucket hostname?
Not currently, but you can attach a bucket policy that rejects requests not made using an access point. Refer to the S3 Documentation for more details.
Q: Can I replace or remove an access point from a bucket?
Yes. When you remove an access point, any access to the associated bucket through other access points, and through the bucket hostname, will not be disrupted.
Q: What is the cost of Amazon S3 Access Points?
There is no additional charge for access points or buckets that use access points. Usual Amazon S3 request rates apply.
Q: How do I get started with S3 Access Points?
You can start creating S3 Access Points on new buckets as well as existing buckets through the Amazon Web Services Management Console, the Amazon Web Services Command Line Interface (CLI), the Application Programming Interface (API), and the Amazon Web Services Software Development Kit (SDK) client. To learn more about S3 Access Points, visit the user guide.
Data Protection
Q: How durable is Amazon S3?
Amazon S3 is designed to provide 99.999999999% durability of objects over a given year. This durability level corresponds to an average annual expected loss of 0.000000001% of objects. For example, if you store 10,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000,000 years.
Q: How is Amazon S3 designed to achieve 99.999999999% durability?
Amazon S3 redundantly stores your objects on multiple devices across multiple facilities in the Amazon S3 Region you designate. The service is designed to sustain concurrent device failures by quickly detecting and repairing any lost redundancy. When processing a request to store data, the service will redundantly store your object across multiple facilities before returning SUCCESS. Amazon S3 also regularly verifies the integrity of your data using checksums.
Q: What checksums does Amazon S3 support for data integrity checking?
Amazon S3 uses a combination of Content-MD5 checksums, secure hash algorithms (SHAs), and cyclic redundancy checks (CRCs) to verify data integrity. S3 performs these checksums on data at rest and repairs any disparity using redundant data. In addition, S3 calculates checksums on all network traffic to detect alterations of data packets when storing or retrieving data.
Choose from four supported checksum algorithms to check data integrity on your upload and download requests: SHA-1, SHA-256, CRC32, or CRC32C, depending on your application needs. Automatically calculate and verify checksums as you store or retrieve data from S3, and access the checksum information at any time using the GetObjectAttributes S3 API or an S3 Inventory report. Calculating checksums as you stream data into S3 saves you time since you can verify and transmit your data in a single pass. Using checksums for data validation is also considered a best practice for data durability, and these capabilities increase performance and reduce costs.
Q: What is Versioning?
Versioning allows you to preserve, retrieve, and restore every version of every object stored in an Amazon S3 bucket. Once you enable Versioning for a bucket, Amazon S3 preserves existing objects anytime you perform a PUT, POST, COPY, or DELETE operation on them. By default, GET requests will retrieve the most recently written version. Older versions of an overwritten or deleted object can be retrieved by specifying a version in the request.
Q: Why should I use Versioning?
Amazon S3 provides customers with a highly durable storage infrastructure. Versioning offers an additional level of protection by providing a means of recovery when customers accidentally overwrite or delete objects. This allows you to easily recover from unintended user actions and application failures. You can also use Versioning for data retention and archiving.
Q: How do I start using Versioning?
You can start using Versioning by enabling a setting on your Amazon S3 bucket. For more information on how to enable Versioning, please refer to the Amazon S3 Technical Documentation.
Q: How does Versioning protect me from accidental deletion of my objects?
When a user performs a DELETE operation on an object, subsequent default requests will no longer retrieve the object. However, all versions of that object will continue to be preserved in your Amazon S3 bucket and can be retrieved or restored. Only the owner of an Amazon S3 bucket can permanently delete a version.
Q: Can I set up a trash, recycle bin, or rollback window on my Amazon S3 objects to recover from deletes and overwrites?
You can use Amazon S3 Lifecycle rules along with S3 Versioning to implement a rollback window for your S3 objects. For example, with your versioning-enabled bucket, you can set up a rule that archives all of your previous versions to the lower-cost S3 Glacier storage class and deletes them after 100 days, giving you a 100-day window to roll back any changes on your data while lowering your storage costs. Additionally, you can save costs by deleting old (noncurrent) versions of an object after 5 days and when there are at least 2 newer versions of the object. You can change the number of days or the number of newer versions based on your cost optimization needs. This allows you to retain additional versions of your objects when needed, but saves you cost by transitioning or removing them after a period of time.
Q: How am I charged for using Versioning?
Normal Amazon S3 rates apply for every version of an object stored or requested.
Q: What is Amazon S3 Block Public Access?
Amazon S3 Block Public Access is a new set of security controls that allows customers to enforce that S3 buckets and objects do not have public access. With a few clicks, administrators can apply the Amazon S3 Block Public Access settings to all buckets within an account, or to specific buckets. Once the settings are applied to an account, any existing or new buckets and objects associated with that account inherit the settings that prevent public access. All new buckets have Block Public Access turned on by default. The Amazon S3 Block Public Access settings override other S3 permissions that allow public access, making it easy for the account administrator to enforce a “no public access” policy regardless of existing permissions, how an object is added or a bucket is created.
Q: Why should I use the Amazon S3 Block Public Access settings?
The Amazon S3 Block Public Access settings let you make sure that, regardless of the existing policies set on buckets or objects, you can apply a control that specifies that S3 resources won’t ever have public access, now or in the future. With just a few clicks on the S3 console you can prevent public policies and ACLs from being set on S3 buckets an objects now, and in the future. Please visit the Amazon S3 Developer Guide to learn more about the Amazon S3 Block Public Access settings.
Q: How do I block public access for all the buckets within my account?
You can configure the Amazon S3 Block Public Access settings either through the “Public access settings for this account” side navigation bar on the S3 console or through the API. Once you set these at the account level, all buckets and objects within the entire account inherit the properties. If you want to change these settings, you can go back to the S3 console and uncheck the checkboxes, or manage it programmatically through the API.
Q: How do I block public access for a specific bucket?
You can configure the Amazon S3 Block Public Access settings through the “permissions” tab on the S3 console or through the API. Once you set these at the bucket level, public access to the bucket and the objects within them will be blocked.
Q: What is an Amazon VPC Endpoint for Amazon S3?
An Amazon VPC Endpoint for Amazon S3 is a logical entity within a VPC that allows connectivity to S3 over the Amazon Web Services China network. There are two types of VPC endpoints for S3 – gateway VPC endpoints and interface VPC endpoints. Gateway endpoints are a gateway that you specify in your route table to access S3 from your VPC over the Amazon Web Services China network. Interface endpoints extend the functionality of gateway endpoints by using private IPs to route requests to S3 from within your VPC, on-premises, or from a different Amazon Web Services China Region. For more information, read the S3 documentation.
Q: Can I allow a specific Amazon VPC Endpoint access to my Amazon S3 bucket?
You can limit access to your bucket from a specific Amazon VPC Endpoint or a set of endpoints using Amazon S3 bucket policies. S3 bucket policies now support a condition, aws:sourceVpce, that you can use to restrict access. For more details and example policies, read the S3 documentation.
Q: What is Amazon PrivateLink for Amazon S3?
Amazon PrivateLink for S3 provides private connectivity between Amazon S3 and on-premises. You can provision interface VPC endpoints for S3 in your VPC to connect your on-premises applications directly with S3 over Amazon Direct Connect. You no longer need to use public IPs, change firewall rules, or configure an internet gateway to access S3 from on-premises. To learn more read the S3 documentation.
Q: How do interface VPC endpoints for Amazon S3 work?
Interface VPC endpoints provision Elastic Network Interface (ENI) in your VPC. An ENI is a logical networking component through which you can route requests to S3 over the Amazon Web Services China network. You can create an interface VPC endpoint in one or more Availability Zones, spanning one or more subnets. In each subnet that you specify, an ENI will be set up with an IP address from your private IP address pool. Requests to S3 are resolved to the private IPs assigned to the ENIs. Addressing S3 through private IP addresses in this way makes S3 directly reachable from on-premises hosts that are connected to Amazon Web Services over Amazon Direct Connect. For more information on interface VPC endpoints, read the S3 documentation.
Q: How do I get started with interface VPC endpoints for S3?
You can create an interface VPC endpoint using the Amazon VPC Management Console, Amazon CLI, Amazon SDK or API. To learn more read the S3 documentation.
Q: When should I choose gateway VPC endpoints instead of Amazon PrivateLink-based interface VPC endpoints?
We recommend that you use interface VPC endpoints to access S3 from on-premises or from a VPC in another Amazon Web Services China Region. For resources that are accessing S3 from a VPC in the same Amazon Web Services China Region as S3, we recommend using gateway VPC endpoints as they are not billed. To learn more read the S3 documentation.
Amazon S3 Storage Classes
Q: What are the Amazon S3 storage classes?
Amazon S3 offers a range of storage classes that you can choose from based on the data access, resiliency, and cost requirements of your workloads. S3 storage classes are purpose-built to provide the low-cost storage for different access patterns. S3 storage classes are ideal for virtually any use case, including those with demanding performance needs, data residency requirements, unknown or changing access patterns, or archival storage. Each S3 storage class charges a fee to store data and fees to access data. In deciding which S3 storage class best fits your workload, consider the access patterns and retention time of your data to optimize for the lowest total cost over the lifetime of your data.
Q: How do I decide which S3 storage class to use?
In deciding which S3 storage class best fits your workload, consider the access patterns and retention time of your data to optimize for the low-total cost over the lifetime of your data. Most workloads have changing (user-generated content), unpredictable (analytics, data lakes), or unknown (new applications) access patterns, and that is why S3 Intelligent-Tiering should be the default storage class to automatically save on storage costs. If you know the access patterns of your data, you can follow this guidance. The S3 Standard storage class is ideal for frequently accessed data; this is the best choice if you access data more than once a month. S3 Standard-Infrequent Access is ideal for data retained for at least a month and accessed once every month or two.
For rarely accessed archive data, you can choose from three archive storage classes optimized for different access patterns and storage duration. For archive data that needs immediate access, you should use S3 Glacier Instant Retrieval, the first archive storage class that delivers milliseconds retrieval for data that is accessed 2-3 times per year. For archive data that does not require immediate access, you can reduce storage costs by using S3 Glacier Flexible Retrieval with data retrieval within minutes to hours, and free bulk retrieval. To save even more on archive storage, you can use S3 Glacier Deep Archive, the lowest cost storage in Amazon S3 with data retrieval within hours. All these storage classes are regional storage classes that provide multi-Availability Zone (AZ) resiliency by redundantly storing data on multiple devices and physically separated Amazon Web Services Availability Zones in an Amazon Web Services China Region.
For data that has a lower resiliency requirement, you can reduce costs by selecting a single-AZ storage class, like S3 One Zone-Infrequent Access.
S3 Intelligent-Tiering
Q: What is S3 Intelligent-Tiering?
Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering) is an S3 storage class for data with unknown access patterns or changing access patterns that are difficult to learn. It is the only cloud storage that delivers automatic cost savings by moving objects between four access tiers when access patterns change. There are two low latency access tiers optimized for frequent and infrequent access, and two optional archive access tiers designed for asynchronous access that are optimized for rare access.
Objects uploaded or transitioned to S3 Intelligent-Tiering are automatically stored in the frequent access tier. S3 Intelligent-Tiering works by monitoring access patterns and then moving the objects that have not been accessed in 30 consecutive days to the infrequent access tier. You can activate one or both archive access tiers to automatically move objects that haven’t been accessed for 90 days to the archive access tier and then after 180 days to the deep archive access tier. If the objects are accessed later, S3 Intelligent-Tiering moves the objects back to the frequent access tier. There are no retrieval fees, so you won’t see unexpected increases in storage bills when access patterns change.
Q: Why would I choose to use S3 Intelligent-Tiering?
S3 Intelligent-Tiering is for data with unknown access patterns or changing access patterns that are difficult to learn. It is ideal for datasets where you may not be able to anticipate access patterns. For datasets with changing access patterns where sub-sets of objects may become rarely accessed over long periods of time, the archive access tiers further reduce your storage cost. S3 Intelligent-Tiering can be used to store new datasets where, shortly after upload, access is frequent, but decreases as the data set ages.
Q: What performance does S3 Intelligent-Tiering offer?
S3 Intelligent-Tiering Frequent, Infrequent, and Archive Instant access tiers provide the same performance as the S3 Standard and S3 Standard-Infrequent Access storage classes. The optional Archive Access tier has the same performance as S3 Glacier Flexible Retrieval, and the Deep Archive Access tier has the same performance as the S3 Glacier Deep Archive storage class.
Q: How durable and available is S3 Intelligent-Tiering?
S3 Intelligent-Tiering is designed for the same 99.999999999% durability as the S3 Standard storage class. S3 Intelligent-Tiering is designed for 99.9% availability, and carries a service level agreement providing service credits if availability is less than our service commitment in any billing cycle.
Q. How am I charged for S3 Intelligent-Tiering?
S3 Intelligent-Tiering charges you for monthly storage, requests, and bandwidth, and charges a small monthly fee for monitoring and automation per object. The S3 Intelligent-Tiering storage class stores objects in three automatic storage access tiers: Frequent Access tier priced at S3 Standard storage rates, an Infrequent Access tier priced at S3 Standard-Infrequent Access storage rates, an Archive Instant Access tier priced at S3 Glacier Instant Retrieval storage class. S3 Intelligent-Intelligent also offers two optional asynchronous access tiers, an Archive Access tier priced at S3 Glacier Flexible Retrieval storage rates, and a Deep Archive Access tier priced at S3 Glacier Deep Archive storage rates.
There are no retrieval fees for S3 Intelligent-Tiering. For a small monitoring and automation fee, S3 Intelligent-Tiering monitors access patterns and automatically moves objects between four access tiers to optimize your storage cost and performance.
There is no minimum billable object size in S3 Intelligent-Tiering, but objects smaller than 128KB are not eligible for auto-tiering. These smaller objects will not be monitored and will always be charged at the Frequent Access tier rates, with no monitoring and automation fee. For each object archived to the archive access tier or deep archive access tier in S3 Intelligent-Tiering, Amazon S3 uses 8 KB of storage for the name of the object and other metadata (billed at S3 Standard storage rates) and 32 KB of storage for index and related metadata (billed at S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive storage rates).
Q: How do I activate S3 Intelligent-Tiering archive access tiers?
You can activate the archive access and deep archive access tiers by creating a bucket, prefix, or object tag level configuration using the Amazon S3 API, CLI, or S3 management console. You should only activate one or both archive access tiers if your objects can be accessed asynchronously by your application.
Q: Can I extend the time before objects get archived within S3 Intelligent-Tiering storage class?
Yes. In the bucket, prefix, or object tag level configuration, you can extend the last access time for archiving objects in S3 Intelligent-Tiering to up to two years for the Archive and Deep Archive Access tiers. The minimum last access time to move objects into the archive access tier is 90 days and the minimum last access time to move objects into the deep archive access tier is 180 days.
Q: How do I get an object from the archive access or deep archive access tiers in the S3 INT storage class?
You can issue a Restore request and the object will automatically begin moving back to the Frequent Access tier, all within the S3 Intelligent-Tiering storage class. Objects in the archive access tier are moved to the Frequent Access tier in 3-5 hours and within 12 hours if they are in the deep archive access tier. Once the object is in the Frequent Access tier, you can issue a GET request to retrieve the object.
Q: Are my S3 Intelligent-Tiering objects backed by the Amazon S3 Service Level Agreement?
Yes, S3 Intelligent-Tiering is backed with the Amazon S3 Service Level Agreement, and customers are eligible for service credits if availability is less than our service commitment in any billing cycle.
Q: How will my latency and throughput performance be impacted as a result of using S3 Intelligent-Tiering?
You should expect the same latency and throughput performance as S3 Standard when using S3 Intelligent-Tiering Frequent, Infrequent, and Archive Instant Access tiers. You should only activate the Archive and Deep Archive Access tiers if your objects can be accessed asynchronously by your application. Objects in the archive access tier are moved to the frequent access tier in 3-5 hours and within 12 hours if they are in the deep archive access tier. If you need faster access to an object in the archive or deep archive access tier, you can pay for faster retrieval by using the console to select expedited retrieval speed.
Q: Is there a minimum duration for S3 Intelligent-Tiering?
No, S3 Intelligent-Tiering has no minimum storage duration.
Q: Is there a minimum object size for S3 Intelligent-Tiering?
S3 Intelligent-Tiering has no minimum billable object size, but objects smaller than 128KB are not eligible for auto-tiering. These smaller objects will not be monitored and will always be charged at the Frequent Access tier rates, with no monitoring and automation fee. For each object archived to the archive access tier or deep archive access tier in S3 Intelligent-Tiering, Amazon S3 uses 8 KB of storage for the name of the object and other metadata (billed at S3 Standard storage rates) and 32 KB of storage for index and related metadata (billed at S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive storage rates). This enables you to get a real-time list of all of your S3 objects using the S3 LIST API or the S3 Inventory report. For more details, please visit https://www.amazonaws.cn/en/s3/pricing/.
S3 Standard-Infrequent Access
Q: What is S3 Standard - Infrequent Access?
Amazon S3 Standard - Infrequent Access (Standard - IA) is an Amazon S3 storage class for data that is accessed less frequently, but requires rapid access when needed. Standard - IA offers the high durability, throughput, and low latency of Amazon S3 Standard, with a low per GB storage price and per GB retrieval fee. This combination of low cost and high performance make Standard - IA ideal for long-term storage, backups, and as a data store for disaster recovery. The Standard - IA storage class is set at the object level and can exist in the same bucket as Standard, allowing you to use lifecycle policies to automatically transition objects between storage classes without any application changes.
Q: Why would I choose to use Standard - IA?
Standard - IA is ideal for data that is accessed less frequently, but requires rapid access when needed. Standard - IA is ideally suited for long-term file storage, older data from sync and share, backup data, and disaster recovery files.
Q: What performance does S3 Standard - Infrequent Access offer?
S3 Standard - Infrequent Access provide the same performance as S3 Standard storage.
Q: How durable and available is Standard - IA?
S3 Standard - IA is designed for the same 99.999999999% durability as Standard and S3 Glacier Flexible Retrieval. Standard - IA is designed for 99.9% availability, and carries a service level agreement providing service credits if availability is less than our service commitment in any billing cycle.
Q: How do I get my data into Standard - IA?
There are two ways to get data into Standard – IA. You can directly PUT into Standard – IA by specifying STANDARD_IA in the x-amz-storage-class header. You can also set lifecycle policies to transition objects from Standard to Standard - IA.
Q: Are my Standard - IA objects backed with the Amazon S3 Service Level Agreement?
Yes, Standard - IA is backed with the Amazon S3 Service Level Agreement, and customers are eligible for service credits if availability is less than our service commitment in any billing cycle.
Q: How will my latency and throughput performance be impacted as a result of using Standard - IA?
You should expect the same latency and throughput performance as Amazon S3 Standard when using Standard - IA.
Q: Is there a minimum duration for Standard - IA?
Standard - IA is designed for long-lived, but infrequently accessed data that is retained for months or years. Data that is deleted from Standard - IA within 30 days will be charged for a full 30 days.
Q: Is there a minimum object size for Standard - IA?
Standard - IA is designed for larger objects and has a minimum object size of 128KB. Objects smaller than 128KB in size will incur storage charges as if the object were 128KB. For example, a 6KB object in S3 Standard - IA will incur S3 Standard - IA storage charges for 6KB and an additional minimum object size fee equivalent to 122KB at the S3 Standard - IA storage price.
Q: Can I tier objects from Standard - IA to Amazon S3 Glacier Flexible Retrieval?
Yes. In addition to using lifecycle policies to migrate objects from Standard to Standard - IA, you can also set up lifecycle policies to tier objects from Standard - IA to Amazon S3 Glacier Flexible Retrieval.
S3 One Zone-Infrequent Access
Q: What is S3 One Zone-IA storage class?
S3 One Zone-IA storage class is an Amazon S3 storage class that customers can choose to store objects in a single availability zone. S3 One Zone-IA storage redundantly stores data within that single Availability Zone to deliver storage at 20% less cost than geographically redundant S3 Standard-IA storage, which stores data redundantly across multiple geographically separate Availability Zones.
S3 One Zone-IA offers a 99% available SLA and is also designed for eleven 9’s of durability within the Availability Zone. But, unlike S3 Standard storage classes, S3 One Zone-IA storage class is not resilient to the physical loss of the availability zone from a major event like earthquake or flood.
S3 One Zone-IA storage offers the same Amazon S3 features as S3 Standard and S3 Standard-IA and is used through the Amazon S3 API, CLI and console. S3 One Zone-IA storage class is set at the object level and can exist in the same bucket as S3 Standard and S3 Standard-IA storage classes. You can use S3 Lifecycle policies to automatically transition objects between storage classes without any application changes.
Q: What use cases are best suited for S3 One Zone-IA storage class?
Customers can use S3 One Zone-IA for infrequently-accessed storage, like backup copies, disaster recovery copies, or other easily re-creatable data.
Q: What performance does S3 One Zone-IA storage offer?
S3 One Zone-IA storage class offers similar performance to S3 Standard and S3 Standard-Infrequent Access storage.
Q: How durable is the S3 One Zone-IA storage class?
S3 One Zone-IA storage class is designed for 99.999999999% of durability within an Availability Zone. However, S3 One Zone-IA storage is not designed to withstand the loss of availability or total destruction of an Availability Zone. In contrast, S3 Standard and S3 Standard-Infrequent Access storage are designed to withstand loss of availability or the destruction of an Availability Zone. S3 One Zone-IA delivers the same or better durability and availability than most modern, physical data centers, while providing the added benefit of elasticity of storage and the Amazon S3 feature set.
Q: What is the availability SLA for S3 One Zone-IA storage class?
S3 One Zone-IA offers a 99% availability SLA. For comparison, S3 Standard offers a 99.9% availability SLA and S3 Standard-Infrequent Access offers a 99% availability SLA. As with all S3 storage classes, S3 One Zone-IA storage class carries a service level agreement providing service credits if availability is less than our service commitment in any billing cycle. See the Amazon S3 Service Level Agreement.
Q: How will using S3 One Zone-IA storage affect my latency and throughput?
You should expect similar latency and throughput in S3 One Zone-IA storage class to Amazon S3 Standard and S3 Standard-IA storage classes.
Q: How am I charged for using S3 One Zone-IA storage class?
Like S3 Standard-IA, S3 One Zone-IA charges for the amount of storage per month, bandwidth, requests, early delete and small object fees, and a data retrieval fee. Amazon S3 One Zone-IA storage is 20% cheaper than Amazon S3 Standard-IA for storage by month, and shares the same pricing for bandwidth, requests, early delete and small object fees, and the data retrieval fee.
As with S3 Standard-Infrequent Access, if you delete aS3 One Zone-IA object within 30 days of creating it, you will incur an early delete charge. For example, if you PUT an object and then delete it 10 days later, you are still charged for 30 days of storage.
Like S3 Standard-IA, S3 One Zone-IA storage class has a minimum object size of 128KB. Objects smaller than 128KB in size will incur storage charges as if the object were 128KB. For example, a 6KB object in a S3 One Zone-IA storage class will incur storage charges for 6KB and an additional minimum object size fee equivalent to 122KB at the S3 One Zone-IA storage price. Please see the pricing page for information about S3 One Zone-IA pricing.
Q: Is an S3 One Zone-IA “Zone” the same thing as a Amazon Web Services Availability Zone?
Yes. Each Amazon Web Services Region is a separate geographic area. Each region has multiple, isolated locations known as Availability Zones. The Amazon S3 One Zone-IA storage class uses an individual Amazon Web Services Availability Zone within the region.
Q: Are there differences between how Amazon EC2 and Amazon S3 work with Availability Zone-specific resources?
Yes. Amazon EC2 provides you the ability to pick the AZ to place resources, such as compute instances, within a region. When you use S3 One Zone-IA, S3 One Zone-IA assigns an Amazon Web Services Availability Zone in the region according to available capacity.
Q: Can I have a bucket that has different objects in different storage classes and Availability Zones?
Yes, you can have a bucket that has different objects stored in S3 Standard, S3 Standard-IA and S3 One Zone-IA.
Q: Is S3 One Zone-IA available in all Amazon Web Services Regions in which S3 operates?
Yes
Q: How much disaster recovery protection do I forego by using S3 One Zone-IA?
Each Availability Zone uses redundant power and networking. Within an Amazon Web Services Region, Availability Zones are on different flood plains, earthquake fault zones, and geographically separated for fire protection. S3 Standard and S3 Standard-IA storage classes offer protection against these sorts of disasters by storing your data redundantly in multiple Availability Zones. S3 One Zone-IA offers protection against equipment failure within an Availability Zone, but it does not protect against the loss of the Availability Zone. Using S3 One Zone-IA, S3 Standard, and S3 Standard-IA options, you can choose the storage class that best fits the durability and availability needs of your storage.
S3 Object Lambda
Q: What is S3 Object Lambda?
S3 Object Lambda allows you to add your own code to process data retrieved from S3 before returning it to an application. With S3 Object Lambda, you can use custom code to modify the data returned by standard S3 GET, HEAD, and LIST requests. This can be used to filter rows, dynamically resize an image, redact or mask confidential information, create a custom view of objects, or to otherwise modify data returned by S3. Powered by Amazon Lambda functions, all request and data processing runs on infrastructure that is fully managed by Amazon Web Services. Your custom code executes on-demand, eliminates the need to create and store derivative copies of your data, and requires no changes to applications.
S3 Object Lambda helps you to easily meet the unique data format requirements of any application without having to build and operate additional infrastructure, such as a proxy layer, or having to create and maintain multiple derivative copies of your data. S3 Object Lambda uses Amazon Lambda functions to automatically process the output of a standard S3 GET, HEAD, and LIST request. Amazon Lambda is a serverless compute service that runs customer-defined code without requiring management of underlying compute resources. With just a few clicks in the Amazon S3 Management Console, you can configure a Lambda function and attach it to a S3 Object Lambda service endpoint. From that point forward, S3 will automatically call that Lambda function to process any data retrieved through the S3 Object Lambda endpoint, returning a transformed result back to the application. You can author and execute your own custom Lambda functions, tailoring S3 Object Lambda’s data transformation to your specific use case.
Q: Why should I use S3 Object Lambda?
You should use S3 Object Lambda to share a single copy of your data across many applications, avoiding the need to build and operate custom processing infrastructure or to store derivative copies of your data. For example, by using S3 Object Lambda to process normal S3 GET requests, you can mask sensitive data for compliance purposes, restructure raw data for the purpose of making it compatible with machine learning applications, filter data to restrict access to specific content within an S3 object, or to address a wide range of additional use cases. You can also use S3 Object Lambda to modify the output of S3 LIST requests to enrich your object lists by querying an external index that contains additional object metadata, filter your object lists to only include objects with a specific object tag, and add a file extension to all the object names in your list. S3 Object Lambda processing can be set up with just a few clicks in the Amazon Web Services Management Console.
Q: How does S3 Object Lambda work?
S3 Object Lambda uses Lambda functions specified by you to transform the output of a standard GET, HEAD, and LIST request. Once you have defined a Lambda function to process the request data, you can attach that function to a S3 Object Lambda endpoint. GET, HEAD, and LIST requests made through a S3 Object Lambda endpoint will now invoke the specified Lambda function. Lambda will then fetch the S3 object requested by the client and process that object. Once processing has completed, Lambda will stream the processed object back to the calling client.
Q: How do I get started with S3 Object Lambda?
S3 Object Lambda can be set up in multiple ways. You can set up S3 Object Lambda in the S3 console by navigating to the Object Lambda Access Point tab. Next, create an S3 Object Lambda Access Point, the Lambda function that you would like S3 to execute against your GET, LIST, and HEAD requests, and a supporting S3 Access Point. Grant permissions to all resources to interact with Object Lambda. Lastly, update your SDK and application to use the new S3 Object Lambda Access Point to retrieve data from S3 using the language SDK of your choice. You can use an S3 Object Lambda Access Point alias when making requests. Aliases for S3 Object Lambda Access Points are automatically generated and are interchangeable with S3 bucket names for data accessed through S3 Object Lambda. For existing S3 Object Lambda Access Points, aliases are automatically assigned and ready for use. There are example Lambda function implementations in the documentation to help you get started.
You can also use Amazon CloudFormation to automate your S3 Object Lambda configuration. When you use the Amazon CloudFormation template, the Lambda function that is deployed in your account will pass S3 objects back to your requesting client or application without any changes. You can add custom code to modify and process data as it is returned to an application. To learn more, visit the S3 Object Lambda User Guide.
Amazon S3 and IPv6
Q: What is IPv6?
Every server and device connected to the Internet must have a unique address. Internet Protocol Version 4 (IPv4) was the original 32-bit addressing scheme. However, the continued growth of the Internet means that all available IPv4 addresses will be utilized over time. Internet Protocol Version 6 (IPv6) is the new addressing mechanism designed to overcome the global address limitation on IPv4.
Q: What can I do with IPv6?
Using IPv6 support for Amazon S3, applications can connect to Amazon S3 without the need for any IPv6 to IPv4 translation software or systems. You can meet compliance requirements, more easily integrate with existing IPv6-based on-premises applications, and remove the need for expensive networking equipment to handle the address translation. You can also now utilize the existing source address filtering features in IAM policies and bucket policies with IPv6 addresses, expanding your options to secure applications interacting with Amazon S3.
Q: How do I get started with IPv6 on Amazon S3?
You can get started by pointing your application to Amazon S3’s new “dual-stack” endpoint, which supports access over both IPv4 and IPv6. In most cases, no further configuration is required for access over IPv6, because most network clients prefer IPv6 addresses by default.
Q: Should I expect a change in Amazon S3 performance when using IPv6?
No, you will see the same performance when using either IPv4 or IPv6 with Amazon S3.
Q: What can I do if my clients are impacted by policy, network, or other restrictions in using IPv6 for Amazon S3?
Applications that are impacted by using IPv6 can switch back to the standard IPv4-only endpoints at any time.
Q: Can I use IPv6 with all Amazon S3 features?
No, IPv6 support is not currently available when using Website Hosting and access via BitTorrent. All other features should work as expected when accessing Amazon S3 using IPv6.
Q: Is IPv6 supported in all Amazon Web Services Regions?
Yes, you can use IPv6 with Amazon S3 in all Amazon Web Services Regions, including Amazon Web Services China (Beijing) Region, operated by Sinnet and Amazon Web Services China (Ningxia) Region, operated by NWCD.
Amazon S3 Glacier Instant Retrieval storage class
Q: What is the S3 Glacier Instant Retrieval storage class?
The S3 Glacier Instant Retrieval storage class delivers low-cost storage for long-lived data that is rarely accessed and requires milliseconds retrieval, such as medical images or news media assets. S3 Glacier Instant Retrieval delivers fast access to archive storage, with the same throughput and milliseconds access as the S3 Standard storage classes. S3 Glacier Instant Retrieval is designed for 99.999999999% (11 9s) of data durability and 99.9% availability by redundantly storing data across a minimum of three physically separated Amazon Web Services Availability Zones.
Q: Why would I choose to use S3 Glacier Instant Retrieval?
S3 Glacier Instant Retrieval is ideal if you have data that is rarely accessed (once a quarter, on average) and requires milliseconds retrieval times. It’s purpose built storage for data that needs the same low latency and high throughput performance of S3 Standard-IA, but your data is accessed less frequently than S3 Standard-IA, with a lower storage price and slightly higher fees to access data.
Q: How available and durable is S3 Glacier Instant Retrieval?
S3 Glacier Instant Retrieval is designed for 99.999999999% (11 9s) of durability and 99.9% availability, the same as S3 Standard-IA, and carries a service level agreement providing service credits if availability is less than 99% in any billing cycle.
Q: What performance can I expect from S3 Glacier Instant Retrieval?
S3 Glacier Instant Retrieval provides the same milliseconds latency and high throughput performance as the S3 Standard and S3 Standard-IA storage classes. Unlike the S3 Glacier and S3 Glacier Deep Archive storage classes which are designed for asynchronous access, you do not need to issue a Restore request before accessing an object stored in S3 Glacier Instant Retrieval.
Q: How do I get my data into S3 Glacier Instant Retrieval?
There are two ways to get data into S3 Glacier Instant Retrieval. You can directly PUT into S3 Glacier Instant retrieval by specifying GLACIER_IR in the x-amz-storage-class header or set S3 Lifecycle policies to transition objects from S3 Standard or S3 Standard-IA to S3 Glacier Instant Retrieval. Use the Amazon S3 console, the Amazon Web Services SDKs, or the Amazon S3 APIs to directly PUT into Amazon S3 Glacier Instant Retrieval or define rules for archival.
Q: How am I charged for S3 Glacier Instant Retrieval?
S3 Glacier Instant Retrieval charges you for monthly storage, requests based on the request type, and data retrievals. The volume of storage billed in a month is based on average storage used throughout the month, measured in gigabyte per month (GB-Month). You are charged for requests based on the request type—such as PUTs, COPYs, and GETs. You also pay a per GB fee for every gigabyte of data returned to you.
Q: Is there a minimum object size charge for Amazon S3 Glacier Instant Retrieval?
S3 Glacier Instant Retrieval is designed for larger objects and has a minimum object storage charge of 128KB. Objects smaller than 128KB in size will incur storage charges as if the object were 128KB. For example, a 6KB object in S3 Glacier Instant Retrieval will incur S3 Glacier Instant Retrieval storage charges for 6KB and an additional minimum object size charge equivalent to 122KB at the S3 Glacier Instant Retrieval storage price. View the Amazon S3 pricing page for information about Amazon S3 Glacier Instant Retrieval pricing.
Amazon S3 Glacier Flexible Retrieval (Formerly Amazon S3 Glacier storage class)
Q: What is the S3 Glacier storage class?
The S3 Glacier storage class is secure, durable, and low-cost storage for data archiving. You can reliably store any amount of data at costs that are competitive with or cheaper than on-premises solutions. To keep costs low yet suitable for varying needs, the S3 Glacier storage class provides three retrieval options that range from a few minutes to several hours. You can upload objects directly to Amazon S3 Glacier Flexible Retrieval, or use S3 Lifecycle policies to transfer data from any of the Amazon S3 storage classes for active data (S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, S3 One Zone-IA, and S3 Glacier Instant Retrieval) to the S3 Glacier storage class.
Q: Does Amazon S3 provide capabilities for archiving objects to lower cost storage options?
Yes, Amazon S3 enables you to utilize Amazon S3 Glacier Flexible Retrieval’s extremely low-cost storage class for data archival. Amazon S3 Glacier Flexible Retrieval is optimized for data that is infrequently accessed and for which retrieval times of minutes are suitable. Examples include digital media archives, financial and healthcare records, raw genomic sequence data, long-term database backups, and data that must be retained for regulatory compliance.
Q: How can I store my data in Amazon S3 Glacier Flexible Retrieval?
You can use lifecycle policy to automatically archive sets of Amazon S3 objects to Amazon S3 Glacier Flexible Retrieval based on lifetime. Use the Amazon S3 Management Console, the Amazon SDKs or the Amazon S3 APIs to define rules for archival. Rules specify a prefix and time period. The prefix (e.g. “logs/”) identifies the object(s) subject to the rule. The time period specifies either the number of days from object creation date (e.g. 180 days) or the specified date after which the object(s) should be archived. Any Amazon S3 Standard or S3 Standard-IA storage objects which have names beginning with the specified prefix and which have aged past the specified time period are archived to Amazon S3 Glacier Flexible Retrieval. To retrieve Amazon S3 data stored in Amazon S3 Glacier Flexible Retrieval, initiate a restore job via the Amazon S3 APIs or Management Console. Restore jobs typically complete in 3 to 5 hours. Once the job is complete, you can access your data through an Amazon S3 GET object request.
Q: Can I use the Amazon S3 APIs or Management Console to list objects that I’ve archived to Amazon S3 Glacier Flexible Retrieval?
Yes, like Amazon S3’s other storage classes (S3 Standard or S3 Standard-IA), Amazon S3 Glacier Flexible Retrieval objects stored using Amazon S3’s APIs or Management Console have an associated user-defined name. You can get a real-time list of all of your Amazon S3 object names, including those stored using the Amazon S3 Glacier Flexible Retrieval option, using the Amazon S3 LIST API.
Q: Can I use Amazon S3 Glacier Direct APIs to access objects that I’ve archived to Amazon S3 Glacier?
Because Amazon S3 maintains the mapping between your user-defined object name and Amazon S3 Glacier’s system-defined identifier, Amazon S3 objects that are stored using the Amazon S3 Glacier storage class are only accessible through the Amazon S3 APIs or the Amazon S3 Management Console.
Q: How can I restore my objects that are archived in Amazon S3 Glacier Flexible Retrieval?
To restore Amazon S3 data stored in Amazon S3 Glacier Flexible Retrieval, initiate a restore request using the Amazon S3 APIs or the Amazon S3 Management Console. Restore requests typically complete in 3 to 5 hours. The restore request creates a temporary copy of your data in S3 Standard while leaving the archived data intact in Amazon S3 Glacier Flexible Retrieval. You can specify the amount of time in days for which the temporary copy is stored in S3 Standard. You can then access your temporary copy from S3 Standard through an Amazon S3 GET request on the archived object.
Q: How long will it take to restore my objects archived in Amazon S3 Glacier Flexible Retrieval?
When processing a restore job, Amazon S3 first retrieves the requested data from Amazon S3 Glacier Flexible Retrieval (which typically takes 3-5 hours), and then creates a temporary copy of the requested data in S3 Standard (which typically takes on the order of a few minutes). You can expect most restore jobs initiated via the Amazon S3 APIs or Management Console to complete in 3-5 hours.
Q: How much data can I restore for free?
You can restore up to 5% of the Amazon S3 data stored in Amazon S3 Glacier Flexible Retrieval for free each month. Typically, this will be sufficient for backup and archival needs. Your 5% monthly free restore allowance is calculated and metered on a daily prorated basis. For example, if on a given day you have 12 terabytes of Amazon S3 data archived to Amazon S3 Glacier Flexible Retrieval, you can restore up to 20.5 gigabytes of data for free that day (12 terabytes x 5% / 30 days = 20.5 gigabytes, assuming it is a 30 day month). You can also use the Bulk Restore for free retrievals within 5-12 hours.
Q: How am I charged for deleting objects from Amazon S3 Glacier Flexible Retrieval that are less than 3 months old?
Amazon S3 Glacier Flexible Retrieval is designed for use cases where data is retained for months, or years. Deleting data that is archived to Amazon S3 Glacier Flexible Retrieval is free if the objects being deleted have been archived in Amazon S3 Glacier Flexible Retrieval for three months or longer. If an object archived in Amazon S3 Glacier Flexible Retrieval is deleted or overwritten within three months of being archived then there will be an early deletion fee. This fee is prorated. If you delete 1 GB of data 1 month after uploading it, you will be charged an early deletion fee for 2 months of Amazon S3 Glacier Flexible Retrieval storage. If you delete 1 GB after 2 months, you will be charged for 1 month of Amazon S3 Glacier Flexible Retrieval storage.
Q: How is my storage charge calculated for Amazon S3 objects archived to S3 Glacier Flexible Retrieval?
The volume of storage billed in a month is based on average storage used throughout the month, measured in gigabyte-months (GB-Months). Amazon S3 calculates the object size as the amount of data you stored, plus an additional 32 KB of S3 Glacier data, plus an additional 8 KB of Amazon S3 Standard storage class data. S3 Glacier Flexible Retrieval requires an additional 32 KB of data per object for S3 Glacier’s index and metadata so you can identify and retrieve your data. Amazon S3 requires 8 KB to store and maintain the user-defined name and metadata for objects archived to S3 Glacier Flexible Retrieval. This enables you to get a real-time list of all of your Amazon S3 objects, including those stored using S3 Glacier Flexible Retrieval, using the Amazon S3 LIST API, or the S3 inventory report.
For example, if you have archived 100,000 objects that are 1 GB each, your billable storage would be: 1.000032 gigabytes for each object x 100,000 objects = 100,003.2 gigabytes of S3 Glacier storage. 0.000008 gigabytes for each object x 100,000 objects = 0.8 gigabytes of S3 Standard storage. The fee is calculated based on the current rates for your Amazon Web Services Region on the Amazon S3 pricing page. For additional Amazon S3 pricing examples, go to the S3 billing FAQs or use the Amazon Web Services pricing calculator.
Q: Are there minimum storage duration and minimum object storage charges for Amazon S3 Glacier Flexible Retrieval?
Objects archived to S3 Glacier Flexible Retrieval have a minimum of 90 days of storage. If an object is deleted, overwritten, or transitioned before 90 days, a pro-rated charge equal to the storage charge for the remaining days will be incurred. S3 Glacier Flexible Retrieval also requires 40 KB of additional metadata for each archived object. This includes 32 KB of metadata charged at the S3 Glacier Flexible Retrieval rate required to identify and retrieve your data. And, an additional 8 KB data charged at the S3 Standard rate which is required to maintain the user-defined name and metadata for objects archived to S3 Glacier Flexible Retrieval. This allows you to get a real-time list of all of your S3 objects using the S3 LIST API or the S3 Inventory report. View the Amazon S3 pricing page for information about Amazon S3 Glacier Flexible Retrieval pricing.
Amazon S3 Glacier Deep Archive
Q: What is Amazon S3 Glacier Deep Archive?
Amazon S3 Glacier Deep Archive is a new Amazon S3 storage class that provides secure and durable object storage for long-term retention of data that is accessed once or twice a year. From just ¥ 0.012 per GB-month, Amazon S3 Glacier Deep Archive offers the lowest cost storage in the cloud, at prices significantly lower than storing and maintaining data in on-premises magnetic tape libraries or archiving data off-site.
Q: What use cases are best suited for Amazon S3 Glacier Deep Archive?
Amazon S3 Glacier Deep Archive is an ideal storage class to provide offline protection of your company’s most important data assets, or when long-term data retention is required for corporate policy, contractual, or regulatory compliance requirements. Customers find Amazon S3 Glacier Deep Archive to be a compelling choice to protect core intellectual property, financial and medical records, research results, legal documents, seismic exploration studies, and long-term backups, especially in highly regulated industries, such as Financial Services, Healthcare, Oil & Gas, and Public Sectors. In addition, there are organizations, such as media and entertainment companies, that want to keep a backup copy of core intellectual property. Frequently, customers using Amazon S3 Glacier Deep Archive are able to reduce or discontinue the use of on-premises magnetic tape libraries and off-premises tape archival services.
Q: How does Amazon S3 Glacier Deep Archive differ from Amazon S3 Glacier Flexible Retrieval?
Amazon S3 Glacier Deep Archive expands our data archiving offerings, enabling you to select the optimal storage class based on storage and retrieval costs, and retrieval times. Choose Amazon S3 Glacier Flexible Retrieval when you want retrieval options in as little as 1-5 minutes using Expedited retrievals for archived data. Amazon S3 Glacier Deep Archive, in contrast, is designed for colder data that is very unlikely to be accessed, but still requires long-term, durable storage. Amazon S3 Glacier Deep Archive is up to 75% less expensive than Amazon S3 Glacier Flexible Retrieval and provides retrieval within 12 hours using the Standard retrieval speed. You may also reduce retrieval costs by selecting Bulk retrieval, which will return data within 48 hours.
Q: How durable and available is Amazon S3 Glacier Deep Archive?
Amazon S3 Glacier Deep Archive is designed for the same 99.999999999% durability as the Amazon S3 Standard and Amazon S3 Glacier storage classes. Amazon S3 Glacier Deep Archive is designed for 99.9% availability, and carries a service level agreement providing service credits if availability is less than our service commitment in any billing cycle.
Q: Are my Amazon S3 Glacier Deep Archive objects backed by Amazon S3 Service Level Agreement?
Yes, Amazon S3 Glacier Deep Archive is backed with the Amazon S3 Service Level Agreement, and customers are eligible for service credits if availability is less than our service commitment in any billing cycle.
Q: How do I get started using Amazon S3 Glacier Deep Archive?
The easiest way to store data in Amazon S3 Glacier Deep Archive is to use the S3 API to upload data directly. Just specify “Glacier Deep Archive” as the storage class. You can accomplish this using the Amazon Web Services Management Console, S3 REST API, Amazon SDKs, or Amazon Command Line Interface.
You can also begin using Amazon S3 Glacier Deep Archive by creating policies to migrate data using S3 Lifecycle, which provides the ability to define the lifecycle of your object and reduce your cost of storage. These policies can be set to migrate objects to Amazon S3 Glacier Deep Archive based on the age of the object. You can specify the policy for an S3 bucket, or for specific prefixes. Lifecycle transitions are billed at the Amazon S3 Glacier Deep Archive Upload price.
Amazon Web Services Tape Gateway, a cloud-based virtual tape library feature of Amazon Storage Gateway, now integrates with Amazon S3 Glacier Deep Archive, enabling you to store your virtual tape-based, long-term backups and archives in Amazon S3 Glacier Deep Archive, thereby providing the lowest cost storage for this data in the cloud. To get started, create a new virtual tape using Amazon Storage Gateway Console or API, and set the archival storage target either to Amazon S3 Glacier Flexible Retrieval or Amazon S3 Glacier Deep Archive. When your backup application ejects the tape, the tape will be archived to your selected storage target.
Q: How do you recommend migrating data from my existing tape archives to Amazon S3 Glacier Deep Archive?
There are multiple ways to migrate data from existing tape archives to Amazon S3 Glacier Deep Archive. You can use the Amazon Web Services Tape Gateway to integrate with existing backup applications using a virtual tape library (VTL) interface. This interface presents virtual tapes to the backup application. These can be immediately used to store data in Amazon S3, Amazon S3 Glacier Flexible Retrieval, and Amazon S3 Glacier Deep Archive.
You can also use Amazon Snowball to migrate data. Snowball accelerates moving terabytes to petabytes of data into and out of Amazon Web Services using physical storage devices designed to be secure for transport. Using Snowball helps to eliminate challenges that can be encountered with large-scale data transfers including high network costs, long transfer times, and security concerns.
Finally, you can use Amazon Direct Connect to establish dedicated network connections from your premises to Amazon Direct Connect locations. In many cases, Direct Connect can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.
Q: How can I retrieve my objects stored in Amazon S3 Glacier Deep Archive?
To retrieve data stored in Amazon S3 Glacier Deep Archive, initiate a “Restore” request using the Amazon S3 APIs or the Amazon S3 Management Console. The Restore creates a temporary copy of your data in the S3 Standard storage class while leaving the archived data intact in Amazon S3 Glacier Deep Archive. You can specify the amount of time in days for which the temporary copy is stored in S3. You can then access your temporary copy from S3 through an Amazon S3 GET request on the archived object.
When restoring an archived object, you can specify one of the following options in the Tier element of the request body: Standard is the default tier and lets you access any of your archived objects within 12 hours, and Bulk lets you retrieve large amounts, even petabytes of data inexpensively and typically completes within 48 hours.
Q: How am I charged for using Amazon S3 Glacier Deep Archive?
Amazon S3 Glacier Deep Archive storage is priced based on the amount of data you store in GBs, the number of PUT/lifecycle transition requests, retrievals in GBs, and number of restore requests. This pricing model is similar to Amazon S3 Glacier Flexible Retrieval. Please see the Amazon S3 pricing page for information about Amazon S3 Glacier Deep Archive pricing.
Q: Are there minimum storage duration and minimum object storage charges for Amazon S3 Glacier Deep Archive?
Amazon S3 Glacier Deep Archive is designed for long-lived but rarely accessed data that is retained for 7-10 years or more. Objects that are archived to Amazon S3 Glacier Deep Archive have a minimum of 180 days of storage, and objects deleted before 180 days incur a pro-rated charge equal to the storage charge for the remaining days. Please see the Amazon S3 pricing page for information about Amazon S3 Glacier Deep Archive pricing.
Amazon S3 Glacier Deep Archive has a minimum billable object storage size of 40KB. Objects smaller than 40KB in size may be stored but will be charged for 40KB of storage. Please see the Amazon S3 pricing page for information about Amazon S3 Glacier Deep Archive pricing.
Q: How does Amazon S3 Glacier Deep Archive integrate with other Amazon Web Services Services?
Amazon S3 Glacier Deep Archive is integrated with Amazon S3 features including S3 Storage Class Analysis, S3 Object Tagging, S3 Lifecycle policies, and S3 Object Lock. With S3 storage management features, you can use a single Amazon S3 bucket to store a mixture of Amazon S3 Glacier Deep Archive, S3 Standard, S3 Standard-IA, S3 One Zone-IA, and Amazon S3 Glacier Flexible Retrieval data. This allows storage administrators to make decisions based on the nature of the data and data access patterns. Customers can use Amazon S3 Lifecycle policies to automatically migrate data to lower-cost storage classes as the data ages.
Amazon Storage Gateway service integrates Tape Gateway with Amazon S3 Glacier Deep Archive storage class, allowing you to store virtual tapes in the lowest-cost Amazon S3 storage class, reducing the monthly cost to store your long-term data in the cloud up to 75%. With this feature, Tape Gateway supports archiving your new virtual tapes directly to Amazon S3 Glacier Flexible Retrieval and Amazon S3 Glacier Deep Archive, helping you meet your backup, archive, and recovery requirements. Tape Gateway helps you move tape-based backups to Amazon Web Services without making any changes to your existing backup workflows. Tape Gateway supports most of the leading backup applications such as Veritas, Veeam, Commvault, Dell EMC NetWorker, IBM Spectrum Protect (on Windows OS), and Microsoft Data Protection Manager.
Event Notification
Q: What are Amazon S3 Event Notifications?
You can enable Amazon S3 Event Notifications and receive them in response to specific events in your S3 bucket, such as PUT, POST, COPY, and DELETE events. You can publish notifications to Amazon EventBridge, Amazon SNS, Amazon SQS, or directly to Amazon Lambda.
Q: What can I do with Amazon S3 Event Notifications?
Amazon S3 Event Notifications enable you to run workflows, send alerts, or perform other actions in response to changes in your objects stored in Amazon S3. You can use Amazon S3 Event Notifications to set up triggers to perform actions including transcoding media files when they are uploaded, processing data files when they become available, and synchronizing Amazon S3 objects with other data stores. You can also set up event notifications based on object name prefixes and suffixes. For example, you can choose to receive notifications on object names that start with “images/."
Q: What is included in an Amazon S3 Event Notification?
For a detailed description of the information included in Amazon S3 Event Notification messages, please refer to the Configuring Amazon S3 Event Notifications topic in the Amazon S3 Developer Guide.
Q: How do I set up Amazon S3 Event Notifications?
For a detailed description of how to configure event notifications, please refer to the Configuring Amazon S3 Event Notifications topic in the Amazon S3 Developer Guide.
Q: What does it cost to use Amazon S3 Event Notifications?
There are no additional charges from Amazon S3 for event notifications. You pay only for use of Amazon SNS or Amazon SQS to deliver event notifications. Visit the Amazon SNS or Amazon SQS pricing pages to view the pricing details for these services.
Storage Management
S3 CloudWatch Metrics | S3 Object Tagging | Lifecycle Management Policies | Replication | S3 Replication Time Control
S3 CloudWatch Metrics
Q: How do I get started with S3 CloudWatch Metrics?
You can use the Amazon Web Services Management Console to enable the generation of 1-minute CloudWatch metrics for your S3 bucket or configure filters for the metrics using a prefix or object tag, or access point. Alternately, you can call the S3 PUT Bucket Metrics API to enable and configure publication of S3 storage metrics. Storage metrics will be available in CloudWatch within 15 minutes of being enabled.
Q: Can I align storage metrics to my applications or business organizations?
Yes, you can configure S3 CloudWatch metrics to generate metrics for your S3 bucket or configure filters for the metrics using a prefix or object tag. For example, you can monitor a spark application that accesses data under the prefix “/Bucket01/BigData/SparkCluster” as metrics filter 1 and define a second metrics filter with the tag “Dept, 1234” as metrics filter 2. An object can be a member of multiple filters, e.g., an object within the prefix “/Bucket01/BigData/SparkCluster” and with the tag “Dept,1234” will be in both metrics filter 1 and 2. In this way, metrics filters can be aligned to business applications, team structures or organizational budgets, allowing you to monitor and alert on multiple workloads separately within the same S3 bucket.
Q: What alarms can I set on my storage metrics?
You can use CloudWatch to set thresholds on any of the storage metrics counts, timers, or rates and fire an action when the threshold is breached. For example, you can set a threshold on the percentage of 4xx Error Responses and when at least 3 data points are above the threshold fire a CloudWatch alarm to alert a Dev Ops engineer.
Q. How am I charged for using S3 CloudWatch Metrics?
S3 CloudWatch Metrics are priced as custom metrics for Amazon CloudWatch. Please see Amazon CloudWatch pricing page for general information about S3 CloudWatch metrics pricing.
S3 Object Tagging
Q: What are Object Tags?
S3 Object Tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object. With these, you’ll have the ability to create Identity and Access Management (IAM) policies, setup S3 Lifecycle policies, and customize storage metrics. These object-level tags can then manage transitions between storage classes and expire objects in the background.
Q: How do I apply Object Tags to my objects?
You can add tags to new objects when you upload them or you can add them to existing objects. Up to ten tags can be added to each S3 object and you can use either the Amazon Web Services Management Console, the REST API, the Amazon CLI, or the Amazon SDKs to add object tags.
Q: Why should I use Object Tags?
Object Tags are a new tool you can use to enable simple management of your S3 storage. With the ability to create, update, and delete tags at any time during the lifetime of your object, your storage can adapt to the needs of your business. These tags allow you to control access to objects tagged with specific key-value pairs, allowing you to further secure confidential data for only a select group or user. Object tags can also be used to label objects that belong to a specific project or business unit, which could be used in conjunction with lifecycle policies to manage transitions to the S3 Standard – Infrequent Access and Amazon S3 Glacier storage classes.
Q: Why should I use Object Tags?
Object Tags are a new tool you can use to enable simple management of your S3 storage. With the ability to create, update, and delete tags at any time during the lifetime of your object, your storage can adapt to the needs of your business. These tags allow you to control access to objects tagged with specific key-value pairs, allowing you to further secure confidential data for only a select group or user. Object tags can also be used to label objects that belong to a specific project or business unit, which could be used in conjunction with lifecycle policies to manage transitions to the S3 Standard – Infrequent Access and Amazon S3 Glacier storage classes.
Q: How can I update the Object Tags on my objects?
Object Tags can be changed at any time during the lifetime of your S3 object, you can use either the Amazon Web Services Management Console, the REST API, the Amazon CLI, or the Amazon SDKs to change your object tags. Note that all changes to tags outside of the Amazon Web Services Management Console are made to the full tag set. If you have five tags attached to a particular object and want to add a sixth, you need to include the original five tags in that request.
Q: Will my Object Tags be replicated if I use Cross-Region Replication?
Object Tags can be replicated across regions using Cross-Region Replication. For more information about setting up Cross-Region Replication, please visit How to Set Up Cross-Region Replication in the Amazon S3 Developer Guide.
For customers with Cross-Region Replication already enabled, new permissions are required in order for tags to replicate. For more information on the policies required, please visit "How to Set Up Cross-Region Replication" in the Amazon S3 Developer Guide.
Q: How much do Object Tags cost?
Please see the Amazon S3 pricing page for more information.
Lifecycle Management Policies
Q: What is Lifecycle Management?
S3 Lifecycle management provides the ability to define the lifecycle of your object with a predefined policy and reduce your cost of storage. You can set lifecycle transition policy to automatically migrate Amazon S3 objects to Standard - Infrequent Access (Standard - IA), Amazon S3 Glacier Flexible Retrieval, and/or Amazon S3 Glacier Deep Archive based on the age of the data. You can also set lifecycle expiration policies to automatically remove objects based on the age of the object. You can set a policy for multipart upload expiration, which expires incomplete multipart upload based on the age of the upload.
Q: How do I set up a lifecycle management policy?
You can set up and manage lifecycle policies in the S3 Console, S3 REST API, Amazon SDKs, or Amazon Command Line Interface (CLI). You can specify the policy at the prefix or at the bucket level.
Q: How much does it cost to use lifecycle management?
There is no additional cost to set up and apply lifecycle policies. A transition request is charged per object when an object becomes eligible for transition according to the lifecycle rule.
Q. What can I do with Lifecycle Management Policies?
As data matures, it can become less critical, less valuable and subject to compliance requirements. Amazon S3 includes an extensive library of policies that help you automate data migration processes. For example, you can set infrequently accessed objects to move into lower cost storage tier (like Standard-Infrequent Access) after a period of time. After another period, it can be moved into Amazon S3 Glacier Flexible Retrieval for archive and compliance, and eventually deleted. These rules can invisibly lower storage costs and simplify management efforts and may be leveraged across the Amazon family of storage services. And these policies also include good stewardship practices to remove objects and attributes that are no longer needed to manage cost and optimize performance.
Q: How can I use Amazon S3’s lifecycle policy to lower my Amazon S3 storage costs?
With Amazon S3’s lifecycle policies, you can configure your objects to be migrated to Standard - Infrequent Access (Standard - IA), archived to Amazon S3 Glacier Flexible Retrieval or Amazon S3 Glacier Deep Archive, or deleted after a specific period of time. You can use this policy-driven automation to quickly and easily reduce storage costs as well as save time. In each rule you can specify a prefix, a time period, a transition to Standard - IA or Amazon S3 Glacier Flexible Retrieval, and/or an expiration. For example, you could create a rule that archives into Amazon S3 Glacier all objects with the common prefix “logs/” 30 days from creation, and expires these objects after 365 days from creation. You can also create a separate rule that only expires all objects with the prefix “backups/” 90 days from creation. Lifecycle policies apply to both existing and new S3 objects, ensuring that you can optimize storage and maximize cost savings for all current data and any new data placed in S3 without time-consuming manual data review and migration. Within a lifecycle rule, the prefix field identifies the objects subject to the rule. To apply the rule to an individual object, specify the key name. To apply the rule to a set of objects, specify their common prefix (e.g. “logs/”). You can specify a transition action to have your objects archived and an expiration action to have your objects removed. For time period, provide the creation date (e.g. January 31, 2015) or the number of days from creation date (e.g. 30 days) after which you want your objects to be archived or removed. You may create multiple rules for different prefixes. And finally, you may use lifecycle policies to automatically expire incomplete uploads, preventing billing on partial file uploads.
Q: How can I configure my objects to be deleted after a specific time period?
You can set a lifecycle expiration policy to remove objects from your buckets after a specified number of days. You can define the expiration rules for a set of objects in your bucket through the Lifecycle Configuration policy that you apply to the bucket. Each Object Expiration rule allows you to specify a prefix and an expiration period. The prefix field identifies the objects subject to the rule. To apply the rule to an individual object, specify the key name. To apply the rule to a set of objects, specify their common prefix (e.g. “logs/”). For expiration period, provide the number of days from creation date (i.e. age) after which you want your objects removed. You may create multiple rules for different prefixes. For example, you could create a rule that removes all objects with the prefix “logs/” 30 days from creation, and a separate rule that removes all objects with the prefix “backups/” 90 days from creation.
After an Object Expiration rule is added, the rule is applied to objects that already exist in the bucket as well as new objects added to the bucket. Once objects are past their expiration date, they are identified and queued for removal. You will not be billed for storage for objects on or after their expiration date, though you may still be able to access those objects while they are in queue before they are removed. As with standard delete requests, Amazon S3 doesn’t charge you for removing objects using Object Expiration. You can set Expiration rules for your versioning-enabled or versioning-suspended buckets as well.
Q: Why would I use a lifecycle policy to expire incomplete multipart uploads?
The lifecycle policy that expires incomplete multipart uploads allows you to save on costs by limiting the time non-completed multipart uploads are stored. For example, if your application uploads several multipart object parts, but never commits them, you will still be charged for that storage. This policy lowers your S3 storage bill by automatically removing incomplete multipart uploads and the associated storage after a predefined number of days.
Q: Can I set up Amazon S3 Event Notifications to send notifications when S3 Lifecycle transitions or expires objects?
Yes, you can set up Amazon S3 Event Notifications to notify you when S3 Lifecycle transitions or expires objects. For example, you can send S3 Event Notifications to an Amazon SNS topic, Amazon SQS queue, or Amazon Lambda function when S3 Lifecycle moves objects to a different S3 storage class or expires objects.
Replication
Q: What is Amazon S3 Replication?
Amazon S3 Replication enables automatic, asynchronous copying of objects across Amazon S3 buckets. Buckets that are configured for object replication can be owned by the same Amazon Web Services account or by different accounts. You can replicate new objects written into the bucket to one or more destination buckets between different Amazon Web Services China Regions (S3 Cross-Region Replication), or within the same Amazon Web Services Region (S3 Same-Region Replication). You can also replicate existing bucket contents (S3 Batch Replication), including existing objects, objects that previously failed to replicate, and objects replicated from another source.
Q: What is Amazon S3 Cross-Region Replication (CRR)?
CRR is an Amazon S3 feature that automatically replicates data between buckets across different Amazon Web Services China Regions. With CRR, you can set up replication at a bucket level, a shared prefix level, or an object level using S3 object tags. You can use CRR to provide lower-latency data access to users within the Amazon Web Services China Regions. CRR can also help if you have a compliance requirement to store copies of data hundreds of miles apart. You can use CRR to change account ownership for the replicated objects to protect data from accidental deletion. To learn more about CRR, please visit the replication developer guide.
Q: What is Amazon S3 Same-Region Replication (SRR)?
SRR is an Amazon S3 feature that automatically replicates data between buckets within the same Amazon Web Services Region. With SRR, you can set up replication at a bucket level, a shared prefix level, or an object level using S3 object tags. You can use SRR to create one or more copies of your data in the same Amazon Web Services Region. SRR helps you address data sovereignty and compliance requirements by keeping a copy of your data in a separate Amazon Web Services account in the same region as the original. You can use SRR to change account ownership for the replicated objects to protect data from accidental deletion. You can also use SRR to easily aggregate logs from different S3 buckets for in-region processing, or to configure live replication between test and development environments. To learn more about SRR, please visit the replication developer guide.
Q: What is Amazon S3 Batch Replication?
Amazon S3 Batch Replication replicates existing objects between buckets. You can use S3 Batch Replication to backfill a newly created bucket with existing objects, retry objects that were previously unable to replicate, migrate data across accounts, or add new buckets to your data lake. You can get started with S3 Batch Replication with just a few clicks in the S3 console or a single API request.
Q: How do I enable Amazon S3 Replication (Cross-Region Replication and Same-Region Replication)?
Amazon S3 Replication (CRR and SRR) is configured at the S3 bucket level, a shared prefix level, or an object level using S3 object tags. You add a replication configuration on your source bucket by specifying a destination bucket in the same or different Amazon Web Services China Regions for replication.
You can use the S3 Management Console, API, Amazon CLI, Amazon SDKs, or Amazon CloudFormation to enable replication. Versioning must be enabled for both the source and destination buckets to enable replication.
Q: How do I use S3 Batch Replication?
You would first need to enable S3 Replication at the bucket level. See the previous question for how you can do so. You may then initiate an S3 Batch Replication job in the S3 console after creating a new replication configuration, changing a replication destination in a replication rule from the replication configuration page, or from the S3 Batch Operations Create Job page. Alternatively, you can initiate an S3 Batch Replication jobs via the Amazon CLI or SDKs.
Q: Can I use S3 Replication with S3 Lifecycle rules?
With S3 Replication, you can establish replication rules to make copies of your objects into another storage class, in the same or a different regions within China. Lifecycle actions are not replicated, and if you want the same lifecycle configuration applied to both source and destination buckets, enable the same lifecycle configuration on both.
For example, you can configure a lifecycle rule to migrate data from the S3 Standard storage class to the S3 Standard-IA on the destination bucket.
With S3 Batch Replication, in addition to Lifecycle actions not replicated from the source, we recommend you to pause Lifecycle in the destination while the Batch Replication job is active if there are active Lifecycle rules in the destination. This is because certain Lifecycle policies depend on the version stack state to transition objects. While Batch Replication is still replicating objects, the versions stack in the destination bucket will be different than the one in the source bucket. Lifecycle can incorrectly rely on the incomplete version stack to transition objects.
You can find more information about lifecycle configuration and replication on the S3 Replication developer guide.
Q: Can I use S3 Replication to replicate to more than one destination bucket?
Yes. S3 Replication allows customers to replicate their data to multiple destination buckets in the same, or different Amazon Web Services China Regions. When setting up, you simply specify the new destination bucket in your existing replication configuration or create a new replication configuration with multiple destination buckets. For each new destination you specify, you have the flexibility to choose storage class of destination bucket, encryption type, replication metrics and notifications, and other properties.
Q: Can I use S3 Replication to setup two-way replication between S3 buckets?
Yes. To setup two-way replication, you create a replicate rule from S3 bucket A to S3 bucket B and setup another replication rule from S3 bucket B to S3 bucket A. When setting up the replication rule from S3 bucket B to S3 bucket A, please enable Sync Replica Modifications to replicate replica metadata changes. With replica modification sync, you can easily replicate metadata changes like object access control lists (ACLs), object tags, or object locks on the replicated objects.
Q: Are objects securely transferred and encrypted throughout replication process?
Yes, objects remain encrypted throughout the replication process. The encrypted objects are transmitted securely via SSL from the source region to the destination region (CRR) or within the same region (SRR).
Q: Can I use replication across Amazon Web Services China accounts to protect against malicious or accidental deletion?
Yes, for CRR and SRR, you can set up replication across Amazon Web Services China accounts to store your replicated data in a different account in the target region. You can use Ownership Overwrite in your replication configuration to maintain a distinct ownership stack between source and destination, and grant destination account ownership to the replicated storage.
Q: Can I replicate delete markers from one bucket to another?
Yes, you can replicate delete markers from source to destination if you have delete marker replication enabled in your replication configuration. When you replicate delete markers, Amazon S3 will behave as if the object was deleted in both buckets. You can enable delete marker replication for a new or existing replication rule. You can apply delete marker replication to the entire bucket or to Amazon S3 objects that have a specific prefix, with prefix based replication rules. Amazon S3 Replication does not support delete marker replication for object tag based replication rules. To learn more about enabling delete marker replication see Replicating delete markers from one bucket to another.
Q: Can I replicate data from other Amazon Web Services Regions to China? Can a customer replicate from one China Region bucket outside of China Regions?
No, Amazon S3 Replication is not available between Amazon Web Services China Regions and Amazon Web Services Regions outside of China. You are only able to replicate within the Amazon Web Services China regions.
Q: Can I replicate existing objects?
Yes, you can use S3 Batch Replication to replicate existing objects between buckets.
Q: Can I re-try replication if object fail to replicate initially?
Yes, you can use S3 Batch Replication to re-try objects that fail to replicate initially.
Q: What encryption types does S3 Replication support?
S3 Replication supports all encryption types that S3 offers. S3 offers both server-side encryption and client-side encryption – the former requests S3 to encrypt the objects for you, and the latter is for you to encrypt data on the client-side before uploading it to S3. For server-side encryption, S3 offers server-side encryption with Amazon S3-managed keys (SSE-S3), server-side encryption with KMS keys stored in Amazon Key Management Service (SSE-KMS), and server-side encryption with customer-provided keys (SSE-C). For further details on these encryption types and how they work, visit the S3 documentation on using encryption.
Q: What is the pricing for S3 Replication (CRR and SRR)?
You pay the Amazon S3 charges for storage, copy requests, and for CRR you pay the inter-region data transfer OUT for the replicated copy of data to the destination region. Copy requests and inter-region data transfer are charged based on the source region. Storage for replicated data is charged based on the target region. If the source object is uploaded using the multipart upload feature, then it is replicated using the same number of parts and part size. For example, a 100 GB object uploaded using the multipart upload feature (800 parts of 128 MB each) will incur request cost associated with 802 requests (800 Upload Part requests + 1 Initiate Multipart Upload request + 1 Complete Multipart Upload request) when replicated. After replication, the 100 GB will incur storage charges based on the destination region. Please visit the S3 pricing page for pricing.
If you are using S3 Batch Replication to replicate objects across accounts, you will incur the S3 Batch Operations charges, in addition to the replication PUT requests and Data Transfer OUT charges (note that S3 RTC is not applicable to Batch Replication.). The Batch Operations charges include the Job and Object charges, which are respectively based on the number of jobs and number of objects processed.
S3 Replication Time Control
Q: What is Amazon S3 Replication Time Control?
Amazon S3 Replication Time Control provides predictable replication performance and helps you meet compliance or business requirements. S3 Replication Time Control is designed to replicate most objects in seconds, and 99.99% of objects within 15 minutes. S3 Replication Time Control is backed by a Service Level Agreement (SLA) commitment that 99.9% of objects will be replicated in 15 minutes for each replication region pair during any billing month. Replication Time Control works with all S3 Replication features. To learn more, please visit the replication developer guide.
Q: What are Amazon S3 Replication metrics and events?
Amazon S3 Replication provides four detailed metrics in the Amazon S3 console and in Amazon CloudWatch: operations pending, bytes pending, replication latency, and operations failed replication. You can use these metrics to monitor the total number of operations and size of objects that are pending to replicate, the replication latency between source and destination buckets, and the number of operations that did not replicate successfully for each replication rule. Additionally, you can set up Amazon S3 Event Notifications of s3:Replication type to get more information about objects that failed to replicate and the reason behind the failures. We recommend using Amazon S3 replication failure reasons to diagnose the errors quickly and fix them before re-replicating the failed objects with S3 Batch Replication. Finally, if you have S3 Replication Time Control (S3 RTC) enabled you will receive an S3 Event Notification when an object takes more than 15 minutes to replicate, and another when that object replicates successfully to the destination.
Q: How do I enable Amazon S3 Replication Time Control?
You can enable S3 Replication Time Control as an option for each replication rule. You can create a new S3 Replication policy with S3 Replication Time Control, or enable the feature on an existing policy.
You can use the S3 console, API, Amazon Web Services CLI, Amazon Web Services SDKs, or Amazon CloudFormation to configure replication. To learn more, please visit overview of setting up S3 Replication in the Amazon S3 Developer Guide.
Q: What information does the operations failed replication metric show?
The operations failed replication metric will show the total number of operations failing replication per minute for a specific replication rule. The metric will refresh every minute to emit +1 for each failed operation, 0 for successful operations, and nothing when there are no replication operations carried out for the minute. This metric is emitted every time an operation does not replicate successfully.
Q: Can I use Amazon S3 Replication metrics and events to track S3 Batch Replication?
You cannot use metrics like bytes pending, operations pending, and replication latency to track S3 Batch Replication progress. However, you can use the operations failed replication metric to monitor existing objects that do not replicate successfully with S3 Batch Replication. Additionally, you can also use S3 Batch Operations completion reports to keep track of objects replicating with S3 Batch Replication.
Q: Where are Amazon S3 Replication metrics published?
The bytes pending, operations pending, and replication latency metrics are published in the source Amazon Web Services account and destination Amazon Web Services Region. However, the operations failed replication metric is published in the source Amazon Web Services account and source Amazon Web Services Region instead of the destination Amazon Web Services Region. This is because if the operations failed replication metric is published in the destination Region, the customer will not see the metric when the destination bucket is erroneously configured. For example, if the customer has mis-typed the destination bucket name in the replication configuration and replication is unsuccessful because the destination bucket is not found, the customer will not be able to see any value for this metric because the destination Region will be unknown when the destination bucket is not found.
Q: How do I enable Amazon S3 Replication metrics and events?
Amazon S3 Replication metrics and events can be enabled for each new or existing replication rules. You can access S3 Replication metrics through the Amazon S3 console and Amazon CloudWatch. Like other Amazon S3 events, S3 Replication events are available through Amazon Simple Queue Service (Amazon SQS), Amazon Simple Notification Service (Amazon SNS), or Amazon Lambda. To learn more, please visit Monitoring progress with replication metrics and Amazon S3 event notifications in the Amazon S3 Developer Guide.
Q: What is the Amazon S3 Replication Time Control Service Level Agreement (SLA)?
Amazon S3 Replication Time Control is designed to replicate 99.99% of your objects within 15 minutes, and is backed by a Service Level Agreement. If fewer than 99.9% of your objects are replicated in 15 minutes for each replication region pair during a monthly billing cycle, the S3 RTC SLA provides a service credit on any object that takes longer than 15 minutes to replicate. The service credit will be divided into Source Region Service Credit and Destination Region Service Credit. The Source Region Service Credit covers a percentage of all the charges that are specific to inter-region data transfer and the RTC feature fee associated with any object affected in the monthly billing cycle affected. The Destination Region Service Credit covers a percentage of the charges that are specific to the replication bandwidth and request charges, and the cost associated with storing your replica in the destination region in the monthly billing cycle affected. To learn more, read the S3 Replication Time Control SLA.
Q: What is the pricing for S3 Replication and S3 Replication Time Control?
For S3 Replication, Cross-Region Replication (CRR) and Same-Region Replication (SRR), you pay the S3 charges for storage in the selected destination S3 storage classes, the storage charges for the primary copy, replication PUT requests, and applicable infrequent access storage retrieval charges. For CRR, you also pay for inter-Region Data Transfer OUT from S3 to each destination Region. When you use S3 Replication Time Control, you also pay a Replication Time Control Data Transfer charge and S3 Replication Metrics charges that are billed at the same rate as Amazon CloudWatch custom metrics. For more information, please visit the S3 pricing page.
If the source object is uploaded using the multipart upload feature, then it is replicated using the same number of parts and part size. For example, a 100-GB object uploaded using the multipart upload feature (800 parts of 128 MB each) will incur request cost associated with 802 requests (800 Upload Part requests + 1 Initiate Multipart Upload request + 1 Complete Multipart Upload request) when replicated. You will incur a request charge of ¥ 0.00405 (802 requests x ¥ 0.00405 per 1,000 requests) and (if the replication was between different Amazon Web Services Regions) a charge of ¥ 60.03 (¥ 0.6003 per GB transferred x 100 GB) for inter-region data transfer. After replication, the 100 GB will incur storage charges based on the destination Region.
Q: How am I charged for S3 Replication metrics on Amazon CloudWatch?
All S3 Replication metrics, including bytes pending, operations pending, replication latency, and operations failed replication, are billed at the same rate as Amazon CloudWatch Custom metrics: ¥ 2.0000 per metric per month for the first 10K metrics, ¥ 0.6673 per metric per month for the next 240K metrics, ¥ 0.3337 per metric per month for the next 750K metrics, and ¥ 0.1335 per metric per month for over 1M metrics.
For example, if your S3 bucket has 100 replication rules with Replication metrics enabled for each rule, you will see a monthly Amazon CloudWatch charge for 400 replication metrics (100 replication rules x 4 metrics per replication rule). The monthly prorated charge for these 400 metrics will be ¥ 800.0000 (400 replication metrics x ¥ 2.0000 per metric (for the first 10K metrics)). For information on Amazon CloudWatch billing, see the Amazon CloudWatch pricing page.
Storage Analytics & Insights
S3 Storage Lens
Q: What features are available to analyze my storage usage on Amazon S3?
S3 Storage Lens delivers organization-wide visibility into object storage usage, activity trends, and makes actionable recommendations to optimize costs and apply data protection best practices. S3 Storage Class Analysis enables you to monitor access patterns across objects to help you decide when to transition data to the right storage class to optimize costs. You can then use this information to configure an S3 Lifecycle policy that makes the data transfer. Amazon S3 Inventory provides a report of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or prefix. This report can be used to help meet business, compliance, and regulatory needs by verifying the encryption, and replication status of your objects.
Q: What is Amazon S3 Storage Lens?
Amazon S3 Storage Lens provides organization-wide visibility into object storage usage and activity trends, as well as actionable recommendations to optimize costs and apply data protection best practices. Storage Lens offers an interactive dashboard containing a single view of your object storage usage and activity across tens or hundreds of accounts in your organization, with drill-downs to generate insights at multiple aggregation levels. This includes metrics like bytes, object counts, and requests, as well as metrics detailing S3 feature utilization, such as encrypted object counts and S3 Lifecycle rule counts. S3 Storage Lens also delivers contextual recommendations to find ways for you to reduce storage costs and apply best practices on data protection across tens or hundreds of accounts and buckets. S3 Storage Lens free metrics are enabled by default for all Amazon S3 users. If you want to get more out of S3 Storage Lens, you can activate advanced metrics and recommendations. Learn more by visiting the S3 Storage Lens user guide.
Q: How does S3 Storage Lens work?
S3 Storage Lens aggregates your storage usage and activity metrics on a daily basis to be visualized in the S3 Storage Lens interactive dashboard, or available as a metrics export in CVS or Parquet file format. A default dashboard is created for you automatically at the account level, and you have the option to create additional custom dashboards. S3 Storage Lens dashboards can be scoped to your Amazon Web Services organization or specific accounts, Regions, buckets, or even prefix level (available with S3 Storage Lens advanced metrics). You can also use S3 Storage Lens groups to aggregate metrics using custom filters based on object metadata like S3 Object Tag, size and, age. In configuring your dashboard you can use the default metrics selection, or upgrade to receive 35 additional metrics and recommendations for an additional cost. Also, S3 Storage Lens provides recommendations contextually with storage metrics in the dashboard, so you can take action to optimize your storage based on the metrics.
Q: What are the key questions that can be answered using S3 Storage Lens metrics?
The S3 Storage Lens dashboard is organized around four main types of questions that can be answered about your storage. With the Summary filter, top-level questions related to overall storage usage and activity trends can be explored. For example, “How rapidly is my overall byte count and request count increasing over time?” With the Cost Optimization filter, you can explore questions related to storage cost reduction, for example, “Is it possible for me to save money by retaining fewer non-current versions?” With the Data Protection and Access Management filters you can answer questions about securing your data, for example, “Is my storage protected from accidental or intentional deletion?” Finally, with the Performance and Events filters you can explore ways to improve performance of workflows. Each of these questions represent a first layer of inquiry that would likely lead to drill-down analysis.
Q: What metrics are available in S3 Storage Lens?
S3 Storage Lens contains more than 60 metrics, grouped into free metrics and advanced metrics (available for an additional cost). Within free metrics, you receive metrics to analyze usage (based on a daily snapshot of your objects), which are organized into the categories of cost optimization, data protection, access management, performance, and events. Within advanced metrics, you receive metrics related to activity (such as request counts), deeper cost optimization (such as S3 Lifecycle rule counts), additional data protection (such as S3 Replication rule counts), and detailed status codes (such as 403 authorization errors). In addition, derived metrics are also provided by combining any base metrics. For example, “Retrieval Rate" is a metric calculated by dividing the "Bytes Downloaded Count" by the "Total Storage.” To view the complete list of metrics, please visit the S3 documentation.
Q: How do I get started with S3 Storage Lens?
S3 Storage Lens can be configured via the S3 Console, Amazon CLI, or Amazon SDK. If you are a member of an Amazon Organizations master account, you can create configurations for all or a subset of accounts that are participating in your org. Otherwise, you will configure at the account level. Your metrics will be available within 24-48 hours of configuration.
Q: What are my dashboard configuration options?
A default dashboard is configured automatically provided for your entire account, and you have the option to create additional custom dashboards that can be scoped to your Amazon Web Services organization, specific regions, or buckets within an account. You can set up multiple custom dashboards, which can be useful if you require some logical separation in your storage analysis, such as segmenting on buckets to represent various internal teams. By default, your dashboard will receive the S3 Storage Lens free metrics, but you have the option to upgrade to receive S3 Storage Lens advanced metrics and recommendations (for an additional cost). S3 Storage Lens advanced metrics have 7 distinct options: Activity metrics, Advanced Cost Optimization metrics, Advanced Data Protection metrics, Detailed Status Code metrics, Prefix aggregation, CloudWatch publishing, and Storage Lens groups aggregation. Additionally, for each dashboard you can enable metrics export, with additional options to specify destination bucket and encryption type.
Q: How much historical data is available in S3 Storage Lens?
For metrics displayed in the interactive dashboard, Storage Lens free metrics retains 14 days of historical data, and Storage Lens advanced metrics (for an additional cost) retains 15 months of historical data. For the optional metrics export, you can configure any retention period you wish, and standard S3 storage charges will apply.
Q: Can I configure S3 Storage Lens to automatically track new buckets and prefixes?
Yes, S3 Storage Lens provides an option to configure dashboards with a scope of “all buckets,” which means that any newly created buckets, or prefixes within a bucket, would automatically be tracked under this configuration.
Q: Who will have permissions to access metrics from S3 Storage Lens?
S3 Storage Lens supports new permissions in IAM policy to authorize access to S3 Storage Lens APIs. You can attach the policy to IAM Users, IAM Groups or Roles to grant them permissions to enable/disable S3 Storage Lens, or to access any dashboard in the console. You can also use the Lens tagging APIs to attach tag pairs (up to 50) to the dashboard configurations and use resource tags in IAM policy to manage permissions. For metrics exports, which are stored in a bucket in your account, permissions are granted using existing s3:GetObject permission in the IAM policy. Similarly, for an Amazon Organization entity, the org master or delegate admin account can use IAM policies to manage access permissions for org-level configurations.
Q: How will I be charged for S3 Storage Lens?
S3 Storage Lens is available in two tiers of metrics. The free metrics are enabled by default and available at no additional charge to all S3 customers. The S3 Storage Lens advanced metrics and recommendations pricing details are available on the S3 pricing page. With S3 Storage Lens free metrics you receive 28 metrics at the bucket level, and can access 14 days of historical data in the dashboard. With S3 Storage Lens advanced metrics and recommendations you receive 35 additional metrics, prefix-level aggregation, CloudWatch metrics support, custom object metadata filtering with S3 Storage Lens groups, and can access 15 months of historical data in the dashboard.
Q: What is the difference between S3 Storage Lens and S3 Inventory?
S3 Inventory provides a list of your objects and their corresponding metadata for an S3 bucket or a shared prefix, which can be used to perform object-level analysis of your storage. S3 Storage Lens provides metrics can be aggregated by organization, account, region, storage class, bucket, prefix, and S3 Storage Lens group levels, which improve organization-wide visibility of your storage.
Q: What is the difference between S3 Storage Lens and S3 Storage Class Analysis (SCA)?
S3 Storage Class Analysis provides recommendations for an optimal storage class by creating object age groups based on object-level access patterns within an individual bucket/prefix/tag for the previous 30-90 days. S3 Storage Lens provides daily organization level recommendations on ways to improve cost efficiency and apply data protection best practices, with additional granular recommendations by account, region, storage class, bucket, S3 Storage Lens group, or prefix (available with S3 Storage Lens advanced metrics). You can also use custom filters with S3 Storage Lens groups to create an object age distribution of your buckets and optimize your retention and archival strategy.
Storage Class Analysis
Q: How do I get started with S3 Analytics – Storage Class Analysis?
You can use the Amazon Web Services Management Console or the S3 PUT Bucket Analytics API to configure Storage Class Analysis policy to identify infrequently accessed storage that can be transitioned to Standard-IA or archived to Glacier. You can navigate to the “Management” tab in the S3 Console to manage S3 Analytics, S3 Inventory, and S3 CloudWatch metrics.
Q: What is S3 Analytics - Storage Class Analysis?
With storage class analysis, you can analyze storage access patterns and transition the right data to the right storage class. This new S3 Analytics feature automatically identifies when infrequent usage pattern is to help you transition storage to S3 Standard-IA, S3 One Zone-IA, Amazon S3 Glacier Flexible Retrieval, or Amazon S3 Glacier Deep Archive. You can configure a storage class analysis policy to monitor an entire bucket, a prefix, or object tag. Once infrequent access pattern is observed, you can easily create a new lifecycle age policy based on the results. Storage class analysis also provides daily visualizations of your storage usage on the Amazon Web Services Management Console that you can export to a S3 bucket to analyze using business intelligence tools of your choice.
Q: How often is the Storage Class Analysis updated?
Storage Class Analysis is updated on a daily basis on the S3 Management Console. Additionally, you can configure S3 Analytics to export you daily storage class analysis to a S3 bucket of your choice.
Q: How am I charged for using S3 Analytics – Storage Class Analysis?
Please call for more information about S3 Analytics – Storage Class Analysis pricing.
S3 Inventory
Q: What is S3 Inventory?
ORC file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or prefix. You can simplify and speed up business workflows and big data jobs with S3 Inventory. You can use S3 Inventory to verify encryption and replication status of your objects to meet business, compliance, and regulatory needs.
Q: How do I get started with S3 Inventory?
You can use the Amazon Web Services Management Console or the PUT Bucket Inventory API to configure a daily or weekly inventory for all the objects within your S3 bucket or a subset of the objects under a shared prefix. As part of the configuration you can specify a destination S3 bucket for your inventory, the output file output format (CSV or ORC), and specific object metadata necessary for your business application, such as: object name, size, last modified date, storage class, version id, delete marker, noncurrent version flag, multipart upload flag, replication status, or encryption status.
Q: Will S3 Inventory improve the performance for my big data jobs and business workflow applications?
Yes, S3 Inventory can be used as a ready-made input into a big data job or workflow application instead of the synchronous S3 LIST API, saving the time and compute resources it takes to call and process the LIST API response.
Q: Can files written by S3 Inventory be encrypted?
Yes, you can configure to encrypt all files written by S3 Inventory to be encrypted by SSE-S3. For more information, refer to the user guide.
Q: How do I use S3 Inventory?
You can use S3 Inventory as a direct input into your application workflows or big data jobs. You can also query S3 Inventory using Standard SQL language with tools such as Presto, Hive, and Spark.
Q: How am I charged for using S3 Inventory?
Please see the Amazon S3 pricing page for general information about S3 Inventory pricing.
Query in Place
Q: What is "Query in Place" functionality?
Amazon S3 allows customers to run sophisticated queries against data stored without the need to move data into a separate analytics platform. The ability to query this data in place on Amazon S3 can significantly increase performance and reduce cost for analytics solutions leveraging S3 as a data lake. S3 offers multiple query in place options, including Amazon Athena and Amazon Redshift Spectrum, allowing you to choose one that best fits your use case.