AWS S3 Object Key and Metadata
This article provides a detailed overview of S3 object keys and metadata and also highlights a few of the use cases in general.
AWS S3 Highlights
- S3 will not only charge for the storage but also for requests, data retrievals, data transfer and replication.
- It’s advisable to use AWS S3 object Tagging, which might cost around $0.01 per 10,000 tags per month.
- We suggest using the S3 pricing calculator to calculate and estimate your AWS S3 spending.
CloudySave AWS Savings Report provides the complete cost visibility of S3 service, which includes all details about storage, requests & data-transfer costs. This report greatly assists in understanding the overhead/wasteful costs.
When reviewing data-transfer costs, please consider taking the following actions:
- Using Cloud Front or redesigning object locations.
- Applying managed lifecycle for all S3 Objects.
- Reviewing and cleaning S3 Objects that are never accessed.
.
An S3 object includes the following:
- Data: data can be anything(files/zip/images/etc.)
- A key (key name): unique identifier
- Metadata: Set of name-value pairs that can be set when uploading an object and can no longer be modified after a successful upload. To change metadata, AWS suggests making an object copy and setting the metadata again.
What are S3 Object Keys?
- Upon creation of objects in S3, a unique key name should be given to identify each object in the bucket.
- For Example, when a bucket is highlighted in S3 Console, it shows the list of items that represent object keys. Key names come as Unicode characters with UTF-8. (max limit: 1024 bytes)
S3 data model:
- Flat structure.
- Create a bucket.
- The bucket stores the objects.
- No hierarchy (no sub buckets or subfolders)
Nonetheless, by utilizing name prefixes & delimiters, a logical hierarchy can be made just like how the S3 console does. The console allows the concept of folders.
For Example, let’s consider a bucket (creator-admin) is made up of four objects which have these object keys:
- FirstFile/assignment.rar
- SecondFile/DAL.xlsx
- ThirdFile/challenges.pdf
- visit.pdf
- The key name prefixes (FirstFile/, SecondFile/, and ThirdFile/) represents the folder structure with S3 bucket.
- As the visit.pdf key does not have any prefix, the S3 console presents that as an object. Navigating inside folders presents its contents.
- S3 takes buckets and objects with no hierarchy. Yet, prefixes and delimiters present in object key names can allow S3 console and SDKs to get a hierarchy and start folders.
All you need to know about S3 Object Key Naming:
- Use any UTF-8 character.
- Few characters may cause problems (wrt. particular protocols and applications)
The upcoming guidelines might maximize compatibility with: -DNS -XML parsers -Web safe characters -Other APIs etc.
What are the Safe Characters?
The following characters are commonly used in key names:
- Alphanumeric Characters such as: “0-9” “a-z” “A-Z”.
- Special Characters such as: “!” (or) “-“ (or) “_” (or) “.” (or) “*” (or) “,” (or) “(“ (or) “)”
Here are a few examples of S3 object key names which are accepted:
- 2my-company
- our.nice_pictures-2020/feb/ourholiday.jpg
- clips/2020/party/clip1.wmv
Object key names having one period “.”, or two periods “..”, can’t be downloaded through the console but can be managed through AWS-CLI, SDKs or REST API.
Characters That Might Require Special Handling
- Extra code handling
- URL encoded
- Referenced as HEX.
Non-printable characters may not be handled by the browser and require special handling, includes:
- “&”
- “$”
- 0–31 decimal and 127 decimal
- “@”
- “=”
- “:” and “;”
- “+”
- Space and particularly multiple spaces
- “,”
- “?”
Which characters should you avoid on S3 object key?
Avoid the following characters in a key-name due to significant special handling for consistency across all applications.
- “”
- “{” and “}”
- 128–255 decimal characters
- “^”
- “%”
- “`”
- “]” and “[“
- Quotation marks
- “>” and “<”
- “~”
- “#”
- “|”
What are the types of S3 Object Metadata?
System-Defined:
Every object in a bucket has a set of system metadata which is processed by S3.
System metadata has 2 categories:
- Metadata: like object creation date, which is controlled by the system and solely Amazon S3 has the ability to update its value.
- Other system metadata: like the storage class configured for an object and objects of server-side enabled encryption, are system metadata with values controlled by you.
Upon the creation of objects, the following may be done: configuring values of system metadata items and updating values when necessary.
Update the status of system-defined metadata:
Name | Description | Can the Value be Modified? |
---|---|---|
Date | Date and time. | No |
Content-Length | Size of Object (bytes). | No |
Last-Modified | Date when object is created or last modified. The date which is the latest. | No |
Content-MD5 | Base64-encoded 128-bit MD5 digest. | No |
x-amz-server-side-encryption | Whether server-side encryption enabled, and whether it is from Key Management Service or from S3 managed encryption. | Yes |
x-amz-version-id | Upon enabling versioning on buckets, a version number get assigned the objects that get added to those buckets. | No |
x-amz-delete-marker | Boolean marker to indicate if the object is a delete marker, where versioning is enabled. | No |
x-amz-storage-class | For storing objects. | Yes |
x-amz-website-redirect-location | Object request redirection to another object in same bucket or external URL. | Yes |
x-amz-server-side-encryption-aws-kms-key-id | When x-amz-server-side-encryption has aws:kms, this shows the ID of KMS symmetric CMK. | Yes |
x-amz-server-side-encryption-customer-algorithm | When server-side encryption having customer-provided encryption keys is enabled. | Yes |
User-Defined Values:
- Metadata can be assigned to an object as you upload it.
- It’s an optional name-value pair for sending a “PUT” or a “POST” request.
- Optional metadata names, which are defined by the user through the REST API, have to start with “x-amz-meta-” to set them apart from other HTTP headers.
- Retrieving an object through the REST API, gets that prefix returned (x-amz-meta-). But the prefix is not needed when uploading it through SOAP API.
- Retrieving through SOAP API removes the prefix, no matter what the API used for uploading this object.
- Using HTTP, SOAP is deplored, but it is available using HTTPS.
- SOAP can no longer support the upcoming S3 features, so either starts using the REST API or the AWS SDKs.
- For the retrieval of metadata by REST API, headers with the same name get combined into a comma-delimited list.
- Metadata with unprintable characters are not returned, but the x-amz-missing-meta header returns, showing the value of unprintable metadata.
User-defined metadata:
- Set of key-value pairs
- Stored by Amazon S3 in lowercase
- Key-value pairs need to be compliant
- US-ASCII when using REST
- UTF-8 when using both SOAP or browser-based uploads through POST
PUT request header:
- Maximum of 8 KB in size.
- Its user-defined metadata has a maximum of 2 KB in size
- size of user-defined metadata= sum of a number of bytes in the UTF-8 encoding of each key and value.
Here are a few awesome resources on AWS Services:
AWS S3 Bucket Details
AWS S3 LifeCycle Management
AWS S3 File Explorer
Setup Cloudfront for S3
AWS S3 Bucket Costs