Skip to content

Instantly share code, notes, and snippets.

@projected1
Last active January 30, 2025 13:37
Show Gist options
  • Save projected1/4665f8a67fabaddd0ca4fc112efae741 to your computer and use it in GitHub Desktop.
Save projected1/4665f8a67fabaddd0ca4fc112efae741 to your computer and use it in GitHub Desktop.
DynamoDB reference.

DynamoDB

Table of Contents

AWS Documentation

  • Reference
  • Terminology:
    SQL NoSQL
    Table Collection
    Record Item
    Column Property
  • Billing: The AWS Free Tier includes 25 WCUs and 25 RCUs (reducing your monthly bill by 14.04).
  • CPU optimized (CPU is expensive) - not storage optimized (storage is cheap)
  • There are no joins because joins are expensive and thus relational databases don't scale well
  • Supports nested attributes up to 32 levels deep
  • Data types:
    • Scalars: Number (N), String (S), Boolean (BOOL), null (NULL), Binary (B)
    • Binary = Buffer, File, Blob, ArrayBuffer, DataView, and JavaScript typed arrays
    • Documents: Array (L), Object (M)
    • Sets: String Set (SS), Number Set (NS), Binary Set (BS)
  • Key types:
    • Partition key, aka "hash key" - used do distribute data across DynamoDB partitions
    • Sort key, aka "range key":
      • Gathers related information together where it can be queried efficiently using range queries (see bellow)
      • Lets you define hierarchical (one-to-many) relationships. e.g. [country]#[region]#[state]#[county]#[city]
  • Consistency
    • Eventual consistency - read data from any replica partition
    • Strong consistency - read data only from the master partition (x2 as expensive, and may be throttled)
  • Secondary index:
    • Global (GSI):
      • Has partition key and sort key different from those on the table
      • GSIs are eventually consistent
    • Local (LSI):
      • Has the same partition key as the table, but a different sort key
      • LSIs are strongly consistent
      • Created when the table is created and cannot be removed
  • Limits:
    • 1,000 WCUs/sec per-partition
    • 3,000 RCUs/sec per-partition
    • 20 GSIs/table (default - can be increased)
    • 5 LSIs/table
    • 100 projected secondary index attributes
    • Transaction:
      • 25 items per-transaction
      • 4mb per-transaction
    • Max partition key length: 2048 bytes
    • Max sort key length: 1024 bytes
    • 10gb partition size
    • 400kb string size (strings are utf-8 encoded)
    • 400kb binary size
    • 400kb item size
    • 32 levels nested attributes
    • 4kb expression length
    • 300 operators or functions in an "UpdateExpression"
    • Batch:
      • 100 items/16mb retrieved in a single "BatchGetItem" operation
      • 25 "PutItem" or "DeleteItem" requests in a single "BatchWriteItem" operation
    • Query: 1mb result set. (Use "LastEvaluatedKey" from the query response to retrieve more results)
    • Scan: 1mb result set. (Use "LastEvaluatedKey" from the query response to retrieve more results)
  • Stream is the changelog of the table. Good for stored-procedures type operations and also for performing computed aggregations.
  • Stream records have a lifetime of 24 hours; after that, they are removed automatically
  • Transactions: ACID transactions - Atomicity, Consistency, Isolation, Durability
  • To fully use all the throughput capacity that is provisioned for a table, you must distribute your workload across your partition key values
  • Relationships:
    • 1:1, one-to-one or key-value
      • Model using a table or GSI with a partition key
      • User "GetItem" or "BatchGetItem" API
    • 1:N, one-to-many
      • Model using ta table or GSI with partition and sort key
      • Use "Query" API to get multiple items
    • M:N, many-to-many
      • Model using a table and inverted GSI (with partition and sort key elements switched)
      • Use "Query" API to get multiple items
  • Design patterns:
    • Hierarchical (composite) sort key: [country]#[region]#[state]#[county]#[city]
    • Sparse index: Conditionally populating the SK, i.e. not all items have a value for a specific attribute (e.g. "STATUS_WARN")
    • Tables & indexes partitioning
    • GSI overloading: Storing different types of data in the same attribute
    • GSI write sharding:
      • Add a random number to the partition key values (e.g. GUID)
        • Pro: Improves write throughput
        • Con: Difficult to read a specific item
      • Add a calculated hash suffix, based on something that you are querying on
    • Adjacency list: Representing many-to-many relationships
    • Materialized aggregations
    • Write sharding: Adding "salt" to the partition key for better data distribution
      • Random prefix/suffix:
        • _RAND(0..N) where N is the number of shards (partitions)
        • When you don't need per-item access
        • Query all shards and merge results
      • Calculated prefix/suffix:
        • _HASH(OrderStatus) % N
        • When you need per-item access
        • Calculate salt from a known value
  • Query & Scan filters:
    • BEGINS_WITH
    • BETWEEN
    • CONTAINS
    • NOT_CONTAINS
    • EQ
    • GE
    • GT
    • IN
    • LE
    • LT
    • NE
    • NULL
    • NOT_NULL
  • Partition key operators:
    • =
  • Sort key operators:
    • =
    • <
    • <=
    • >
    • >=
    • a BETWEEN b AND c
    • BEGINS_WITH(str, substr)
  • Save operations ("ReturnValues" key options):
    • NONE
    • ALL_OLD
    • UPDATED_OLD
    • ALL_NEW
    • UPDATED_NEW

The DynamoDB Book

  • A Read Capacity Unit (RCU) gives you a single strongly-consistent read per second or two eventually-consistent read per second, up to 4KB is size. A Write Capacity Unit (WCU) allows you to write a single item per second, up to 1KB in size.

  • DynamoDB items have a 400KB size limit.

  • Online Transactional Processing (OLTP) - For high speed, high velocity data access where you're operating on multiple records at once.

  • Online Analytical Processing (OLAP) - For analysis of entire data sets to get insights.

  • No text indexing for searching.

  • No geospatial indexing for location-based queries.

  • No multi-key indexes for searching within arrays.

  • Item collections are all the items in a table or secondary index that share the same partition key.

  • Strong consistency means you will get the same answer from different nodes when querying them.

  • Streams are immutable sequence of records that can be processed by multiple, independent consumers.

  • With adaptive capacity, throughput is automatically spread around your table to the items that need it.

  • Rather than requiring that all read and writes go through the primary, we can have all writes go through the primary and then have the reads shared across the secondary nodes.

  • You can opt into a strongly-consistent read by passing ConsistentRead=True in your API call.

  • Global secondary index (GSI) will only allow you to make eventually-consistent reads.

  • A single partition can have a maximum of 3,000 Read Capacity Units (RCU) or 1,000 Write Capacity Units (WCU).

  • If you have a local secondary index, a single item collection cannot be larger than 10GB.

  • If the items in a global secondary index, for a partition key exceed 10GB in total storage, they will be split across multiple partitions.

  • In a batch API request, your reads or writes can succeed or fail independently. The failure of one write won't affect the other writes in the batch.

  • With the transactional API actions, all of your reads or writes will succeed or fail together.

  • You can do operations like >=, <=, BEGINS_WITH, or BETWEEN, but you can't do contains or ends_with. This is because an item collection is ordered and stored as a B-tree.

  • Every condition on the sort key can be expressed with the BETWEEN operator.

  • The time complexity of a B-tree search is O(log n).

  • All Query and Scan operations are limited to 1MB of data in total.

  • DynamoDB operations time complexity:

    Operation Data structure Notes
    Find node for partition key Hash table Time complexity of O(1)
    Find starting value for sort key B-tree Time complexity of O(log n) *
    Read values until end of sort key match N/A Sequential read. Limited to 1MB data

    * Where n is the size of the item collection, not the entire table

  • Optional properties on individual requests:

    • ConsistentRead
    • ScanIndexForward
    • ReturnValues
    • ReturnConsumedCpacity
    • ReturnItemCollectionMetrics
  • Use the ReturnValues attribute to affect the payload that is returned to your client. This property is available on the following write-based API actions:

  • PutItem

  • UpdateItem

  • DeleteItem

  • TransactWriteItem (here, the property is referred to as ReturnValuesOnConditionCheckFailure)

  • By default, DynamoDB will not return any additional information about the item. You can use the ReturnValues attribute to change these defaults:

    • NONE - Return no attributes from the item (default).
    • ALL_OLD - Return all the attributes from the item as it looked before the operation was applied.
    • UPDATED_OLD - For any attributes updated in the operation, return the attributes before the operation was applied.
    • ALL_NEW - Return all the attributes from the item as it loos after the operation is applied.
    • UPDATED_NEW - For any attributes updated in the operation, return the attributes after the operation is applied.
  • ReturnConsumedCapacity property is an optional property that will return data about the capacity units that were used by the request. Specify ReturnConsumedCapacity=TOTAL to receive an overall summary of the capacity consumed in the operation.

  • DynamoDB is priced based on read and write capacity units, and the amount of capacity used depends on a few factors:

    • The size of the item(s) you're reading or writing.
    • For read operations, whether you're doing an eventually or strongly consistent read.
    • Whether you're making a transactional request.
  • There are five types of expressions in DynamoDB:

    • Key Condition Expression - Used in the Query API call to describe which items you want to retrieve in your query.
    • Filter Expression - Used in Query and Scan operations to describe which items should be returned to the client after finding items that match your key condition expression.
    • Projection Expression - Used in all read operations to describe which attributes you want to return on items that were read.
    • Condition Expression - Used in write operations to assert the existing condition (or non-condition) of an item before writing to it.
    • Update Expression - Used in the UpdateItem call to describe the desired updates to an existing item.
  • Key condition expression can be applied only to attributes of a primary key.

  • Filter expression can be applied to any attribute in the table. It works on an item-by-item basis. It can save you a bit of data sent over the wire, but it won't help you find data more quickly as they are applied after items are read. Filter expressions are useful for a few limited contexts:

    • Reducing response payload size. DynamoDB can return up to 1MB in a singe response.
    • Easier application filtering. If you'll retrieve some results from DynamoDB and immediately run a filter() method to throw some away, it can be easier to handle that in your API request to DynamoDB.
    • Better validation around time-to-live (TTL) expiry. When using TTL, items are generally deleted within 48 hours of their TTL expiry. To help guard against this, your could write a filter expression that removes all items that should have been expired by now.
  • Projection expression works on an attribute-by-attribute basis within an item, and can be used to remove attributes within items.

  • Condition expressions are available on every operation where you will alter an item: PutItem, UpdateItem, DeleteItem, and their batch and transactional equivalents. They can operate on any attribute on your item, not just those in primary key. Available comparison operators - >=, <=, =, BETWEEN. Available functions:

    • attribute_exists() - Used to assert that a given attribute exists.
    • attribute_not_exists() - Used to assert that a given attribute does not exist. Commonly used to prevent overwrites by using it on the partition key of the item you're writing.
    • attribute_type() - Used to assert that an attribute is of a particular type.
    • attribute_with() - Assert that an attribute value begins with a particular substring.
    • contains() - Assert that a string contains a particular substring, or that a set contains a particular value.
    • size() - Allows you to assert various properties about the size of an attribute value. For strings or binary values, it's the length of the string or number of bytes in the binary value. For lists, maps, or sets, it returns the number of elements in a set.
  • Prevent overwrites or check for uniqueness when inserting items with the PutItem API, by using the attributes_not_exists() operator in your ConditionEpression.

  • Limit in-progress items for a workflow by using a set attribute that acts as a guard state, and check for its size using ConditionExpression on every UpdateItem API call. e.g. ConditionExpression: "size(#inprogress) <= 10", UpdateExpression="Add #inprogress :id", ExpressionAttributeNames={"#inprogress": "InProgress"}, ExpressionAttributeValues={":id": <JOB_ID>}. Asserting user permissions on an item

  • Use condition expressions to assert user permissions on an item.

  • Use condition expressions to check across multiple items using transactions.

  • TransactWriteItem API allows you to use up to 10 items in a single query.

  • Update expressions require to state the changes you want to make. There are four verbs for stating these changes:

    • SET - Add or overwrite an attribute on an item. Also, add or subtract from a number attribute.
    • REMOVE - Delete an attribute from an item or delete nested properties from a list of map.
    • ADD - Add to a number attribute or insert an element into a set attribute.
    • DELETE - Remove an element from a set attribute.
  • Multiple update expressions in a single query:

    UpdateExpression=" \
      SET Name=:name, UpdatedAt=:updatedAt \
      REMOVE InProgress"
  • The first three forms of normalization:

    Form Definition Plain English
    First normal form (1NF) Each column value is atomic Don't include multiple values in a single attribute
    Second normal form (2NF) No partial dependencies All non-key attributes must depend on the entirety of the primary key
    Third normal form (3NF) No transitive dependencies All non-key attributes depend only on the primary key
  • Strategies for model one-to-many relationships in DynamoDB:

    • Denormalizing the data and storing the nested objects as a document attribute.
    • Denormalizing the data by duplicating it across multiple items.
    • Using a composite primary key.
    • Creating a secondary index.
    • Using a composite sort key to handle hierarchical data.
  • Strategies for model many-to-many relationships in DynamoDB:

    • Shallow duplication.
    • Adjacency list (inverted index) - Flip the PK and SK for our secondary index.
    • Materialized graph.
    • Normalization and multiple requests.
  • If you have highly-mutable many-to-many relationships in DynamoDB, you'll likely need to make multiple requests at read time.

  • Strategies for filtering:

    • Filtering with the partition key.
    • Filtering with the sort key.
    • Composite sort key.
    • Sparse indexes.
    • Filter expressions.
    • Client-side filtering.
  • Use sparse indexes to project a single type of entity.

  • Lexicographical sorting - Standardize your sort key in all uppercase or all lowercase values.

  • K-Sortable Unique Identifier (KSUID), is a unique identifier that is prefixed with a timestamp but also contains enough randomness to make collisions very unlikely. Node.js libraries:

    • KSUID - 1 sec precision (similar to Twitter).
    • ULID - 1 msc precision.
  • When updating an item in DynamoDB, you may not change any elements of the primary key.

  • When you are co-locating items for one-to-many or many-to-many relationships, be sure to consider the order in wheich you want the related items returned so that your parent itself is located accordingly.

  • You can do the reverse to fetch the Org item and all User items by looking for all items greater than or equal to our Org item sort key, then reading forward to pick up all the User items.

  • In order to ensure uniqueness on two or more attributes across items, use transaction with ConditionExpression.

  • In order to handle sequential Ids, use UpdateItem to increment an Id, and set the ReturnValues parameter to UPDATED_NEW; this will return the current value of the updated Id.

  • In order to limit the amount of items to return, set the Limit parameter (e.g. set it to 5, to return up to 5 items).

  • Pagination (within a single item collection):

    # Fist page. Note that "$" is lexicographically larger than "#", thus "SK < ORDER$" yield the last "ORDER#" item.
    client.query(
      TableName="Demo",
      KeyConditionExpression="#pk=:pk, #sk<:sk",
      ExpressionAttributeNames={"#pk": "PK", "#sk": "SK"},
      ExpressionAttributeValues={":pk": "USER#johndoe", ":sk": "ORDER$"},
      ScanIndexForward=False,
      Limit=5)
    
    # Consecutive pages
    client.query(
      TableName="Demo",
      KeyConditionExpression="#pk=:pk, #sk<:sk",
      ExpressionAttributeNames={"#pk": "PK", "#sk": "SK"},
      ExpressionAttributeValues={":pk": "USER#johndoe", ":sk": "ORDER#3KrZPCKs7h6L3pGz0LWKS8Tv"},
      ScanIndexForward=False,
      Limit=5)
  • Singleton items - TODO

  • Reference counts - TODO

  • When denormalizing by using a complex attribute, we need to ask two things:

    • Do we have any access patterns that fetch related entity directly by values of the related entity, outside the context of the parent?
    • Is the amount of data in the complex attribute unbounded?
  • Update a Message entity to mark it as read: UpdateExpression="SET #unread=:false, REMOVE #gsi1pk, #gsi1sk"

    • Change the Unread property to "False".
    • Remove the GSI1PK and GSI1SK attributes so that it will be removed from the sparse index.
  • Retrieve all Messages, read or unread:

    args = {...}
    if unread_only:
      args["IndexName"] = "GSI1"
    client.query(**args)
  • Patterns:

    • Arbitrary partitioning of data to prevent hot partition.
    • Singleton items.
    • Transactional operations to maintain reference counts.
    • DynamoDB Streams to react to incoming changes.
    • Sparse indexes to project a single entity type.
    • Sparse indexes to filter within an entity type.
  • We can usually model two one-to-many relationships in a single item collection by locating the parent object in the middle of the collection. The trick is that one of the access patterns needs to be accessed in ascending order, while the other needs to be accessed in descending order.

  • We can get reverse ordering when scanning the index forward by subtracting our issue number from the maximum allowable issue number (e.g. 99999999). The sort key pattern: ISSUE#OPEN#<IssueNumberDifference> e.g. For issue number 15 ISSUE#OPEN#99999984

  • Data modeling patterns (from easy to hard):

    • Adding new attribute.
    • Adding a new entity with no relationship.
    • Adding a new entity and relationship into an existing item collection.
    • Migrating existing patterns.
  • Expression examples:

    • UpdateExpression="SET #gsi1pk=:gsi1pk, #gsi1sk=:gsi1sk"
    • FilterExpression="attribute_not_exists(#status) OR #status=:status"
    • ConditionExpression="attribute_not_exists(#reactions) OR NOT contains(#reactions, :reaction)"

PartiQL

Reference

Data Types

Insert data types:

INSERT INTO TypesTable value {
  'pk': '1',
  'NumberType': 1,
  'MapType': {'entryname1': 'value', 'entryname2': 4},
  'ListType': [1, 'stringval'],
  'NumberSetType': <<1, 34, 32, 4.5>>,
  'StringSetType': <<'stringval', 'stringval2'>>
}

Update data types:

UPDATE TypesTable
SET NumberType = NumberType + 100
SET MapType.NewMapEntry = [2020, 'stringvalue', 2.4]
SET ListType = LIST_APPEND(ListType, [4, <<'string1', 'string2'>>])
SET NumberSetType = SET_ADD(NumberSetType, <<345, 48.4>>)
SET StringSetType = SET_ADD(StringSetType, <<'stringsetvalue1', 'stringsetvalue2'>>)
WHERE pk = '1'

Delete data types:

UPDATE TypesTable
SET NumberType = NumberType - 1
REMOVE ListType[1]
REMOVE MapType.NewMapEntry
SET NumberSetType = SET_DELETE(NumberSetType, <<345>>)
SET StringSetType = SET_DELETE(StringSetType, <<'stringsetvalue1'>>)
WHERE pk = '1'

Statements

Select Statement

Retrieves data from a table in Amazon DynamoDB.

Syntax:

SELECT expression [, ...]
FROM table[.index]
[WHERE condition] [ [ORDER BY key [DESC|ASC] , ...]

expression

  • A projection formed from the * wildcard or a projection list of one or more attribute names or document paths from the result set. An expression can consist of calls to Use PartiQL functions with amazon DynamoDB or fields that are modified by PartiQL arithmetic, comparison, and logical operators for DynamoDB.

Note: You must add double quotation marks to the table name and index name when querying an index.

SELECT *
FROM "TableName"."IndexName"

Note: To ensure that a SELECT statement does not result in a full table scan, the WHERE clause condition must specify a partition key. Use the equality or IN operator.

Example 1: Full table scan. Given OrderID (a partition key) and Address (a non-key attribute), the following statements WOULD NOT result in a full table scan.

SELECT *
FROM "Orders"
WHERE OrderID = 100

SELECT *
FROM "Orders"
WHERE OrderID = 100 AND Address = 'some address'

SELECT *
FROM "Orders"
WHERE OrderID IN [100, 300, 234] ORDER BY OrderID DESC

Example 2: Not a full table scan. Given Address (a non-key attribute) and Total (a non-key attribute) the following statements WILL result in a full table scan:

SELECT *
FROM "Orders"
WHERE OrderID > 1

SELECT *
FROM "Orders"
WHERE Address = 'some address'

SELECT *
FROM "Orders"
WHERE OrderID = 100 OR Address='some address'

SELECT OrderID, Total
FROM "Orders"
WHERE Total BETWEEN 500 AND 600

SELECT OrderID
FROM "Orders"
WHERE A.B.C.Dates[0] >= '01/01/23'

Update Statement

Modifies the value of one or more attributes within an item in an Amazon DynamoDB table.

Syntax:

UPDATE table
[SET | REMOVE] path [= data] [...]
WHERE condition [RETURNING returnvalues]
<returnvalues> ::= [ALL OLD | MODIFIED OLD | ALL NEW | MODIFIED NEW] *

condition

  • Must resolve to a single primary key value.

returnvalues

  • ALL OLD * - Returns all of the attributes of the item, as they appeared before the update operation.
  • MODIFIED OLD * - Returns only the updated attributes, as they appeared before the update operation.
  • ALL NEW * - Returns all of the attributes of the item, as they appear after the update operation.
  • MODIFIED NEW * - Returns only the updated attributes, as they appear after the UpdateItem operation.

Note: If the WHERE clause of the UPDATE statement does not evaluate to true for any item in the DynamoDB table, ConditionalCheckFailedException is returned.

Updates an item in the Music table by adding an attribute of type number AwardsWon and an attribute of type map AwardDetail:

UPDATE "Music"
SET AwardsWon = 1
SET AwardDetail = {'Grammys': [2020, 2018]}
WHERE Artist = 'Acme Band' AND SongTitle = 'PartiQL Rocks'

Updates an item in the Music table by appending to a list AwardDetail.Grammys:

UPDATE "Music"
SET AwardDetail.Grammys = list_append(AwardDetail.Grammys,[2016])
WHERE Artist = 'Acme Band' AND SongTitle = 'PartiQL Rocks'

Updates an item in the Music table by removing from a list AwardDetail.Grammys:

UPDATE "Music"
REMOVE AwardDetail.Grammys[2]
WHERE Artist = 'Acme Band' AND SongTitle = 'PartiQL Rocks'

Updates an item in the Music table by adding BillBoard to the map AwardDetail:

UPDATE "Music"
SET AwardDetail.BillBoard = [2020]
WHERE Artist = 'Acme Band' AND SongTitle = 'PartiQL Rocks'

Updates an item in the Music table by adding the string set attribute BandMembers:

UPDATE "Music"
SET BandMembers = <<'member1', 'member2'>>
WHERE Artist = 'Acme Band' AND SongTitle = 'PartiQL Rocks'

Updates an item in the Music table by adding newbandmember to the string set BandMembers:

UPDATE "Music"
SET BandMembers = set_add(BandMembers, <<'newbandmember'>>)
WHERE Artist = 'Acme Band' AND SongTitle = 'PartiQL Rocks'

Delete Statement

Deletes an existing item from your Amazon DynamoDB table.

Syntax:

DELETE FROM table
WHERE condition [RETURNING returnvalues]
<returnvalues>  ::= ALL OLD *

condition

  • Must resolve to a single primary key value.

returnvalues

  • ALL OLD * - Returns all of the attributes of the item, as they appeared before the update operation.

Deletes an item in the Music table:

DELETE FROM "Music"
WHERE "Artist" = 'Acme Band' AND "SongTitle" = 'PartiQL Rocks'

Insert Statement

Inserts a single item.

Syntax:

INSERT INTO table VALUE item

Note: If the DynamoDB table already has an item with the same primary key as the primary key of the item being inserted, DuplicateItemException is returned.

INSERT INTO "Music" value {'Artist': 'Acme Band', 'SongTitle': 'PartiQL Rocks'}

EXISTS Function

Returns true if the value is a non-empty collection.

Note: EXISTS can only be used in transactions write requests.

Syntax:

EXISTS (statement)

Example:

EXISTS(
    SELECT * FROM "Music"
    WHERE "Artist" = 'Acme Band' AND "SongTitle" = 'PartiQL Rocks')

BEGINS_WITH Function

Returns true if an attribute begins with a particular substring.

Syntax:

BEGINS_WITH(path, value)

Example:

SELECT * FROM "Orders" WHERE "OrderID" = 1 AND BEGINS_WITH("Address", '7834 24th')

MISSING Function

Returns true if the item does not contain the specified attribute. Only equality and inequality operators can be used with this function.

Syntax:

attributename IS | IS NOT MISSING

Example:

SELECT * FROM Music WHERE "Awards" is MISSING

ATTRIBUTE_TYPE Function

Returns true if the attribute at the specified path is of a particular data type.

Syntax:

ATTRIBUTE_TYPE(attributename, type)

Example:

SELECT * FROM "Music" WHERE ATTRIBUTE_TYPE("Artist", 'S')

CONTAINS Function

Returns true if the attribute specified by the path is one of the following:

  • A String that contains a particular substring.
  • A Set that contains a particular element within the set.

Syntax:

CONTAINS(path, substring)

Example:

SELECT * FROM "Orders" WHERE "OrderID" = 1 AND CONTAINS("Address", 'Kirkland')

SIZE Function

Returns a number representing an attribute's size in bytes.

Syntax:

SIZE(path)

Example:

SELECT * FROM "Orders" WHERE "OrderID" = 1 AND SIZE("Image") > 300

Arithmetic Operators

Operator Description
+ Add
- Subtract

Comparison Operators

Operator Description
= Equal to
<> Not Equal to
> Greater than
< Less than
>= Greater than or equal to
<= Less than or equal to

Logical Operators

Operator Description
AND TRUE if all the conditions separated by AND are TRUE
BETWEEN TRUE if the operand is within the range of comparisons
IN TRUE if the operand is equal to one of a list of expressions *
IS TRUE if the operand is a given, PartiQL data type, including NULL or MISSING
NOT Reverses the value of a given Boolean expression
OR TRUE if any of the conditions separated by OR are TRUE

* At max 50 hash attribute values or at max 100 non-key attribute values.

Full Table Scans

PartiQL allows to structure queries that might result in unwanted full table scan operations. In the following example, where Artist attribute is the partition key, PartiQL allows to call BEGINS_WITH function on the partition key. This operation is not allowed by DynamoDB on a partition key, however PartiQL translates it into a full table scan followed by a filter of the results.

SELECT * FROM "Music" WHERE BEGINS_WITH(Artist, 'Acme')`

In order to prevent PartiQL from executing full table scans, we can use the following policy:

DenyPartiQLScanPolicy:
  Type: AWS::IAM::Policy
  Properties:
    PolicyName: deny-pql-table-scan
    PolicyDocument:
      Statement:
        - Effect: Deny
          Action: dynamodb:PartiQLSelect
          Resource: !GetAtt MyTable.Arn
          Condition:
            Bool:
              dynamodb:FullTableScan: true
      Roles:
        - !Ref MyFunctionRole

If we don't use PartiQL, but still want to prevent full table scans, we should use the following policy instead:

DenyDynamoDBScanPolicy:
  Type: AWS::IAM::Policy
  Properties:
    PolicyName: deny-ddb-table-scan
    PolicyDocument:
      Statement:
        - Effect: Deny
          Action: dynamodb:Scan
          Resource: !GetAtt MyTable.Arn
      Roles:
        - !Ref MyFunctionRole

Or, combine both to support both PartiQL and native DynamoDB queries:

DenyPartiQLScanPolicy:
  Type: AWS::IAM::Policy
  Properties:
    PolicyName: deny-table-scan
    PolicyDocument:
      Statement:
        - Effect: Deny
          Action: dynamodb:PartiQLSelect
          Resource: !GetAtt MyTable.Arn
          Condition:
            Bool:
              dynamodb:FullTableScan: true
        - Effect: Deny
          Action: dynamodb:Scan
          Resource: !GetAtt MyTable.Arn
      Roles:
        - !Ref MyFunctionRole
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment