Learning Apache Cassandra(Second Edition)
上QQ阅读APP看书,第一时间看更新

Composite partition key table

So far, we have only worked with tables that have a single column partition key. Let's try working with multiple columns in the partition key. For instance, in our status application, we might want to allow users to reply to other users' status updates. In this case, each status update would have a stream of replies; replies would be partitioned by the full primary key of the original status update, and each reply would get its own timestamped UUID:

CREATE TABLE "status_update_replies" ( 
"status_update_username" text,
"status_update_id" timeuuid,
"id" timeuuid,
"author_username" text,
"body" text,
PRIMARY KEY (
("status_update_username", "status_update_id"),
"id"
)
);

Note the extra set of parentheses around the status_update_username and status_update_id columns in the PRIMARY KEY declaration. This tells Cassandra that we want those two columns together to form the partition key. Without the extra parentheses, Cassandra assumes, by default, that only the first column in the primary key is a partition key, and the remaining columns are clustering columns.

Let's try inserting some data into the table:

INSERT INTO "status_update_replies" ("status_update_username",  
"status_update_id", "id", "author_username", "body")
VALUES ('alice', 76e7a4d0-e796-11e3-90ce-5f98e903bf02, NOW(), 'bob',
'Reply1');

INSERT INTO "status_update_replies" ("status_update_username",
"status_update_id", "id", "author_username", "body")
VALUES ('alice', 97719c50-e797-11e3-90ce-5f98e903bf02, NOW(), 'bob',
'Reply2');

Now, let's try fetching the replies first by just the status_update_username column and second by the status_update_username and status_update_id columns:

SELECT * FROM "status_update_replies" 
WHERE "status_update_username" = 'alice';

The response is as follows:

SELECT * FROM "status_update_replies" 
WHERE "status_update_username" = 'alice' AND
"status_update_id" = 97719c50-e797-11e3-90ce-5f98e903bf02;

You will notice that the first query fails because you failed to specify the equality relation for both parts of the partition key in the where clause. Since you are trying to fetch all the replies associated with the status updates made by the username alice and this could be spread across multiple partitions, Cassandra doesn't support it as it will have a poor performance.