Learning Apache Cassandra（Second Edition）

上QQ阅读APP看书，第一时间看更新

Paginating through results

In most situations, we will want to avoid displaying an arbitrary number of results to a user in a single response. Instead, we will display a fixed number of results, and give the user an interface to load additional pages of the same size. In an SQL database, pages are typically specified using the OFFSET keyword, but CQL does not have this capability. Instead, we'll use the natural ordering of primary keys to put a lower bound on the key values displayed on a given page.

Let's insert a couple more rows into our table to get a sense of how pagination works using the following INSERT statements:

    INSERT INTO "users"
    ("username", "email", "encrypted_password")
    VALUES (
      'carol',
      'carol@gmail.com',
      0xed3d8299b191b59b7008759a104c10af3db6e63a
    );
    
    INSERT INTO "users"
    ("username", "email", "encrypted_password")
    VALUES (
      'dave',
      'dave@gmail.com',
      0x6d1d90d92bbab0012270536f286d243729690a5b
    );

Now that we've got four users in our system, we can paginate over them. To save ourselves the trouble of adding numerous additional entries by hand, we'll just use a page size of two to demonstrate the process. We'll start by retrieving the first page as follows:

    SELECT * FROM users
    LIMIT 2;

The LIMIT part of the query simply tells Cassandra to return no more than two results:

Now that we have our first page, we want to get the second. It would be nice if we could simply ask for the next primary key in order after dave using the following SELECT statement:

    SELECT * FROM "users"
    WHERE "username" > 'dave'
    LIMIT 2;

Unfortunately, this will give an error:

The message is a bit cryptic, and we don't know what exactly a partition key is yet, but it does contain a helpful hint: we can use the token() function to do what we want. The reason that our attempted query doesn't work is that, as we noticed before, the primary keys in our users table are not stored in lexical order; Cassandra can only return rows in the order in which they are stored.

The actual ordering is determined by the token of the primary key—the way the token is calculated is opaque to us, but Cassandra lets us use the token() function to retrieve the token for a given value:

    SELECT "username", token("username")
    FROM "users";

Now we can see why the rows are returned in the order they are; they ascend by token:

Armed with this function, we can retrieve the next page of results as follows:

    SELECT * FROM "users"
    WHERE token("username") > token('dave')
    LIMIT 2;

And just as we'd hoped, the next two rows are returned as shown next:

Using this technique, we can paginate over arbitrarily large tables using multiple queries.

It bears emphasizing that retrieving large result sets from tables structured like users is a relatively expensive operation for Cassandra. In Chapter 3, Organizing Related Data, we'll begin to develop a more advanced table structure that will allow us to retrieve batches of rows very efficiently.