Skip to content

TokuMX Fast Updates

John Esmet edited this page Dec 3, 2013 · 20 revisions

Fast updates

Introduction

TokuMX is built on TokuKV, the fractal tree indexing library from Tokutek. One of the unique features of TokuKV is the ability to send an "update" message to a key-value pair, which modifies the "value" without looking up the "key". TokuMX utilizes the update message feature to implement "fastupdates", a seamless optimization to the common update() path that has very positive performance implications but affects certain update and upsert semantics. The benefits and "gotchas" are described below.

Why are fast updates fast?

By sending an update message directly to the affected row without performing a query, we avoid: 1.) The CPU cycles involved in retrieving the row from the fractal tree. This means higher in-memory throughput. In common cases, this is worth a factor of 2 in throughput. 2.) The I/O required to fetch the row from storage, if it's not in memory. This means much less I/O when operating on a large collection, as well as a smaller working set. Because fast updates may be performed without an I/O lookup compared to regular updates which do, the speedup can be orders of magnitude.

When can an update operation be made "fast"?

Any update operation that will not affect secondary indexes (note: a clustering secondary index is always affected by any update operation) can be made fast.

Fast updates sometimes have different update and upsert semantics. What does that mean?

Fast updates do not read the rows they modify. Therefore, any decision that is usually made on the spot (such as if an update is valid, or if an upsert is required) is not made until the message is applied, which may not happen until an unspecified time in the future (we will not describe fractal tree internals in this document). The result is that fast updates always perform upserts if necessary, whether the user requested an upsert or not. Further, if a malformed fast update is sent to a row, the result of that operation will not be known until the message is applied, which may not be until the next client reads the row. Such malformed updates will be unapplied and the row is left unaffected. This may not be desirable during development stages where bad updates need to throw errors immediately. It may be acceptable in production when errors are never supposed to happen and may be dealt with asynchronously, where the benefit is performance.

How are fast updates exposed to the user?

Because fast updates have modified update semantics, they are not enabled by default. To enable fast updates, the server parameter "fastupdates" must be set to "true" (via command line or setParameter command). The parameter is global to a single mongod and is respected when that mongod is acting as a primary. A secondary node will perform fast updates as instructed by the primary even if fastupdates is false. Although fast updates are not the default behavior, they are transparently applied for all "eligible" operations once enabled (more on eligibility below). We believe this strikes a good point on the tradeoff curve of "no code change" and "change the semantics of an operation behind your back".

Impact on replication

Oplog processing (AKA background sync)

Normal updates are logged as a full before and after image of the document. Fast updates are logged as a primary key and an update object (eg: { $inc: { c: 1 } }). Because secondaries may have different indexes than their primary, secondaries use the same high-level code path to perform the update as the primary, which will detect when an update must go down the regular path because an index is affected. This means a secondary with different indexes than the primary may lag behind, doing the more expensive, traditional update operation.

Failover + rollback

Normal updates are trivial to roll back in the case of replication failover - if a row A was replaced by row B, the rollback operation simply looks up row B and replaces it with row A. Fast updates may only rollback if they are "invertible", which we define as an update program that has an "agreed-upon" inverse. We will talk more about what "agreed-upon" means later. First, consider the following operations: { $set: { a: 1 } }, { $inc: { z: 5 } }.

  • For the $set operation, it is not obvious what the inverse would look like. $unset? Not quite - an $unset may unset a value that already existed, which leaves the rolled-back state different than the state of the document with no update applied.
  • For the $inc operation, it is intuitive that inc(X) is inverted by inc(-X). That is, to roll back an increment of 5 we should decrement by 5. This almost always leaves the original object in the same state as if the update was not applied - except for one case - when the original object did not have the field to be modified in the first place. If we update an object { x: 1 } with { $inc: { y: 1 } } we are left with { x: 1, y : 1 }. If we performed a "fast update" and rolled it back, we would apply the inverted operation { $inc: { y: -1 } and be left with { x: 1, y: 0 }, which is not the same as { x: 1 }. We felt that this difference was trivial enough to be acceptable, so $inc operations have this modified rollback semantic and are therefore eligible for the "fast update" optimization! At the time of this writing, only $inc is considered "invertible" by TokuMX, and therefore is the only optimized operation.
  • In the above two examples, the implementation must make a choice about what happens when an operation rolls back. TokuMX has an "agreed-upon" rollback behavior for increments that says if the field did not exist before the increment, the rollback will leave the object in a state such that the field does exist. It does not have an "agreed-upon" behavior for rolling back a $set (or $addToSet, or $push, or... etc) and therefore does not utilize the fast update optimization in these cases.

Impact on sharding

Fast updates may only be applied to documents when the query portion of the update specifies the full primary key. For sharded collections, the primary key is usually the shard key + underscore id . The only thing special about fast updates with sharding is that the implementation is responsible for detecting that an update is happening over the shard key and must be logged during a migration. This case is tested under jstests/sharding/fastupdates_during_migration.js

Caveats

  • Fast updates are not performed on capped collections because we do not know if the object will grow in size. Knowledge of the new object's size is a requirement for capped collection update.
Clone this wiki locally