Indexing Overview

This document provides an overview of indexes in MongoDB, including index types and creation options. For operational guidelines and procedures, see the Indexing Operations document. For strategies and practical approaches, see the Indexing Strategies document.

Synopsis

An index is a data structure that allows you to quickly locate documents based on the values stored in certain specified fields. Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB supports indexes on any field or sub-field contained in documents within a MongoDB collection.

MongoDB indexes have the following core features:

  • MongoDB defines indexes on a per-collection level.

  • You can create indexes on a single field or on multiple fields using a compound index.

  • Indexes enhance query performance, often dramatically. However, each index also incurs some overhead for every write operation. Consider the queries, the frequency of these queries, the size of your working set, the insert load, and your application’s requirements as you create indexes in your MongoDB environment.

  • All MongoDB indexes use a B-tree data structure. MongoDB can use this representation of the data to optimize query responses.

  • Every query, including update operations, uses one and only one index. The query optimizer selects the index empirically by occasionally running alternate query plans and by selecting the plan with the best response time for each query type. You can override the query optimizer using the cursor.hint() method.

  • An index “covers” a query if:

    • all the fields in the query are part of that index, and
    • all the fields returned in the documents that match the query are in the same index.

    When an index covers a query, the server can both match the query conditions and return the results using only the index; MongoDB does not need to look at the documents, only the index, to fulfill the query. Querying the index can be faster than querying the documents outside of the index.

    See Create Indexes that Support Covered Queries for more information.

  • Using queries with good index coverage reduces the number of full documents that MongoDB needs to store in memory, thus maximizing database performance and throughput.

  • If an update does not change the size of a document or cause the document to outgrow its allocated area, then MongoDB will update an index only if the indexed fields have changed. This improves performance. Note that if the document has grown and must move, all index keys must then update.

Index Types

This section enumerates the types of indexes available in MongoDB. For all collections, MongoDB creates the default _id index. You can create additional indexes with the ensureIndex() method on any single field or sequence of fields within any document or sub-document. MongoDB also supports indexes of arrays, called multi-key indexes.

_id Index

The _id index is a unique index [1] on the _id field, and MongoDB creates this index by default on all collections. [2] You cannot delete the index on _id.

The _id field is the primary key for the collection, and every document must have a unique _id field. You may store any unique value in the _id field. The default value of _id is an ObjectID on every insert() operation. An ObjectId is a 12-byte unique identifiers suitable for use as the value of an _id field.

Note

In sharded clusters, if you do not use the _id field as the shard key, then your application must ensure the uniqueness of the values in the _id field to prevent errors. This is most-often done by using a standard auto-generated ObjectId.

[1]Although the index on _id is unique, the getIndexes() method will not print unique: true in the mongo shell.
[2]Before version 2.2 capped collections did not have an _id field. In 2.2, all capped collections have an _id field, except those in the local database. See the release notes for more information.

Secondary Indexes

All indexes in MongoDB are secondary indexes. You can create indexes on any field within any document or sub-document. Additionally, you can create compound indexes with multiple fields, so that a single query can match multiple components using the index while scanning fewer whole documents.

In general, you should create indexes that support your primary, common, and user-facing queries. Doing so requires MongoDB to scan the fewest number of documents possible.

In the mongo shell, you can create an index by calling the ensureIndex() method. Arguments to ensureIndex() resemble the following:

{ "field": 1 }
{ "product.quantity": 1 }
{ "product": 1, "quantity": 1 }

For each field in the index specify either 1 for an ascending order or -1 for a descending order, which represents the order of the keys in the index. For indexes with more than one key (i.e. compound indexes) the sequence of fields is important.

Indexes on Sub-documents

You can create indexes on fields that hold sub-documents as in the following example:

Example

Given the following document in the factories collection:

{ "_id": ObjectId(...), metro: { city: "New York", state: "NY" } } )

You can create an index on the metro key. The following queries would then use that index, and both would return the above document:

db.factories.find( { metro: { city: "New York", state: "NY" } } );

db.factories.find( { metro: { $gte : { city: "New York" } } } );

The second query returns the document because { city: "New York" } is less than { city: "New York", state: "NY" } The order of comparison is in ascending key order in the order the keys occur in the BSON document.

Indexes on Embedded Fields

You can create indexes on fields in sub-documents, just as you can index top-level fields in documents. [3] These indexes allow you to use a “dot notation,” to introspect into sub-documents.

Consider a collection named people that holds documents that resemble the following example document:

{"_id": ObjectId(...)
 "name": "John Doe"
 "address": {
        "street": "Main"
        "zipcode": 53511
        "state": "WI"
        }
}

以下の仕様を使って、address.zipcode フィールドにインデックスを作成できます。

db.people.ensureIndex( { "address.zipcode": 1 } )
[3]Indexes on Sub-documents, by contrast allow you to index fields that hold documents, including the full content, up to the maximum Index Size of the sub-document in the index.

Compound Indexes

MongoDB supports “compound indexes,” where a single index structure holds references to multiple fields within a collection’s documents. Consider a collection named products that holds documents that resemble the following document:

{
 "_id": ObjectId(...)
 "item": "Banana"
 "category": ["food", "produce", "grocery"]
 "location": "4th Street Store"
 "stock": 4
 "type": cases
 "arrival": Date(...)
}

If most applications queries include the item field and a significant number of queries will also check the stock field, you can specify a single compound index to support both of these queries:

db.products.ensureIndex( { "item": 1, "location": 1, "stock": 1 } )

Compound indexes support queries on any prefix of the fields in the index. [4] For example, MongoDB can use the above index to support queries that select the item field and to support queries that select the item field and the location field. The index, however, would not support queries that select the following:

  • only the location field
  • only the stock field
  • only the location and stock fields
  • only the item and stock fields

Important

You may not create compound indexes that have hashed index fields. You will receive an error if you attempt to create a compound index that includes a hashed index.

When creating an index, the number associated with a key specifies the direction of the index. The options are 1 (ascending) and -1 (descending). Direction doesn’t matter for single key indexes or for random access retrieval but is important if you are doing sort queries on compound indexes.

The order of fields in a compound index is very important. In the previous example, the index will contain references to documents sorted first by the values of the item field and, within each value of the item field, sorted by the values of location, and then sorted by values of the stock field.

[4]Index prefixes are the beginning subset of fields. For example, given the index { a: 1, b: 1, c: 1 } both { a: 1 } and { a: 1, b: 1 } are prefixes of the index.

Indexes with Ascending and Descending Keys

Indexes store references to fields in either ascending or descending order. For single-field indexes, the order of keys doesn’t matter, because MongoDB can traverse the index in either direction. However, for compound indexes, if you need to order results against two fields, sometimes you need the index fields running in opposite order relative to each other.

To specify an index with a descending order, use the following form:

db.products.ensureIndex( { "field": -1 } )

More typically in the context of a compound index, the specification would resemble the following prototype:

db.products.ensureIndex( { "fieldA": 1, "fieldB": -1 } )

usernameおよびtimestampの両方を含むイベントデータのコレクションを想定します。usernameでソートされ、次に最新のイベントを最初に並べたリストを返したい場合、このインデックスを作成するには、以下のコマンドを使用します。

db.events.ensureIndex( { "username" : 1, "timestamp" : -1 } )

Multikey Indexes

If you index a field that contains an array, MongoDB indexes each value in the array separately, in a “multikey index.”

Example

Given the following document:

{ "_id" : ObjectId("..."),
  "name" : "Warm Weather",
  "author" : "Steve",
  "tags" : [ "weather", "hot", "record", "april" ] }

Then an index on the tags field would be a multikey index and would include these separate entries:

{ tags: "weather" }
{ tags: "hot" }
{ tags: "record" }
{ tags: "april" }

Queries could use the multikey index to return queries for any of the above values.

Note

For hashed indexes, MongoDB collapses sub-documents and computes the hash for the entire value, but does not support multi-key (i.e. arrays) indexes. For fields that hold sub-documents, you cannot use the index to support queries that introspect the sub-document.

You can use multikey indexes to index fields within objects embedded in arrays, as in the following example:

Example

Consider a feedback collection with documents in the following form:

{
 "_id": ObjectId(...)
 "title": "Grocery Quality"
 "comments": [
    { author_id: ObjectId(...)
      date: Date(...)
      text: "Please expand the cheddar selection." },
    { author_id: ObjectId(...)
      date: Date(...)
      text: "Please expand the mustard selection." },
    { author_id: ObjectId(...)
      date: Date(...)
      text: "Please expand the olive selection." }
 ]
}

An index on the comments.text field would be a multikey index and would add items to the index for all of the sub-documents in the array.

With an index, such as { comments.text: 1 }, consider the following query:

db.feedback.find( { "comments.text": "Please expand the selection." } )

This would select the document, that contains the following document in the comments.text array:

{ author_id: ObjectId(...)
  date: Date(...)
  text: "Please expand the olive selection." }

Compound Multikey Indexes May Only Include One Array Field

While you can create multikey compound indexes, at most one field in a compound index may hold an array. For example, given an index on { a: 1, b: 1 }, the following documents are permissible:

{a: [1, 2], b: 1}

{a: 1, b: [1, 2]}

However, the following document is impermissible, and MongoDB cannot insert such a document into a collection with the {a: 1, b: 1 } index:

{a: [1, 2], b: [1, 2]}

If you attempt to insert a such a document, MongoDB will reject the insertion, and produce an error that says cannot index parallel arrays. MongoDB does not index parallel arrays because they require the index to include each value in the Cartesian product of the compound keys, which could quickly result in incredibly large and difficult to maintain indexes.

ユニークなインデックス

A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. To create a unique index on the user_id field of the members collection, use the following operation in the mongo shell:

db.addresses.ensureIndex( { "user_id": 1 }, { unique: true } )

By default, unique is false on MongoDB indexes.

If you use the unique constraint on a compound index then MongoDB will enforce uniqueness on the combination of values, rather than the individual value for any or all values of the key.

If a document does not have a value for the indexed field in a unique index, the index will store a null value for this document. MongoDB will only permit one document without a unique value in the collection because of this unique constraint. You can combine with the sparse index to filter these null values from the unique index.

You may not specify a unique constraint on a hashed index.

スパースインデックス

Sparse indexes only contain entries for documents that have the indexed field. [5] Any document that is missing the field is not indexed. The index is “sparse” because of the missing documents when values are missing.

By contrast, non-sparse indexes contain all documents in a collection, and store null values for documents that do not contain the indexed field. Create a sparse index on the xmpp_id field, of the members collection, using the following operation in the mongo shell:

db.addresses.ensureIndex( { "xmpp_id": 1 }, { sparse: true } )

By default, sparse is false on MongoDB indexes.

Warning

これらのインデックスを使用して、結果をフィルターまたはソートすることは、ときに不完全な結果となることがあります。それはスパースインデックスがコレクションのすべてのドキュメントに完全に備わっていないためです。

Note

Do not confuse sparse indexes in MongoDB with block-level indexes in other databases. Think of them as dense indexes with a specific filter.

You can combine the sparse index option with the unique indexes option so that mongod will reject documents that have duplicate values for a field, but that ignore documents that do not have the key.

[5]All documents that have the indexed field are indexed in a sparse index, even if that field stores a null value in some documents.

Hashed Index

New in version 2.4.

Hashed indexes maintain entries with hashes of the values of the indexed field. The hashing function collapses sub-documents and computes the hash for the entire value but does not support multi-key (i.e. arrays) indexes.

MongoDB can use the hashed index to support equality queries, but hashed indexes do not support range queries.

You may not create compound indexes that have hashed index fields or specify a unique constraint on a hashed index; however, you can create both a hashed index and an ascending/descending (i.e. non-hashed) index on the same field: MongoDB will use the scalar index for range queries.

Warning

hashed indexes truncate floating point numbers to 64-bit integers before hashing. For example, a hashed index would store the same value for a field that held a value of 2.3, 2.2 and 2.9. To prevent collisions, do not use a hashed index for floating point numbers that cannot be consistently converted to 64-bit integers (and then back to floating point.) hashed indexes do not support floating point values larger than 253.

Create a hashed index using an operation that resembles the following:

db.active.ensureIndex( { a: "hashed" } )

This operation creates a hashed index for the active collection on the a field.

[6]The hash stored in the hashed index is 64 bits of the 128 bit md5 hash.

Index Names

The default name for an index is the concatenation of the indexed keys and each key’s direction in the index (1 or -1).

Example

Issue the following command to create an index on item and quantity:

db.products.ensureIndex( { item: 1, quantity: -1 } )

The resulting index is named: item_1_quantity_-1.

Optionally, you can specify a name for an index instead of using the default name.

Example

Issue the following command to create an index on item and quantity and specify inventory as the index name:

db.products.ensureIndex( { item: 1, quantity: -1 } , {name: "inventory"} )

The resulting index is named: inventory.

To view the name of an index, use the getIndexes() method.

Index Creation Options

You specify index creation options in the second argument in ensureIndex().

The options sparse, unique, and TTL affect the kind of index that MongoDB creates. This section addresses, background construction and duplicate dropping, which affect how MongoDB builds the indexes.

Background Construction

By default, creating an index is a blocking operation. Building an index on a large collection of data can take a long time to complete. To resolve this issue, the background option can allow you to continue to use your mongod instance during the index build.

For example, to create an index in the background of the zipcode field of the people collection you would issue the following:

db.people.ensureIndex( { zipcode: 1}, {background: true} )

By default, background is false for building MongoDB indexes.

以下に示すように、バックグランド操作は、他のオプションと組み合わせることができます。

db.people.ensureIndex( { zipcode: 1}, {background: true, sparse: true } )

バックグランドインデックス構築における以下の挙動に注意してください。

  • A mongod instance can build more than one index in the background concurrently.

    Changed in version 2.4: Before 2.4, a mongod instance could only build one background index per database at a time.

    Changed in version 2.2: Before 2.2, a single mongod instance could only build one index at a time.

  • The indexing operation runs in the background so that other database operations can run while creating the index. However, the mongo shell session or connection where you are creating the index will block until the index build is complete. Open another connection or mongo instance to continue using commands to the database.

  • The background index operation use an incremental approach that is slower than the normal “foreground” index builds. If the index is larger than the available RAM, then the incremental process can take much longer than the foreground build.

  • If your application includes ensureIndex() operations, and an index doesn’t exist for other operational concerns, building the index can have a severe impact on the performance of the database.

    Make sure that your application checks for the indexes at start up using the getIndexes() method or the equivalent method for your driver and terminates if the proper indexes do not exist. Always build indexes in production instances using separate application code, during designated maintenance windows.

Building Indexes on Secondaries

Background index operations on a replica set primary become foreground indexing operations on secondary members of the set. All indexing operations on secondaries block replication.

To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step down the primary, restart it as a standalone, and build the index on the former primary.

Remember, the amount of time required to build the index on a secondary node must be within the window of the oplog, so that the secondary can catch up with the primary.

See Build Indexes on Replica Sets for more information on this process.

“復旧”モード中のセカンダリメンバーのインデックスは、できるだけ早く追いつけるよう、常にフォアグラウンドで作成されます。

See Build Indexes on Replica Sets for a complete procedure for rebuilding indexes on secondaries.

Note

If MongoDB is building an index in the background, you cannot perform other administrative operations involving that collection, including repairDatabase, drop that collection (i.e. db.collection.drop(),) and compact. These operations will return an error during background index builds.

Queries will not use these indexes until the index build is complete.

重複のドロップ

MongoDB cannot create a unique index on a field that has duplicate values. To force the creation of a unique index, you can specify the dropDups option, which will only index the first occurrence of a value for the key, and delete all subsequent values.

Warning

すべての一意のインデックス同様、ドキュメントにインデックスされるフィールドがない場合、MongoDBは“ヌル”値でインデックスに含めます。

If subsequent fields do not have the indexed field, and you have set {dropDups: true}, MongoDB will remove these documents from the collection when creating the index. If you combine dropDups with the sparse option, this index will only include documents in the index that have the value, and the documents without the field will remain in the database.

accounts コレクションのusername フィールドの重複をドロップする一意のインデックスを作成するには、以下の書式のコマンドを使用します。

db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } )

Warning

Specifying { dropDups: true } will delete data from your database. Use with extreme caution.

By default, dropDups is false.

Index Features

TTL Indexes

TTLインデックスは、MongoDBが一定の時間後に自動的にドキュメントをコレクションから削除するために使用できる特別なインデックスです。これは、マシン生成のイベントデータ、ログ、およびセッション情報など、限定された時間のみデータベースに残ることが必要な種類の情報には最適です。

These indexes have the following limitations:

  • Compound indexes are not supported.
  • The indexed field must be a date type.
  • If the field holds an array, and there are multiple date-typed data in the index, the document will expire when the lowest (i.e. earliest) matches the expiration threshold.

Note

TTL indexes expire data by removing documents in a background task that runs every 60 seconds. As a result, the TTL index provides no guarantees that expired documents will not exist in the collection. Consider that:

  • Documents may remain in a collection after they expire and before the background process runs.
  • The duration of the removal operations depend on the workload of your mongod instance.

In all other respects, TTL indexes are normal indexes, and if appropriate, MongoDB can use these indexes to fulfill arbitrary queries.

Geospatial Indexes

MongoDBは、位置ベースおよびその他類似の2次元座標システムにおけるクエリをサポートするため、「地理空間インデックス」を提供しています。たとえば、座標軸のあるドキュメントのコレクションで、指定された座標ペアの「近く」にあるオプションを返さなければならない場合に、地理空間インデックスを使用します。

To create a geospatial index, your documents must have a coordinate pair. For maximum compatibility, these coordinate pairs should be in the form of a two element array, such as [ x , y ]. Given the field of loc, that held a coordinate pair, in the collection places, you would create a geospatial index as follows:

db.places.ensureIndex( { loc : "2d" } )

MongoDBは、loc フィールドに最小値および最大値の範囲外の値があるドキュメントは排除します。

Note

MongoDBは、コレクションに1つだけの地理空間インデックスを許容します。MongoDBは、クライアントが複数の地理空間インデックスを作成することを許容しますが、1つのクエリは1つだけのインデックスを使用できます。

See the $near, and the database command geoNear for more information on accessing geospatial data.

Geohaystack Indexes

In addition to conventional geospatial indexes, MongoDB also provides a bucket-based geospatial index, called “geospatial haystack indexes.” These indexes support high performance queries for locations within a small area, when the query must filter along another dimension.

Example

If you need to return all documents that have coordinates within 25 miles of a given point and have a type field value of “museum,” a haystack index would be provide the best support for these queries.

Haystackインデックスは、データディストリビューションにバケツサイズを調整することができ、特定の種類のドキュメントに対して、一般的に2次元の非常に狭い地域における検索ができるようにします。このインデックスはバケツサイズに比べて最も近いドキュメントが特定の地点から遠く離れている場合に、その地点に最も近いドキュメントを見つけるのには適していません。

text Indexes

New in version 2.4.

MongoDB provides text indexes to support the search of string content in documents of a collection. text indexes are case-insensitive and can include any field that contains string data. text indexes drop language-specific stop words (e.g. in English, “the,” “an,” “a,” “and,” etc.) and uses simple language-specific suffix stemming. See Text Search Languages for the supported languages.

You can only access the text index with the text command.

See Text Search for more information.

Index Behaviors

Limitations

  • A collection may have no more than 64 indexes.

  • Index keys can be no larger than 1024 bytes.

    Documents with fields that have values greater than this size cannot be indexed.

    To query for documents that were too large to index, you can use a command similar to the following:

    db.records.find({<key>: <value too large to index>}).hint({$natural: 1})
    
  • The name of an index, including the namespace must be shorter than 128 characters.

  • インデックスには保存要件があり、ある程度、挿入/更新速度に影響します。

  • クエリやその他の操作をサポートするためのインデックスを作成することができますが、MongoDBインスタンスが使用できない、または使用しないインデックスは維持しないようにしてください。

  • For queries with the $or operator, each clause of an $or query executes in parallel, and can each use a different index.

  • For queries that use the sort() method and use the $or operator, the query cannot use the indexes on the $or fields.

  • 2d geospatial queries do not support queries that use the $or operator.

Consider Insert Throughput

If your application is write-heavy, then be careful when creating new indexes, since each additional index with impose a write-performance penalty. In general, don’t be careless about adding indexes. Add indexes to complement your queries. Always have a good reason for adding a new index, and be sure to benchmark alternative strategies.

MongoDB must update all indexes associated with a collection after every insert, update, or delete operation. For update operations, if the updated document does not move to a new location, then MongoDB only modifies the updated fields in the index. Therefore, every index on a collection adds some amount of overhead to these write operations. In almost every case, the performance gains that indexes realize for read operations are worth the insertion penalty. However, in some cases:

  • An index to support an infrequent query might incur more insert-related costs than savings in read-time.
  • If you have many related indexes on a collection that receives a high volume of write operations, you may find better overall performance with a smaller number of indexes, even if some queries are less optimally supported by an index.
  • If your indexes and queries are not sufficiently selective, the speed improvements for query operations may not offset the costs of maintaining an index. For more information see Create Queries that Ensure Selectivity.