In this article we are going to discuss about indexing and query optimization using output of explain plan in MongoDB.
First check if any indexes is already present, we can default index in _id column is already present, we will create new indexes, and then again we will check the count of indexes.
- explain() is important in debugging or optimizing queries. When database receives query, it must plan out how to execute it; this is called query plan,
- EXPLAIN describes query paths and allows developers to diagnose slow operations by determining which indexes query has used
- MongoDB has its own version of EXPLAIN that provides the same service. We will try to understand this in details
- Apply this the to, the query we have executed in previous article (MongoDB: Creating large document)
i.e. finding documents with num
value greater than 1991 from techno_col collection, below is the output
>
db.techno_col.find({num: {"$gt" :1991}});
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd3c"), "num" : 1992 }
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd3d"), "num" : 1993 }
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd3e"), "num" : 1994 }
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd3f"), "num" : 1995 }
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd40"), "num" : 1996 }
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd41"), "num" : 1997 }
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd42"), "num" : 1998 }
{ "_id" :
ObjectId("5c5d66a510f3108c3c83dd43"), "num" : 1999 }
>
Now get the explain plan for
this query
>
db.techno_col.find({num: {"$gt"
:1991}}).explain("executionStats");
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" :
"techno_db.techno_col",
"indexFilterSet" : false,
"parsedQuery" : {
"num" : {
"$gt" : 1991
}
},
"winningPlan" : {
"stage" :
"COLLSCAN",
"filter" : {
"num" : {
"$gt" : 1991
}
},
"direction" :
"forward"
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 8,
"executionTimeMillis" : 2,
"totalKeysExamined" : 0,
"totalDocsExamined" : 2000,
"executionStages" : {
"stage" :
"COLLSCAN",
"filter" : {
"num" : {
"$gt" : 1991
}
},
"nReturned" :
8,
"executionTimeMillisEstimate" : 0,
"works" :
2002,
"advanced" :
8,
"needTime" :
1993,
"needYield" :
0,
"saveState" :
15,
"restoreState" : 15,
"isEOF" : 1,
"invalidates"
: 0,
"direction" :
"forward",
"docsExamined" : 2000
}
},
"serverInfo" : {
"host" : "test_mongo_srvr",
"port" : 27017,
"version" : "4.0.4",
"gitVersion" : " nogitversion"
},
"ok" : 1,
"operationTime" : Timestamp(1549631830, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1549631830, 1),
"signature" : {
"hash" :
BinData(0,"hKf4Ck7scgSFsRaM8qv8+hoJm4g="),
"keyId" :
NumberLong("6636938165653340162")
}
}
}
>
We will try to understand the
terms used in the output and understand the explain
we can see query engine has to
scan the entire collection, all 2000 documents (docsExamined), to return only 8
rows as output (nReturned) of the query.
The value of totalKeysExamined
field shows the number of indexes entries scanned, which is zero, these all stats
show that this is not an efficient query
This collection techno_col
needs an index to be created in order to make query faster and more efficient. you
can use CreateIndex() method to create an index on the num key
>
db.techno_col.getIndexes();
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "techno_db.techno_col"
}
]
>
>
db.techno_col.createIndex({num : 1});
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
>
>
db.techno_col.getIndexes();
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "techno_db.techno_col"
},
{
"v" : 2,
"key" : {
"num" : 1
},
"name" : "num_1",
"ns" : "techno_db.techno_col"
}
]
>
The collection techno_col has
now two indexes one is the standard _id_ index that is automatically built for every
collection. The second one is we created on num. If we do not provide a name, MongoDB
automatically assigns one.
Now we will try to run
explain() on the same query as we did earlier, we will be surprised to see the
difference.
> db.techno_col.find({num:
{"$gt" :1991}}).explain("executionStats");
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" :
"techno_db.techno_col",
"indexFilterSet" : false,
"parsedQuery" : {
"num" : {
"$gt" : 1991
}
},
"winningPlan" : {
"stage" :
"FETCH",
"inputStage"
: {
"stage" : "IXSCAN",
"keyPattern" : {
"num" : 1
},
"indexName" : "num_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"num" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"num" : [
"(1991.0,
inf.0]"
]
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 8,
"executionTimeMillis" : 1,
"totalKeysExamined" : 8,
"totalDocsExamined" : 8,
"executionStages" : {
"stage" :
"FETCH",
"nReturned" :
8,
"executionTimeMillisEstimate" : 0,
"works" : 9,
"advanced" :
8,
"needTime" :
0,
"needYield" :
0,
"saveState" :
0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates"
: 0,
"docsExamined" : 8,
"alreadyHasObj" : 0,
"inputStage"
: {
"stage" : "IXSCAN",
"nReturned" : 8,
"executionTimeMillisEstimate" : 0,
"works" : 9,
"advanced" : 8,
"needTime" : 0,
"needYield" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 1,
"invalidates" : 0,
"keyPattern" : {
"num" : 1
},
"indexName" : "num_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"num" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"num" : [
"(1991.0,
inf.0]"
]
},
"keysExamined" : 8,
"seeks" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0
}
}
},
"serverInfo" : {
"host" : " test_mongo_srvr ",
"port" : 27017,
"version" : "4.0.4",
"gitVersion" : "nogitversion"
},
"ok" : 1
}
>
from the above renewed explain plan, we can observe below values for different parameters
docsExamined - 8
nReturned - 8
totalKeysExamined - 8
With these stats, we can say the query
has been executed very quickly and efficiently. But we have to remember, that indexes are not free they consume some space from storage.
In this way we have learned,
to generate query explain plan and create indexes and their use in real-time.
0 comments:
Post a Comment