Indexing is a must to have part when it comes to the area of performance optimization from database side. However, sometimes may of us create indexes that might not be best and could have other side effects instead. From last few days of my work, I did get chance to digg a little deeper into this area and found that, there are some common techniques which we should all remember while creating/using MySQL indexing. I am still exploring this area. So, this tutorial has good chance to get updated with few more tips in near future.
Also, Though I am sharing this as part of MySQL database, most of these should apply to all major RDBMS. OK, now, here you go:
1. Why Not Index All Columns?
The answer to this question is probably known by many who knows indexing well. Still I am including this here just for the record so that, if any of the readers doesn’t aware of this yet such as beginner level developers.
There is already an interesting thread on stack-overflow which answers it very well. I just want to summarize the answers in two major points. We shouldn’t index every column, because of:
- Memory Space: Indexing take up memory space, thus indexing without any need will harm to your disk/memory thus overall DB server’s performance throughput.
- Slower CRUD Operations: If you create an index, for any kind of create/update/delete commands, index need to be updated for the new entry and thus these operations gets an overhead. The more you create index, the more such overhead increases. For many columns based table, it could become a major problem.
2. Multiple Index Or MultiColumn Index?
Developers may have a misconception about how to create an effecting index. Many thinks that we can just create each indexes based on the columns used in query. But that’s not true. But, in reality, nature of query should originally direct to the proper way of construction. this is the original rule of thumb:
- If the columns in ‘Where’ each condition are constructed using ‘AND’ conditions, it’s better to create multi-column composite index.
- If the columns in ‘Where’ each condition are constructed using ‘OR’ conditions, it’s better to create multiple index based on each column.
3. Columns Ordering In Composite Indexes?
Similar to the above situation, many people have misconception about index that, we can just create an index by selecting the columns in question in any order. But that’s not true either. MySQL optimizer may discard/not select those indexes which aren’t in same order as in ‘WHERE’ condition. So, it is very much important to verify whether your index is being used or not.
Also, as a best practice, you should try to put higher cardinality columns first(Cardinality = number of different values a column can have). Because that will minimize the amount of entries need to be searched by later columns.
4. How To Know That MySQL Will Use A Index?
You can know so by running an EXPLAIN command. In case you don’t know about ‘EXPLAIN’ command, it is very much useful to check the detailed plan about how MySQL is going to execute a query. Whether indexes are to be used by the MySQL optimizer in a query. Then you can take action accordingly. An example of such SQL can be as below:
EXPLAIN EXTENDED {your original query}
Which will in turn return a table with details info that mysql optimizer performs as like the below screenshot:
5. Table Column Nullable Or Not Nullable?
Though seems silly, but if you think deeper, you will know that, for nullable column index, with every checking during search, null check also occurs. As a broader sense, this is also a performance killer. So, if you have a nullable column for no reason, you better change it to ‘not null’ type.
6. When To Use covering Index:
If only limited specific columns to be retrieved in ‘SELECT’ query, then its best to use covering index as it will boost your query performance. If your where condition columns (and group columns by if exist) and selectable columns are specific, then you can try by creating a composite index on all those columns together. Which will facilitate to have all necessary data to be in memory instead of disk, thus making your query blazing fast as MySQL won’t look up the original table, rather will serve all data from its index. Here, you still have to consider the orders of the columns. Place columns in same order for where conditions columns first, then group by columns(if any) and then columns on select section.
7. MySQL Indexing is not necessary always:
That’s right. You heard it correct. in a situation, MySQL indexing might have shown you an incredible performance gain, but you can’t become excited to use it everywhere. In some cases, there might be need to look up very few records compared to creating an index that will consume a large amount of space/memory, resulting overall performance degradation. So, experiment before using an index and make sure you are having noticeable performance gain, otherwise, just don’t.
8. Force To Use Index In Query
Do you know you can suggest the MySQL optimizer to use a certain index you have in mind for it? Yes, besides suggestion, you can even force it to use an index. But, these aren’t needed in usual regular cases we use day-to-day basis. But, in case you created an index, but MySQL optimizer isn’t using that index for some reason, you can use this command and observe the performance by yourself. Though, there is a good chance that, the performance could be worse. Example of such SQL command is as below:
//suggest mysql to use an index SELECT a, b, c FROM my_table USE INDEX(index_name) WHERE a = {value}; //force mysql to use an index SELECT a, b, c FROM my_table FORCE INDEX(index_name) WHERE a = {value};
9. Finally, Experiment To Find The Best Tradeoff For Your Need:
Yes, it’s very important when you are dealing with complex JOIN query. Just creating indexes on each table may just don’t work straight forward. So, based on data in the tables, find which table is most important for index based searching. Then design those index in that way. Because, in JOIN query, all indexes from different tables may not be used. So, try, experiment, evaluate performance, change index construction/query construction, experiment again, that would be the way to get best possible performance.
Additional Resources:
Few weeks earlier, I did got a chance to attend an local meetup group’s event about mysql indexing, in my current city, Montreal. The presenter did an excellent job in explaining MySQL indexing from A to Z. You can read the slides on slide-share, which might help you get a brief overview on this area. Happy MySQL Indexing!
Discover more from CodeSamplez.com
Subscribe to get the latest posts sent to your email.
kenn says
“MySQL optimizer may discard/not select those indexes which aren’t in same order as in ‘WHERE’ condition.” This is not true. Also, “you should try to put higher cardinality columns first” this isn’t true either – it doesn’t matter. If you have INDEX(a,b,c) and WHERE a=1 AND b=2 the index-prefix is used but that shouldn’t be confused with order of columns in an index.
Md Ali Ahsan Rana says
I said there ‘MAY BE’, based on my own experience. I have experienced issue without with ordering, could be because of large complex join queries. Also, it is clearly mentioned on mysql multi-column indexing documentation that if we have an index(a,b,c), trying to use condition in “b =” AND c=”” won’t work, where “a=” AND b= ”” will work.
About your second complain, May be you don’t know about this yet then, read this stackoverflow thread. Also see the last section on this article, which I updated recently with link to a presentation I got chance to attend. See slide 12. Both evidence clearly support this concept. It makes sense as total amount of search is reducing down.
kenn says
That’s exactly what I mentioned – thus my “index-prefix” or “leftmost” wording. index(a,b,c) will work for (a), (a,b) and (a,b,c) to be used in WHERE clause. You should tweak this sentence “MySQL optimizer may discard/not select those indexes which aren’t in same order as in ‘WHERE’ condition.” because there’s no such thing as “in same order as in WHERE condition” – “WHERE a=1 AND b=2” and “WHERE b=2 AND a=1” are identical.
Also, however there’s a chance that high-cardinality-column-first strategy brings better selectivity, it’s more important to consider if low cardinality columns should be in index in the first place. In general, you shouldn’t index too low cardinality columns like (male, female) unless you’re doing covering index, and column ordering in index is negligible compared to impact of to-index-or-not decision.
There’s one very important thing that column ordering in index matters – ORDER BY optimization. If you have index(a,b,c) and a query WHERE a=1 AND b=2 ORDER BY c, it uses index to scan in this order, but if the query was WHERE a=1 AND b=2 ORDER BY a, it cannot use index for sorting and end up with filesort. If I were to mention anything about ordering of compound-index ordering, this is it.
It’s also worth noting that primary keys are always included in indexes on InnoDB – index(a) is actually index(a,pk), so WHERE a=1 ORDER BY pk is always fast, and SELECT pk FROM tbl WHERE a=1 is a covering index query.