
Indexing is a must-have part of performance optimization from the database side. However, sometimes, many of us create indexes that might not be best and could have other side effects instead. From the last few days of my work, I did get a chance to dig a little deeper into this area and found that there are some common techniques which we should all remember while creating/using MySQL indexing. I am still exploring this area. So, this tutorial has a good chance of being updated with a few more tips in the near future.
Also, Though I am sharing this as part of MySQL database, most of these should apply to all major RDBMS. OK, now, here you go:
1. Why Not Index All Columns?
The answer to this question is probably known by many who know indexing well. Still, I am including this here just for the record so that, if any of the readers aren’t aware of this yet, such as beginner-level developers.
There is already an interesting thread on stack-overflow, which answers it very well. I want to summarize the answers in two major points. We shouldn’t index every column because of the following:
- Memory Space: Indexing takes up memory space, and indexing without any need will harm your disk/memory and the overall performance throughput of your DB server.
- Slower CRUD Operations: If you create an index for any kind of create/update/delete command, the index needs to be updated for the new entry, and thus, these operations incur overhead. The more you create an index, the more such overhead increases. For many columns-based tables, this could become a major problem.
2. Multiple Index Or MultiColumn Index?
Developers may have a misconception about how to create an effecting index. Many think that we can just create each index based on the columns used in the query. But that’s not true. But, in reality, the nature of the query should originally direct to the proper way of construction. This is the original rule of thumb:
- If the columns in ‘Where’ each condition are constructed using ‘AND’ conditions, it’s better to create a multi-column composite index.
- If the columns in ‘Where’ each condition are constructed using ‘OR’ conditions, it’s better to create multiple indexes based on each column.
3. Columns Ordering In Composite Indexes?
Similar to the above situation, many people have misconceptions about indexes. They think we can just create an index by selecting the columns in question in any order. But that’s not true either. MySQL optimizer may discard/not select those indexes that aren’t in the same order as in the ‘WHERE’ condition. So, it is very important to verify whether your index is being used or not.
Also, as a best practice, you should try to put higher cardinality columns first(Cardinality = number of different values a column can have). This will minimize the number of entries that need to be searched by later columns.
4. How To Know That MySQL Will Use An Index?
You can learn this by running an EXPLAIN command. If you don’t know about the ‘EXPLAIN’ command, it is very useful to check the detailed plan for how MySQL is going to execute a query. Whether indexes are to be used by the MySQL optimizer in a query, you can take action accordingly. An example of such SQL can be as below:
[sql] EXPLAIN EXTENDED {your original query} [/sql]
This will in turn, return a table with detailed info that MySQL optimizer performs, as in the below screenshot:
5. Is the table column nullable or not nullable?
Though it seems silly, if you think deeper, you will know that, for the nullable column index, with every checking during the search, a null check also occurs. In a broader sense, this is also a performance killer. So, if you have a nullable column for no reason, you had better change it to a ‘not null’ type.
6. When To Use Covering Index:
If only limited specific columns are to be retrieved in the ‘SELECT’ query, then it’s best to use a covering index, as it will boost your query performance. If you have condition columns (and group columns if they exist) and selectable columns are specific, then you can try creating a composite index on all those columns together. This will facilitate having all necessary data in memory instead of disk, thus making your query blazing fast, as MySQL won’t look up the original table but will serve all data from its index. Here, you still have to consider the order of the columns. Place columns in same order for where conditions columns first, then group by columns(if any) and then columns on select section.
7. MySQL Indexing is not always necessary:
That’s right. You heard it correctly. MySQL indexing might have shown you an incredible performance gain, but you can’t become excited to use it everywhere. In some cases, you might need to look up very few records compared to creating an index that will consume a large amount of space/memory, resulting in overall performance degradation. So, experiment before using an index and make sure you are having a noticeable performance gain; otherwise, just don’t.
8. Force To Use Index In Query
Do you know if you can suggest that the MySQL optimizer use a certain index you have in mind for it? Yes, besides suggestions, you can even force it to use an index. But these aren’t needed in the usual regular cases we use on a day-to-day basis. But, if you created an index but MySQL optimizer isn’t using that index for some reason, you can use this command and observe the performance yourself. However, there is a good chance that the performance could be worse. An example of such an SQL command is as follows:
//suggest mysql to use an index
SELECT a, b, c FROM my_table USE INDEX(index_name) WHERE a = {value};
//force mysql to use an index
SELECT a, b, c FROM my_table FORCE INDEX(index_name) WHERE a = {value};
Code language: PHP (php)
9. Finally, Experiment To Find The Best Tradeoff For Your Need:
Yes, it’s very important when you are dealing with a complex JOIN query. Just creating indexes on each table may not work as straightforwardly. So, based on data in the tables, find which table is most important for index-based searching. Then, design those indexes in that way. Because, in JOIN query, all indexes from different tables may not be used. So, try, experiment, evaluate performance, change index construction/query construction, experiment again, that would be the way to get best possible performance.
Additional Resources:
A few weeks earlier, I got a chance to attend a local meetup group’s event about MySQL indexing in my current city, Montreal. The presenter did an excellent job explaining MySQL indexing from A to Z. You can read the slides on SlideShare, which might help you get a brief overview of this area. Happy MySQL Indexing!
Discover more from CODESAMPLEZ.COM
Subscribe to get the latest posts sent to your email.
“MySQL optimizer may discard/not select those indexes which aren’t in same order as in ‘WHERE’ condition.” This is not true. Also, “you should try to put higher cardinality columns first” this isn’t true either – it doesn’t matter. If you have INDEX(a,b,c) and WHERE a=1 AND b=2 the index-prefix is used but that shouldn’t be confused with order of columns in an index.
I said there ‘MAY BE’, based on my own experience. I have experienced issue without with ordering, could be because of large complex join queries. Also, it is clearly mentioned on mysql multi-column indexing documentation that if we have an index(a,b,c), trying to use condition in “b =” AND c=”” won’t work, where “a=” AND b= ”” will work.
About your second complain, May be you don’t know about this yet then, read this stackoverflow thread. Also see the last section on this article, which I updated recently with link to a presentation I got chance to attend. See slide 12. Both evidence clearly support this concept. It makes sense as total amount of search is reducing down.
That’s exactly what I mentioned – thus my “index-prefix” or “leftmost” wording. index(a,b,c) will work for (a), (a,b) and (a,b,c) to be used in WHERE clause. You should tweak this sentence “MySQL optimizer may discard/not select those indexes which aren’t in same order as in ‘WHERE’ condition.” because there’s no such thing as “in same order as in WHERE condition” – “WHERE a=1 AND b=2” and “WHERE b=2 AND a=1” are identical.
Also, however there’s a chance that high-cardinality-column-first strategy brings better selectivity, it’s more important to consider if low cardinality columns should be in index in the first place. In general, you shouldn’t index too low cardinality columns like (male, female) unless you’re doing covering index, and column ordering in index is negligible compared to impact of to-index-or-not decision.
There’s one very important thing that column ordering in index matters – ORDER BY optimization. If you have index(a,b,c) and a query WHERE a=1 AND b=2 ORDER BY c, it uses index to scan in this order, but if the query was WHERE a=1 AND b=2 ORDER BY a, it cannot use index for sorting and end up with filesort. If I were to mention anything about ordering of compound-index ordering, this is it.
It’s also worth noting that primary keys are always included in indexes on InnoDB – index(a) is actually index(a,pk), so WHERE a=1 ORDER BY pk is always fast, and SELECT pk FROM tbl WHERE a=1 is a covering index query.