MySQL index optimization is enough to read this article!

MySQL index optimization is enough to read this article!

It takes about 5 minutes to read this article.

Source: cnblogs.com/songwenjie/p/9410009.html

This article mainly discusses part of the knowledge of MySQL index. It will start from the basics of MySQL index, the actual combat of index optimization, and the data structure behind the database index.

1. MySQL index basics

1. we will start with the basics of indexing to introduce what an index is, analyze several types of indexes, and discuss how to create an index and the basic principles of index design.

The structure of the user table created by the test index in this part is as follows:

1. What is an index?

"Index (also called "key" in MySQL) is a data structure for the storage engine to quickly find records."

"High Performance MySQL"

We need to know that an index is actually a data structure. Its function is to help us quickly match and find the required data rows. It is one of the most commonly used tools for database performance optimization. Its function is equivalent to a shopping guide in a supermarket and a catalog in a book.

2. Index Type

You can use SHOW INDEX FROM table_name; to view index details:

Primary key index PRIMARY KEY: It is a special unique index that does not allow null values. Generally, the primary key index is created at the same time as the table is built. Note: A table can only have one primary key.

Unique index UNIQUE: The value of the unique index column must be unique, but null values are allowed. If it is a composite index, the combination of column values must be unique. You can create a unique index through ALTER TABLE table_name ADD UNIQUE (column);:

You can create a unique composite index through ALTER TABLE table_name ADD UNIQUE (column1,column2);:

Ordinary index INDEX: This is the most basic index, it has no restrictions. You can create a common index through ALTER TABLE table_name ADD INDEX index_name (column);:

Composite index INDEX: that is, an index contains multiple columns, mostly used to avoid back to the table query. You can create a composite index through ALTER TABLE table_name ADD INDEX index_name(column1,column2, column3);:

Full text index FULLTEXT: also known as full text retrieval, is a key technology currently used by search engines. You can create a full-text index through ALTER TABLE table_name ADD FULLTEXT (column);:

Once an index is created, it cannot be modified. If you want to modify the index, you can only delete it and rebuild it. can use

DROP INDEX index_name ON table_name; delete the index.

3. The principle of index design

1) The column suitable for the index is the column that appears in the where clause, or the column specified in the join clause;

2) For classes with a small cardinality, the indexing effect is poor, and there is no need to create an index in this column;

3) Use a short index. If you index a long string column, you should specify a prefix length, which can save a lot of index space;

4) Don't over-index. Indexing requires additional disk space and reduces the performance of write operations. When the table content is modified, the index will be updated or even reconstructed. The more index columns, the longer this time will be. So only maintain the required index to facilitate the query.

2. MySQL index optimization actual combat

Above we introduced the basic content of the index, in this part we introduce the actual combat of index optimization. Before introducing the actual combat of index optimization, we must first introduce two important concepts related to indexing. These two concepts are essential for index optimization.

The user table structure used for testing in this part:

1. Important concepts related to indexing

Cardinality: The number of unique keys (distict_keys) in a single column is called the cardinality.

SELECT COUNT(DISTINCT name),COUNT(DISTINCT gender) FROM user;

The total number of rows in the user table is 5, and the cardinality of the gender column is 2, indicating that there are a large number of duplicate values in the gender column. The cardinality of the name column is equal to the total number of rows, indicating that the name column has no duplicate values, which is equivalent to the primary key.

Proportion of returned data: There are 5 pieces of data in the user table:

SELECT * FROM user;

Query the number of records satisfying the gender of 0 (male):

Then the proportion of records returned is:

Similarly, query the number of records whose name is'swj':

The proportion of records returned is:

Now comes the question. Assuming that the name and gender columns have indexes, can SELECT * FROM user WHERE gender = 0; SELECT * FROM user WHERE name ='swj'; all hit the index?

Index details of the user table:

SELECT * FROM user WHERE gender = 0; There is no hit index. Note that the value of filtered is the proportion of returned records calculated above.

SELECT * FROM user WHERE name ='swj'; hits the index index_name, because the record to be queried can be found directly through the index, so the filtered value is 100.

Therefore, 30% of the data in the returned table will be indexed, and full table scans will be used to return more than 30% of the data. Of course, this conclusion is too absolute, and it is not an absolute 30%, just a rough range.

Back to the table: When an index is created on a column, the index will contain the key value of the column and the rowid of the corresponding row of the key value. Accessing the data in the table through the rowid recorded in the index is called back to the table. Too many times to return to the table will seriously affect SQL performance, if there are too many times to return to the table, you should not go through an index scan, and you should go directly to a full table scan.

The Using Index in the result of the EXPLAIN command means that the main data can be obtained through the index without returning to the table. Using Where means the need to return to the table to fetch data.

2. Index optimization practice

Sometimes although the database has an index, it is not selected by the optimizer. We can use SHOW STATUS LIKE'Handler_read%'; to view the usage of the index:

Handler_read_key : If the index is working, the value of Handler_read_key will be very high.

Handler_read_rnd_next : The number of requests to read the next row in the data file. If a large number of table scans are in progress, the value will be higher, indicating that the index utilization is not ideal.

Index optimization rules:

1) If MySQL estimates that using an index is slower than a full table scan, it will not use the index.

The ratio of returned data is an important indicator. The lower the ratio, the easier it is to hit the index. Remember this range value-30%, the content mentioned later is based on the proportion of returned data within 30%.

2) The leading fuzzy query cannot hit the index.

Create a common index on the name column:

The leading fuzzy query cannot hit the index:

EXPLAIN SELECT * FROM user WHERE name LIKE'%s%';

Non-leading fuzzy queries can use indexes, which can be optimized to use non-leading fuzzy queries:

EXPLAIN SELECT * FROM user WHERE name LIKE's%';

3) The index will not be hit when the data type is implicitly converted, especially when the column type is a string, the character constant value must be enclosed in quotation marks.

EXPLAIN SELECT * FROM user WHERE name=1;

EXPLAIN SELECT * FROM user WHERE name='1';

4) In the case of a composite index, the query condition does not include the leftmost part of the index column (the leftmost principle is not satisfied), and the index will not be hit.

Create a composite index on the name, age, and status columns:

ALTER TABLE user ADD INDEX index_name (name,age,status);

User table index details:

SHOW INDEX FROM user;

According to the leftmost principle, the composite index index_name can be hit:

EXPLAIN SELECT * FROM user WHERE name='swj' AND status=1;

Note that the leftmost principle does not mean the order of query conditions:

EXPLAIN SELECT * FROM user WHERE status=1 AND name='swj';

It is whether the leftmost column of the index is included in the query condition:

EXPLAIN SELECT * FROM user WHERE status=2;

5) Union, in, or can all hit the index, it is recommended to use in.

union:

EXPLAIN SELECT*FROM user WHERE status=1

UNION ALL

SELECT*FROM user WHERE status = 2;

in:

EXPLAIN SELECT * FROM user WHERE status IN (1,2);

or:

EXPLAIN SELECT*FROM user WHERE status=1OR status=2;

The CPU consumption of the query: or>in>union.

6) Separate the condition with or. If there is an index in the condition before or, but there is no index in the following column, then the index involved will not be used.

EXPLAIN SELECT * FROM payment WHERE customer_id = 203 OR amount = 3.96;

Because there is no index in the conditional column behind or, then the following query must go through a full table scan. In the case of a full table scan, there is no need for one more index scan to increase IO access.

7) Negative conditional queries cannot use indexes, and can be optimized as in queries.

Negative conditions include: !=, <>, not in, not exists, not like, etc.

Create an index on the status column:

ALTER TABLE user ADD INDEX index_status (status);

User table index details:

SHOW INDEX FROM user;

Negative conditions cannot hit the cache:

EXPLAIN SELECT * FROM user WHERE status !=1 AND status != 2;

It can be optimized as an in query, but the premise is that the discrimination is high, and the proportion of returned data is within 30%:

EXPLAIN SELECT * FROM user WHERE status IN (0,3,4);

8) Range condition query can hit the index. Range conditions are: <, <=, >, >=, between, etc.

Create indexes for status and age columns:

ALTER TABLE user ADD INDEX index_status (status);

ALTER TABLE user ADD INDEX index_age (age);

User table index details:

SHOW INDEX FROM user;

Range condition query can hit the index:

EXPLAIN SELECT * FROM user WHERE status>5;

The range column can use the index (the joint index must be the leftmost prefix), but the column after the range column cannot use the index. The index can be used for at most one range column. If there are two range columns in the query condition, the index cannot be used for all :

EXPLAIN SELECT * FROM user WHERE status>5 AND age<24;

If the range query and the equivalent query exist at the same time, the index of the equivalent query column is matched first:

EXPLAIN SELECT * FROM user WHERE status>5 AND age=24;

8) The database execution calculation will not hit the index.

EXPLAIN SELECT * FROM user WHERE age>24;

EXPLAIN SELECT * FROM user WHERE age+1>24;

The calculation logic should be placed in the business layer as much as possible to save the CPU of the database while hitting the index to the maximum.

9) Use the covering index to query to avoid returning to the table.

The data of the queried column can be obtained from the index instead of the row locator row-locator and then the row is obtained, that is, "the column to be queried should be covered by the index built", which can speed up the query.

Index details of the user table:

Because the status field is an index column, you can get the value directly from the index without going back to the table to query:

Using Index means to query from the index:

EXPLAIN SELECT status FROM user where status=1;

When querying other columns, you need to query back to the table, which is one of the reasons why SELECT* should be avoided:

EXPLAIN SELECT * FROM user where status=1;

10) The column to be indexed cannot be null.

Single-column indexes do not store null values, and composite indexes do not store all-null values. If the column is allowed to be null, you may get "unexpected" result sets. Therefore, please use the not null constraint and the default value.

Remark column to create an index:

ALTER TABLE user ADD INDEX index_remark (remark);

IS NULL can hit the index:

EXPLAIN SELECT * FROM user WHERE remark IS NULL;

IS NOT NULL cannot hit the index:

EXPLAIN SELECT * FROM user WHERE remark IS NOT NULL;

Although IS NULL can hit the index, NULL itself is not a good database design, you should use NOT NULL constraints and default values.

a. It is not appropriate to create indexes on fields that are updated frequently: because the update operation will change the B+ tree and rebuild the index. This process is very consuming database performance.

b. It is not suitable to build an index on a field with a small degree of distinction: similar to a field with a small degree of distinction, such as gender, it is of little significance to establish an index. Because the data cannot be filtered effectively, the performance is equivalent to that of a full table scan. In addition, when the proportion of returned data is outside 30%, the optimizer will not choose to use the index.

c. Fields with unique characteristics in the business, even if it is a combination of multiple fields, must be unique indexes. Although the unique index will affect the insert speed, the speed increase for the query is very obvious. In addition, even if a very complete verification control is done at the application layer, as long as there is no unique index, dirty data will still be generated in the case of concurrency.

d. When multiple tables are associated, ensure that there must be an index on the associated field.

e. Avoid the following misconceptions when creating an index: The more indexes the better, and that one query needs to be built; Ning Que, don t overdo it, believing that the index will consume space and seriously slow down updates and new additions; Resist the unique index and believe that the business is unique Sexuality needs to be solved at the application layer through the "check before insertion" method; if you optimize too early, you start to optimize without understanding the system.

3. Summary

For the SQL query statements written by yourself, try to use the EXPLAIN command to analyze it, and be a programmer who pursues SQL performance. To measure whether a programmer is reliable, SQL ability is an important indicator. As a back-end programmer, I think so.



END

The programmer's growth path

Although the road is far away, you must go


WeChat ID: cxydczzl