Back-end performance optimization

Back-end performance optimization

1. What is system optimization

One aspect of system optimization is to systematically analyze and optimize each link in the IT system or transaction chain, and the other is to analyze and optimize the bottleneck point of a single system. But the goal of optimization is roughly the same. It is nothing more than improving the response speed and throughput of the system and reducing the coupling of each layer to cope with the market for flexible counterparties.

3.levels of system optimization: IT architecture governance layer, system layer, and infrastructure layer.

  • IT system governance layer: The purpose of optimization is not only performance optimization, but also application architecture optimization (such as: application layering, service governance, etc.) to adapt to changes in business architecture.

  • System layer: The purpose of optimization includes business process optimization, data process optimization (such as: increasing system load, reducing system overhead, etc.)

  • Infrastructure layer: The purpose of optimization is mainly to improve the capabilities of the IAAS platform (for example: building a flexible cluster with horizontal expansion capabilities, supporting rapid online access and transfer of resources, etc.).

2. Methodology and ideas for system optimization

What is methodology? My personal understanding is that it sounds very good. People who have done it think it is nonsense, but it can indicate the direction of action or continuous improvement.

2.1 Common methodology

(1) Do not access unnecessary data-reduce unnecessary links on the transaction line, reduce failure points and maintenance points.

(2) Nearest loading/caching is king-reducing unnecessary access.

(3) Fault isolation-don't overwhelm the entire trading platform because of a system bottleneck.

(4) Possess good scalability-rational use of resources, improve processing efficiency and avoid single points of failure.

(5) Optimize the transaction chain to increase throughput-asynchronous/reduced serialization, reasonable split (vertical/horizontal split), and rule forward.

(6) Performance and function are equally important-after the 5 performances on the transaction chain become 90% of the design stage, the overall performance is 59% of the design stage.

2.2 The general idea of optimization


2.3 Principles of optimization

  • In the design and development of application systems, performance should always be considered.

  • Determining clear performance goals is key.

  • Performance tuning accompanies the entire project cycle. It is best to set goals in stages. Once the expected performance goals are reached, the work of this stage can be summarized and knowledge transferred to the next stage of tuning work.

  • You must ensure that the tuned program runs correctly.

  • The performance is more dependent on good design, tuning techniques are only an auxiliary means.

  • The tuning process is an iterative and gradual process, and the results of each tuning must be fed back to subsequent code development.

  • Performance tuning cannot be at the expense of code readability and maintainability.

3 performance tuning

3.1 Common performance problems

3.1.1 Common client performance issues

  • Slow loading: Slow startup or slow reload for the first time;

  • No response: the page hangs after the event starts;

  • Severely affected by network bandwidth: Because of the need to download a large number of resource files, pages in some areas where the network environment is not good;

  • JS memory overflow: Frequent operations on the properties of the object cause a large amount of memory and eventually overflow.

3.1.2 Common J2EE system performance problems

  • Memory leak: During operation, the memory is constantly occupied and cannot be reclaimed. The memory usage rate increases linearly with the increase of time or load. The processing efficiency of the system decreases with the increase of time or concurrency, until the memory allocated to the JVM is used. The system will go down as soon as possible, or the system can return to normal within a short time after restarting.

  • Resource leakage: The problem of unclosed or unsuccessful close after opening the resource. These resources include data source connections, file streams, etc. When these resources are often opened but not closed successfully, it will lead to resource leakage. Data connection leakage is a common resource leakage problem.

  • Overload: The system is overused and exceeds the load that the system can bear.

  • Internal resource bottleneck: resource bottleneck caused by over-use or under-allocation of resources.

  • Thread blocking, thread deadlock: The thread retreats to the synchronization point that cannot be completed, causing communication blocking.

  • The application system responds slowly: Due to the application itself or the unreasonable SQL, the response time is long.

  • The application system is unstable, and the phenomenon of fast and slow occurs.

  • Various abnormal situations occur in the application system: some are exceptions thrown by the middleware server, and some are exceptions thrown by the data side.

3.1.3 Common database problems

  • Deadlock: because the request to keep or the execution efficiency is low, the guide cannot be released in time or the table deadlocks due to the loop waiting;

  • IO busy: A lot of IO waits due to bad SQL or unreasonable business logic design;

  • High CPU usage: High concurrency or cache penetration causes the database CPU to remain high or fluctuate.

3.2 The specific work of tuning

The world s martial arts is fast and unbreakable. The first thing is to improve the response time of the system (response time = service processing time + queuing time). The throughput of the system reduces the queuing time of the system.

Response time curve (from "Oracle Performance Forecast")

The vertical axis is response time. Response time is the sum of service time and queuing time. The horizontal axis is the arrival rate. As the number of transactions entering the system per unit time increases, the curve slides to the right. As the arrival rate continues to increase, at some point, the queue time will rise sharply. When this happens, the response time will also rise sharply, performance will drop, and users will feel very frustrated.

Let's analyze the specific work of performance optimization through cases in previous projects.

3.2.1 Optimization of trading lines

The transaction line starts from the service consumers, looking at the functions that the transaction should complete at all levels, and the relationship between the function points. The relationship between function points is represented by a directed path:


Principles of trading line optimization:

  • The shortest path: reduce unnecessary links and avoid points of failure;

  • Transaction integrity: ensure the consistency of transactions in all links of the transaction line through offset or compensation transactions;

  • Fault isolation and fast location: shield the image of abnormal conditions on normal transactions, and quickly resolve problems through transaction codes or error codes;

  • Flow control principle: You can control the flow of the service channel and combine with priority settings to prioritize high-level services;

  • The principle of the timeout control funnel: try to keep the timeout setting of the front-end system of the transaction line greater than that of the back-end system.

[Case] With the evolution of the architecture, the previous stop was a silo system, which has gradually developed into an independent unit that can be flexibly constructed with service as a unit:


In the process of service governance, the original core business system was broken into various independent business components. Some middle-tier platform systems gradually built business services based on these business components and process services, and became the rapid construction of front-end applications to provide services. support. In this process, service identification and construction are the foundation, and the transaction line specification is the guarantee. Through the transaction line specification, it can be determined that the service governance has done something and not done. This is because with the iteration of the software version, few individuals can control the system. All the details of the government have been carefully considered, so it is necessary to govern by rules, not by man.

To develop an order query function A, both services B and C of the service integration platform can perform the same function, but B adds some additional unnecessary checks on the basis of C. According to the shortest path principle, A should be directly at this time Call the C service.


When the service provider D has insufficient processing capacity, it should notify the service consumer C in time or discard part of the access channel request according to the priority. The front-end consumer receives the back-end flow control error code and informs the user in time. This can prevent all user levels from being denied service after the system reaches the capacity limit. One of the purposes of flow control is to ensure the healthy and stable operation of each system. Generally, counters are used to detect the number of concurrent transactions according to transaction types. Different transaction types use different counters. When a transaction request arrives, the counter increases by 1, and when the request responds or times out, the counter decreases by 1.

3.2.2 Client optimization

The primary goal of client optimization is to speed up page presentation, and secondly to reduce calls to the server.

Common solutions:

  • Analyze the bottleneck points and make targeted optimization;

  • Caching is king, improving page response time by caching static data on the client;

  • Reduce client network download traffic through GZIP compression;

  • Use compression tools to compress js and reduce the size of js files;

  • Delete and merge scripts, style sheets and pictures to reduce get requests;

  • Load JS without blocking

  • Pre-loading (pictures, css styles, js scripts);

  • Load js scripts on demand;

  • Optimize the js processing method to improve the page processing speed.

WEB request sequence diagram:


[Case] The following is a client HTTP request monitoring record of an enterprise's internal application system:


From the figure above, you can see that a total of 25 requests were sent (21 hits in the cache, 4 interactions with the server).


From the statistics, we can see that the total request time is 5.645 seconds, 4 network interactions are performed, and 5.9KB of data is received. Sending 110.25KB of data, GZIP compression saves: 8KB of data.

Later, the page optimized the response time of the page to about 2 seconds by optimizing back-end requests, merging and compressing JS/JSP files, etc.

PS: For front-end optimization, it is best to understand the principles of browsers and HTTP

3.2.3 Server-side optimization


[Case] Remember a resource leak, the specific manifestation is that RESULT-SET is not closed:

RESULT-SET does not close statistics

Viewing the application according to the stack trace log found that the program code only closed the connection but not the Statement and ResultSet.

Regarding the question of whether closing the connection will automatically close the Statement and ResultSet, and whether the resources occupied by the Statement and ResultSet will be automatically released, the JDBC processing specification or the JDK specification makes the following description:

JDBC processing specification

JDBC. 3.0 Specification 13.1.3 Closing Statement Objects

An application calls the method Statement.close toindicate that it has finished processing a statement. All Statement objects will be closed when the connection that created them is closed. However, it is good coding practice for applications to close statements as soon as they have finished processing them. This allows any external resources that the statementis using to be released immediately.

Closing a Statement object will close and invalidateany instances of ResultSet produced by that Statement object. The resources held by the ResultSet object may not be released until garbage collection runs again, so it is a good practice to explicitly close ResultSet objects when they are no longer needed.

These comments about closing Statement objects applyto PreparedStatement and CallableStatement objects as well.

JDBC. 4.0 Specification 13.1.4 Closing Statement Objects

An application calls the method Statement.close toindicate that it has finished processing a statement. All Statement objects will be closed when the connection that created them is closed. However, it is good coding practice for applications to close statements as soon as they have finished processing them. This allows any external resources that the statementis using to be released immediately.

Closing a Statement object will close and invalidateany instances of ResultSet produced by that Statement object. The resources held by the ResultSet object may not be released until garbage collection runs again, so it is a good practice to explicitly close ResultSet objects when they are no longer needed.

Once a Statement has been closed, any attempt to access any of its methods with the exception of the isClosed or close methods will result in a SQLException being thrown.

These comments about closing Statement objects applyto PreparedStatement and CallableStatement objects as well.

Specification description: connection.close automatically closes Statement.close and automatically causes the ResultSet object to be invalid (note that the ResultSet object is invalid, and the resources occupied by the ResultSet may not be released). Therefore, the close method of connection, Statement, and ResultSet should still be executed explicitly. Especially when using the connection pool, connection.close does not cause the physical connection to be closed, and failure to execute the close of the ResultSet may cause more resource leakage.

JDK processing specifications:

JDK1.4

Note : A ResultSet object is automatically closed by the Statement object that generated it when that Statement object is closed, re-executed, or is used to retrieve the next result from a sequence of multipleresults. A ResultSet object is also automatically closed when it is garbagecollected .

Note : A Statement object is automatically closed when it is garbage collected. When a Statement object is closed, its current ResultSetobject, if one exists, is also closed.

Note : A Connection object is automatically closed when it is garbagecollected. Certain fatal errors also close a Connection object.

JDK1.5 

Releases this ResultSet object's database and JDBC resources immediately instead of waiting for this to happen when it is automatically closed.

Note : A ResultSetobject is automatically closed by the Statement object that generated it when that Statement object is closed, re-executed, or is used to retrieve the nextresult from a sequence of multiple results. A ResultSet object is alsoautomatically closed when it is garbage collected.

Specification description:

1. The garbage collection mechanism can automatically close them;

2. Statement closure will cause the ResultSet to be closed;

3. Connection closure does not necessarily cause Statement to be closed.

Now application systems all use database connection pools. Connection shutdown is not a physical shutdown, but just returns the connection pool, so Statement and ResultSet may be held and actually occupy the cursor resources of the relevant database. In this case, as long as long-term operation It is possible to report a "cursor exceeds the maximum allowed by the database" error, causing the program to be unable to access the database normally.

Suggestions for this type of problem:

(1) Explicitly close database resources, especially when using Connection Pool;

(2) The best experience is to execute close in the order of ResultSet, Statement, and Connection;

(3) In order to avoid memory leaks due to problems with java code, rs = null and stmt = null must be added after rs.close() and stmt.close(), and exception handling should be done;

(4) If you must pass ResultSet, you should use RowSet. RowSet does not depend on Connection and Statement.

3.2.4 JVM optimization

The parameter adjustment for JVM needs to be handled carefully. Common JVM parameters:

Heap parameter settings

-server  Xmx1G  Xms1G -Xmn512M-XX:PermSize=512M -XX:MaxPermSize=512M -XX:+UseCompressedOops 

-server: Select "server" VM, which must be used as the first parameter. The opposite parameter is -client, "client" VM. Adding -server parameter will affect the default value of other parameters of jvm. HotSpot includes an interpreter and two compilers (client and server, choose one of the two), a mixed execution mode of interpretation and compilation, and interpretation and execution are enabled by default. The server starts slowly, occupies a lot of memory, and has high execution efficiency. It is suitable for server-side applications. After JDK1.6, this mode will be enabled by default in the jdk environment with 64-bit capability; the client starts fast, occupies a small amount of memory, and the execution efficiency is not as fast as the server. , Dynamic compilation is not performed by default, and is usually used for client application or PC application development and debugging.

PS: According to reports, some versions of Hotspot Servermode have been reported to have stability problems. Therefore, whether jvm adopts server mode or client mode needs to be evaluated through long-term system monitoring.

Garbage collection parameter settings

-XX:+DisableExplicitGC-XX:+UseParNewGC-XX:+UseConcMarkSweepGC-XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection-XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSClassUnloadingEnabled

-XX:+DisableExplicitGC prohibits System.gc() to prevent programmers from calling gc methods by mistake and affecting performance;

PS: According to historical experience, the general garbage collection time is less than 2%, which is considered to have little impact on performance.

Log parameters

-XX:+PrintClassHistogram -XX:+PrintGCDetails-XX:+PrintGCTimeStamps-Xloggc:log/gc.log 

-XX:+ShowMessageBoxOnError-XX:+HeapDumpOnOutOfMemoryError-XX:+HeapDumpOnCtrlBreak

Set some log parameters when debugging, such as -XX:+PrintClassHistogram -XX:+PrintGCDetails-XX:+PrintGCTimeStamps -Xloggc:log/gc.log, so that you can view the gc frequency from gc.log, and evaluate the correctness based on this Performance impact.

When debugging, set the heap dump file to be generated during abnormal downtime, -XX:+ShowMessageBoxOnError-XX:+HeapDumpOnOutOfMemoryError -XX:+HeapDumpOnCtrlBreak, so that you can view what operations the system performs during the downtime.

Performance monitoring parameter settings

-Djava.rmi.server.hostname=Server IP-Dcom.sun.management.jmxremote.port=7091-Dcom.sun.management.jmxremote.ssl=false-Dcom.sun.management.jmxremote.authenticate=false

By adding the above parameters, you can monitor the execution of the remote JVM through visualVM or jconsole.

JVM parameter adjustment

To adjust the heap parameters and garbage collection parameters, it is necessary to pass stress testing and comprehensive analysis of monitoring records. The most effective solution:

Parameter combination

TransResponse Time

Throughput

Passed Transactions

Heap parameters

GC parameters


























[Case] The application server runs a segment of Object instances with a number of one million/ten million. When using IBMHeapAnalyzer to analyze memory overflow, a heapdump file is generated, and it is found that 89.1% of the space is occupied by basic objects (caused by loading a large number of records from the database):


After monitoring with jprofiler, it was found that a large number of unreleased VchBaseVo objects:


Looking at the engineering code, I found that Hibernate's list() method is used to query. The hibernatelist() method first queries the cached data. If it cannot be obtained, it will be obtained from the database. After obtaining it from the database, Hibernate will fill the first and second levels accordingly. Level cache, so millions of objects occupy the memory problem in the application server level memory. This is an effective solution for hibernate caching, but it does bring performance problems here. You need to call clear() to release the first level cache. Memory resources occupied.

3.2.5 Database optimization

[Case] An enterprise's internal core business system database has a peak business CPU usage rate, a large amount of data query, multiple table connections cause query performance degradation, and table indexes are unreasonable. Finally, the following methods are used to reduce the peak business period The CPU usage rate is controlled within 30%:

Execute the following statement under SQL*PLUS: 

SQL> set line 1000-set each line to display 1000 characters

SQL> set autotrace traceonly-displays the execution plan and statistics, but does not display the query output

Inefficient execution of SQL statements:

select variablein0_.TOKENVARIABLEMAP_ as TOKENVAR7_1_

   from JBPM_VARIABLEINSTANCE variablein0_


 where variablein0_.TOKENVARIABLEMAP_ = '4888804'

View the execution plan before optimization:

Implementation plan

-------------------------------------------------- --------

Plan hash value: 3971367966

-------------------------------------------------- -----------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time|

-------------------------------------------------- -----------------------------------------

| 0 | SELECT STATEMENT | | 12 | 612 | 12408 (2)| 00:02:29 |

|* 1 | TABLE ACCESS FULL| JBPM_VARIABLEINSTANCE | 12 | 612 | 12408 (2)| 00:02:29 |

-------------------------------------------------- -----------------------------------------

Predicate Information (identified by operation id):

-------------------------------------------------- -

   1-filter("VARIABLEIN0_"."TOKENVARIABLEMAP_"=4888804)


Statistics

-------------------------------------------------- --------

          1 recursive calls

          1 db block gets

      48995 consistent gets

      48982 physical reads

          0 redo size

       1531 bytes sent via SQL*Net to client

        248 bytes received via SQL*Net from client

          2 SQL*Net roundtrips to/from client

          0 sorts (memory)

          0 sorts (disk)

          9 rows processed

From the execution plan, the lack of an index for this statement leads to a full table scan. Consuming total consistent read occupancy: 48995, average consistent read per line: 48995/9=5444, physical read: 48982, which does not meet normal performance requirements. Create an execution plan after index optimization:

Statistics

-------------------------------------------------- --------

           1 recursive calls

           0 db block gets

           6 consistent gets

           4 physical reads

           0 redo size

        1530 bytes sent via SQL*Net to client

         248 bytes received via SQL*Net from client

           2 SQL*Net roundtrips to/from client

           0 sorts (memory)

           0 sorts (disk)

           9 rows processed

From the execution plan, the statement consumes the total consistent read occupancy: 6, the average consistent read per line: 6/9=0.67, and the physical read: 4, which is a more efficient SQL.

It is generally believed that the average read consistency per row exceeds 100 SQL with relatively low execution efficiency, and the SQL with relatively high execution efficiency within 10.

According to previous optimization practices, the problems cause SQL inefficiency are mainly concentrated in the following aspects:

(1) access path is mainly concentrated in the SQL execution caused by the missing index or the index failure caused by the data migration, the index scan cannot be used during the SQL execution, and the full table scan access path is forced to use. The solution at this time is to create the missing index or rebuild the index.

(2) Excessive use of sub-queries , in some cases we will connect multiple large tables, and at this time due to the needs of business logic, we often use some sub-queries, because the logic of the statement is too complicated, so that Oracle can not Automatically convert sub-query statements into multi-table join operations, the result of which is that Oracle chooses the wrong execution path, which results in a sharp drop in statement execution performance. Therefore, we need to use join queries instead of subqueries as much as possible, which can help the oracle query optimizer choose a reasonable join sequence, join technology, and table access technology according to the data partitioning situation and index design situation, that is, choose the most efficient execution plan.

(3) The advantage of using bind variables is that you can avoid hard parsing. The advantages are not discussed here, but the disadvantage is that the wrong execution plan may be selected, which may cause a sharp drop in performance. Currently, oracle 10g has introduced a binding variable classification mechanism to deal with this problem. 11g maintains a new execution plan by creating a new sub-cursor. Under 11g, we can boldly use bind variables.

3.2.6 Load balancing optimization

Load balancing is responsible for the distribution of access traffic and improving the system's horizontal scalability to avoid single point of failure. The following is the analysis and optimization ideas for load balancing problems of a certain project group:


Load balancing algorithm:

  1. Random: One is randomly selected from the pool address. Benefits: simple algorithm, high performance, basically keeping the backend balanced when the request time-consuming difference is small; disadvantages: if the request time-consuming difference is large, then later The end machine is prone to imbalance.

  2. Round-Robin: Choose according to the order of the pool address list. Advantages: simple algorithm and high performance. Disadvantages: same as random. If the request time-consuming difference is large, the back-end machines are likely to be unbalanced.

  3. By weight: You can assign weights to the hosts in the pool, and then assign requests according to the weights. Benefits: This algorithm can be used, especially when hosts with different configurations have been accumulated in the production environment for many years. However, with virtualization, the problem has been in IAAS The layer is solved.

  4. Hash: The request information is hashed and then dispatched to the machines in the pool (usually used for loading static resources). Advantages: increase the cache hit rate; disadvantages: because the request information needs to be read and hashed, it needs to consume more CPU resources.

  5. According to the response time: According to the response time, the advantage: the request can be allocated to the host with good performance; the disadvantage: if the request time-consuming difference is large, the back-end machine is easy to be unbalanced.

  6. According to the minimum number of connections: allocate according to the number of host connections, benefits: balance request resources; disadvantages: adding a new server or restarting a certain one will cause performance problems due to excessive instantaneous requests.

Session retention:

  1. No session retention: Every time a request is considered a new request, it is allocated to the back-end host again according to the load balancing algorithm. Advantages: simple, high performance; disadvantages: need back-end services to do stateless processing;

  2. Retention based on access ip: After the same IP is allocated according to the load balancing algorithm for the first time, the second request is still allocated to the last host. Advantages: The conversation remains relatively stable; Disadvantages: cause some users in the network to connect to one server;

  3. Based on cookie retention: the first time the load balancer is requested to insert a cookie in the HTTP request header, the second request is allocated to the last host based on the cookie in the requested HTTP header. Benefits: Relatively stable and flexible switching; Disadvantages: Occasionally, the session is lost due to clearing of cookies.

health examination:

  1. Based on TCP port: Whether the listening port is enabled or not, if it is not monitored, the host will be removed from the pool. Benefits: simple, disadvantages: it is possible that the container is started, the application is not started, and the request is distributed

  2. Based on Http get/TCP requests: periodically send requests to the server and determine whether the returned string is consistent with the agreed upon. If inconsistent, the host will be removed from the pool. Benefits: it can accurately determine whether the application is started normally, and it can dynamically control whether the service is online , Disadvantages: need to write scripts.

Author: Kong Qinglong

This article is reproduced from the WeChat public account Mesozoic technology freshmanTechnology