Things to remember when launching a product or the very first steps to an appropriate system load
In times of dial-up modems we waited for minutes before a web-page would load and we were okay with that. Now the situation has changed - we are no longer ready to wait for even 10 seconds. Once we see the error on the page we find a similar resource that works faster and without setbacks and we are unlikely to return to the previous one.
If you can’t answer these questions decisively:
- Will the application manage the planned user flow?
- Will it work accordingly if the number of users increases?
- How many users can the server hold without crashing?
- What happens if the load becomes more than the server can hold?
- Will the servers be able to recover their performance and how much time will it take, if the number of visitors will lead to a system crash?
- Will the service still work steadily after a week, a month, a year?
It's high time to consider load testing for your system or service.
Instant work without setbacks and errors. According to the research by Akamai:
- 47% of users expect a web page to load in 2 seconds;
- 40% of visitors leave a site, if it takes more than 3 seconds to load it;
- 52% say that fast loading increases their loyalty.
Slow site loading undermines customer loyalty to the company. This is confirmed by the research of Gomez company:
- at peak hours, 75% of visitors went to competitors' sites without waiting for the page to load;
- 88% of visitors are unlikely to return to the site after an unsuccessful attempt to open it;
- 55% of users expressed a less positive opinion of the company as a whole, if the site loading was slow;
- 33% shared a negative impression with friends.
So, if there are problems with loading, the business is already suffering losses: the number of users is decreasing, there are fewer orders and more dissatisfied reviews. The right thing to do here is to calculate the load on the service and "hardware" in advance correctly, before the project is launched.
System load. First steps
So, you decided that load testing was necessary for your system or service. Below we give some basic steps that will help you take care about the load in advance.
Step 1. Goals and requirements
We state them before we start the working. This will determine how we are going to load the system and what aspects we should consider. The analysis can be different, depending on the goals we set.
For example, objectives may be as follows:
Determine the maximum system performance according to the existing configuration;
Check system reliability: analyze possible memory leaks and the impact of third-party regular tasks on the system, when a database backup is created, for instance;
Identify potential "bottleneck" of the system.
The main requirements usually include:
Operation time: 90% of queries must be serviced within 10 seconds. Demanding to service all 100% of queries on time is meaningless - 10% is a reserve for emergency situations and outliers. For different queries, the service time requirements may differ: to go to the internal page, you need a fraction of a second, and for downloading the annual report - a few minutes;
Capacity: The system should serve 100 simultaneous users and follow the requirements fromt the 1st point.
In general, such list of requirements is enough to start testing. If the system uses non-standard protocols or functionality: video, streaming, the requirements will be defined individually.
Use case of ordering goods in the online store
While preparing load testing scripts, there is no need to cover all the existing functionality, so we highlight the most important and frequently used user actions: search, selection and purchase, payment, ordering. We do not take into account, for example, changing the password, changing the avatar or passport data - this is the functionality that does not affect the overall performance evaluation of the system. This will save time on developing scripts.
Step 3. Testing Tools
The main load testing tool that we use is Jmeter. This is not the only tool, but it is the most popular one. It is suitable for most projects due to being flexible, cross-platform and the ability to support a large number of protocols. In addition, Jmeter is a free tool supported by a large community of developers: they improve the ready-made solution, which can be used for free.
For Jmeter we develop scripts: a set of commands sent to the server when the user is active. Each script simulates a separate user action: login, enter login / password, search, add items to the shopping cart, etc. Together they form a custom script from step 2. To test how many users the system will withstand, we use several virtual users in the group (their number depends on the goals and requirements from step 1).
Step 4. Test data and scripts
Take the load of an online store. To verify it, we need a sufficient number of unique users, documents or products.
Let's say the estimated number of customers is 1000 people. In this case, we check how the system will work if all 1000 people order goods at the same time. Hence, we create 1000 users. If in the future the number of users increases, we create more users in advance to find out the limit of visitors that the system is able to withstand.
Right before testing, we check that:
- all scripts were debugged;
- test data is prepared;
- the load that will be fed to the system and servers, has been agreed upon;
- the agents that will monitor metrics like memory consumption, CPU time, paging file and network activity, have been installed.
Step 5. Actual loading
We build the load profile based on the target. For example, to determine the capacity of a system, we configure the test so that the number of working users increases. We monitor the server, the response time and the error rate, we notice when the first inflection point occurs - saturation testing. This is the point when degradation begins: it seems that the load is still within the requirements, but the limit of the machine resources is already close.
The saturation testing occurred at the degradation point
The next point of requirements violation is when we go beyond the normal maintenance. The system works, but this is our limit of normal maintenance.
System at the normal maintenance limit
We go further and try to work on how it will turn out. The next step is the point of failure: memory or processor time runs out, and the service "pauses". Here we stop the test and fix the maximum capacity of the system.
Fail point: memory or processor time runs out
Step 6. Analyzing the results
We analyze the data obtained during testing: metrics, application logs, data on the resources utilization of each system servers, database logs and identify problematic areas and bottlenecks. Various issues can occur: starting from incorrect configuration of some server that is a part of the system, to deadlock in the database or errors in the application code. Often, such problems can not be identified at the stage of functional testing, when 2-3 people use the system. Without load testing, the problem will be discovered during the operation phase, when hundreds and thousands of visitors will start using the application.
Detection of one of such bottlenecks, is shown in the graph of the utilization dependence of processor capacities of one of the servers on the number of users simultaneously working with the system. When the number of visitors exceeded 200 users, the CPU utilization of this server reached 100%, and the entire service stopped responding. By the way, the load was less than the estimated number of real users.
Graph of the utilization dependence of processor capacities on the number of users
As soon as the processor was upgraded on the server, the number of simultaneous users increased by 2.5 times. This exceeded the planned load on the service.
After the load testing, a test results report is prepared. It includes a list of errors, suggestions for optimizing the operation of the system and general recommendations.
Many people ignore load testing, hoping that the problems will not occur. As they say, trust, but check. Checking the load before starting the service will protect the product from working slowly and incorrectly, and determine the limit of its performance on the selected equipment. Practice shows that simple configuration adjustments are enough to speed up the project by 5-10 times and make it resistant to stressful loads.