The pandemic and the subsequent changes in the business environment has, for many, moved them from bricks-and-mortar outlets to e-commerce as their prime source of income. This has meant that the organisation’s Web presence is key to its business strategy.
In turn, the management of their web presence and the associated IT infrastructure becomes vital. E-commerce systems need to be available to internal and external users at all times, meaning that the IT infrastructure supporting the corporate network must be up and running 24/7/365.
E-Commerce websites and websites in general, need to have good speedy responses. Customers will click away if they feel that a site is slow and unresponsive.
Website performance is not only dependent on its design and programming, but also on the webserver hosting it.
Monitoring of the Web server is vital. A CPU load that is too high can cause problems, as camn excessive memory usage and low levels of storage space. One time to be especially vigilant is when adding a new application to the server.
Web Server Monitoring
Monitoring of webserver performance is essential, but it is too easy to become bogged down in excessive information. Here are 10 general areas to monitor.
This is the most important metric. Downtime equals lost sales and disruptions to the flow of operations of the company. Long or frequent downtime can result in reputational damage. Aim for “Five Nines”, 99.999% uptime.
An active web server will use CPU resources to service requests. A high level of CPU usage can mean too many active connections, a process is hogging resources, or you need to upgrade your CPU power because it is constantly close to capacity.
A graph of CPU usage over time will show any resource usage patterns, and if the CPU is being over utilised.
Time to First Byte
One item that is often overlooked is the time it takes for a page to start to load. Customers don’t look at a blank or static screen for more than a couple of seconds before moving on. The “Time to First Byte” or TTFB is how long you wait before seeing the page starting to load.
server configuration, or traffic. Having sorted out TTFB, the focus is then on the operation of the website itself.
The first indication of a problem could be that new connection requests are being denied. A graph of connection requests over time will show when the problem first arose, and the number and frequency of denied connections.
The reason for the denials will need further analysis.
As with CPU utilisation, RAM usage is key to web server performance. Each active process needs RAM, and web server performance will drop if processes need to wait on currently running processes to terminate and release memory.
Again a graph of memory utilisation over time will indicate if additional memory is needed.
Disk Storage Performance
If a process needs to store or recover data from storage, then that can have an effect on web server performance. A typical implementation could have several databases and servers included in the infrastructure – authentication, payment, inventory for example. Optimising storage performance can improve web server performance.
Once more, a graph showing data transfer volumes and speeds over time will show if data transfer speeds are a bottleneck. You may need to consider replacing slow HDDs with faster HDDs or even SSD storage.
Request processing time is how long the webserver takes to service a request. This will be a composite of the time taken by any subsidiary processes plus the webserver process itself. For example, it may include database access and request times.
One point to note, if you see long request times, it may not be the webserver, it could be a subordinate process causing the delay.
Error rates show the number of service requests that are not met. Again, these may not arise from web server problems but could be a function of badly designed applications or server misconfiguration.
Overall server load can be estimated by the number of active threads at any one time. Configuration of the server will limit the number of threads a process can use or spawn, setting a maximum threshold. Having the threshold set too low can prevent processes from operating at maximum efficiency, too high can result in resource overutilisation.
Monitor thread count to see how often it hits the threshold.
Average Response Time (“ART”)
Measuring request/response times gives an average response time. The lower the better. Bear in mind that it can be affected by unusual circumstances, for example, a sharp increase in connection requests caused by a marketing promotion.
Peak Response Time (“PRT”)
In addition to average response time, measuring peak response times gives a good indication of any issues that might need attention. For example, if the ART is fine, but the PRT is very much higher, that could indicate that a particular request generated by an application process or systems operation is an anomaly that must be investigated.
If your business is an e-commerce business, or you operate a website providing customer services, a high level of web server performance is essential. Monitoring and remediation is a continuous process.