MySQL Performance: How To Leverage MySQL Database Indexing

A Mysql Indexing Logo

Throughout this tutorial, we will cover some of the fundamentals of indexing. As part of the MySQL series, we will introduce capabilities of MySQL indexing and the role it plays in optimizing database performance. Liquid Web recommends consulting with a DBA before making any changes to your production level application.

What is Indexing?

Indexing is a powerful structure in MySQL which can be leveraged to get the fastest response times from common queries. MySQL queries achieve efficiency by generating a smaller table, called an index, from a specified column or set of columns. These columns, called a key, can be used to enforce uniqueness. Below is a simple visualization of an example index using two columns as a key.

+------+----------+----------+
| ROW | COLUMN_1 | COLUMN_2 |
+------+----------+----------+
| 1 | data1 | data2 |
+------+----------+----------+
| 2 | data1 | data1 |
+------+----------+----------+
| 3 | data1 | data1 |
+------+----------+----------+
| 4 | data1 | data1 |
+------+----------+----------+
| 5 | data1 | data1 |
+------+----------+----------+

Queries utilize indexes to identify and retrieve the targeted data, even if they are a combination of keys. Without an index, running that same query results in an inspection of every row for the needed data. Indexing produces a shortcut, with much faster query times on expansive tables. A textbooks analogy may provide another common way to visualize how indexes function.
This analogy compares MySQL indexing to indexing in the back of a book.

When to Enable Indexing?

Indexing is only advantageous for huge tables with regularly accessed information. For instance, to continue with our textbook analogy, it makes little sense to index a children’s storybook with only a dozen pages. It’s more efficient to simply read the book to find each occurrence of the word “turtle” than it would be to set up and maintain indexes, query for those indexes, and then review each page provided. In the computing world, those extra tasks surrounding indexing represent wasted resources which would be better purposed by not indexing.

Without indexes, when tables grow to enormous proportions, response times suffer from queries targeting those obtuse tables. Inefficient queries manifest into latency within application or website performance. We commonly identify this latency by using the MySQL slow query log feature. You can find more details about using the slow query log feature in the first article in this series: MySQL Performance: Identifying Long Queries.
Once a colossal table hits its tipping point, it reaches the potential for downtime for applications and websites. Conducting routine evaluations for growing database establishes optimal database performance and sidesteps long queries’ inherent interruptions.

MySQL Indexing Pros vs. Cons

There are benefits and downsides to using MySQL indexing, and we’ll discuss the significant pros and cons for your consideration. These aspects will guide you to decide whether indexing is an appropriate choice for your situation.

quick data transmissions and ideal for OLAP.

What Information Does One Index?

Selecting what to index is probably the most challenging part to indexing your databases. Determining what is important enough to index and what is benign enough to not index. Generally speaking, indexing works best on those columns that are the subject of the WHERE clauses in your commonly executed queries. Consider the following simplified table:

ID, TITLE, LAST_NAME, FIRST_NAME, MAIDEN_NAME, DOB, GENDER, AGE, DESCRIPTION, HISTORY, ETC...

If your queries rely on testing the WHERE clause using LAST_NAME and FIRST_NAME then indexing by these two columns would significantly increase query response times. Alternately, if your queries rely on a simple ID lookup, indexing by ID would be the better choice.

These examples are merely a rudimentary example, and there are several types of indexing structures built-in to MySQL. The following MySQL page discusses these types of indexes in greater detail, and a recommended read for anyone considering indexing: How MySQL Uses Indexes

What is a Unique Index?

Another point for consideration when evaluating which columns to serve as the key in your index is whether to use the UNIQUE constraint. Setting the UNIQUE constraint will enforce uniqueness based on the configured indexing key. As with any key, this can be a single column or a concatenation of multiple columns. The function of this constraint ensures that there are no duplicate entries in the table based on the configured key.

UNIQUE constraints increase write speeds, a taxation of implementation.

What is a Primary Key Index?

As commonly invoked as the UNIQUE constraint the PRIMARY KEY is used to optimize indexes. This constraint ensures that the designated PRIMARY KEY cannot be of a null value. As a result, a performance boost occurs when running on an InnoDB storage engine for the table in question. This boost is due to how InnoDB physically stores data, placing null valued rows in the key out of contiguous sequence with rows that have values. Enabling this constraint ensures the rows of the table are kept in contiguous order for quicker responses.

Primary Key Index is absolutely necessary for large tables.

Managing Indexes

Now we will cover some of the basics of manipulating indexes using MySQL syntax. In examples, we will include the creation, deletion, and listing of indexes.Keywords for Managing Indexes: dbName, tableName, indexName Keep in mind, these examples have placeholder entries for the specific keywords. These keywords are self-explanatory by nature for easy reading, and below is an outline of them.

Instead of tableName you can use dbName.tableName.

Listing/Showing Indexes

Tables can have multiple indexes. Managing indexes will inevitably require being able to list the existing indexes on a table. The syntax for viewing an index is below.

SHOW INDEX FROM tableName;

SHOW INDEX FROM tableName; shows all indexes.

Indexing are present on 3 different columns.

Creating Indexes

Index creation has a simple syntax. The difficulty is in determining what columns need indexing and whether enforcing uniqueness is necessary. Below we will illustrate how to create indexes with and without a PRIMARY KEY and UNIQUE constraints.

As previously mentioned, tables can have multiple indexes. Multiple indexing is useful for creating indexes attuned to the queries required by your application or website. The default settings allow for up to 16 indexes per table, increase this number but is generally more than is necessary. Indexes can be created during a table’s creation or added on to the table as additional indexes later on. We will go over both methods below.

Creating too many indexes can add latency, but if you must then increase buffers in MySQL config.

Example: Create a Table with a Standard Index

CREATE TABLE tableName (
ID int,
LName varchar(255),
FName varchar(255),
DOB varchar(255),
LOC varchar(255),
INDEX ( ID )
);
You can create an index for several columns, using ID as the index.

Example: Create a Table with Unique Index & Primary Key

CREATE TABLE tableName (
ID int,
LName varchar(255),
FName varchar(255),
DOB varchar(255),
LOC varchar(255),
PRIMARY KEY (ID),
UNIQUE INDEX ( ID )
);
You can create an Primary Key and UNIQUE constraint over several columns.

Example: Add an Index to Existing Table

CREATE INDEX indexName ON tableName (ID, LName, FName, LOC);CREATE INDEX statement creates an index and names it.

Example: Add an Index to Existing Table with Primary Key

CREATE UNIQUE INDEX indexName ON tableName (ID, LName, FName, LOC);the CREATE UNIQUE command can add an index to a table ensuring no duplicate data.

Deleting Indexes

While managing indexes, you may find it necessary to remove some. Deleting indexes is also a very simple process, see the example below:

DROP INDEX indexName ON tableName;The DROP INDEX command lets us drop indexes on particular column.

There are many ways to optimize your database for true efficiency. If you would like to learn more or convert the search engines types available in MySQL read through our MyISAM vs. InnoDB tutorial.  Or if you are need of high functioning databases check out our MySQL product page to view different options.

Apache Performance Tuning: MPM Directives

How directives behave and which directives are mainly available hinges on the loaded MPM. As discussed in our previous series, MPM is short for MultiProcess Modules, and they determine the basis for how Apache addresses multiprocessing. Using our last article on Apache MPM Modules as a springboard, we will use this section to cover the following subsections:

Each part will focus on how the directives affect performance for their respective MPM and some common considerations that should be assessed when optimizing Apache with those specific MPMs.

Note:
Be sure to review this article in its entirety as the universal directives operate in the same manner regardless of the MPM chosen.

 

General Optimization

IfModule

An important directive to learn when working with Apache servers is the IfModule conditional statement. There are two parts to the IfModule statement. A beginning, which also accepts a module name or module source file name, as well as a closing statement. When the provided module is loaded into Apache, then all directives between the beginning IfModule statement and the closing IfModule statement are also read into the Apache running configuration. Please review the provided example below for further clarification:

 

<ifModule mpm_prefork_module>
MaxSpareServers 16
</ifModule>
Timeout 60
The above example defines the MaxSpareServers directive only when loaded by mpm_prefork_module. The Timeout directive is applied every time due to it being outside of the IfModule closing statement.

 

IfModule statements are used to maintain compatibility within Apache configuration between module changes. Maintaining compatibility is done by grouping directives into IfModule statements, so they are only used when the required module is loaded. Ensuring a syntactically correct configuration file even when swapping modules.

Rule of Thumb:
Appropriately wrapping everything in an IfModule statement is a best practice standard with Apache and should be adhered to for superior compatibility in config files.

Timeout

The numerical value of seconds Apache waits for all common I/O events. Apache will abandon requests fail to complete before the provided Timeout value.

Determining the right Timeout depends on both traffic habits and hosted applications. Ideally, Timeout should be as low as possible while still allowing the vast majority of regular traffic to operate without issue. Large timeouts, those above 1 minute, open the server to SlowLoris style DOS attacks and foster a long wait in the browser when it encounters a problem. Lower timeouts allow Apache to recover from errant stuck connections quickly. It becomes necessary to strike a balance between the two extremes.

Tip:
Avoid increasing the global Timeout when addressing issues with a single script, or user, that requires a long Timeout. Problems can usually be resolved by a .htaccess file or include file to increase the Timeout directive for that specific script.

KeepAlive

KeepAliveA simple on|off toggle enables the KeepAlive protocols with supported browsers. The KeepAlive feature can provide as much as a 50% reductions in latency, significantly boosting the performance of Apache. KeepAlive accomplishes this by reusing the same initial connections a browser creates when connecting to Apache for all follow-up requests which occur within a short period.

KeepAlive is a powerful feature and in general, should be enabled in most situations. It works great for reducing some of the CPU and Network overhead with modern element heavy websites. For example, an easy way to visualize KeepAlive is with the “hold the door” phrase. Imagine a queue of people entering a building through a single doorway. Each person is required to open the door, walk through it, then close the door before the next person does the same process. Mostly, that’s how Apache works without KeepAlive. When enabled, the door stays open until all the people in line are through the door before it closes again.

Two additional related directives also govern KeepAlive. MaxKeepAliveRequests and KeepAliveTimeout. Discussed in the next section, each one plays a vital role in fine-tuning of the KeepAlive directive.

 

MaxKeepAliveRequests

Sets a limit on the number of requests an individual KeepAlive connection is permitted to handle. Once reached, Apache forces the connection to terminate, and creates a new one for any additional requests.

Determining an ideal setting here is open to interpretation. Generally, you want this value to be at least as high as the largest count of elements (HTML, Text, CSS, Images, Etc..) served by the most heavily trafficked pages on the server.

Rule-of-Thumb:
MaxKeepAliveRequestsSet MaxKeepAliveRequests to double that of the largest count of elements on common pages. (Services like webpagetest.org or gtmetrix.com can count elements on a page).

KeepAliveTimeout

This directive is measured in seconds and will remain idle waiting for additional requests from its initiator. Since these types of connections are only accessible to their initiator, we want to keep KeepAliveTimeout very low. A low value prevents too many KeepAlive connections from locking out new visitors due to connection priority.

Tip:
KeepAliveTimoutA large MaxKeepAliveRequests directive with a very low KeepAliveTimeout allows active visitors to reuse connections while also quickly recovering threads from idle visitors.
Configuration: Set MaxKeepAliveRequests to 500+, Set KeepAliveTimeout to 2
Requirements: Works best on MPM Event.

MPM Event/Worker Optimization

This section details the use and performance considerations that are essential when running Worker based MPMs, including both MPM Event and MPM Worker. These MPMs are considered multi-threaded solutions and some directives behave differently based on the loaded MPM. The information provided in this section is only a portion about Worker based MPMs.

Note:
In Worker based MPMs: ServerLimit, ThreadsPerChild, and MaxRequestWorkers are intrinsically linked with each other. It is essential to understand the role of each one and how changing one affects the others. The following directives govern the fine-tuning of the thread handling capabilities of Apache web servers.
MPM Worker and MPM Event

The two modules, MPM Event, and MPM Worker for most intents and purposes operate identically. The difference is apparent in the way each handles KeepAlive requests. The MPM Worker locks threads for the duration of the KeepAlive process and directly affects the number of available threads able to handle new requests. The MPM Event uses a Listener thread for each child. These Listener threads handle standard requests, and KeepAlive requests alike meaning thread locking will not reduce the capacity of the server. Without thread locking, MPM Event is the superior choice but only in Apache 2.4. Before Apache 2.4 the MPM Event was unstable and prone to problems.

ServerLimit

ServerLimit represents the upper limit of children Apache is allowed. The practical usage for ServerLimit is creating a hard ceiling in Apache to protect against input errors with MaxRequestWorkers. The cap prevents spawning vastly more children than a system can handle, resulting in downtime, revenue loss, reputation loss or even data loss.ServerLimit

ServerLimit ties in directly with the thrashing point discussed earlier in this article. The thrashing point is the maximum number of children Apache can run before memory usage tips the scale into perpetual swap. Match the ServerLimit to the calculated thrashing point to safeguard the server.

 

ThreadsPerChild

Used to define the limit of threads that each Apache child can manage. Every thread running can handle a single request. The default of 25 works well for most cases and is a fair balance between children and threads.ThreadsPerChild

There is an upper limit on this directive as well, and the limit is controlled by the ThreadLimit directive, which defaults to 64 threads. The adjustments to increase ThreadsPerChild past 64 threads also need to be made to ThreadLimit.

Increasing this value allows each child to handle more requests keeping memory consumption down while allowing a larger MaxRequestWorkers directive. A key benefit of running more threads within each child is shared memory cache access. Threads from one child cannot access caches from another child. Boosting the number of threads per child squeezes out more performance due to this sharing of cache data. The major downside for increased threads per child occurs during child recycling. The capacity of the server is diminished by the number of threads configured for each child when that child process is eventually recycled (graceful restart).

MPM Event/Worker

Inversely the opposite reaction is achieved by lowering ThreadsPerChild. Fewer threads per child require more children to run an equal amount of MaxRequestWorkers. Since children are full copies of Apache, this increases Apache’s overall memory footprint but reduces the impact when recycling children. Fewer threads mean fewer potential “stuck” threads during the recycle procedure, keeping the higher capacity of requests available overall children. Having fewer threads per child provides increased shared memory isolation. For instance, dropping ThreadsPerChild to 1 gives the same request isolation of MPM Prefork but also inherits its massive performance tax as well, requiring one child per one request.

Tip:
When setting ThreadsPerChild always consider the server environment and hardware.

  • A memory-heavy shared server hosting numerous independent accounts might opt for a lower ThreadsPerChild, reducing the potential impact of one user affecting another.
  • A dedicated Apache server in a high capacity load balanced configuration can choose to increase ThreadsPerChild significantly for a better overall performance of each thread.

ThreadLimit

Used to set the maximum value of ThreadsPerChild. This directive is a hard ceiling for ThreadsPerChild. It helps protect against typographical errors with the ThreadLimit
ThreadsPerChild directive which could quickly spin a server out of control if too many threads are allowed due to an input error. This setting need to be adjusted in some high-end servers when the system needs more than the default of 64 threads per child.

MaxRequestWorkers / MaxClients

The directive sets the limit for active worker threads across all running children and acts as a soft ceiling with ServerLimit taking control as the hard limit. When the number of total running threads has reached or exceeded MaxRequestWorkers, Apache no longer spawns new children.MaxRequestWorkers/MaxClients

Determining the MaxRequestWorkers is a critical part of server optimization. An optimal setting is based on several changing variables. This means its configuration needs to be reevaluated and tailored periodically over time, changed by watching traffic habits and system resource usage. The Apache status Scoreboard is an effective tool for analysis of Apache performance.

 

It is typical of Worker based MPM systems to run an isolated third-party PHP handler like Mod_fcgid, PHP-FPM, and mod_lsapi. These modules are responsibleMPM Event/Worker2

for processing PHP code outside of Apache and frees up Apache to handle all other non-PHP requests such as HTML, TEXT, CSS, Images, etc… These requests are far less taxing on server resources which allows Apache to handle larger volumes of requests, such as those beyond 400 MaxRequestWorkers.

MinSpareThreads

The least number of Threads that should remain open, waiting for new requests. MinSpareThreads is a multiple of ThreadsPerChild and cannot exceed MaxSpareThreads, though it can match it.

Rule-of-Thumb:
Set MinSpareThreads to equal 50% of MaxRequestWorkers.

Spare threads are idle workers threads. These threads are merely waiting for new incoming requests and are governed by the Apache child process that spawned them. If there are less available threads than MinSpareThreads, The Apache parent will generate a new child with another ThreadsPerChild worth of threads.

MinSpareThreads

MaxSpareThreads

This directive governs the total number of idle threads allowed on the server across all children. Any threads above this limit direct their parent to shut down to reduce memory consumption during off-peak hours.MaxSpareThreads

Having a limit to the number of idle open threads is excellent for smaller servers with hardware constraints. However, it mostly unneeded on today’s modernizing hardware.

Tip:
Configuring Apache as an open throttle is a high-performance configuration for servers with significant RAM and multiple CPU cores. When running the open throttle configuration, all available threads become available at all time. Apache’s memory usage will stay near its peak at all times, a side effect due to running all the configured children into memory preemptively. This configuration will produce the best possible response times from Apache by maintaining persistent open connections ready to do work and removing the overhead of spawning new processes in response to traffic surges.

Configuration: Match both MinSpareThreads and MaxSpareThreads to MaxRequestWorkers.

Requirements: Make sure there is enough server RAM to run all MaxRequestWorkers at once.

StartServers

This directive governs the initial amount of children the Apache Parent process spawns when the Apache service is started or restarted. This is commonly left unchanged since Apache continuously checks the current running children in conjunction with ThreadsPerChild and compare it to MinSpareThreads to determine if more children get forked. This process is repeated perpetually, with a doubling of new children on each iteration, until MinSpareThreads is satisfied.

Rule-of-Thumb:
StartServerManually calculating StartServers is done by dividing MaxRequestWorkers by ThreadsPerChild, rounding down to the nearest whole number. This process forces all children to be created without delay at startup and begins handling requests immediately. This aspect is especially useful in modern Apache servers which require periodic restarts to load in directive changes.

MaxConnectionsPerChild / MaxRequestsPerChild

The number of requests a single Apache child process can handle equals a cumulative total on the child server across all threads it controls. Each request handled by a thread counts toward this limit to its parent. Once the child server has reached its limit, the child is then recycled.
This directive is a stop-gap for accidental memory leaks. Some code executed through Apache threads may contain memory leaks. Leaked memory are portions of memory that subprocess failed to release properly, so they are inaccessible to any outside processes. The longer a leaking program is left running, the more memory it will leak. Setting a MaxConnectionsPerChild limit is a specific method for assuring Apache is periodically recycling programs to reduce the impact of leaked memory on the system. When using external code handlers like Mod_fcgid, PHP-FPM or mod_lsapi, it becomes necessary to set MaxConnectionsPerChild to 0 (unlimited), doing so prevents periodic error pages caused by Apache terminating threads prematurely.

Rule-of-Thumb:
MaxConnectionsPerChild/MaxRequestsPerChild If the server encounters a memory leak never set the MaxConnectionsPerChild / MaxRequestsPerChild too low, instead start with 10,000 and reduce it incrementally.

 

MPM Prefork Optimization

This MPM Prefork section details the use and performance considerations for various directives when running this module. This MPM is a non-threaded multi-processor designed for compatibility. It consists of a single Apache parent process, which is used to govern all new Apache processes also known as children. The following directives show how Apache is capable of performance tuning when using MPM Prefork. Unlike Worker based MPMs, optimizing MPM Prefork is generally simple and straightforward. There is a 1:1 ratio of Apache processes to incoming requests. However, MPM Prefork does not scale well with hardware and the more traffic it encounters, the more hardware it will need to keep up with the pace. It should be noted that some directives behave differently based on which MPM is loaded. The information provided in this section is only the portion about MPM Prefork.

MaxRequestWorkers / MaxClients

Used to control the upper limit of children that the Apache parent server is allowed to have in memory at one time. These children (also called workers) handle requests on a 1:1 ratio. This translates into the maximum number of simultaneous requests the server can handle.

MaxRequestWorkers / MaxClientsIf this directive is too low, Apache under-utilizes the available hardware which translates to wasted money and long delays in page load times during peak hours. Alternatively, if this directive is too high, Apache outpaces the underlying hardware sending the system into thrashing (link to thrashing article) scenario which can lead to server crashes and potential data loss.

 

MinSpareServers

This directive defines a minimum number of spare children the Apache parent process can maintain in its memory. An additional server is a preforked idle Apache child that is ready to respond to a new incoming request. Having idle children waiting for new requests is essential for providing the fastest server response times. When the total idle children on the server drop below this value, a new child is preforked at the rate of one per second until this directive is satisfied. The “one per second” rule is in place to prevent surges of the creation process that overload the server, however, this failsafe comes at a cost. The one per second spawn rate is particularly slow when it comes to handling page requests. So it’s highly beneficial to make sure enough children are preforked and ready to handle incoming requests.

 

Rule of Thumb:
MinSpareServers Default settingNever set this to zero. Setting this to 25% of MaxRequestWorkers ensures plenty resources are ready and waiting for requests.

MaxSpareServers

MasSpareServers controls the maximum number of idle Apache child servers running at one time. An idle child is one which is not currently handling a request but waiting for a new request. When there are more than MaxSpareServers idle children, Apache kills off the excess.

If the MaxSpareServers value is less than MinSpareServers, Apache will automatically adjust MaxSpareServers to equal MinSpareServers plus one.

Like with MinSpareServers, this value should always be altered with available server resources in mind.

Rule of Thumb:
MaxSpareServersSet this to double the value of MinSpareServers.
Tip:
Configuring Apache as an open throttle is a high-performance configuration for servers with significant RAM and multiple CPU cores. When running the open throttle configuration, all available Apache children become available at all times. As a side effect of running open throttle, the Apache memory usage will stay near its peak at all times, due to running all the configured children into memory preemptively. This configuration will produce the best possible response times by maintaining persistent open connections. Furthermore, in response to traffic surges, it removes the overhead that comes from spawning new processes.
Configuration: Match both MinSpareServers and MaxSpareServers to MaxRequestWorkers.
Requirements: Make sure there is enough server RAM to run all MaxRequestWorkers at once.

StartServers

Created at startup, are the initial amount of Apache child servers.

This seldom changed directive only impacts Apache startup and restart processes. Generally not altered because Apache uses internal logic to work out how many child servers should be running.

Many modern servers periodically restart Apache to address configuration changes, rotate log files or other internal processes. When this occurs during a high load traffic surge, every bit of downtime matters. You can manually set the StartServers directive to mirror that of your MinSpareServers to shave off time from the Apache startup.

Rule of Thumb:
StartServersThe StartServers directive should mirror that of MinSpareServers.

 

ServerLimit

The ServerLimit directive represents the upper limit of MaxRequestWorkers. This setting is generally used as a safeguard or ceiling against input errors when modifying MaxRequestWorkers.ServerLimit Default Setting

It becomes necessary to adjusted ServerLimit when the server is expected to handle more than the default of 256 requests simultaneously.

ServerLimit ties in directly with the thrashing point. The thrashing point is the maximum number of children Apache can run before memory usage tips the scale into perpetual swap. Match the ServerLimit to the calculated thrashing point to safeguard the server.

Note:
Increasing ServerLimit is not recommended with MPM Prefork. Running more than 256 simultaneous requests is hardware intensive when using the MPM Prefork module.

ThreadsPerChild Default Settings

MaxConnectionsPerChild / MaxRequestsPerChild

This directive equals the number of requests a single Apache child server can handle.

This directive is a stop-gap for accidental memory leaks. Code executed through Apache may contain faults which leak memory. These leaks add up over time making less and less of the shared memory pool of the child usable. The way to recover from leaked memory is to recycle the affected Apache child process. Setting a MaxConnectionsPerChild limit will protect from this type of memory leakage.

Note:
MaxConnectionsPerChild/MaxRequestsPerChild Rule-of-Thumb: Never set this too low. If the server encounters memory leak issues start with 10,000 and reduce incrementally.

 

Apache Performance Tuning: Swap Memory

Before we get into the nitty-gritty of Apache tuning, we need to understand what happens when servers go unresponsive due to a poorly optimized configuration. An over-tuned server is one that is configured to allow more simultaneous requests (ServerLimit) than the server’s hardware can manage. Servers set in this manner have a tipping point, once reached, the server will become stuck in a perpetual swapping scenario. Meaning the Kernel is stuck rapidly reading and writing data to and from the system swap file. Swap files have read/write access speeds vastly slower than standard memory space. The swap files’ latency causes a bottleneck on the server as the Kernel attempts to read and write data faster than is physically possible or more commonly known as thrashing. If not caught immediately, thrashing spirals the system out of control leading to a system crash. If trashing is left running for too long, it has the potential of physically harming the hard drive itself by simulating decades of read/write activity over a short period. When optimizing Apache, we must be cautious not to create a thrashing scenario. We can accomplish this by calculating the thrashing point of the server based on several factors.

 

Pre-Flight Check

This article covers all Apache-based servers including but is not limited to, both traditional Dedicated servers and Cloud VPS servers running a variety of Linux distributions. We will include the primary locations where stored Apache configurations on the following Liquid Web system types:

Liquid Web Server Types

Calculating the estimated thrashing point or ServerLimit of a server uses a simple equation:

Caculating the Thrashing Point

  • buff/cache: The total memory used by the Kernel for buffers and cache.
  • Reserved: The amount of memory reserved for non-Apache processes.
  • Available: The difference between buff/cache and Reserved memory.
  • Avg.Apache: The average of all running Apache children during peak operational hours.

 

Important
Calculating the thrashing point/ServerLimit should be done during peak operational hours and periodically reassessed for optimal performance.

The thrashing point value is equal to the number Apache children the server can run; this applies to either threaded or non-thread children. When the number of children running in memory meets the calculated thrashing point, the server will begin to topple. The following example walks through a standard Liquid Web Fully Managed cPanel server to illustrate gathering the necessary details to calculate a system’s estimated thrashing point.

 

Buff/Cache Memory

On modern Linux systems, the buffer/cache can be derived using the /proc/meminfo file by adding the Buffers, Cached and Slab statistics. Using the free command, we can quickly grab this information, as in the example below:

free

Output:

Use the free command to get the buff/cache info

Don’t be fooled by the column labeled “available.” We are solely looking at the memory we can reappropriate, which is the buff/cache column (708436).

 

Reserved Memory

Reserved memory is a portion of memory held for other services aside from Apache. Some of the biggest contenders for additional memory outside of Apache are MySQL, Tomcat, Memcache, Varnish, and Nginx. It is necessary to examine these services configs to determine a valid reserved memory. These configs are outside the purview of this article. However, MySQL is the most commonly encountered service running with Apache. You can find tools online to help analyze and configure MySQL separate from this article.

 

Rule-of-Thumb:
Save 25% of the total buff/cache memory for any extra services ran on the server. Examples:

  • A standard cPanel server runs several services along with Apache and MySQL. A server with these services runs on the heavier side, needing 25% reserved for non-Apache services.
  • A pure Apache web node in a high capacity load balanced configuration does not need to reserve any additional ram for other services.

 

Average Apache Memory

Finding the average size of Apache processes is relatively simple using the ps command to list the RSS (Resident Set Size) of all running httpd processes. Note: some distributions use “apache” instead of “httpd” for process name.

This example uses a short awk script to print out the average instead of listing the sizes.

ps o rss= -C httpd|awk '{n+=$1}END{print n/NR}'

Output:

22200.6

 

This example is easy to average by hand, but larger servers will require more calculation.

ps o rss -C httpd

Output:

RSS
5092
27940
28196
27572

 

Calculate the Thrashing Point

Once collected divided the Available memory by Avg. Apache, rounding down to the nearest whole number. Available memory is the buff/cache memory minus the Reserved memory. Below is a summary of the calculation process in table form.

Thrashing point calculations

Conservative estimates are provided below for various memory configurations. These estimates can be used as a starting configuration but will require additional follow up performance assessments during peak hours to adjust directives by the servers.

General Thrashing Estimates

Install Memcached on Ubuntu 16.04

Memcached works to enhance performance by keeping a copy of commonly used script elements within the server’s memory in a form that is more easily read by the server thus reducing time. A bonus feature of this object cacher is its ability to decrease the number of connections to your database. In this tutorial, we instruct how to install Memcached, but it’s important to note that when using Memcache in an application, the application must be specially coded or configured to store and retrieve data this cached data.

Learn more about caching from our dedicated article or visit our series for database optimization.

Pre-flight

  • We are logged in as root on an Ubuntu 16.04 VPS powered by Liquid Web!
  • Installed and running Apache and PHP 7.

Installation of Memcached

Step 1:
Following best practices, we will do a quick package update by using the following command:
apt-get update
Step 2:
Install the Memcached daemon using
apt-get install memcached -y
Step 3:
Install the Memcache module for PHP fuctionality:
apt-get install php-memcached -y

Verify installation of Memcached

Use the php -m flag to show compiled modules while sorting through specifically looking for memcached.

php -m | grep memcached
memcached

Optional Configurations

At some point, you may find that you need to change the default settings of Memcached. These include adjusting the port number, memory for your cache, and the listening IP address.
vim /etc/memcached.conf

Adjust these configurations by keeping the same flags (-m, -p, -u, -l), adjusting the letter or number after the flag and save the file by typing :wq .
# Start with a cap of 64 megs of memory. It's reasonable, and the daemon default
# Note that the daemon will grow to this size, but does not start out holding this much
# memory
-m 64
 
# Default connection port is 11211
-p 11211
 
# Run the daemon as root. The start-memcached will default to running as root if no
# -u command is present in this config file
-u memcache
 
# Specify which IP address to listen on. The default is to listen on all IP addresses
# This parameter is one of the only security measures that memcached has, so make sure
# it's listening on a firewalled interface.
-l 127.0.0.1

 

Restart your Memcached service to recognize the changes to this file:
systemctl restart memcached

MySQL Performance: MyISAM vs InnoDB

 

A major factor in database performance is the storage engine used by the database, and more specifically, its tables. Different storage engines provide better performance in one situation over another. For general use, there are two contenders to be considered. These are MyISAM, which is the default MySQL storage engine, or InnoDB, which is an alternative engine built-in to MySQL intended for high-performance databases. Before we can understand the difference between the two storage engines, we need to understand the term “locking.”

To protect the integrity of the data stored within databases, MySQL employs locking. Locking, simply put, means protecting data from being accessed. When a lock is applied, the data cannot be modified except by the query that initiated the lock. Locking is a necessary component to ensure the accuracy of the stored information.  Each storage engine has a different method of locking used. Depending on your data and query practices, one engine can outperform another. In this series, we will look at the two most common types of locking employed by our two storage engines.

 

Table locking:  The technique of locking an entire table when one or more cells within the table need to be updated or deleted. Table locking is the default method employed by the default storage engine, MyISAM.

Example: MyISAM Table LockingColumn AColumn BColumn C
Query 1 UPDATERow 1Writingdatadata
Query 2 SELECT (Wait)Row 2datadatadata
Query 3 UPDATE (Wait)Row 3datadatadata
Query 4 SELECT (Wait)Row 4datadatadata
Query 5 SELECT (Wait)Row 5datadatadata
The example illustrates how a single write operation locks the entire table causing other queries to wait for the UPDATE query finish.

 

Row-level locking: The act of locking an effective range of rows in a table while one or more cells within the range are modified or deleted. Row-level locking is the method used by the InnoDB storage engine and is intended for high-performance databases.

Example: InnoDB Row-Level LockingColumn AColumn AColumn A
Query 1 UPDATERow 1Writingdatadata
Query 2 SELECTRow 2Readingdatadata
Query 3 UPDATERow 3dataWritingdata
Query 4 SELECTRow 4ReadingReadingReading
Query 5 SELECTRow 5ReadingdataReading
The example shows how using row-level locking allows for multiple queries to run on individual rows by locking only the rows being updated instead of the entire table.

 

By comparing the two storage engines, we get to the crux of the argument between using InnoDB over MyISAM. An application or website that has a frequently used table works exceptionally well using the InnoDB storage engine by resolving table-locking bottlenecks. However, the question of using one over the other is a subjective as neither of them is perfect in all situations. There are strengths and limitations to both storage engines. Intimate knowledge of the database structure and query practices is critical for selecting the best storage engine for your tables.

MyISAM will out-perform InnoDB on large tables that require vastly more read activity versus write activity. MyISAM’s readabilities outshine InnoDB because locking the entire table is quicker than figuring out which rows are locked in the table. The more information in the table, the more time it takes InnoDB to figure out which ones are not accessible. If your application relies on huge tables that do not change data frequently, then MyISAM will out-perform InnoDB.  Conversely, InnoDB outperforms MyISAM when data within the table changes frequently. Table changes write data more than reading data per second. In these situations, InnoDB can keep up with large amounts of requests easier than locking the entire table for each one.

 

Should I use InnoDB with WordPress, Magento or Joomla Sites?

The short answer here is yes, in most cases. Liquid Web’s Most Helpful Humans in Hosting Support Teams have encountered several table-locking bottlenecks when clients are using some of the standard web applications of today. Most users of popular third-party applications like WordPress, Magento, and Joomla have limited knowledge of the underlying database components or code involved to make an informed decision on storage engines. Most table-locking bottlenecks from these content management systems (CMS) are generally resolved by changing all the tables for the site over to  InnoDB instead of the default MyISAM.  If you are hosting many of these types of CMS on your server, it would be beneficial to change the default storage engine in MySQL to use InnoDB for all new tables so that any new table installations start off with InnoDB.

 

Set your default storage engine to InnoDB by adding default_storage_engine=InnoDB to the [mysqld] section of the system config file located at:  /etc/my.cnf . Restarting the MySQL service is necessary for the server to detect changes to the file.

~ $ cat /etc/my.cnf
[mysqld]
log-error=/var/lib/mysql/mysql.err
innodb_file_per_table=1
default-storage-engine=innodb
innodb_buffer_pool_size=128M

 

Unfortunately, MySQL does not inherently have an option to convert tables, leaving each table to be changed individually. Liquid Web’s support team has put together an easy to follow maintenance plan for this process. The script, which you can run on the necessary server via shell access (SSH) will convert all tables between storage engines.

Note
Plan accordingly when performing batch operations of this nature just in case downtime occurs. Best practice is to backup all your MySQL Databases before implementing a change of this magnitude, doing so provides an easy recovery point to prevent any data loss.

Step 1: Prep

Plan to start at a time of day where downtime would have minimal consequences. This process itself does not require any downtime, however, downtime may be necessary to recover from unforeseen circumstances.  

 

Step 2: Backup All Databases To A File

The command below creates a single file backup of all databases named all-databases-backup.sqld and can be deleted once the conversion has succeeded and there are no apparent problems.
mysqldump --all-databases > all-databases-backup.sql

 

Step 3: Record Existing Table Engines To A File

Run the following script to record the existing table engines to a file named table-engine-backup.sql. You can then “import” or “run” this file later to convert back to their original engines if necessary.

mysql -Bse 'SELECT CONCAT("ALTER TABLE ",table_schema,".",table_name," ENGINE=",Engine,";") FROM information_schema.tables WHERE table_schema NOT IN("mysql","information_schema","performance_schema");' | tee table-engine-backup.sql

If you need to revert the table engines back for any reason, run:
mysql < table-engine-backup.sql

 

Step 4a: Convert MyISAM Tables To InnoDB

The below command will proceed even if a table fails and lets you know which tables failed to convert. The output is saved to the file named convert-to-innodb.log for later review.
mysql -Bse 'SELECT CONCAT("ALTER TABLE ",table_schema,".",table_name," ENGINE=InnoDB;") FROM information_schema.tables WHERE table_schema NOT IN ("mysql","information_schema","performance_schema") AND Engine = "MyISAM";' | while read -r i; do echo $i; mysql -e "$i"; done | tee convert-to-innodb.log

 

Step 4b: Convert All InnoDB Tables To MyISAM

This command will proceed even if a table fails and lets you know which tables failed to convert. The output is also saved to the file named convert-to-myisam.log for later review.

mysql -Bse 'SELECT CONCAT("ALTER TABLE ",table_schema,".",table_name," ENGINE=MyISAM;") FROM information_schema.tables WHERE table_schema NOT IN ("mysql","information_schema","performance_schema") AND Engine = "InnoDB";' | while read -r i; do echo $i; mysql -e "$i"; done | tee convert-to-myisam.log

 

The following commands illustrate how converting a single table is accomplished.

Note
Replace database_name with the proper database name and table_name with the correct table name. Make sure you have a valid backup of the table in question before proceeding. 

Backup A Single Table To A File
mysqldump database_name table_name > backup-table_name.sql

 

Convert A Single Table To InnoDB

mysql -Bse ‘ALTER TABLE database_name.table_name ENGINE=InnoDB;’

 

Convert A Single Table To MyISAM:

mysql -Bse ‘ALTER TABLE database_name.table_name ENGINE=MyISAM;’

 

Check out our other articles in this series, MySQL Performance: Identifying Long Queries, to pinpoint slow queries within your database.  Stay tuned for our next article where we will cover caching and optimization.

MySQL Performance: Identifying Long Queries

Every MySQL backed application can benefit from a finely tuned database server. The Liquid Web Heroic Support team has encountered numerous situations over the years where some minor adjustments have made a world of difference in website and application performance. In this series of articles, we have outlined some of the more common recommendations that have had the largest impact on performance.

Preflight Check

This article applies to most Linux based MySQL servers. This includes, but is not limited to, both Traditional Dedicated and Cloud VPS servers running a variety of common Linux distributions. The article can be used with the following Liquid Web system types:

  • Core-managed CentOS 6x/7x
  • Core-managed Ubuntu 14.04/16.04
  • Fully-managed CentOS 6/7 cPanel
  • Fully-managed CentOS 7 Plesk Onyx 17
  • Self-managed Linux servers
Note
Self-managed systems, which have opted out of direct support can take advantage of the techniques discussed here, however, the Liquid Web Heroic Support Team cannot offer direct aid on these server types.

This series of articles assumes familiarity with the following basic system administration concepts:

 

What is MySQL Optimization?

There is no clearly defined definition for the term MySQL Optimization. It can mean something different depending on the person,  administrator, group or company. For the sake of this series of articles on MySQL Optimization, we will define MySQL Optimization as:  The configuration of a MySQL or MariaDB server which has been configured to avoid commonly encountered bottlenecks discussed in this series of articles.

What is a bottleneck?

Very similar to the neck on a soda bottle, a bottleneck as a technical term is a point in an application or server configuration where a small amount of traffic or data can pass through without issue. However, a larger volume of the same type of traffic or data is hindered or blocked and cannot operate successfully as-is. See the following example of a configuration bottleneck:

Visual Difference between Optimized and Non-Optimized DatabaseIn this example, the server is capable of handling 10 connections simultaneously. However, the configuration only accepts 5 connections. This issue would not manifest so long as there were 5 or less connections at one time. However, when traffic ramps up to 10 connections, half of them start to fail due to unused resources in the server configuration. The above examples illustrates the bottleneck shape where it derives its name versus an optimized configuration which corrects the bottleneck.

When Should I Optimize My MySQL database?

Ideally, database performance tuning should occur regularly and before productivity is affected. It is best practice behavior to conduct weekly or monthly audits of database performance to prevent issues from adversely affecting  applications. The most obvious symptoms of performance problems are:

  • Queries stack up and never completing in the MySQL process table.
  • Applications or websites using the database become sluggish.
  • Connection timeouts errors, especially during peak hours.

While it is normal for there to be several concurrent queries running at one time on a busy system, it becomes a problem when these queries are taking too long to finish on a regular basis. Although the specific threshold varies per system and per application, average query times exceeding several seconds will manifest as a slowdown within attached websites and applications. These slowdowns can sometimes start out small and go unnoticed until a large traffic surge hits a particular bottleneck.

Identifying Performance Issues

Knowing how to examine the MySQL process table is vital for diagnosing the specific bottleneck being encountered. There is a number of ways to view the process table depending on your particular server and preference. For the sake of brevity this series will focus on the most common methods used via Secure Shell (SSH) access:

 

Using The MySQL Process Table: Method 1

Use the ‘mysqladmin’ command line tool with the flag ‘processlist’ or ‘proc’ for short. (Adding the flag ‘statistics’ or ‘stat’ for short will show running statistics for queries since MySQL’s last restart.)

Command:

mysqladmin proc stat

Output:

 +-------+------+-----------+-----------+---------+------+-------+
| Id | User | Host | db | Command | Time | State | Info | Progress |
+-------+------+-----------+-----------+---------+------+-------+--------------------+----------+
| 77255 | root | localhost | employees | Query | 150 | | call While_Loop2() | 0.000 |
| 77285 | root | localhost | | Query | 0 | init | show processlist | 0.000 |
+-------+------+-----------+-----------+---------+------+-------+--------------------+----------+
Uptime: 861755 Threads: 2 Questions: 20961045 Slow queries: 0 Opens: 2976 Flush tables: 1 Open tables: 1011 Queries per second avg: 24.323

Pro: Used on the shell interface, this makes piping output to other scripts and tools very easy.
Con: The process table’s info column is always truncated so does not provide the full query on longer queries.

Using The MySQL Process Table: Method 2

Run the ‘show processlist;’ query from within MySQL interactive mode prompt. (Adding the ‘full’  modifier to the command disables truncation of the Info column. This is necessary when viewing long queries.)

 

Command:

show processlist;

Output:
MariaDB [(none)]> show full processlist;
+-------+------+-----------+-----------+---------+------+-------+-----------------------+----------+
| Id | User | Host | db | Command | Time | State | Info | Progress |
+-------+------+-----------+-----------+---------+------+-------+-----------------------+----------+
| 77006 | root | localhost | employees | Query | 151 | NULL | call While_Loop2() | 0.000 |
| 77021 | root | localhost | NULL | Query | 0 | init | show full processlist | 0.000 |
+-------+------+-----------+-----------+---------+------+-------+-----------------------+----------+

Pro: Using the full modifier allows for seeing the full query on longer queries.
Con: MySQL Interactive mode cannot access scripts and tools available in the shell interface.

Using The slow query log

Another valuable tool in  MySQL is the included slow query logging feature. This feature is the preferred method for finding long running queries on a regular basis. There are several directives available to adjust this feature. However, the most commonly needed settings are:

 

slow_query_logenable/disable the slow query log
slow_query_log_filename and path of the slow query log file
long_query_timetime in seconds/microseconds defining a slow query

These directives are set within the [mysqld] section of the MySQL configuration file located at /etc/my.cnf and will require a MySQL service restart before they will take affect. See the example below for formatting:

Caution
There is a large disk space concern with the slow query log file, which needs to be attended to continually until the slow query log feature is disabled. Keep in mind, the lower your long_query_time directive the faster the slow query log fills up a disk partition.
[mysqld]
log-error=/var/lib/mysql/mysql.err
innodb_file_per_table=1
default-storage-engine=innodb
innodb_buffer_pool_size=128M
innodb_log_file_size=128M
max_connections=300
key_buffer_size = 8M
slow_query_log=1
slow_query_log_file=/var/lib/mysql/slowquery.log
long_query_time=5

Once the slow query log is enabled you will need to periodically follow-up with it to review unruly queries that need to be adjusted for better performance. To analyze the slow query log file, you can parse it directly to review its contents. The following example shows the statistics for the sample query which ran longer that the configured 5 seconds:

Caution
There is a performance hit taken by enabling the slow query log feature. This is due to the additional routines needed to analyze each query as well as the I/O needed to write the necessary queries to the log file. Because of this, it is considered best practice on production systems to disable the slow query log. The slow query log should only remain enabled for a specific duration when actively looking for troublesome queries that may be impacting the application or website.
# Time: 180717 0:23:28
# User@Host: root[root] @ localhost [] # Thread_id: 32 Schema: employees QC_hit: No
# Query_time: 627.163085 Lock_time: 0.000021 Rows_sent: 0 Rows_examined: 0
# Rows_affected: 0
use employees;
SET timestamp=1531801408;
call While_Loop2();

Optionally, you can use the mysqldumpslow command line tool, which parses the slow query log file and groups like queries together except values of number and string data:
~ $ mysqldumpslow -a /var/lib/mysql/slowquery.log
Reading mysql slow query log from /var/lib/mysql/slowquery.log
Count: 2 Time=316.67s (633s) Lock=0.00s (0s) Rows_sent=0.5 (1), Rows_examined=0.0 (0), Rows_affected=0.0 (0), root[root]@localhost
call While_Loop2()
(For usage information visit MySQL documentation here: mysqldumpslow – Summarize Slow Query Log Files)

So concludes the first part of our Database Optimization series and gives us a solid basis to refer back to for benchmark purposes. Though database issues can be complicated, our series will break down these concepts to provide means to optimize your database through database conversion, table conversion, and indexing.

 

Optimizing Your Website in Cloud Sites

optimizing-your-site-cloud-sties-header

To ensure that your site performs at its best, there are a few things you can do to optimize it when using Liquid Web Cloud Sites technology. This article will take you through the top five best practices to optimize your website.

Continue reading “Optimizing Your Website in Cloud Sites”

How to Install the Memcached PHP Extension on Ubuntu 15.04

Memcached is a distributed, high-performance, in-memory caching system that is primarily used to speed up sites that make heavy use of databases. It can, however, be used to store objects of any kind. Nearly every popular CMS has a plugin or module to take advantage of Memcached, and many programming languages have a Memcached library, including PHP, Perl, Ruby, and Python. Memcached runs in memory and is thus quite speedy since it does not need to write data to disk.

Pre-Flight Check

  • These instructions are intended specifically for installing the Memcached PHP Extension on a single Ubuntu 15.04 node.
  • I’ll be working from a Liquid Web Core Managed Ubuntu 15.04 server, and I’ll be logged in as root.
  • Follow our tutorial on How to Install Memcached on Ubuntu 15.04 prior to this KB!

Continue reading “How to Install the Memcached PHP Extension on Ubuntu 15.04”

How to Install Memcached on Ubuntu 15.04

Memcached is a distributed, high-performance, in-memory caching system that is primarily used to speed up sites that make heavy use of databases. It can however be used to store objects of any kind. Nearly every popular CMS has a plugin or module to take advantage of memcached, and many programming languages have a memcached library, including PHP, Perl, Ruby, and Python. Memcached runs in memory and is thus quite speedy, since it does not need to write data to disk.

Pre-Flight Check

  • These instructions are intended specifically for installing Memcached on a single Ubuntu 15.04 node.
  • I’ll be working from a Liquid Web Core Managed Ubuntu 15.04 server, and I’ll be logged in as root.

Continue reading “How to Install Memcached on Ubuntu 15.04”

How to Resize a Server

Upgrading your Storm® server is a simple process, and can be done in just a few satisfying clicks. Upgrades are a necessity. You work hard on your blog, or ecommerce store, and the traffic grows! Once you’ve optimized your WordPress site or Magento store, reward yourself with an easy upgrade process and increase the available resources to your Storm® by following the steps below. This process can be followed for Storm® VPS or Storm® Dedicated servers.

Pre-Flight Check
  • These instructions are intended specifically for resizing your Storm® server.

Continue reading “How to Resize a Server”