| Subcribe via RSS

PCI DSS & MySQL – Requirement 6

April 7th, 2010 | 1 Comment | Posted in MySQL, MySQL Administration, PCI DSS, Security

Requirement 6 of PCI DSS v1.2 states that in order to be compliant, an organization must:

“Develop and maintain secure systems and applications”

“Unscrupulous individuals use security vulnerabilities to gain privileged access to systems. Many of these vulnerabilities are fixed by vendor- provided security patches, which must be installed by the entities that manage the systems. All critical systems must have the most recently released, appropriate software patches to protect against exploitation and compromise of cardholder data by malicious individuals and malicious software.

“Note: Appropriate software patches are those patches that have been evaluated and tested sufficiently to determine that the patches do not conflict with existing security configurations. For in-house developed applications, numerous vulnerabilities can be avoided by using standard system development processes and secure coding techniques.”

Most of Requirement 6 is outside the scope of MySQL directly and covers things like:

  • Validation of all input
  • Validation of proper error handling
  • Validation of secure cryptographic storage (although you’ll want to validate MySQL’s at-rest encryption as part of your validation processes)
  • Validation of secure communications
  • Separate development/test and production environments
  • Separation of duties between development/test and production environments
  • Production data (live PANs) are not used for testing or development
  • Web Application Security (XSS, CSRF, etc)
  • Formalizing the process for security updates

The parts that specifically apply to MySQL are as follows:

6.2 Establish a process to identify newly discovered security vulnerabilities (for example, subscribe to alert services freely available on the Internet). Update configuration standards as required by PCI DSS Requirement 2.2 to address new vulnerability issues.

Personally, I like a formalized subscription to BugTraq and MySQL’s Announce List, where each relevant bug and announcement are formally evaluated against your target environment. In the event that there is something applicable to your environment, you can begin the (formally documented) process of introducing the mitigation, upgrade, or other required change(s).

6.4 – Follow change control procedures for all changes to system components

The procedures must include the following:

  • Documentation of impact
  • Management sign-off by appropriate parties
  • Testing of operational functionality
  • Back-out procedures

Unfortunately, Change Control for databases is not nearly as widely practiced as for code. We, as a community of database administrators, need to make a greater effort to make this a de facto standard for organizations. In the interest of the greater good, I wanted to call out this requirement directly and provide some relevant links:

Tags: , , ,

PCI DSS & MySQL – Requirement 4

April 7th, 2010 | 1 Comment | Posted in MySQL, MySQL Administration, PCI DSS, Security

Requirement 4 of PCI DSS v1.2 states that we must:

“Encrypt transmission of cardholder data across open, public networks”

Specifically, “Sensitive information must be encrypted during transmission over networks that are easily accessed by malicious individuals. Misconfigured wireless networks and vulnerabilities in legacy encryption and authentication protocols can be continued targets of malicious individuals who exploit these vulnerabilities to gain privileged access to cardholder data environments.”

On the face of it, this seems generally irrelevant to MySQL as nobody in their right mind has their databases on open, public networks. If all of your databases are all located locally to one another (same switch in private network), you’re likely compliant as far as MySQL is concerned (same caveat as earlier posts apply: the other aspects of your stack must be evaluated as this post is solely concerned with MySQL). If you have replication across datacenters, you must ensure that at least one of the following holds true:

  • The traffic between datacenters is encrypted at the network layer (secure VPN, for example)
  • Applicable data is encrypted before being inserted into the database (either by encrypting in the application layer or using RBR as described in my previous post on PCI DSS & MySQL).
  • You use MySQL Replication Over SSL.

Even if we meet the specific goals of Requirement 4, we still will not have satisfied generally-accepted best practices of always encrypting data in transit! A case in point of where this requirement falls short is the 2006-2009 Heartland Data Breach, where (if I recall correctly), malware located on their internal network was able to sniff unencrypted traffic and compromise account information.

The bottom line is that if you’re taking the time to become PCI compliant, take the time to go the extra mile. It could ease any transitional burdens when PCI DSS is updated with stronger requirements (for example, it is rumored that new encryption guidance for end-to-end encryption will be part of the next standard).

Tags: , , ,

PCI DSS & MySQL: Requirement 3

April 7th, 2010 | 3 Comments | Posted in MySQL, PCI DSS, Security

Requirement 3 of the PCI DSS v1.2 is:

“Protect Stored Cardholder Data”

As vague as that sounds, the PCI DSS enumerates exactly what that covers:

Data Element Storage Permitted Protection Required PCI DSS Req 3.4
Primary Account Number Yes Yes Yes
Cardholder Name* Yes Yes No
Service Code* Yes Yes No
Expiration Date* Yes Yes No
Full Magnetic Stripe Data No N/A N/A
CAV2/CVC2/CVV2/CID No N/A N/A
PIN/PIN Block No N/A N/A

* These data elements must be protected if stored in conjunction with the PAN. This protection should be per PCI DSS requirements for general protection of the cardholder data environment. Additionally, other legislation (for example, related to consumer personal data protection, privacy, identity theft, or data security) may require specific protection of this data, or proper disclosure of a company’s practices if consumer-related personal data is being collected during the course of business. PCI DSS, however, does not apply if PANs are not stored, processed, or transmitted.

For those data points that can be stored, MySQL offers a variety of encryption functions to secure the data. For the purposes of this example, we’ll use AES_ENCRYPT() and AES_DECRYPT(). Let’s begin by creating a table to house the data. The formula for determining the space requirements is 16*(CEILING(string_length/16)+1) so we’ll create the table as follows:

mysql> CREATE TABLE `cc_info` (
    ->    `id` int unsigned NOT NULL auto_increment,
    ->    `cc_num` binary(32) NOT NULL, -- Sorry AMEX
    ->    `service_code` varbinary(32) NOT NULL,
    ->    `name_on_card` varbinary(48) NOT NULL, -- Assuming a 32-char max,
    ->    PRIMARY KEY (`id`))
    ->    ENGINE=InnoDB;
Query OK, 0 rows affected (2.79 sec)

Populating it with some data, where I’ve chosen ‘secret_key’ as the key_str used to encrypt:

mysql> INSERT INTO `cc_info` (`cc_num`, `service_code`, `name_on_card`) VALUES (
    ->   AES_ENCRYPT('1234123412341234', 'secret_key'),
    ->   AES_ENCRYPT('1234', 'secret_key'),
    ->   AES_ENCRYPT('John Doe', 'secret_key'));
Query OK, 1 row affected (0.35 sec)

If we try and SELECT the data, we get

mysql> SELECT id, cc_num, service_code, name_on_card FROM cc_info\G
*************************** 1. row ***************************
          id: 1
      cc_num: ??
                ? q$?!~c?3Pg?"xu&3?:?,am?
service_code: y.??A??
?? ?a??
name_on_card: ?93s?!? X?8?|nZ
1 row in set (0.00 sec)

We need to use AES_DECRYPT as follows:

mysql> SELECT id,
    ->        AES_DECRYPT(`cc_num`, 'secret_key') AS `cc_num`,
    ->        AES_DECRYPT(`service_code`, 'secret_key') AS `service_code`,
    ->        AES_DECRYPT(`name_on_card`, 'secret_key') AS `name_on_card`
    ->   FROM `cc_info`\G
*************************** 1. row ***************************
          id: 1
      cc_num: 1234123412341234
service_code: 1234
name_on_card: John Doe
1 row in set (0.00 sec)

The PCI DSS v1.2 standard says that we must use “strong cryptography” to protect the data, where they define “strong cryptography” as “Cryptography based on industry-tested and accepted algorithms, along with strong key lengths and proper key-management practices. Cryptography is a method to protect data and includes both encryption (which is reversible) and hashing (which is not reversible, or “one way”). SHA-1 is an example of an industry-tested and accepted hashing algorithm. Examples of industry-tested and accepted standards and algorithms for encryption include AES (128 bits and higher), TDES (minimum double-length keys), RSA (1024 bits and higher), ECC (160 bits and higher), and ElGamal (1024 bits and higher).”

MySQL uses 128-bit key length, satisfying the requirement. You can, however, take things a step further by editing the file include/my_aes.h and updating the following line:

#define AES_KEY_LENGTH 128 /* must be 128 192 or 256 */

and compiling from source. I DO NOT recommend taking this approach unless you have a very compelling reason, as it is easy to mess up the build process, more difficult to manage upgrades, and generally more error-prone than simply using standard package managers for your chosen distribution.

So we’re done, right? Nope:) We must secure the information not only in the database, but in logs, backups, and any other place it resides on disk. If we’re using MySQL replication, which is quite typical, we must also consider the binary logs. Looking at the transactions above, we see:

%> mysqlbinlog log-bin.000001
...
BEGIN
/*!*/;
# at 172
#100406 16:35:31 server id 1  end_log_pos 200 	Intvar
SET INSERT_ID=1/*!*/;
# at 200
#100406 16:35:31 server id 1  end_log_pos 461 	Query	thread_id=1	exec_time=0	error_code=0
use cc/*!*/;
SET TIMESTAMP=1270596931/*!*/;
INSERT INTO `cc_info` (`cc_num`, `service_code`, `name_on_card`) VALUES (
   AES_ENCRYPT('1234123412341234', 'secret_key'),
   AES_ENCRYPT('1234', 'secret_key'),
   AES_ENCRYPT('John Doe', 'secret_key'))
/*!*/;
# at 461
#100406 16:35:31 server id 1  end_log_pos 488 	Xid = 6
COMMIT/*!*/;
...

Unfortunately we can see all of the information that we worked so hard to secure! As of MySQL 5.1, you can use row-based replication, which logs changes in individual table rows rather than simply replicating full SQL statements. The same statements as above, as executed when binlog_format=ROW, appear in the binary log as:

%> mysqlbinlog -v log-bin.000001
...
#100406 16:39:30 server id 1  end_log_pos 936 	Query	thread_id=3	exec_time=0	error_code=0
SET TIMESTAMP=1270597170/*!*/;
BEGIN
/*!*/;
# at 936
# at 989
#100406 16:39:30 server id 1  end_log_pos 989 	Table_map: `cc`.`cc_info` mapped to number 15
#100406 16:39:30 server id 1  end_log_pos 1090 	Write_rows: table id 15 flags: STMT_END_F

BINLOG '
Msa7SxMBAAAANQAAAN0DAAAAAA8AAAAAAAAAAmNjAAdjY19pbmZvAAQD/g8PBv4gIAAwAAA=
Msa7SxcBAAAAZQAAAEIEAAAQAA8AAAAAAAEABP/wBAAAACCY3AusAHEkreUIIX5jyzNQZ90ieHUm
M8Y6BgflLGFtjxB5Lo7nQeK6zQr+wQCVYabiELI5M3P+IYoAWOo4iHxuWhI=
'/*!*/;
### INSERT INTO cc.cc_info
### SET
###   @1=1
###   @2='??\x0b?\x00q$??\x08!~c?3Pg?"xu&3?:\x06\x07?,am?'
###   @3='y.??A??\x0a??\x00?a??'
###   @4='?93s?!?\x00X?8?|nZ\x12'
# at 1090
#100406 16:39:30 server id 1  end_log_pos 1117 	Xid = 23
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
...

Here, we can only see the encrypted binary data; thus meeting PCI standards. MySQL offers a third binlog_format, and that is MIXED. This is essentially where non-deterministic statements are executed with row-based logging, while deterministic statements use statement-based logging (this is an oversimplification, full details on MIXED logging format can be found here). Since these encryption functions are deterministic, the full SQL statements will be shown in the log and won’t work in a PCI environment.

We can’t obfuscate MySQL’s other logs, though. It is generally considered best practice to ensure that the slow query log, error log, and any other log in which full SQL queries may appear to be written on an encrypted volume if you’re using the “pure MySQL” approach to encryption. If encryption is done at the application layer, the logs are obfuscated for you.

These methodologies described above represent but one of the “Pure MySQL” ways to meet requirement 3. The other most common approaches are to call encryption functions within the application or to use a full-disk encryption solution. Be very careful with this approach, though; in my experience with these, they are mostly expensive, buggy, a performance killer, or some combination thereof (YMMV)! If this approach is chosen, I encourage one of your tests to be to physically yank the power cord during a load test; see if you can access the data after.

Much of the body of requirement 3 is dedicated to key management, documentation, distribution, storage, and generation. Here we’ll go through the remaining requirements that are relevant to our MySQL-only focus.

3.1 – Keep cardholder data storage to a minimum. Develop a data retention and disposal policy. Limit storage amount and retention time to that which is required for business, legal, and/or regulatory purposes, as documented in the data retention policy.

This is just generally good advice, not storing what you won’t or can’t use. It pollutes the database, requires more on-disk (and potentially in-memory) space while providing no business value. The important thing here is to develop a policy. A quick Google search for PCI Policy Templates can show some samples that can provide a jumping-off point while developing policies for your own organization.

3.2 – Do not store sensitive authentication data after authorization (even if encrypted)

The subsections of 3.2 basically boil down to a “don’t list”: DO NOT

  • store sensitive authentication data after authorization (even if encrypted) [3.2]
  • store the full contents of any track from the magnetic stripe (located on the back of a card, contained in a chip, or elsewhere). This data is alternatively called full track, track, track 1, track 2, and magnetic-stripe data [3.2.1]
  • store the card- verification code or value (three- digit or four-digit number printed on the front or back of a payment card) used to verify card-not- present transactions [3.2.2]
  • store the personal identification number (PIN) or the encrypted PIN block [3.2.3]

3.3 – Mask PAN when displayed (the first six and last four digits are the maximum number of digits to be displayed).

As a user, I actually consider this maximum to be incredibly rude! Imagine if you logged in to Amazon and they displayed your “masked” card as: 123412******1234! The standard method of display is **** 1234 (where the number of stars are variable). Since we generally don’t want unencrypted data floating around the network, we can access the data as follows:

mysql> SELECT RIGHT(AES_DECRYPT(`cc_num`, 'secret_key'),4) AS `last_four`
    ->   FROM `cc_info`
    ->   WHERE `id`=1;
+-----------+
| last_four |
+-----------+
| 1234      |
+-----------+
1 row in set (0.00 sec)

And then format it with leading *’s on the front-end as required.

3.4 – Render PAN, at minimum, unreadable anywhere it is stored (including on portable digital media, backup media, in logs) by using any of the following approaches: One-way hashes based on strong cryptography, Truncation, Index tokens and pads (pads must be securely stored)

This cannot be done natively in MySQL when using replication without one of the approaches listed above or writing logs to a dedicated encrypted partition.

3.4.1 – If disk encryption is used (rather than file- or column-level database encryption), logical access must be managed independently of native operating system access control mechanisms (for example, by not using local user account databases). Decryption keys must not be tied to user accounts.

When you evaluate a full-disk encryption solution, you must evaluate it primarily against the metrics outlined in the PCI DSS.

Tags: , ,

PCI DSS & MySQL – Requirement 2

April 6th, 2010 | 2 Comments | Posted in MySQL, PCI DSS, Security

Requirement 2 of the PCI DSS v1.2 is:

“Do not use vendor-supplied defaults for system passwords and other security parameters”

Understanding that we’re limiting the discussion solely to MySQL (OS, Network Devices, and other software will no doubt apply to overall compliance), we can do this easily. The vendor-supplied default MySQL 5.1.43 (they’re similar across other versions, but only this version was tested here) credentials can be seen as follows:

mysql> SELECT user, host, password FROM mysql.user;
+------+-----------+----------+
| user | host      | password |
+------+-----------+----------+
| root | localhost |          |
| root | testbox1  |          |
| root | 127.0.0.1 |          |
|      | localhost |          |
|      | testbox1  |          |
+------+-----------+----------+
5 rows in set (0.28 sec)

MySQL handily provides us with a script called mysql_secure_installation that will, assuming we choose Y to all questions, satisfy this requirement:

%> mysql_secure_installation
NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MySQL
      SERVERS IN PRODUCTION USE!  PLEASE READ EACH STEP CAREFULLY!

In order to log into MySQL to secure it, we'll need the current
password for the root user.  If you've just installed MySQL, and
you haven't set the root password yet, the password will be blank,
so you should just press enter here.

Enter current password for root (enter for none):
OK, successfully used password, moving on...

Setting the root password ensures that nobody can log into the MySQL
root user without the proper authorisation.

Set root password? [Y/n] Y
New password:
Re-enter new password:
Password updated successfully!
Reloading privilege tables..
 ... Success!

By default, a MySQL installation has an anonymous user, allowing anyone
to log into MySQL without having to have a user account created for
them.  This is intended only for testing, and to make the installation
go a bit smoother.  You should remove them before moving into a
production environment.

Remove anonymous users? [Y/n] Y
 ... Success!

Normally, root should only be allowed to connect from 'localhost'.  This
ensures that someone cannot guess at the root password from the network.

Disallow root login remotely? [Y/n] Y
 ... Success!

By default, MySQL comes with a database named 'test' that anyone can
access.  This is also intended only for testing, and should be removed
before moving into a production environment.

Remove test database and access to it? [Y/n] Y
 - Dropping test database...
 ... Success!
 - Removing privileges on test database...
 ... Success!

Reloading the privilege tables will ensure that all changes made so far
will take effect immediately.

Reload privilege tables now? [Y/n] Y
 ... Success!

Cleaning up...

All done!  If you've completed all of the above steps, your MySQL
installation should now be secure.

Thanks for using MySQL!

(Emphasis added was mine). This gives us the following:

mysql> SELECT user, host, password FROM mysql.user;
+------+-----------+-------------------------------------------+
| user | host      | password                                  |
+------+-----------+-------------------------------------------+
| root | localhost | *F169C0AFEEC30BFF924130B124E6AE3E875D5F60 |
+------+-----------+-------------------------------------------+
1 row in set (0.00 sec)

Note that the mysql_secure_installation has done more than the PCI minimum of removing default accounts, but also completed items considered to be best practices:

  • Remove anonymous users
  • Disallow root login remotely
  • Remove test database and access to it

We should take this a step further and ensure that MySQL’s old (insecure) password format is not enabled:

mysql> SHOW GLOBAL VARIABLES LIKE 'old_passwords';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| old_passwords | OFF   |
+---------------+-------+
1 row in set (0.00 sec)

To find any users with the old-style (or empty) passwords:

mysql> SELECT user, host FROM mysql.user WHERE LENGTH(password)!=41;
Empty set (0.00 sec)

Now we’ve done this for a new install, but few of us have the privilege of fresh installs. Instead, we work with existing systems. So, here’s a query (for 5.1.43) to find out if we have any insecure user accounts or system defaults:

mysql> SELECT user, host FROM mysql.user WHERE user='' OR password='' OR host='%';
Empty set (0.00 sec)

On the face of it, we seem to have satisfied PCI DSS Requirement 2. If only it were that easy … Requirement 2 actually consists of a variety of subsections. Most of these are outside the scope of MySQL directly, but the following will apply:

2.2 – Develop configuration standards for all system components. Assure that these standards address all known security vulnerabilities and are consistent with industry-accepted system hardening standards.

This requirement is to ensure that all configurations are codified and put into use. In addition to written policies and procedures, the use of an automated configuration management system (cfengine, Chef, Puppet, etc) can simplify adherence to corporate policies and standards.

2.2.3 – Configure system security parameters to prevent misuse.

I’m still waiting for the NSA to release a Security Configuration Guide for MySQL as it has for other databases, but there are other common security configurations that prevent misuse. As every installation is different, do not consider this list comprehensive, it is meant to be a starting point.

  • Be judicious with your GRANTs. If an application requires only SELECT, INSERT, UPDATE, and DELETE privileges on a particular database, limit the GRANT statement accordingly. Specifically avoid WITH GRANT OPTION and SUPER wherever possible (I mention SUPER in particular because it allows the one reserved administrative connection to be used by the app, meaning what may have only been a momentary blip in service results in extended downtime while mysqld is restarted; this goes against the “A” in the CIA Triad)
  • Disable local_infile. I won’t repeat the common wisdom here, but you can read more about the security implications of this on MySQL’s website
  • Disable old_passwords. The hashing algorithm has gotten more secure over the years, it is best practice (as described above) to use only the newer hash algorithm. You can read more about password hashing in MySQL 5.1 here
  • Set read_only=ON. Although not typically viewed as a security configuration parameter, this setting can provide an additional layer of protection that ensures data on your replication slaves remains in sync with your masters (think I in the CIA Triad)
  • Enable secure_auth. This blocks connections from all accounts that have passwords stored in the old (pre-4.1) format. Even though old_passwords will be OFF, this can provide a bit of redundancy to that option

I hope this post has helped secure at least one MySQL server. Look for more PCI DSS + MySQL coming up soon!

Tags: , ,

Nagios Checks For MMM

July 15th, 2009 | No Comments | Posted in MySQL, MySQL Administration

I’ve written some new Nagios checks for MMM (MMM on Google CodeMMM on Launchpad). check_mmm is a part of http://code.google.com/p/check-mysql-all/, and is meant to be called locally on the MMM Monitor server (usually via NRPE). Feedback is welcome, usage is as follows:

Usage:
     check_mmm --cluster C# 

     Options:
       --cluster=    The MMM Cluster to check
       -c, --critical=
    The level at which a critical alarm is raised.
       -h, --help                Display this message and exit
       -v, --verbose             Increase verbosity level
       -V, --version             Display version information and exit
       -w, --warning             The level at which a warning is raised.

     Defaults are:

     ATTRIBUTE                  VALUE
     -------------------------- ------------------
     cluster                    No default value
     critical                   HARD_OFFLINE,REPLICATION_FAIL
     help                       FALSE
     verbose                    1 (out of 3)
     version                    FALSE
     warning                    ADMIN_OFFLINE,AWAITING_RECOVERY,REPLICATION_DELAY
Tags: , ,

Nagios MySQL Plug-Ins

March 9th, 2009 | 4 Comments | Posted in MySQL, MySQL Administration, Nagios

There currently exist many plugins for MySQL to use with Nagios. Many of them, however, are not version-independent, leaving organizations that use multiple versions of MySQL to either install multiple plugins or not monitor specific versions of MySQL. As such, I’ve compiled what I consider to be the most useful checks into a single plugin: check_mysql

Usage:
     check_mysql check_name [options]

     Options:
       --args|a     Optional arguments.  Comma-separated.  Check-specific.
       --critical|c The level at which a critical alarm is raised.  Check-specific.
       --database   The database to use (defaults to mysql)
       --help|?     Display this message and exit
       --hostname|H     The target MySQL server host (defaults to localhost)
       --password|p The password of the MySQL user
       --port       The port MySQL is listening on (defaults to 3306)
       --user|u     The MySQL user used to connect
       --version|V  Display version information and exit
       --warning|w  The level at which a warning is raised.  Check-specific.

     defaults are:

     ATTRIBUTE                  VALUE
     -------------------------- ------------------
     args                       No default value
     critical                   Check-specific
     database                   mysql
     help                       FALSE
     host                       localhost
     password                   No default value
     port                       3306
     timeout                    10 seconds
     user                       No default value
     verbose                    1 (out of 3)
     version                    FALSE
     warning                    Check-specific

Current Checks Supported:

* connect – Check to see whether or not one can connect to MySQL (USAGE)
* repl_io – Check to see whether on not the IO Replication thread is running (REPLICATION CLIENT)
* repl_sql – Check to see whether or not the SQL Replication thread is running (REPLICATION CLIENT)
* repl_sbm – Check how many seconds behind the master the slave is (REPLICATION CLIENT)
* mysql_query – Run a given query, test if it executes properly (SELECT)
* connections – Test if the percentage of used connections is over a given threshold (PROCESS)

I am open to requests for additional checks etc.

Tags: ,

OmniSQL 0.0.7 Released

October 28th, 2008 | No Comments | Posted in MySQL, MySQL Administration, OmniSQL

OmniSQL (a command line tool for DBAs needing to issue ad-hoc queries against sharded data) version 0.0.7 is officially released.

Instead of logging in separately to multiple databases to issue the same query, groups of databases can be specified in a configuration file and queries will be automatically issued against all targeted MySQL instances.

Let me know of any bugs found or features you would like to see in upcoming releases!

Download at http://code.google.com/p/omnisql/.

CHANGE LOG
- Fixed Bug #3: Script failure when only one group is defined
- Fixed Bug #4: Too Slow (now it forks)
- Fixed Bug #5: Host-Specific parameters don't work
- INCOMPATIBLE CHANGE: Changed format of config file
- INCOMPATIBLE CHANGE: Use XML::SimpleObject instead of XML::Simple
- Added group summary (Issue #6)

Tags: ,

The Awkward Stage of Scaling

October 9th, 2008 | No Comments | Posted in MySQL, MySQL Performance, Performance

A lot of my clients are in a position where their database performance is deteriorating but they are not “big enough” (or not willing/able to) explore sharding all of their data structures. They’re too big for the solution to be adding another read slave, but too small to justify the resources for re-designing their architecture. They’ve often implemented memcache, re-factored schema, and tried other ways to improve database performance but are looking for quick wins without the hassles & risks of full-fledged sharding. As such, I find myself regularly recommending that customers explore basic functional partitioning and “mini sharding”. It is a great way to stave off that inevitable day when you have to re-architect the entire application:)

If you are lucky enough to have built-in debug code and know which module is taking the most time, by all means ignore this list and separate that module from the core architecture if possible. If not, read on.

Registration

Your user table probably has a column like the following:

`status` enum('PENDING','ACTIVE','DELETED') NOT NULL default 'PENDING'

Where a user is ‘PENDING’ until they complete email verification. If you do a:

SELECT `status`, COUNT(*) FROM `user` GROUP BY `status`;

(go ahead and do it … I’ll wait)

You will probably notice that a lot of your users are ‘PENDING’ (my guess is ~20% … am I right?). If, upon registration, user records are placed in a pending_user until successfully completing email verification, the following benefits can be realized:

1) The number of records in the main user table is smaller. It is common for as many as 15-25% of new registrations to never complete email verification. Keeping them out of the main user table will slow the data/index size growth rate.
2) Queries against the user table will not be invalidated as often. Because of the way the MySQL query cache works, inserts into a table invalidate queries against the target table. By not inserting as many records into the table (see benefit #1), the queries do not get invalidated as often.
3) Queries using “WHERE `status` != ‘PENDING’” now don’t have to use that where clause. Conversely, queries looking only for records where `status` = ‘PENDING’ can query the pending_user table directly and not interfere with the *real* tables.

The same goes for *anything* that goes through an approval process (media, posts, etc). Put it in a pending table so that queries don’t have to have that extra WHERE clause.

Archive server(s)

The logical extension of separating PENDING records is to do the same for DELETED records. If deleted records (whether users or posts) are only used in specific queries, consider setting up archive_user or archive_post tables. This way, they can easily be moved to separate nodes and not waste storage/index space on your primary database servers.

Search

If search isn’t killing your database now, it probably will in the future. This is one of the easiest modules to detach from the primary database servers (it should simply involve changing your Search.class). Check out Sphinx, Lucene, or Solr. Because search tends to be widely used, this should free up the database to serve a higher volume of other queries.

Click-Tracking

Everybody wants to record what their users do, and when they do it. This can be the backbone of providing recommendations, UX research, and user profiling. It is important. But it is also write-heavy and can cause replication lag. With a little bit of work, the raw data can be separated from the core database server and written to different nodes. Some general thoughts on how to approach this:

- Log the clicks directly from http logs (set up a lightweight daemon on a dedicated node)
- Log clicks to a file and then periodically write to a database that is separate from your application (consider using LOAD DATA INFILE)
- Group the raw data by day/week/month/year and use partitioning
- Move data that is not actively used to another node (usually one with larger, slower disks)

The above recommendations are all relatively easy to implement and can provide tremendous benefit to your application. 64993021DF5E7E3652226B74779DCB92

Tags: , ,

Managing MySQL Configuration Files

September 3rd, 2008 | 2 Comments | Posted in MySQL, MySQL Administration

It is good practice to manage changes to MySQL configuration files (/etc/my.cnf) by using a version control system. I usually use a home-brewed (not brewed by me!) svn+cfengine application to propagate my.cnf (and other configuration file) changes to defined classes of machines (classes are based on application role, replication role, etc).

When managing hundreds of different database servers with dozens of roles, templating my.cnf becomes a necessity! If I have to change a variable for a given class, that can mean editing a dozen my.cnf files on as many servers. Not a productive use of time!!! Usually, the only parameters that differ between database hosts in the same class are the replication options. As such, I find it useful to have a base template for each class of machine and use the !include startup option to specify host-specific startup options.

Here is a sample of what I would have as a base my.cnf for billing-class MySQL Replication Slaves:

[mysqld]
# Default Configuration For billing-class MySQL Replication Slaves


#############################################################################
# GENERAL STORAGE ENGINES
#############################################################################

skip-bdb
default-storage-engine = InnoDB

#############################################################################
# InnoDB
#############################################################################

innodb_file_per_table = 1
innodb_log_file_size = 256M
innodb_buffer_pool_size = 12G
innodb_flush_method = O_DIRECT

#############################################################################
# MyISAM
#############################################################################

key_buffer_size = 32M

#############################################################################
# Query Cache
#############################################################################

query_cache_type = 1
query_cache_size = 64M

#############################################################################
# REPLICATION
#############################################################################

!include /etc/my_replication.cnf

#############################################################################
# LOGGING
#############################################################################

log-err = /var/lib/mysql_logs/err.log
log-slow = /var/lib/mysql_logs/slow.log
long-query-time = 10000
log-queries-not-using-indexes

#############################################################################
# MAINTENANCE & RECOVERY
#############################################################################

myisam_recover = FORCE,BACKUP

And then I would store host-specific replication options in a smaller file:

[mysqld]
#############################################################################
# REPLICATION
#############################################################################

server-id = 100
...

To re-iterate, templating MySQL configuration files can help ensure consistency between servers of the same class as well as making it easier to push changes to all relevant hosts instead of doing it individually.

Tags: ,

The Dangers of Having status fields

August 27th, 2008 | 1 Comment | Posted in MySQL, MySQL Performance

Having a status column is very common in databases today. It can be used to denote a user status:

CREATE TABLE IF NOT EXISTS `user` (
`user_id` int(10) unsigned NOT NULL auto_increment,
`email` varchar(32) NOT NULL,
`pw_hash` char(40) NOT NULL COLLATE latin1_general_cs,
`status` enum('PENDING', 'ACTIVE', 'DISABLED') default 'PENDING',
`date_created` timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (`user_id`),
UNIQUE KEY `idx_email` (`email`)
);

or user-uploaded media status:

CREATE TABLE IF NOT EXISTS `media` (
`media_id` int unsigned NOT NULL auto_increment,
`owner_user_id` int(10) unsigned NOT NULL,
`title` varchar(32) NOT NULL,
`path` varchar(255) NOT NULL,
`description` varchar(128) NOT NULL,
`status` enum('PROCESSING', 'ACTIVE', 'FAILED', 'DISABLED') default 'PROCESSING',
`date_created` timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (`media_id`),
KEY `idx_owner` (`owner_user_id`)
);

Let’s say that we needed to retrieve a list of media associated with users. This can be easily accomplished with the following query:

SELECT SQL_NO_CACHE
`u`.`user_id`,
`m`.`media_id`
FROM `user` AS `u`
JOIN `media` AS `m` ON (`u`.`user_id` = `m`.`owner_user_id`);

If there are status columns involved, the query then becomes:

SELECT SQL_NO_CACHE
`u`.`user_id`,
`m`.`media_id`
FROM `user` AS `u`
JOIN `media` AS `m` ON (`u`.`user_id` = `m`.`owner_user_id`)
WHERE `u`.`status` = 'ACTIVE'
AND `m`.`status` = 'ACTIVE';

If non-active records are removed, the status columns dropped, and the first query run, there is a 15% increase in qps over the second query on the original tables in my test environment. The difference can becomes even more pronounced as the number of tables referenced in a JOIN increases!

If you must have status fields, choose the right data type. But if they can be avoided by archiving disabled records to “cold” databases and keeping “pending” records separate (say, until a user completes email verification), that is generally a better solution.

Tags: ,