SlideShare a Scribd company logo
Percona Toolkit Documentation
                    Release 2.1.1




                    Percona Inc




                        April 04, 2012
Percona toolkit 2_1_operations_manual
CONTENTS



1   Getting Percona Toolkit                                                                                                                                                                                3
    1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                                       3

2 Tools                                                                                                                                                                                                     5
  2.1     pt-align . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .     5
  2.2     pt-archiver . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .     7
  2.3     pt-config-diff . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    24
  2.4     pt-deadlock-logger . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    29
  2.5     pt-diskstats . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    37
  2.6     pt-duplicate-key-checker        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    46
  2.7     pt-fifo-split . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    52
  2.8     pt-find . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    55
  2.9     pt-fingerprint . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    65
  2.10    pt-fk-error-logger . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    68
  2.11    pt-heartbeat . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    73
  2.12    pt-index-usage . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    82
  2.13    pt-ioprofile . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    91
  2.14    pt-kill . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    95
  2.15    pt-log-player . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   106
  2.16    pt-mext . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   115
  2.17    pt-mysql-summary . . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   117
  2.18    pt-online-schema-change         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   126
  2.19    pt-pmp . . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   136
  2.20    pt-query-advisor . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   138
  2.21    pt-query-digest . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   147
  2.22    pt-show-grants . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   174
  2.23    pt-sift . . . . . . . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   180
  2.24    pt-slave-delay . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   183
  2.25    pt-slave-find . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   188
  2.26    pt-slave-restart . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   194
  2.27    pt-stalk . . . . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   202
  2.28    pt-summary . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   208
  2.29    pt-table-checksum . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   214
  2.30    pt-table-sync . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   228
  2.31    pt-table-usage . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   245
  2.32    pt-tcp-model . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   253
  2.33    pt-trend . . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   258
  2.34    pt-upgrade . . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   261
  2.35    pt-variable-advisor . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   270


                                                                                                                                                                                                            i
2.36 pt-visual-explain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

3 Configuration                                                                                                                                                                  293
  3.1 CONFIGURATION FILES . . . . . . . . . . . . .                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   293
  3.2 DSN (DATA SOURCE NAME) SPECIFICATIONS                             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   294
  3.3 ENVIRONMENT . . . . . . . . . . . . . . . . . .                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   296
  3.4 SYSTEM REQUIREMENTS . . . . . . . . . . . .                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   296

4 Miscellaneous                                                                                                                                                                 299
  4.1 BUGS . . . . . . . . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   299
  4.2 AUTHORS . . . . . . . . . . . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   299
  4.3 COPYRIGHT, LICENSE, AND WARRANTY                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   299
  4.4 VERSION . . . . . . . . . . . . . . . . . . .         .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   300
  4.5 Release Notes . . . . . . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   300

Index                                                                                                                                                                           305




ii
Percona Toolkit Documentation, Release 2.1.1


Percona Toolkit is a collection of advanced command-line tools used by Percona (http://www.percona.com/) support
staff to perform a variety of MySQL and system tasks that are too difficult or complex to perform manually.
These tools are ideal alternatives to private or “one-off” scripts because they are professionally developed, formally
tested, and fully documented. They are also fully self-contained, so installation is quick and easy and no libraries are
installed.
Percona Toolkit is derived from Maatkit and Aspersa, two of the best-known toolkits for MySQL server administration.
It is developed and supported by Percona Inc. For more information and other free, open-source software developed
by Percona, visit http://www.percona.com/software/.




CONTENTS                                                                                                              1
Percona Toolkit Documentation, Release 2.1.1




2                                              CONTENTS
CHAPTER

                                                                                                             ONE



                             GETTING PERCONA TOOLKIT

1.1 Installation

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.




                                                                                                                   3
Percona Toolkit Documentation, Release 2.1.1




4                                              Chapter 1. Getting Percona Toolkit
CHAPTER

                                                                                                                TWO



                                                                                                   TOOLS

2.1 pt-align

2.1.1 NAME

pt-align - Align output from other tools to columns.


2.1.2 SYNOPSIS

Usage

pt-align [FILES]

pt-align aligns output from other tools to columns. If no FILES are specified, STDIN is read.
If a tool prints the following output,
DATABASE TABLE   ROWS
foo      bar      100
long_db_name table 1
another long_name 500

then pt-align reprints the output as,
DATABASE          TABLE     ROWS
foo               bar        100
long_db_name      table        1
another           long_name 500



2.1.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-align is a read-only tool. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-align.


                                                                                                                       5
Percona Toolkit Documentation, Release 2.1.1


See also “BUGS” for more information on filing bugs and getting help.


2.1.4 DESCRIPTION

pt-align reads lines and splits them into words. It counts how many words each line has, and if there is one number that
predominates, it assumes this is the number of words in each line. Then it discards all lines that don’t have that many
words, and looks at the 2nd line that does. It assumes this is the first non-header line. Based on whether each word
looks numeric or not, it decides on column alignment. Finally, it goes through and decides how wide each column
should be, and then prints them out.
This is useful for things like aligning the output of vmstat or iostat so it is easier to read.


2.1.5 OPTIONS

This tool does not have any command-line options.


2.1.6 ENVIRONMENT

This tool does not use any environment variables.


2.1.7 SYSTEM REQUIREMENTS

You need Perl, and some core packages that ought to be installed in any reasonably new version of Perl.


2.1.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-align.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.1.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb




6                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.1.10 AUTHORS

Baron Schwartz, Brian Fraser, and Daniel Nichter


2.1.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.1.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.1.13 VERSION

pt-align 2.1.1


2.2 pt-archiver

2.2.1 NAME

pt-archiver - Archive rows from a MySQL table into another table or a file.


2.2.2 SYNOPSIS

Usage

pt-archiver [OPTION...] --source DSN --where WHERE




2.2. pt-archiver                                                                                                  7
Percona Toolkit Documentation, Release 2.1.1


pt-archiver nibbles records from a MySQL table. The –source and –dest arguments use DSN syntax; if COPY is yes,
–dest defaults to the key’s value from –source.


Examples

Archive all rows from oltp_server to olap_server and to a file:
pt-archiver --source h=oltp_server,D=test,t=tbl --dest h=olap_server 
  --file ’/var/log/archive/%Y-%m-%d-%D.%t’                           
  --where "1=1" --limit 1000 --commit-each

Purge (delete) orphan rows from child table:
pt-archiver --source h=host,D=db,t=child --purge 
  --where ’NOT EXISTS(SELECT * FROM parent WHERE col=child.col)’



2.2.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-archiver is a read-write tool. It deletes data from the source by default, so you should test your archiving jobs with
the --dry-run option if you’re not sure about them. It is designed to have as little impact on production systems as
possible, but tuning with --limit, --txn-size and similar options might be a good idea too.
If you write or use --plugin modules, you should ensure they are good quality and well-tested.
At the time of this release there is an unverified bug with --bulk-insert that may cause data loss.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
archiver.
See also “BUGS” for more information on filing bugs and getting help.


2.2.4 DESCRIPTION

pt-archiver is the tool I use to archive tables as described in http://tinyurl.com/mysql-archiving. The goal is a low-
impact, forward-only job to nibble old data out of the table without impacting OLTP queries much. You can insert the
data into another table, which need not be on the same server. You can also write it to a file in a format suitable for
LOAD DATA INFILE. Or you can do neither, in which case it’s just an incremental DELETE.
pt-archiver is extensible via a plugin mechanism. You can inject your own code to add advanced archiving logic that
could be useful for archiving dependent data, applying complex business rules, or building a data warehouse during
the archiving process.
You need to choose values carefully for some options. The most important are --limit, --retries, and
--txn-size.
The strategy is to find the first row(s), then scan some index forward-only to find more rows efficiently. Each sub-
sequent query should not scan the entire table; it should seek into the index, then scan until it finds more archivable
rows. Specifying the index with the ‘i’ part of the --source argument can be crucial for this; use --dry-run to
examine the generated queries and be sure to EXPLAIN them to see if they are efficient (most of the time you prob-
ably want to scan the PRIMARY key, which is the default). Even better, profile pt-archiver with mk-query-profiler
(http://maatkit.org/get/mk-query-profiler) and make sure it is not scanning the whole table every query.



8                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


You can disable the seek-then-scan optimizations partially or wholly with --no-ascend and --ascend-first.
Sometimes this may be more efficient for multi-column keys. Be aware that pt-archiver is built to start at the beginning
of the index it chooses and scan it forward-only. This might result in long table scans if you’re trying to nibble from
the end of the table by an index other than the one it prefers. See --source and read the documentation on the i
part if this applies to you.


2.2.5 OUTPUT

If you specify --progress, the output is a header row, plus status output at intervals. Each row in the status output
lists the current date and time, how many seconds pt-archiver has been running, and how many rows it has archived.
If you specify --statistics, pt-archiver outputs timing and other information to help you identify which part of
your archiving process takes the most time.


2.2.6 ERROR-HANDLING

pt-archiver tries to catch signals and exit gracefully; for example, if you send it SIGTERM (Ctrl-C on UNIX-ish
systems), it will catch the signal, print a message about the signal, and exit fairly normally. It will not execute
--analyze or --optimize, because these may take a long time to finish. It will run all other code normally,
including calling after_finish() on any plugins (see “EXTENDING”).
In other words, a signal, if caught, will break out of the main archiving loop and skip optimize/analyze.


2.2.7 OPTIONS

Specify at least one of --dest, --file, or --purge.
--ignore and --replace are mutually exclusive.
--txn-size and --commit-each are mutually exclusive.
--low-priority-insert and --delayed-insert are mutually exclusive.
--share-lock and --for-update are mutually exclusive.
--analyze and --optimize are mutually exclusive.
--no-ascend and --no-delete are mutually exclusive.
DSN values in --dest default to values from --source if COPY is yes.
-analyze
    type: string
      Run ANALYZE TABLE afterwards on --source and/or --dest.
      Runs ANALYZE TABLE after finishing. The argument is an arbitrary string. If it contains the letter ‘s’, the
      source will be analyzed. If it contains ‘d’, the destination will be analyzed. You can specify either or both. For
      example, the following will analyze both:
      --analyze=ds

      See http://dev.mysql.com/doc/en/analyze-table.html for details on ANALYZE TABLE.
-ascend-first
    Ascend only first column of index.
      If you do want to use the ascending index optimization (see --no-ascend), but do not want to incur the
      overhead of ascending a large multi-column index, you can use this option to tell pt-archiver to ascend only the


2.2. pt-archiver                                                                                                      9
Percona Toolkit Documentation, Release 2.1.1


     leftmost column of the index. This can provide a significant performance boost over not ascending the index at
     all, while avoiding the cost of ascending the whole index.
     See “EXTENDING” for a discussion of how this interacts with plugins.
-ask-pass
    Prompt for a password when connecting to MySQL.
-buffer
    Buffer output to --file and flush at commit.
     Disables autoflushing to --file and flushes --file to disk only when a transaction commits. This typically
     means the file is block-flushed by the operating system, so there may be some implicit flushes to disk between
     commits as well. The default is to flush --file to disk after every row.
     The danger is that a crash might cause lost data.
     The performance increase I have seen from using --buffer is around 5 to 15 percent. Your mileage may vary.
-bulk-delete
    Delete each chunk with a single statement (implies --commit-each).
     Delete each chunk of rows in bulk with a single DELETE statement. The statement deletes every row between
     the first and last row of the chunk, inclusive. It implies --commit-each, since it would be a bad idea to
     INSERT rows one at a time and commit them before the bulk DELETE.
     The normal method is to delete every row by its primary key. Bulk deletes might be a lot faster. They also
     might not be faster if you have a complex WHERE clause.
     This option completely defers all DELETE processing until the chunk of rows is finished. If you have a plugin
     on the source, its before_delete method will not be called. Instead, its before_bulk_delete method
     is called later.
     WARNING: if you have a plugin on the source that sometimes doesn’t return true from is_archivable(),
     you should use this option only if you understand what it does. If the plugin instructs pt-archiver not to archive
     a row, it will still be deleted by the bulk delete!
-[no]bulk-delete-limit
    default: yes
     Add --limit to --bulk-delete statement.
     This is an advanced option and you should not disable it unless you know what you are doing and why! By
     default, --bulk-delete appends a --limit clause to the bulk delete SQL statement. In certain cases, this
     clause can be omitted by specifying --no-bulk-delete-limit. --limit must still be specified.
-bulk-insert
    Insert each chunk with LOAD DATA INFILE (implies --bulk-delete --commit-each).
     Insert each chunk of rows with LOAD DATA LOCAL INFILE. This may be much faster than inserting a row
     at a time with INSERT statements. It is implemented by creating a temporary file for each chunk of rows, and
     writing the rows to this file instead of inserting them. When the chunk is finished, it uploads the rows.
     To protect the safety of your data, this option forces bulk deletes to be used. It would be unsafe to delete each
     row as it is found, before inserting the rows into the destination first. Forcing bulk deletes guarantees that the
     deletion waits until the insertion is successful.
     The --low-priority-insert, --replace, and --ignore options work with this option, but
     --delayed-insert does not.
-charset
    short form: -A; type: string




10                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


     Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
     option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
     on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
     See also --[no]check-charset.
-[no]check-charset
    default: yes
     Ensure connection and table character sets are the same. Disabling this check may cause text to be erroneously
     converted from one character set to another (usually from utf8 to latin1) which may cause data loss or mojibake.
     Disabling this check may be useful or necessary when character set conversions are intended.
-[no]check-columns
    default: yes
     Ensure --source and --dest have same columns.
     Enabled by default; causes pt-archiver to check that the source and destination tables have the same columns. It
     does not check column order, data type, etc. It just checks that all columns in the source exist in the destination
     and vice versa. If there are any differences, pt-archiver will exit with an error.
     To disable this check, specify –no-check-columns.
-check-interval
    type: time; default: 1s
     How often to check for slave lag if --check-slave-lag is given.
-check-slave-lag
    type: string
     Pause archiving until the specified DSN’s slave lag is less than --max-lag.
-columns
    short form: -c; type: array
     Comma-separated list of columns to archive.
     Specify a comma-separated list of columns to fetch, write to the file, and insert into the destination table. If
     specified, pt-archiver ignores other columns unless it needs to add them to the SELECT statement for ascending
     an index or deleting rows. It fetches and uses these extra columns internally, but does not write them to the file
     or to the destination table. It does pass them to plugins.
     See also --primary-key-only.
-commit-each
    Commit each set of fetched and archived rows (disables --txn-size).
     Commits transactions and flushes --file after each set of rows has been archived, before fetching the next set
     of rows, and before sleeping if --sleep is specified. Disables --txn-size; use --limit to control the
     transaction size with --commit-each.
     This option is useful as a shortcut to make --limit and --txn-size the same value, but more importantly
     it avoids transactions being held open while searching for more rows. For example, imagine you are archiving
     old rows from the beginning of a very large table, with --limit 1000 and --txn-size 1000. After some
     period of finding and archiving 1000 rows at a time, pt-archiver finds the last 999 rows and archives them,
     then executes the next SELECT to find more rows. This scans the rest of the table, but never finds any more
     rows. It has held open a transaction for a very long time, only to determine it is finished anyway. You can use
     --commit-each to avoid this.
-config
    type: Array



2.2. pt-archiver                                                                                                     11
Percona Toolkit Documentation, Release 2.1.1


      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-delayed-insert
    Add the DELAYED modifier to INSERT statements.
      Adds      the    DELAYED       modifier       to     INSERT     or    REPLACE        statements.           See
      http://dev.mysql.com/doc/en/insert.html for details.
-dest
    type: DSN
      DSN specifying the table to archive to.
      This item specifies a table into which pt-archiver will insert rows archived from --source. It uses the same
      key=val argument format as --source. Most missing values default to the same values as --source, so
      you don’t have to repeat options that are the same in --source and --dest. Use the --help option to see
      which values are copied from --source.
      WARNING: Using a default options file (F) DSN option that defines a socket for --source causes pt-
      archiver to connect to --dest using that socket unless another socket for --dest is specified. This means
      that pt-archiver may incorrectly connect to --source when it connects to --dest. For example:
      --source F=host1.cnf,D=db,t=tbl --dest h=host2

      When pt-archiver connects to --dest, host2, it will connect via the --source, host1, socket defined in
      host1.cnf.
-dry-run
    Print queries and exit without doing anything.
      Causes pt-archiver to exit after printing the filename and SQL statements it will use.
-file
    type: string
      File to archive to, with DATE_FORMAT()-like formatting.
      Filename to write archived rows to. A subset of MySQL’s DATE_FORMAT() formatting codes are allowed in
      the filename, as follows:
      %d      Day of the month, numeric (01..31)
      %H      Hour (00..23)
      %i      Minutes, numeric (00..59)
      %m      Month, numeric (01..12)
      %s      Seconds (00..59)
      %Y      Year, numeric, four digits

      You can use the following extra format codes too:
      %D      Database name
      %t      Table name

      Example:
      --file ’/var/log/archive/%Y-%m-%d-%D.%t’

      The file’s contents are in the same format used by SELECT INTO OUTFILE, as documented in the MySQL
      manual: rows terminated by newlines, columns terminated by tabs, NULL characters are represented by N, and
      special characters are escaped by . This lets you reload a file with LOAD DATA INFILE’s default settings.
      If you want a column header at the top of the file, see --header. The file is auto-flushed by default; see
      --buffer.



12                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-for-update
    Adds the FOR UPDATE modifier to SELECT statements.
      For details, see http://dev.mysql.com/doc/en/innodb-locking-reads.html.
-header
    Print column header at top of --file.
      Writes column names as the first line in the file given by --file. If the file exists, does not write headers; this
      keeps the file loadable with LOAD DATA INFILE in case you append more output to it.
-help
    Show help and exit.
-high-priority-select
    Adds the HIGH_PRIORITY modifier to SELECT statements.
      See http://dev.mysql.com/doc/en/select.html for details.
-host
    short form: -h; type: string
      Connect to host.
-ignore
    Use IGNORE for INSERT statements.
      Causes INSERTs into --dest to be INSERT IGNORE.
-limit
    type: int; default: 1
      Number of rows to fetch and archive per statement.
      Limits the number of rows returned by the SELECT statements that retrieve rows to archive. Default is one
      row. It may be more efficient to increase the limit, but be careful if you are archiving sparsely, skipping over
      many rows; this can potentially cause more contention with other queries, depending on the storage engine,
      transaction isolation level, and options such as --for-update.
-local
    Do not write OPTIMIZE or ANALYZE queries to binlog.
      Adds the NO_WRITE_TO_BINLOG modifier to ANALYZE and OPTIMIZE queries. See --analyze for
      details.
-low-priority-delete
    Adds the LOW_PRIORITY modifier to DELETE statements.
      See http://dev.mysql.com/doc/en/delete.html for details.
-low-priority-insert
    Adds the LOW_PRIORITY modifier to INSERT or REPLACE statements.
      See http://dev.mysql.com/doc/en/insert.html for details.
-max-lag
    type: time; default: 1s
      Pause archiving if the slave given by --check-slave-lag lags.
      This option causes pt-archiver to look at the slave every time it’s about to fetch another row. If the slave’s lag
      is greater than the option’s value, or if the slave isn’t running (so its lag is NULL), pt-table-checksum sleeps
      for --check-interval seconds and then looks at the lag again. It repeats until the slave is caught up, then
      proceeds to fetch and archive the row.



2.2. pt-archiver                                                                                                     13
Percona Toolkit Documentation, Release 2.1.1


       This option may eliminate the need for --sleep or --sleep-coef.
-no-ascend
    Do not use ascending index optimization.
       The default ascending-index optimization causes pt-archiver to optimize repeated SELECT queries so they
       seek into the index where the previous query ended, then scan along it, rather than scanning from the beginning
       of the table every time. This is enabled by default because it is generally a good strategy for repeated accesses.
       Large, multiple-column indexes may cause the WHERE clause to be complex enough that this could actually
       be less efficient. Consider for example a four-column PRIMARY KEY on (a, b, c, d). The WHERE clause to
       start where the last query ended is as follows:
       WHERE   (a     >   ?)
          OR   (a     =   ? AND b > ?)
          OR   (a     =   ? AND b = ? AND c > ?)
          OR   (a     =   ? AND b = ? AND c = ? AND d >= ?)

       Populating the placeholders with values uses memory and CPU, adds network traffic and parsing overhead, and
       may make the query harder for MySQL to optimize. A four-column key isn’t a big deal, but a ten-column key
       in which every column allows NULL might be.
       Ascending the index might not be necessary if you know you are simply removing rows from the beginning
       of the table in chunks, but not leaving any holes, so starting at the beginning of the table is actually the most
       efficient thing to do.
       See also --ascend-first. See “EXTENDING” for a discussion of how this interacts with plugins.
-no-delete
    Do not delete archived rows.
       Causes pt-archiver not to delete rows after processing them. This disallows --no-ascend, because enabling
       them both would cause an infinite loop.
       If there is a plugin on the source DSN, its before_delete method is called anyway, even though pt-archiver
       will not execute the delete. See “EXTENDING” for more on plugins.
-optimize
    type: string
       Run OPTIMIZE TABLE afterwards on --source and/or --dest.
       Runs OPTIMIZE TABLE after finishing.                    See --analyze for the               option   syntax    and
       http://dev.mysql.com/doc/en/optimize-table.html for details on OPTIMIZE TABLE.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-plugin
    type: string
       Perl module name to use as a generic plugin.
       Specify the Perl module name of a general-purpose plugin. It is currently used only for statistics (see
       --statistics) and must have new() and a statistics() method.


14                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      The new( src = $src, dst => $dst, opts => $o )> method gets the source and destination DSNs, and their
      database connections, just like the connection-specific plugins do. It also gets an OptionParser object ($o) for
      accessing command-line options (example: ‘‘$o-‘‘get(‘purge’);>).
      The statistics(%stats, $time) method gets a hashref of the statistics collected by the archiving
      job, and the time the whole job started.
-port
    short form: -P; type: int
      Port number to use for connection.
-primary-key-only
    Primary key columns only.
      A shortcut for specifying --columns with the primary key columns. This is an efficiency if you just want
      to purge rows; it avoids fetching the entire row, when only the primary key columns are needed for DELETE
      statements. See also --purge.
-progress
    type: int
      Print progress information every X rows.
      Prints current time, elapsed time, and rows archived every X rows.
-purge
    Purge instead of archiving; allows omitting --file and --dest.
      Allows archiving without a --file or --dest argument, which is effectively a purge since the rows are just
      deleted.
      If you just want to purge rows, consider specifying the table’s primary key columns with
      --primary-key-only. This will prevent fetching all columns from the server for no reason.
-quick-delete
    Adds the QUICK modifier to DELETE statements.
      See http://dev.mysql.com/doc/en/delete.html for details. As stated in the documentation, in some cases it may
      be faster to use DELETE QUICK followed by OPTIMIZE TABLE. You can use --optimize for this.
-quiet
    short form: -q
      Do not print any output, such as for --statistics.
      Suppresses normal output, including the output of --statistics, but doesn’t suppress the output from
      --why-quit.
-replace
    Causes INSERTs into --dest to be written as REPLACE.
-retries
    type: int; default: 1
      Number of retries per timeout or deadlock.
      Specifies the number of times pt-archiver should retry when there is an InnoDB lock wait timeout or deadlock.
      When retries are exhausted, pt-archiver will exit with an error.
      Consider carefully what you want to happen when you are archiving between a mixture of transactional and
      non-transactional storage engines. The INSERT to --dest and DELETE from --source are on separate
      connections, so they do not actually participate in the same transaction even if they’re on the same server.




2.2. pt-archiver                                                                                                  15
Percona Toolkit Documentation, Release 2.1.1


      However, pt-archiver implements simple distributed transactions in code, so commits and rollbacks should
      happen as desired across the two connections.
      At this time I have not written any code to handle errors with transactional storage engines other than InnoDB.
      Request that feature if you need it.
-run-time
    type: time
      Time to run before exiting.
      Optional suffix s=seconds, m=minutes, h=hours, d=days; if no suffix, s is used.
-[no]safe-auto-increment
    default: yes
      Do not archive row with max AUTO_INCREMENT.
      Adds an extra WHERE clause to prevent pt-archiver from removing the newest row when ascending a single-
      column AUTO_INCREMENT key. This guards against re-using AUTO_INCREMENT values if the server
      restarts, and is enabled by default.
      The extra WHERE clause contains the maximum value of the auto-increment column as of the beginning of the
      archive or purge job. If new rows are inserted while pt-archiver is running, it will not see them.
-sentinel
    type: string; default: /tmp/pt-archiver-sentinel
      Exit if this file exists.
      The presence of the file specified by --sentinel will cause pt-archiver to stop archiving and exit. The
      default is /tmp/pt-archiver-sentinel. You might find this handy to stop cron jobs gracefully if necessary. See also
      --stop.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables.
      Specify any variables you want to be set immediately after connecting to MySQL. These will be included in a
      SET command.
-share-lock
    Adds the LOCK IN SHARE MODE modifier to SELECT statements.
      See http://dev.mysql.com/doc/en/innodb-locking-reads.html.
-skip-foreign-key-checks
    Disables foreign key checks with SET FOREIGN_KEY_CHECKS=0.
-sleep
    type: int
      Sleep time between fetches.
      Specifies how long to sleep between SELECT statements. Default is not to sleep at all. Transactions are NOT
      committed, and the --file file is NOT flushed, before sleeping. See --txn-size to control that.
      If --commit-each is specified, committing and flushing happens before sleeping.
-sleep-coef
    type: float
      Calculate --sleep as a multiple of the last SELECT time.




16                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      If this option is specified, pt-archiver will sleep for the query time of the last SELECT multiplied by the
      specified coefficient.
      This is a slightly more sophisticated way to throttle the SELECTs: sleep a varying amount of time between each
      SELECT, depending on how long the SELECTs are taking.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-source
    type: DSN
      DSN specifying the table to archive from (required). This argument is a DSN. See DSN OPTIONS for the
      syntax. Most options control how pt-archiver connects to MySQL, but there are some extended DSN options
      in this tool’s syntax. The D, t, and i options select a table to archive:
      --source h=my_server,D=my_database,t=my_tbl

      The a option specifies the database to set as the connection’s default with USE. If the b option is true, it disables
      binary logging with SQL_LOG_BIN. The m option specifies pluggable actions, which an external Perl module
      can provide. The only required part is the table; other parts may be read from various places in the environment
      (such as options files).
      The ‘i’ part deserves special mention. This tells pt-archiver which index it should scan to archive. This appears
      in a FORCE INDEX or USE INDEX hint in the SELECT statements used to fetch archivable rows. If you don’t
      specify anything, pt-archiver will auto-discover a good index, preferring a PRIMARY KEY if one exists. In my
      experience this usually works well, so most of the time you can probably just omit the ‘i’ part.
      The index is used to optimize repeated accesses to the table; pt-archiver remembers the last row it retrieves from
      each SELECT statement, and uses it to construct a WHERE clause, using the columns in the specified index,
      that should allow MySQL to start the next SELECT where the last one ended, rather than potentially scanning
      from the beginning of the table with each successive SELECT. If you are using external plugins, please see
      “EXTENDING” for a discussion of how they interact with ascending indexes.
      The ‘a’ and ‘b’ options allow you to control how statements flow through the binary log. If you specify the ‘b’
      option, binary logging will be disabled on the specified connection. If you specify the ‘a’ option, the connection
      will USE the specified database, which you can use to prevent slaves from executing the binary log events with
      --replicate-ignore-db options. These two options can be used as different methods to achieve the same
      goal: archive data off the master, but leave it on the slave. For example, you can run a purge job on the master
      and prevent it from happening on the slave using your method of choice.
      WARNING: Using a default options file (F) DSN option that defines a socket for --source causes pt-
      archiver to connect to --dest using that socket unless another socket for --dest is specified. This means
      that pt-archiver may incorrectly connect to --source when it is meant to connect to --dest. For example:
      --source F=host1.cnf,D=db,t=tbl --dest h=host2

      When pt-archiver connects to --dest, host2, it will connect via the --source, host1, socket defined in
      host1.cnf.
-statistics
    Collect and print timing statistics.
      Causes pt-archiver to collect timing statistics about what it does. These statistics are available to the plugin
      specified by --plugin
      Unless you specify --quiet, pt-archiver prints the statistics when it exits. The statistics look like this:




2.2. pt-archiver                                                                                                       17
Percona Toolkit Documentation, Release 2.1.1



      Started at 2008-07-18T07:18:53, ended at 2008-07-18T07:18:53
      Source: D=db,t=table
      SELECT 4
      INSERT 4
      DELETE 4
      Action         Count       Time        Pct
      commit            10     0.1079      88.27
      select             5     0.0047       3.87
      deleting           4     0.0028       2.29
      inserting          4     0.0028       2.28
      other              0     0.0040       3.29

      The first two (or three) lines show times and the source and destination tables. The next three lines show how
      many rows were fetched, inserted, and deleted.
      The remaining lines show counts and timing. The columns are the action, the total number of times that action
      was timed, the total time it took, and the percent of the program’s total runtime. The rows are sorted in order of
      descending total time. The last row is the rest of the time not explicitly attributed to anything. Actions will vary
      depending on command-line options.
      If --why-quit is given, its behavior is changed slightly. This option causes it to print the reason for exiting
      even when it’s just because there are no more rows.
      This option requires the standard Time::HiRes module, which is part of core Perl on reasonably new Perl re-
      leases.
-stop
    Stop running instances by creating the sentinel file.
      Causes pt-archiver to create the sentinel file specified by --sentinel and exit. This should have the effect
      of stopping all running instances which are watching the same sentinel file.
-txn-size
    type: int; default: 1
      Number of rows per transaction.
      Specifies the size, in number of rows, of each transaction. Zero disables transactions altogether. After pt-
      archiver processes this many rows, it commits both the --source and the --dest if given, and flushes the
      file given by --file.
      This parameter is critical to performance. If you are archiving from a live server, which for example is doing
      heavy OLTP work, you need to choose a good balance between transaction size and commit overhead. Larger
      transactions create the possibility of more lock contention and deadlocks, but smaller transactions cause more
      frequent commit overhead, which can be significant. To give an idea, on a small test set I worked with while
      writing pt-archiver, a value of 500 caused archiving to take about 2 seconds per 1000 rows on an otherwise
      quiet MySQL instance on my desktop machine, archiving to disk and to another table. Disabling transactions
      with a value of zero, which turns on autocommit, dropped performance to 38 seconds per thousand rows.
      If you are not archiving from or to a transactional storage engine, you may want to disable transactions so
      pt-archiver doesn’t try to commit.
-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.
-where
    type: string


18                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      WHERE clause to limit which rows to archive (required).
      Specifies a WHERE clause to limit which rows are archived. Do not include the word WHERE. You may need
      to quote the argument to prevent your shell from interpreting it. For example:
      --where ’ts < current_date - interval 90 day’

      For safety, --where is required. If you do not require a WHERE clause, use --where 1=1.
-why-quit
    Print reason for exiting unless rows exhausted.
      Causes pt-archiver to print a message if it exits for any reason other than running out of rows to archive.
      This can be useful if you have a cron job with --run-time specified, for example, and you want to be sure
      pt-archiver is finishing before running out of time.
      If --statistics is given, the behavior is changed slightly. It will print the reason for exiting even when it’s
      just because there are no more rows.
      This output prints even if --quiet is given. That’s so you can put pt-archiver in a cron job and get an email
      if there’s an abnormal exit.


2.2.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • a
      copy: no
      Database to USE when executing queries.
   • A
      dsn: charset; copy: yes
      Default character set.
   • b
      copy: no
      If true, disable binlog with SQL_LOG_BIN.
   • D
      dsn: database; copy: yes
      Database that contains the table.
   • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
   • h
      dsn: host; copy: yes
      Connect to host.
   • i


2.2. pt-archiver                                                                                                  19
Percona Toolkit Documentation, Release 2.1.1


       copy: yes
       Index to use.
     • m
       copy: no
       Plugin module name.
     • p
       dsn: password; copy: yes
       Password to use when connecting.
     • P
       dsn: port; copy: yes
       Port number to use for connection.
     • S
       dsn: mysql_socket; copy: yes
       Socket file to use for connection.
     • t
       copy: yes
       Table to archive from/to.
     • u
       dsn: user; copy: yes
       User for login if not current user.


2.2.9 EXTENDING

pt-archiver is extensible by plugging in external Perl modules to handle some logic and/or actions. You can specify a
module for both the --source and the --dest, with the ‘m’ part of the specification. For example:
--source D=test,t=test1,m=My::Module1 --dest m=My::Module2,t=test2

This will cause pt-archiver to load the My::Module1 and My::Module2 packages, create instances of them, and then
make calls to them during the archiving process.
You can also specify a plugin with --plugin.
The module must provide this interface:
new(dbh => $dbh, db => $db_name, tbl => $tbl_name)
       The plugin’s constructor is passed a reference to the database handle, the database name, and table name.
       The plugin is created just after pt-archiver opens the connection, and before it examines the table given
       in the arguments. This gives the plugin a chance to create and populate temporary tables, or do other setup
       work.
before_begin(cols => @cols, allcols => @allcols)
       This method is called just before pt-archiver begins iterating through rows and archiving them, but after
       it does all other setup work (examining table structures, designing SQL queries, and so on). This is the
       only time pt-archiver tells the plugin column names for the rows it will pass the plugin while archiving.


20                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


     The cols argument is the column names the user requested to be archived, either by default or by the
     --columns option. The allcols argument is the list of column names for every row pt-archiver will
     fetch from the source table. It may fetch more columns than the user requested, because it needs some
     columns for its own use. When subsequent plugin functions receive a row, it is the full row containing all
     the extra columns, if any, added to the end.
is_archivable(row => @row)
     This method is called for each row to determine whether it is archivable. This applies only to --source.
     The argument is the row itself, as an arrayref. If the method returns true, the row will be archived;
     otherwise it will be skipped.
     Skipping a row adds complications for non-unique indexes. Normally pt-archiver uses a WHERE clause
     designed to target the last processed row as the place to start the scan for the next SELECT statement. If
     you have skipped the row by returning false from is_archivable(), pt-archiver could get into an infinite
     loop because the row still exists. Therefore, when you specify a plugin for the --source argument, pt-
     archiver will change its WHERE clause slightly. Instead of starting at “greater than or equal to” the last
     processed row, it will start “strictly greater than.” This will work fine on unique indexes such as primary
     keys, but it may skip rows (leave holes) on non-unique indexes or when ascending only the first column
     of an index.
     pt-archiver will change the clause in the same way if you specify --no-delete, because again an
     infinite loop is possible.
     If you specify the --bulk-delete option and return false from this method, pt-archiver may not do
     what you want. The row won’t be archived, but it will be deleted, since bulk deletes operate on ranges of
     rows and don’t know which rows the plugin selected to keep.
     If you specify the --bulk-insert option, this method’s return value will influence whether the row
     is written to the temporary file for the bulk insert, so bulk inserts will work as expected. However, bulk
     inserts require bulk deletes.
before_delete(row => @row)
     This method is called for each row just before it is deleted. This applies only to --source. This is a
     good place for you to handle dependencies, such as deleting things that are foreign-keyed to the row you
     are about to delete. You could also use this to recursively archive all dependent tables.
     This plugin method is called even if --no-delete is given, but not if --bulk-delete is given.
before_bulk_delete(first_row => @row, last_row => @row)
     This method is called just before a bulk delete is executed. It is similar to the before_delete method,
     except its arguments are the first and last row of the range to be deleted. It is called even if --no-delete
     is given.
before_insert(row => @row)
     This method is called for each row just before it is inserted. This applies only to --dest. You could
     use this to insert the row into multiple tables, perhaps with an ON DUPLICATE KEY UPDATE clause to
     build summary tables in a data warehouse.
     This method is not called if --bulk-insert is given.
before_bulk_insert(first_row => @row, last_row => @row, filename => bulk_insert_filename)
     This method is called just before a bulk insert is executed. It is similar to the before_insert method,
     except its arguments are the first and last row of the range to be deleted.
custom_sth(row => @row, sql => $sql)
     This method is called just before inserting the row, but after “before_insert()”. It allows the plugin to
     specify different INSERT statement if desired. The return value (if any) should be a DBI statement


2.2. pt-archiver                                                                                                   21
Percona Toolkit Documentation, Release 2.1.1


      handle. The sql parameter is the SQL text used to prepare the default INSERT statement. This method
      is not called if you specify --bulk-insert.
      If no value is returned, the default INSERT statement handle is used.
      This method applies only to the plugin specified for --dest, so if your plugin isn’t doing what you
      expect, check that you’ve specified it for the destination and not the source.
custom_sth_bulk(first_row => @row, last_row => @row, sql => $sql, filename => $bulk_insert_filename)
      If you’ve specified --bulk-insert, this method is called just before the bulk insert, but after “be-
      fore_bulk_insert()”, and the arguments are different.
      This method’s return value etc is similar to the “custom_sth()” method.
after_finish()
      This method is called after pt-archiver exits the archiving loop, commits all database handles, closes
      --file, and prints the final statistics, but before pt-archiver runs ANALYZE or OPTIMIZE (see
      --analyze and --optimize).
If you specify a plugin for both --source and --dest, pt-archiver constructs, calls before_begin(), and calls
after_finish() on the two plugins in the order --source, --dest.
pt-archiver assumes it controls transactions, and that the plugin will NOT commit or roll back the database handle.
The database handle passed to the plugin’s constructor is the same handle pt-archiver uses itself. Remember that
--source and --dest are separate handles.
A sample module might look like this:
package My::Module;

sub new {
   my ( $class, %args ) = @_;
   return bless(%args, $class);
}

sub before_begin {
   my ( $self, %args ) = @_;
   # Save column names for later
   $self->{cols} = $args{cols};
}

sub is_archivable {
   my ( $self, %args ) = @_;
   # Do some advanced logic with $args{row}
   return 1;
}

sub   before_delete     {}   #   Take   no   action
sub   before_insert     {}   #   Take   no   action
sub   custom_sth        {}   #   Take   no   action
sub   after_finish      {}   #   Take   no   action

1;



2.2.10 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:


22                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



PTDEBUG=1 pt-archiver ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.2.11 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.2.12 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-archiver.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.2.13 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.2.14 AUTHORS

Baron Schwartz


2.2.15 ACKNOWLEDGMENTS

Andrew O’Brien




2.2. pt-archiver                                                                                                    23
Percona Toolkit Documentation, Release 2.1.1


2.2.16 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.2.17 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.2.18 VERSION

pt-archiver 2.1.1


2.3 pt-config-diff

2.3.1 NAME

pt-config-diff - Diff MySQL configuration files and server variables.


2.3.2 SYNOPSIS

Usage

pt-config-diff [OPTION...] CONFIG CONFIG [CONFIG...]

pt-config-diff diffs MySQL configuration files and server variables. CONFIG can be a filename or a DSN. At least
two CONFIG sources must be given. Like standard Unix diff, there is no output if there are no differences.
Diff host1 config from SHOW VARIABLES against host2:
pt-config-diff h=host1 h=host2

Diff config from [mysqld] section in my.cnf against host1 config:
pt-config-diff /etc/my.cnf h=host1

Diff the [mysqld] section of two option files:




24                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



pt-config-diff /etc/my-small.cnf /etc/my-large.cnf



2.3.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-config-diff reads MySQL’s configuration and examines it and is thus very low risk.
At the time of this release there are no known bugs that pose a serious risk.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
config-diff.
See also “BUGS” for more information on filing bugs and getting help.


2.3.4 DESCRIPTION

pt-config-diff diffs MySQL configurations by examining the values of server system variables from two or more
CONFIG sources specified on the command line. A CONFIG source can be a DSN or a filename containing the
output of mysqld --help --verbose, my_print_defaults, SHOW VARIABLES, or an option file (e.g.
my.cnf).
For each DSN CONFIG, pt-config-diff connects to MySQL and gets variables and values by executing SHOW
/*!40103 GLOBAL*/ VARIABLES. This is an “active config” because it shows what server values MySQL is
actively (currently) running with.
Only variables that all CONFIG sources have are compared because if a variable is not present then we cannot know
or safely guess its value. For example, if you compare an option file (e.g. my.cnf) to an active config (i.e. SHOW
VARIABLES from a DSN CONFIG), the option file will probably only have a few variables, whereas the active config
has every variable. Only values of the variables present in both configs are compared.
Option file and DSN configs provide the best results.


2.3.5 OUTPUT

There is no output when there are no differences. When there are differences, pt-config-diff prints a report to STDOUT
that looks similar to the following:
2 config differences
Variable                            my.master.cnf         my.slave.cnf
=========================           ===============       ===============
datadir                             /tmp/12345/data       /tmp/12346/data
port                                12345                 12346

Comparing MySQL variables is difficult because there are many variations and subtleties across the many versions
and distributions of MySQL. When a comparison fails, the tool prints a warning to STDERR, such as the following:
Comparing log_error values (mysqld.log, /tmp/12345/data/mysqld.log)
caused an error: Argument "/tmp/12345/data/mysqld.log" isn’t numeric
in numeric eq (==) at ./pt-config-diff line 2311.

Please report these warnings so the comparison functions can be improved.



2.3. pt-config-diff                                                                                                  25
Percona Toolkit Documentation, Release 2.1.1


2.3.6 EXIT STATUS

pt-config-diff exits with a zero exit status when there are no differences, and 1 if there are.


2.3.7 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
       Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
       option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
       on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
       Read this comma-separated list of config files; if specified, this must be the first option on the command line.
       (This option does not specify a CONFIG; it’s equivalent to --defaults-file.)
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-defaults-file
    short form: -F; type: string
       Only read mysql options from the given file. You must give an absolute pathname.
-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-ignore-variables
    type: array
       Ignore, do not compare, these variables.
-password
    short form: -p; type: string
       Password to use for connection.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.




26                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-[no]report
    default: yes
      Print the MySQL config diff report to STDOUT. If you just want to check if the given configs are different or
      not by examining the tool’s exit status, then specify --no-report to suppress the report.
-report-width
    type: int; default: 78
      Truncate report lines to this many characters. Since some variable values can be long, or when comparing
      multiple configs, it may help to increase the report width so values are not truncated beyond readability.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-user
    short form: -u; type: string
      MySQL user if not current user.
-version
    Show version and exit.


2.3.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.
   • D
      dsn: database; copy: yes
      Default database.
   • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
   • h
      dsn: host; copy: yes
      Connect to host.
   • p




2.3. pt-config-diff                                                                                            27
Percona Toolkit Documentation, Release 2.1.1


       dsn: password; copy: yes
       Password to use when connecting.
     • P
       dsn: port; copy: yes
       Port number to use for connection.
     • S
       dsn: mysql_socket; copy: yes
       Socket file to use for connection.
     • u
       dsn: user; copy: yes
       User for login if not current user.


2.3.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-config-diff ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.3.10 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.3.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-config-diff.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
     • Complete command-line used to run the tool
     • Tool --version
     • MySQL version of all servers involved
     • Output from the tool including STDERR
     • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.3.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:




28                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.3.13 AUTHORS

Baron Schwartz and Daniel Nichter


2.3.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.3.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.3.16 VERSION

pt-config-diff 2.1.1


2.4 pt-deadlock-logger

2.4.1 NAME

pt-deadlock-logger - Extract and log MySQL deadlock information.




2.4. pt-deadlock-logger                                                                                          29
Percona Toolkit Documentation, Release 2.1.1


2.4.2 SYNOPSIS

Usage

pt-deadlock-logger [OPTION...] SOURCE_DSN

pt-deadlock-logger extracts and saves information about the most recent deadlock in a MySQL server.
Print deadlocks on SOURCE_DSN:
pt-deadlock-logger SOURCE_DSN

Store deadlock information from SOURCE_DSN in test.deadlocks table on SOURCE_DSN (source and destination
are the same host):
pt-deadlock-logger SOURCE_DSN --dest D=test,t=deadlocks

Store deadlock information from SOURCE_DSN in test.deadlocks table on DEST_DSN (source and destination are
different hosts):
pt-deadlock-logger SOURCE_DSN --dest DEST_DSN,D=test,t=deadlocks

Daemonize and check for deadlocks on SOURCE_DSN every 30 seconds for 4 hours:
pt-deadlock-logger SOURCE_DSN --dest D=test,t=deadlocks --daemonize --run-time 4h --interval 30s



2.4.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-deadlock-logger is a read-only tool unless you specify a --dest table. In some cases polling SHOW INNODB
STATUS too rapidly can cause extra load on the server. If you’re using it on a production server under very heavy
load, you might want to set --interval to 30 seconds or more.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
deadlock-logger.
See also “BUGS” for more information on filing bugs and getting help.


2.4.4 DESCRIPTION

pt-deadlock-logger extracts deadlock data from a MySQL server. Currently only InnoDB deadlock information is
available. You can print the information to standard output, store it in a database table, or both. If neither --print
nor --dest are given, then the deadlock information is printed by default. If only --dest is given, then the deadlock
information is only stored. If both options are given, then the deadlock information is printed and stored.
The source host can be specified using one of two methods. The first method is to use at least one of the standard
connection-related command line options: --defaults-file, --password, --host, --port, --socket
or --user. These options only apply to the source host; they cannot be used to specify the destination host.
The second method to specify the source host, or the optional destination host using --dest, is a DSN. A
DSN is a special syntax that can be either just a hostname (like server.domain.com or 1.2.3.4), or a
key=value,key=value string. Keys are a single letter:


30                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



KEY    MEANING
===    =======
h      Connect to host
P      Port number to use for connection
S      Socket file to use for connection
u      User for login if not current user
p      Password to use when connecting
F      Only read default options from the given file

If you omit any values from the destination host DSN, they are filled in with values from the source host, so you don’t
need to specify them in both places. pt-deadlock-logger reads all normal MySQL option files, such as ~/.my.cnf, so
you may not need to specify username, password and other common options at all.


2.4.5 OUTPUT

You can choose which columns are output and/or saved to --dest with the --columns argument. The default
columns are as follows:
server
         The (source) server on which the deadlock occurred. This might be useful if you’re tracking deadlocks on
         many servers.
ts
         The date and time of the last detected deadlock.
thread
         The MySQL thread number, which is the same as the connection ID in SHOW FULL PROCESSLIST.
txn_id
         The InnoDB transaction ID, which InnoDB expresses as two unsigned integers. I have multiplied them
         out to be one number.
txn_time
         How long the transaction was active when the deadlock happened.
user
         The connection’s database username.
hostname
         The connection’s host.
ip
         The connection’s IP address. If you specify --numeric-ip, this is converted to an unsigned integer.
db
         The database in which the deadlock occurred.
tbl
         The table on which the deadlock occurred.
idx
         The index on which the deadlock occurred.
lock_type


2.4. pt-deadlock-logger                                                                                             31
Percona Toolkit Documentation, Release 2.1.1


        The lock type the transaction held on the lock that caused the deadlock.
lock_mode
        The lock mode of the lock that caused the deadlock.
wait_hold
        Whether the transaction was waiting for the lock or holding the lock. Usually you will see the two waited-
        for locks.
victim
        Whether the transaction was selected as the deadlock victim and rolled back.
query
        The query that caused the deadlock.


2.4.6 INNODB CAVEATS AND DETAILS

InnoDB’s output is hard to parse and sometimes there’s no way to do it right.
Sometimes not all information (for example, username or IP address) is included in the deadlock information. In this
case there’s nothing for the script to put in those columns. It may also be the case that the deadlock output is so long
(because there were a lot of locks) that the whole thing is truncated.
Though there are usually two transactions involved in a deadlock, there are more locks than that; at a minimum,
one more lock than transactions is necessary to create a cycle in the waits-for graph. pt-deadlock-logger prints the
transactions (always two in the InnoDB output, even when there are more transactions in the waits-for graph than that)
and fills in locks. It prefers waited-for over held when choosing lock information to output, but you can figure out the
rest with a moment’s thought. If you see one wait-for and one held lock, you’re looking at the same lock, so of course
you’d prefer to see both wait-for locks and get more information. If the two waited-for locks are not on the same table,
more than two transactions were involved in the deadlock.


2.4.7 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
        Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
        option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
        on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-clear-deadlocks
    type: string
        Use this table to create a small deadlock. This usually has the effect of clearing out a huge deadlock, which
        otherwise consumes the entire output of SHOW INNODB STATUS. The table must not exist. pt-deadlock-
        logger will create it with the following MAGIC_clear_deadlocks structure:
        CREATE TABLE test.deadlock_maker(a INT PRIMARY KEY) ENGINE=InnoDB;

        After creating the table and causing a small deadlock, the tool will drop the table again.




32                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-[no]collapse
    Collapse whitespace in queries to a single space. This might make it easier to inspect on the command line or
    in a query. By default, whitespace is collapsed when printing with --print, but not modified when storing to
    --dest. (That is, the default is different for each action).
-columns
    type: hash
      Output only this comma-separated list of columns. See “OUTPUT” for more details on columns.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-create-dest-table
    Create the table specified by --dest.
      Normally the --dest table is expected to exist already. This option causes pt-deadlock-logger to create the
      table automatically using the suggested table structure.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-dest
    type: DSN
      DSN for where to store deadlocks; specify at least a database (D) and table (t).
      Missing values are filled in with the same values from the source host, so you can usually omit most parts of
      this argument if you’re storing deadlocks on the same server on which they happen.
      By default, whitespace in the query column is left intact; use --[no]collapse if you want whitespace
      collapsed.
      The following MAGIC_dest_table is suggested if you want to store all the information pt-deadlock-logger can
      extract about deadlocks:
      CREATE TABLE deadlocks (
        server char(20) NOT NULL,
        ts datetime NOT NULL,
        thread int unsigned NOT NULL,
        txn_id bigint unsigned NOT NULL,
        txn_time smallint unsigned NOT NULL,
        user char(16) NOT NULL,
        hostname char(20) NOT NULL,
        ip char(15) NOT NULL, -- alternatively, ip int unsigned NOT NULL
        db char(64) NOT NULL,
        tbl char(64) NOT NULL,
        idx char(64) NOT NULL,
        lock_type char(16) NOT NULL,
        lock_mode char(1) NOT NULL,
        wait_hold char(1) NOT NULL,
        victim tinyint unsigned NOT NULL,
        query text NOT NULL,
        PRIMARY KEY (server,ts,thread)
      ) ENGINE=InnoDB




2.4. pt-deadlock-logger                                                                                          33
Percona Toolkit Documentation, Release 2.1.1


       If you use --columns, you can omit whichever columns you don’t want to store.
-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-interval
    type: time
       How often to check for deadlocks. If no --run-time is specified, pt-deadlock-logger runs forever, checking
       for deadlocks at every interval. See also --run-time.
-log
       type: string
       Print all output to this file when daemonized.
-numeric-ip
    Express IP addresses as integers.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-print
    Print results on standard output. See “OUTPUT” for more. By default, enables --[no]collapse unless you
    explicitly disable it.
       If --interval or --run-time is specified, only new deadlocks are printed at each interval. A fingerprint
       for each deadlock is created using --columns server, ts and thread (even if those columns were not specified
       by --columns) and if the current deadlock’s fingerprint is different from the last deadlock’s fingerprint, then
       it is printed.
-run-time
    type: time
       How long to run before exiting. By default pt-deadlock-logger runs once, checks for deadlocks, and exits. If
       --run-time is specified but no --interval is specified, a default 1 second interval will be used.
-set-vars
    type: string; default: wait_timeout=10000
       Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
       executed.
-socket
    short form: -S; type: string



34                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


       Socket file to use for connection.
-tab
       Print tab-separated columns, instead of aligned.
-user
    short form: -u; type: string
       User for login if not current user.
-version
    Show version and exit.


2.4.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
       dsn: charset; copy: yes
       Default character set.
   • D
       dsn: database; copy: yes
       Default database.
   • F
       dsn: mysql_read_default_file; copy: yes
       Only read default options from the given file
   • h
       dsn: host; copy: yes
       Connect to host.
   • p
       dsn: password; copy: yes
       Password to use when connecting.
   • P
       dsn: port; copy: yes
       Port number to use for connection.
   • S
       dsn: mysql_socket; copy: yes
       Socket file to use for connection.
   • t
       Table in which to store deadlock information.
   • u


2.4. pt-deadlock-logger                                                                                      35
Percona Toolkit Documentation, Release 2.1.1


       dsn: user; copy: yes
       User for login if not current user.


2.4.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-deadlock-logger ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.4.10 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.4.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-deadlock-logger.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
     • Complete command-line used to run the tool
     • Tool --version
     • MySQL version of all servers involved
     • Output from the tool including STDERR
     • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.4.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.4.13 AUTHORS

Baron Schwartz


36                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.4.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.4.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.4.16 VERSION

pt-deadlock-logger 2.1.1


2.5 pt-diskstats

2.5.1 NAME

pt-diskstats - An interactive I/O monitoring tool for GNU/Linux.


2.5.2 SYNOPSIS

Usage

pt-diskstats [OPTION...] [FILES]

pt-diskstats prints disk I/O statistics for GNU/Linux. It is somewhat similar to iostat, but it is interactive and more
detailed. It can analyze samples gathered from another machine.


2.5.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-diskstats simply reads /proc/diskstats. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.


2.5. pt-diskstats                                                                                                   37
Percona Toolkit Documentation, Release 2.1.1


The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
diskstats.
See also “BUGS” for more information on filing bugs and getting help.


2.5.4 DESCRIPTION

The pt-diskstats tool is similar to iostat, but has some advantages. It prints read and write statistics separately, and
has more columns. It is menu-driven and interactive, with several different ways to aggregate the data. It integrates
well with the pt-stalk tool. It also does the “right thing” by default, such as hiding disks that are idle. These properties
make it very convenient for quickly drilling down into I/O performance and inspecting disk behavior.
This program works in two modes. The default is to collect samples of /proc/diskstats and print out the formatted
statistics at intervals. The other mode is to process a file that contains saved samples of /proc/diskstats; there is a shell
script later in this documentation that shows how to collect such a file.
In both cases, the tool is interactively controlled by keystrokes, so you can redisplay and slice the data flexibly and
easily. It loops forever, until you exit with the ‘q’ key. If you press the ‘?’ key, you will bring up the interactive help
menu that shows which keys control the program.
When the program is gathering samples of /proc/diskstats and refreshing its display, it prints information about the
newest sample each time it refreshes. When it is operating on a file of saved samples, it redraws the entire file’s
contents every time you change an option.
The program doesn’t print information about every block device on the system. It hides devices that it has never
observed to have any activity. You can enable and disable this by pressing the ‘i’ key.


2.5.5 OUTPUT

In the rest of this documentation, we will try to clarify the distinction between block devices (/dev/sda1, for example),
which the kernel presents to the application via a filesystem, versus the (usually) physical device underneath the block
device, which could be a disk, a RAID controller, and so on. We will sometimes refer to logical I/O operations, which
occur at the block device, versus physical I/Os which are performed on the underlying device. When we refer to the
queue, we are speaking of the queue associated with the block device, which holds requests until they’re issued to the
physical device.
The program’s output looks like the following sample, which is too wide for this manual page, so we have formatted
it as several samples with line breaks:
#ts   device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc                     rd_rt
{6}   sda     0.9     4.2     0.0     0%    0.0                      17.9
{6}   sdb     0.4     4.0     0.0     0%    0.0                      26.1
{6}   dm-0    0.0     4.0     0.0     0%    0.0                      13.5
{6}   dm-1    0.8     4.0     0.0     0%    0.0                      16.0

      ...      wr_s wr_avkb wr_mb_s wr_mrg wr_cnc                   wr_rt
      ...      99.7     6.2     0.6    35%    3.7                    23.7
      ...      14.5    15.8     0.2    75%    0.5                     9.2
      ...       1.0     4.0     0.0     0%    0.0                     2.3
      ...     117.7     4.0     0.5     0%    4.1                    35.1

      ...                    busy in_prg          io_s     qtime stime
      ...                      6%      0         100.6      23.3   0.4
      ...                      4%      0          14.9       8.6   0.6
      ...                      0%      0           1.1       1.5   1.2
      ...                      5%      0         118.5      34.5   0.4


38                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


The columns are as follows:
#ts
        This column’s contents vary depending on the tool’s aggregation mode. In the default mode, when each
        line contains information about a single disk but possibly aggregates across several samples from that disk,
        this column shows the number of samples that were included into the line of output, in {curly braces}. In
        the example shown, each line of output aggregates {10} samples of /proc/diskstats.
        In the “all” group-by mode, this column shows timestamp offsets, relative to the time the tool began
        aggregating or the timestamp of the previous lines printed, depending on the mode. The output can be
        confusing to explain, but it’s rather intuitive when you see the lines appearing on your screen periodically.
        Similarly, in “sample” group-by mode, the number indicates the total time span that is grouped into each
        sample.
        If you specify --show-timestamps, this field instead shows the timestamp at which the sample was
        taken; if multiple timestamps are present in a single line of output, then the first timestamp is used.
device
        The device name. If there is more than one device, then instead the number of devices aggregated into the
        line is shown, in {curly braces}.
rd_s
        The average number of reads per second. This is the number of I/O requests that were sent to the underly-
        ing device. This usually is a smaller number than the number of logical IO requests made by applications.
        More requests might have been queued to the block device, but some of them usually are merged before
        being sent to the disk.
        This field is computed from the contents of /proc/diskstats as follows. See “KERNEL DOCUMENTA-
        TION” below for the meaning of the field numbers:
        delta[field1] / delta[time]

rd_avkb
        The average size of the reads, in kilobytes. This field is computed as follows:
        2 * delta[field3] / delta[field1]

rd_mb_s
        The average number of megabytes read per second. Computed as follows:
        2 * delta[field3] / delta[time]

rd_mrg
        The percentage of read requests that were merged together in the queue scheduler before being sent to the
        physical device. The field is computed as follows:
        100 * delta[field2] / (delta[field2] + delta[field1])

rd_cnc
        The average concurrency of the read operations, as computed by Little’s Law. This is the end-to-end
        concurrency on the block device, not the underlying disk’s concurrency. It includes time spent in the
        queue. The field is computed as follows:
        delta[field4] / delta[time] / 1000 / devices-in-group

rd_rt


2.5. pt-diskstats                                                                                                       39
Percona Toolkit Documentation, Release 2.1.1


        The average response time of the read operations, in milliseconds. This is the end-to-end response time,
        including time spent in the queue. It is the response time that the application making I/O requests sees,
        not the response time of the physical disk underlying the block device. It is computed as follows:
        delta[field4] / (delta[field1] + delta[field2])

wr_s, wr_avkb, wr_mb_s, wr_mrg, wr_cnc, wr_rt
        These columns show write activity, and they match the corresponding columns for read activity.
busy
        The fraction of wall-clock time that the device had at least one request in progress; this is what iostat
        calls %util, and indeed it is utilization, depending on how you define utilization, but that is sometimes
        ambiguous in common parlance. It may also be called the residence time; the time during which at least
        one request was resident in the system. It is computed as follows:
        100 * delta[field10] / (1000 * delta[time])

        This field cannot exceed 100% unless there is a rounding error, but it is a common mistake to think that a
        device that’s busy all the time is saturated. A device such as a RAID volume should support concurrency
        higher than 1, and solid-state drives can support very high concurrency. Concurrency can grow without
        bound, and is a more reliable indicator of how loaded the device really is.
in_prg
        The number of requests that were in progress. Unlike the read and write concurrencies, which are averages
        that are generated from reliable numbers, this number is an instantaneous sample, and you can see that
        it might represent a spike of requests, rather than the true long-term average. If this number is large, it
        essentially means that the device is heavily loaded. It is computed as follows:
        field9

ios_s
        The average throughput of the physical device, in I/O operations per second (IOPS). This column shows
        the total IOPS the underlying device is handling. It is the sum of rd_s and wr_s.
qtime
        The average queue time; that is, time a request spends in the device scheduler queue before being sent to
        the physical device. This is an average over reads and writes.
        It is computed in a slightly complex way: the average response time seen by the application, minus the
        average service time (see the description of the next column). This is derived from the queueing theory
        formula for response time, R = W + S: response time = queue time + service time. This is solved for W,
        of course, to give W = R - S. The computation follows:
        delta[field11] / (delta[field1, 2, 5, 6] + delta[field9])
           - delta[field10] / delta[field1, 2, 5, 6]

        See the description for stime for more details and cautions.
stime
        The average service time; that is, the time elapsed while the physical device processes the request, after
        the request finishes waiting in the queue. This is an average over reads and writes. It is computed from
        the queueing theory utilization formula, U = SX, solved for S. This means that utilization divided by
        throughput gives service time:
        delta[field10] / (delta[field1, 2, 5, 6])




40                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      Note, however, that there can be some kernel bugs that cause field 9 in /proc/diskstats to become negative,
      and this can cause field 10 to be wrong, thus making the service time computation not wholly trustworthy.
      Note that in the above formula we use utilization very specifically. It is a duration, not a percentage.
      You can compare the stime and qtime columns to see whether the response time for reads and writes is
      spent in the queue or on the physical device. However, you cannot see the difference between reads and
      writes. Changing the block device scheduler algorithm might improve queue time greatly. The default
      algorithm, cfq, is very bad for servers, and should only be used on laptops and workstations that perform
      tasks such as working with spreadsheets and surfing the Internet.
If you are used to using iostat, you might wonder where you can find the same information in pt-diskstats. Here are
two samples of output from both tools on the same machine at the same time, for /dev/sda, wrapped to fit:
     #ts    dev rd_s rd_avkb rd_mb_s rd_mrg rd_cnc                   rd_rt
08:50:10    sda 0.0      0.0     0.0     0%    0.0                     0.0
08:50:20    sda 0.4      4.0     0.0     0%    0.0                    15.5
08:50:30    sda 2.1      4.4     0.0     0%    0.0                    21.1
08:50:40    sda 2.4      4.0     0.0     0%    0.0                    15.4
08:50:50    sda 0.1      4.0     0.0     0%    0.0                    33.0

                  wr_s wr_avkb wr_mb_s wr_mrg wr_cnc                 wr_rt
                   7.7    25.5     0.2    84%    0.0                   0.3
                  49.6     6.8     0.3    41%    2.4                  28.8
                 210.1     5.6     1.1    28%    7.4                  25.2
                 297.1     5.4     1.6    26%   11.4                  28.3
                  11.9    11.7     0.1    66%    0.2                   4.9

                            busy     in_prg      io_s     qtime      stime
                              1%          0       7.7       0.1        0.2
                              6%          0      50.0      28.1        0.7
                             12%          0     212.2      24.8        0.4
                             16%          0     299.5      27.8        0.4
                              1%          0      12.0       4.7        0.3

            Dev rrqm/s      wrqm/s      r/s    w/s        rMB/s    wMB/s
08:50:10    sda   0.00       41.40     0.00   7.70         0.00     0.19
08:50:20    sda   0.00       34.70     0.40 49.60          0.00     0.33
08:50:30    sda   0.00       83.30     2.10 210.10         0.01     1.15
08:50:40    sda   0.00      105.10     2.40 297.90         0.01     1.58
08:50:50    sda   0.00       22.50     0.10 11.10          0.00     0.13

                     avgrq-sz avgqu-sz          await     svctm    %util
                        51.01     0.02           2.04      1.25     0.96
                        13.55     2.44          48.76      1.16     5.79
                        11.15     7.45          35.10      0.55    11.76
                        10.81    11.40          37.96      0.53    15.97
                        24.07     0.17          15.60      0.87     0.97

The correspondence between the columns is not one-to-one. In particular:
rrqm/s, wrqm/s
      These columns in iostat are replaced by rd_mrg and wr_mrg in pt-diskstats.
avgrq-sz
      This column is in sectors in iostat, and is a combination of reads and writes. The pt-diskstats output
      breaks these out separately and shows them in kB. You can derive it via a weighted average of rd_avkb
      and wr_avkb in pt-diskstats, and then multiply by 2 to get sectors (each sector is 512 bytes).



2.5. pt-diskstats                                                                                                  41
Percona Toolkit Documentation, Release 2.1.1


avgqu-sz
        This column really represents concurrency at the block device scheduler. The pt-diskstats output shows
        concurrency for reads and writes separately: rd_cnc and wr_cnc.
await
        This column is the average response time from the beginning to the end of a request to the block device,
        including queue time and service time, and is not shown in pt-diskstats. Instead, pt-diskstats shows
        individual response times at the disk level for reads and writes (rd_rt and wr_rt), as well as queue time
        versus service time for reads and writes in aggregate.
svctm
        This column is the average service time at the disk, and is shown as stime in pt-diskstats.
%util
        This column is called busy in pt-diskstats. Utilization is usually defined as the portion of time during
        which there was at least one active request, not as a percentage, which is why we chose to avoid this
        confusing term.


2.5.6 COLLECTING DATA

It is straightforward to gather a sample of data for this tool. Files should have this format, with a timestamp line
preceding each sample of statistics:
TS <timestamp>
<contents of /proc/diskstats>
TS <timestamp>
<contents of /proc/diskstats>
... et cetera

You can simply use pt-diskstats with --save-samples to collect this data for you. If you wish to capture samples
as part of some other tool, and use pt-diskstats to analyze them, you can include a snippet of shell script such as the
following:
INTERVAL=1
while true; do
   sleep=$(date +%s.%N | awk "{print $INTERVAL - ($1 % $INTERVAL)}")
   sleep $sleep
   date +"TS %s.%N %F %T" >> diskstats-samples.txt
   cat /proc/diskstats >> diskstats-samples.txt
done



2.5.7 KERNEL DOCUMENTATION

This documentation supplements the official documentation|http://www.kernel.org/doc/Documentation/iostats.txt on
the contents of /proc/diskstats. That documentation can sometimes be difficult to understand for those who are not
familiar with Linux kernel internals. The contents of /proc/diskstats are generated by the diskstats_show()
function in the kernel source file block/genhd.c.
Here is a sample of /proc/diskstats on a recent kernel.
8 1 sda1 426 243 3386 2056 3 0 18 87 0 2135 2142

The fields in this sample are as follows. The first three fields are the major and minor device numbers (8, 1), and the
device name (sda1). They are followed by 11 fields of statistics:



42                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


   1. The number of reads completed. This is the number of physical reads done by the underlying disk, not the
      number of reads that applications made from the block device. This means that 426 actual reads have completed
      successfully to the disk on which /dev/sda1 resides. Reads are not counted until they complete.
   2. The number of reads merged because they were adjacent. In the sample, 243 reads were merged. This means
      that /dev/sda1 actually received 869 logical reads, but sent only 426 physical reads to the underlying physical
      device.
   3. The number of sectors read successfully. The 426 physical reads to the disk read 3386 sectors. Sectors are 512
      bytes, so a total of about 1.65MB have been read from /dev/sda1.
   4. The number of milliseconds spent reading. This counts only reads that have completed, not reads that are in
      progress. It counts the time spent from when requests are placed on the queue until they complete, not the
      time that the underlying disk spends servicing the requests. That is, it measures the total response time seen by
      applications, not disk response times.
   5. Ditto for field 1, but for writes.
   6. Ditto for field 2, but for writes.
   7. Ditto for field 3, but for writes.
   8. Ditto for field 4, but for writes.
   9. The number of I/Os currently in progress, that is, they’ve been scheduled by the queue scheduler and issued to
      the disk (submitted to the underlying disk’s queue), but not yet completed. There are bugs in some kernels that
      cause this number, and thus fields 10 and 11, to be wrong sometimes.
 10. The total number of milliseconds spent doing I/Os. This is not the total response time seen by the applications;
     it is the total amount of time during which at least one I/O was in progress. If one I/O is issued at time 100,
     another comes in at 101, and both of them complete at 102, then this field increments by 2, not 3.
 11. This field counts the total response time of all I/Os. In contrast to field 10, it counts double when two I/Os
     overlap. In our previous example, this field would increment by 3, not 2.


2.5.8 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-columns-regex
    type: string; default: .
      Print columns that match this Perl regex.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-devices-regex
    type: string
      Print devices that match this Perl regex.
-group-by
    type: string; default: disk
      Group-by mode: disk, sample, or all. In disk mode, each line of output shows one disk device. In sample mode,
      each line of output shows one sample of statistics. In all mode, each line of output shows one sample and one
      disk device.




2.5. pt-diskstats                                                                                                   43
Percona Toolkit Documentation, Release 2.1.1


-headers
    type: Hash; default: group,scroll
      If group is present, each sample will be separated by a blank line, unless the sample is only one line. If
      scroll is present, the tool will print the headers as often as needed to prevent them from scrolling out of view.
      Note that you can press the space bar, or the enter key, to reprint headers at will.
-help
    Show help and exit.
-interval
    type: int; default: 1
      When in interactive mode, wait N seconds before printing to the screen. Also, how often the tool should sample
      /proc/diskstats.
      The tool attempts to gather statistics exactly on even intervals of clock time. That is, if you specify a 5-second
      interval, it will try to capture samples at 12:00:00, 12:00:05, and so on; it will not gather at 12:00:01, 12:00:06
      and so forth.
      This can lead to slightly odd delays in some circumstances, because the tool waits one full cycle before printing
      out the first set of lines. (Unlike iostat and vmstat, pt-diskstats does not start with a line representing the
      averages since the computer was booted.) Therefore, the rule has an exception to avoid very long delays.
      Suppose you specify a 10-second interval, but you start the tool at 12:00:00.01. The tool might wait until
      12:00:20 to print its first lines of output, and in the intervening 19.99 seconds, it would appear to do nothing.
      To alleviate this, the tool waits until the next even interval of time to gather, unless more than 20% of that interval
      remains. This means the tool will never wait more than 120% of the sampling interval to produce output, e.g if
      you start the tool at 12:00:53 with a 10-second sampling interval, then the first sample will be only 7 seconds
      long, not 10 seconds.
-iterations
    type: int
      When in interactive mode, stop after N samples. Run forever by default.
-sample-time
    type: int; default: 1
      In –group-by sample mode, include N seconds of samples per group.
-save-samples
    type: string
      File to save diskstats samples in; these can be used for later analysis.
-show-inactive
    Show inactive devices.
-show-timestamps
    Show a ‘HH:MM:SS’ timestamp in the #ts column. If multiple timestamps are aggregated into one line, the
    first timestamp is shown.
-version
    Show version and exit.


2.5.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:



44                                                                                                     Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



PTDEBUG=1 pt-diskstats ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.5.10 SYSTEM REQUIREMENTS

This tool requires Perl v5.8.0 or newer and the /proc filesystem, unless reading from files.


2.5.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-diskstats.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.5.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.5.13 AUTHORS

Baron Schwartz, Brian Fraser, and Daniel Nichter


2.5.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.




2.5. pt-diskstats                                                                                                   45
Percona Toolkit Documentation, Release 2.1.1


2.5.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.5.16 VERSION

pt-diskstats 2.1.1


2.6 pt-duplicate-key-checker

2.6.1 NAME

pt-duplicate-key-checker - Find duplicate indexes and foreign keys on MySQL tables.


2.6.2 SYNOPSIS

Usage

pt-duplicate-key-checker [OPTION...] [DSN]

pt-duplicate-key-checker examines MySQL tables for duplicate or redundant indexes and foreign keys. Connection
options are read from MySQL option files.
pt-duplicate-key-checker --host host1



2.6.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-duplicate-key-checker is a read-only tool that executes SHOW CREATE TABLE and related queries to inspect
table structures, and thus is very low-risk.
At the time of this release, there is an unconfirmed bug that causes the tool to crash.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
duplicate-key-checker.
See also “BUGS” for more information on filing bugs and getting help.


46                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.6.4 DESCRIPTION

This program examines the output of SHOW CREATE TABLE on MySQL tables, and if it finds indexes that cover
the same columns as another index in the same order, or cover an exact leftmost prefix of another index, it prints
out the suspicious indexes. By default, indexes must be of the same type, so a BTREE index is not a duplicate of a
FULLTEXT index, even if they have the same columns. You can override this.
It also looks for duplicate foreign keys. A duplicate foreign key covers the same columns as another in the same table,
and references the same parent table.


2.6.5 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-all-structs
    Compare indexes with different structs (BTREE, HASH, etc).
      By default this is disabled, because a BTREE index that covers the same columns as a FULLTEXT index is not
      really a duplicate, for example.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-[no]clustered
    default: yes
      PK columns appended to secondary key is duplicate.
      Detects when a suffix of a secondary key is a leftmost prefix of the primary key, and treats it as a duplicate
      key. Only detects this condition on storage engines whose primary keys are clustered (currently InnoDB and
      solidDB).
      Clustered storage engines append the primary key columns to the leaf nodes of all secondary keys anyway, so
      you might consider it redundant to have them appear in the internal nodes as well. Of course, you may also want
      them in the internal nodes, because just having them at the leaf nodes won’t help for some queries. It does help
      for covering index queries, however.
      Here’s an example of a key that is considered redundant with this option:
      PRIMARY KEY (‘a‘)
      KEY ‘b‘ (‘b‘,‘a‘)

      The use of such indexes is rather subtle. For example, suppose you have the following query:
      SELECT ... WHERE b=1 ORDER BY a;

      This query will do a filesort if we remove the index on b,a. But if we shorten the index on b,a to just b and
      also remove the ORDER BY, the query should return the same results.
      The tool suggests shortening duplicate clustered keys by dropping the key and re-adding it without the primary
      key prefix. The shortened clustered key may still duplicate another key, but the tool cannot currently detect
      when this happens without being ran a second time to re-check the newly shortened clustered keys. Therefore,
      if you shorten any duplicate clustered keys, you should run the tool again.


2.6. pt-duplicate-key-checker                                                                                       47
Percona Toolkit Documentation, Release 2.1.1


-config
    type: Array
       Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-databases
    short form: -d; type: hash
       Check only this comma-separated list of databases.
-defaults-file
    short form: -F; type: string
       Only read mysql options from the given file. You must give an absolute pathname.
-engines
    short form: -e; type: hash
       Check only tables whose storage engine is in this comma-separated list.
-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-ignore-databases
    type: Hash
       Ignore this comma-separated list of databases.
-ignore-engines
    type: Hash
       Ignore this comma-separated list of storage engines.
-ignore-order
    Ignore index order so KEY(a,b) duplicates KEY(b,a).
-ignore-tables
    type: Hash
       Ignore this comma-separated list of tables. Table names may be qualified with the database name.
-key-types
    type: string; default: fk
       Check for duplicate f=foreign keys, k=keys or fk=both.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.




48                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-port
    short form: -P; type: int
      Port number to use for connection.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-[no]sql
    default: yes
      Print DROP KEY statement for each duplicate key. By default an ALTER TABLE DROP KEY statement is
      printed below each duplicate key so that, if you want to remove the duplicate key, you can copy-paste the
      statement into MySQL.
      To disable printing these statements, specify --no-sql.
-[no]summary
    default: yes
      Print summary of indexes at end of output.
-tables
    short form: -t; type: hash
      Check only this comma-separated list of tables.
      Table names may be qualified with the database name.
-user
    short form: -u; type: string
      User for login if not current user.
-verbose
    short form: -v
      Output all keys and/or foreign keys found, not just redundant ones.
-version
    Show version and exit.


2.6.6 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.
   • D


2.6. pt-duplicate-key-checker                                                                                49
Percona Toolkit Documentation, Release 2.1.1


      dsn: database; copy: yes
      Default database.
     • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
     • h
      dsn: host; copy: yes
      Connect to host.
     • p
      dsn: password; copy: yes
      Password to use when connecting.
     • P
      dsn: port; copy: yes
      Port number to use for connection.
     • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
     • u
      dsn: user; copy: yes
      User for login if not current user.


2.6.7 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-duplicate-key-checker ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.6.8 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.6.9 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-duplicate-key-checker.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
     • Complete command-line used to run the tool
     • Tool --version


50                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.6.10 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.6.11 AUTHORS

Baron Schwartz and Daniel Nichter


2.6.12 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.6.13 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.6.14 VERSION

pt-duplicate-key-checker 2.1.1


2.6. pt-duplicate-key-checker                                                                                     51
Percona Toolkit Documentation, Release 2.1.1



2.7 pt-fifo-split

2.7.1 NAME

pt-fifo-split - Split files and pipe lines to a fifo without really splitting.


2.7.2 SYNOPSIS

Usage

pt-fifo-split [options] [FILE ...]

pt-fifo-split splits FILE and pipes lines to a fifo. With no FILE, or when FILE is -, read standard input.
Read hugefile.txt in chunks of a million lines without physically splitting it:
pt-fifo-split --lines 1000000 hugefile.txt
while [ -e /tmp/pt-fifo-split ]; do cat /tmp/pt-fifo-split; done



2.7.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-fifo-split creates and/or deletes the --fifo file. Otherwise, no other files are modified, and it merely reads lines
from the file given on the command-line. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-fifo-
split.
See also “BUGS” for more information on filing bugs and getting help.


2.7.4 DESCRIPTION

pt-fifo-split lets you read from a file as though it contains only some of the lines in the file. When you read from it
again, it contains the next set of lines; when you have gone all the way through it, the file disappears. This works only
on Unix-like operating systems.
You can specify multiple files on the command line. If you don’t specify any, or if you use the special filename -,
lines are read from standard input.


2.7.5 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.



52                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-fifo
    type: string; default: /tmp/pt-fifo-split
       The name of the fifo from which the lines can be read.
-force
    Remove the fifo if it exists already, then create it again.
-help
    Show help and exit.
-lines
    type: int; default: 1000
       The number of lines to read in each chunk.
-offset
    type: int; default: 0
       Begin at the Nth line. If the argument is 0, all lines are printed to the fifo. If 1, then beginning at the first line,
       lines are printed (exactly the same as 0). If 2, the first line is skipped, and the 2nd and subsequent lines are
       printed to the fifo.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-statistics
    Print out statistics between chunks. The statistics are the number of chunks, the number of lines, elapsed time,
    and lines per second overall and during the last chunk.
-version
    Show version and exit.


2.7.6 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-fifo-split ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.7.7 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.7.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-fifo-split.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:


2.7. pt-fifo-split                                                                                                        53
Percona Toolkit Documentation, Release 2.1.1


     • Complete command-line used to run the tool
     • Tool --version
     • MySQL version of all servers involved
     • Output from the tool including STDERR
     • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.7.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.7.10 AUTHORS

Baron Schwartz


2.7.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.7.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.



54                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.7.13 VERSION

pt-fifo-split 2.1.1


2.8 pt-find

2.8.1 NAME

pt-find - Find MySQL tables and execute actions, like GNU find.


2.8.2 SYNOPSIS

Usage

pt-find [OPTION...] [DATABASE...]

pt-find searches for MySQL tables and executes actions, like GNU find. The default action is to print the database
and table name.
Find all tables created more than a day ago, which use the MyISAM engine, and print their names:
pt-find --ctime +1 --engine MyISAM

Find InnoDB tables that haven’t been updated in a month, and convert them to MyISAM storage engine (data ware-
housing, anyone?):
pt-find --mtime +30 --engine InnoDB --exec "ALTER TABLE %D.%N ENGINE=MyISAM"

Find tables created by a process that no longer exists, following the name_sid_pid naming convention, and remove
them.
pt-find --connection-id ’D_d+_(d+)$’ --server-id ’D_(d+)_d+$’ --exec-plus "DROP TABLE %s"

Find empty tables in the test and junk databases, and delete them:
pt-find --empty junk test --exec-plus "DROP TABLE %s"

Find tables more than five gigabytes in total size:
pt-find --tablesize +5G

Find all tables and print their total data and index size, and sort largest tables first (sort is a different program, by the
way).
pt-find --printf "%Tt%D.%Nn" | sort -rn

As above, but this time, insert the data back into the database for posterity:
pt-find --noquote --exec "INSERT INTO sysdata.tblsize(db, tbl, size) VALUES(’%D’, ’%N’, %T)"



2.8.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.


2.8. pt-find                                                                                                              55
Percona Toolkit Documentation, Release 2.1.1


pt-find only reads and prints information by default, but --exec and --exec-plus can execute user-defined SQL.
You should be as careful with it as you are with any command-line tool that can execute queries against your database.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-find.
See also “BUGS” for more information on filing bugs and getting help.


2.8.4 DESCRIPTION

pt-find looks for MySQL tables that pass the tests you specify, and executes the actions you specify. The default
action is to print the database and table name to STDOUT.
pt-find is simpler than GNU find. It doesn’t allow you to specify complicated expressions on the command line.
pt-find uses SHOW TABLES when possible, and SHOW TABLE STATUS when needed.


2.8.5 OPTION TYPES

There are three types of options: normal options, which determine some behavior or setting; tests, which determine
whether a table should be included in the list of tables found; and actions, which do something to the tables pt-find
finds.
pt-find uses standard Getopt::Long option parsing, so you should use double dashes in front of long option names,
unlike GNU find.


2.8.6 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-case-insensitive
    Specifies that all regular expression searches are case-insensitive.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-day-start
    Measure times (for --mmin, etc) from the beginning of today rather than from the current time.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.




56                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-or
       Combine tests with OR, not AND.
       By default, tests are evaluated as though there were an AND between them. This option switches it to OR.
       Option parsing is not implemented by pt-find itself, so you cannot specify complicated expressions with paren-
       theses and mixtures of OR and AND.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-port
    short form: -P; type: int
       Port number to use for connection.
-[no]quote
    default: yes
       Quotes MySQL identifier names with MySQL’s standard backtick character.
       Quoting happens after tests are run, and before actions are run.
-set-vars
    type: string; default: wait_timeout=10000
       Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
       executed.
-socket
    short form: -S; type: string
       Socket file to use for connection.
-user
    short form: -u; type: string
       User for login if not current user.
-version
    Show version and exit.




2.8. pt-find                                                                                                             57
Percona Toolkit Documentation, Release 2.1.1


2.8.7 TESTS

Most tests check some criterion against a column of SHOW TABLE STATUS output. Numeric arguments can be
specified as +n for greater than n, -n for less than n, and n for exactly n. All numeric options can take an optional
suffix multiplier of k, M or G (1_024, 1_048_576, and 1_073_741_824 respectively). All patterns are Perl regular
expressions (see ‘man perlre’) unless specified as SQL LIKE patterns.
Dates and times are all measured relative to the same instant, when pt-find first asks the database server what time it
is. All date and time manipulation is done in SQL, so if you say to find tables modified 5 days ago, that translates to
SELECT DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 5 DAY). If you specify --day-start, if course it’s
relative to CURRENT_DATE instead.
However, table sizes and other metrics are not consistent at an instant in time. It can take some time for MySQL to
process all the SHOW queries, and pt-find can’t do anything about that. These measurements are as of the time they’re
taken.
If you need some test that’s not in this list, file a bug report and I’ll enhance pt-find for you. It’s really easy.
-autoinc
    type: string; group: Tests
      Table’s next AUTO_INCREMENT is n. This tests the Auto_increment column.
-avgrowlen
    type: size; group: Tests
      Table avg row len is n bytes. This tests the Avg_row_length column. The specified size can be “NULL” to test
      where Avg_row_length IS NULL.
-checksum
    type: string; group: Tests
      Table checksum is n. This tests the Checksum column.
-cmin
    type: size; group: Tests
      Table was created n minutes ago. This tests the Create_time column.
-collation
    type: string; group: Tests
      Table collation matches pattern. This tests the Collation column.
-column-name
    type: string; group: Tests
      A column name in the table matches pattern.
-column-type
    type: string; group: Tests
      A column in the table matches this type (case-insensitive).
      Examples of types are: varchar, char, int, smallint, bigint, decimal, year, timestamp, text, enum.
-comment
    type: string; group: Tests
      Table comment matches pattern. This tests the Comment column.
-connection-id
    type: string; group: Tests




58                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


     Table name has nonexistent MySQL connection ID. This tests the table name for a pattern. The argument to this
     test must be a Perl regular expression that captures digits like this: (d+). If the table name matches the pattern,
     these captured digits are taken to be the MySQL connection ID of some process. If the connection doesn’t exist
     according to SHOW FULL PROCESSLIST, the test returns true. If the connection ID is greater than pt-find‘s
     own connection ID, the test returns false for safety.
     Why would you want to do this? If you use MySQL statement-based replication, you probably know the trouble
     temporary tables can cause. You might choose to work around this by creating real tables with unique names,
     instead of temporary tables. One way to do this is to append your connection ID to the end of the table, thusly:
     scratch_table_12345. This assures the table name is unique and lets you have a way to find which connection
     it was associated with. And perhaps most importantly, if the connection no longer exists, you can assume the
     connection died without cleaning up its tables, and this table is a candidate for removal.
     This is how I manage scratch tables, and that’s why I included this test in pt-find.
     The argument I use to --connection-id is “D_(d+)$”. That finds tables with a series of numbers at the
     end, preceded by an underscore and some non-number character (the latter criterion prevents me from examining
     tables with a date at the end, which people tend to do: baron_scratch_2007_05_07 for example). It’s better to
     keep the scratch tables separate of course.
     If you do this, make sure the user pt-find runs as has the PROCESS privilege! Otherwise it will only see
     connections from the same user, and might think some tables are ready to remove when they’re still in use. For
     safety, pt-find checks this for you.
     See also --server-id.
-createopts
    type: string; group: Tests
     Table create option matches pattern. This tests the Create_options column.
-ctime
    type: size; group: Tests
     Table was created n days ago. This tests the Create_time column.
-datafree
    type: size; group: Tests
     Table has n bytes of free space. This tests the Data_free column. The specified size can be “NULL” to test
     where Data_free IS NULL.
-datasize
    type: size; group: Tests
     Table data uses n bytes of space. This tests the Data_length column. The specified size can be “NULL” to test
     where Data_length IS NULL.
-dblike
    type: string; group: Tests
     Database name matches SQL LIKE pattern.
-dbregex
    type: string; group: Tests
     Database name matches this pattern.
-empty
    group: Tests
     Table has no rows. This tests the Rows column.




2.8. pt-find                                                                                                          59
Percona Toolkit Documentation, Release 2.1.1


-engine
    type: string; group: Tests
     Table storage engine matches this pattern. This tests the Engine column, or in earlier versions of MySQL, the
     Type column.
-function
    type: string; group: Tests
     Function definition matches pattern.
-indexsize
    type: size; group: Tests
     Table indexes use n bytes of space. This tests the Index_length column. The specified size can be “NULL” to
     test where Index_length IS NULL.
-kmin
    type: size; group: Tests
     Table was checked n minutes ago. This tests the Check_time column.
-ktime
    type: size; group: Tests
     Table was checked n days ago. This tests the Check_time column.
-mmin
    type: size; group: Tests
     Table was last modified n minutes ago. This tests the Update_time column.
-mtime
    type: size; group: Tests
     Table was last modified n days ago. This tests the Update_time column.
-procedure
    type: string; group: Tests
     Procedure definition matches pattern.
-rowformat
    type: string; group: Tests
     Table row format matches pattern. This tests the Row_format column.
-rows
    type: size; group: Tests
     Table has n rows. This tests the Rows column. The specified size can be “NULL” to test where Rows IS NULL.
-server-id
    type: string; group: Tests
     Table name contains the server ID. If you create temporary tables with the naming convention explained in
     --connection-id, but also add the server ID of the server on which the tables are created, then you can use
     this pattern match to ensure tables are dropped only on the server they’re created on. This prevents a table from
     being accidentally dropped on a slave while it’s in use (provided that your server IDs are all unique, which they
     should be for replication to work).
     For example, on the master (server ID 22) you create a table called scratch_table_22_12345. If you see this
     table on the slave (server ID 23), you might think it can be dropped safely if there’s no such connection 12345.
     But if you also force the name to match the server ID with --server-id ’D_(d+)_d+$’, the table
     won’t be dropped on the slave.


60                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-tablesize
    type: size; group: Tests
     Table uses n bytes of space. This tests the sum of the Data_length and Index_length columns.
-tbllike
    type: string; group: Tests
     Table name matches SQL LIKE pattern.
-tblregex
    type: string; group: Tests
     Table name matches this pattern.
-tblversion
    type: size; group: Tests
     Table version is n. This tests the Version column.
-trigger
    type: string; group: Tests
     Trigger action statement matches pattern.
-trigger-table
    type: string; group: Tests
     --trigger is defined on table matching pattern.
-view
    type: string; group: Tests
     CREATE VIEW matches this pattern.


2.8.8 ACTIONS

The --exec-plus action happens after everything else, but otherwise actions happen in an indeterminate order. If
you need determinism, file a bug report and I’ll add this feature.
-exec
    type: string; group: Actions
     Execute this SQL with each item found.         The SQL can contain escapes and formatting directives (see
     --printf).
-exec-dsn
    type: string; group: Actions
     Specify a DSN in key-value format to use when executing SQL with --exec and --exec-plus. Any values
     not specified are inherited from command-line arguments.
-exec-plus
    type: string; group: Actions
     Execute this SQL with all items at once. This option is unlike --exec. There are no escaping or formatting
     directives; there is only one special placeholder for the list of database and table names, %s. The list of tables
     found will be joined together with commas and substituted wherever you place %s.
     You might use this, for example, to drop all the tables you found:
     DROP TABLE %s




2.8. pt-find                                                                                                         61
Percona Toolkit Documentation, Release 2.1.1


      This is sort of like GNU find’s “-exec command {} +” syntax. Only it’s not totally cryptic. And it doesn’t
      require me to write a command-line parser.
-print
    group: Actions
      Print the database and table name, followed by a newline. This is the default action if no other action is specified.
-printf
    type: string; group: Actions
      Print format on the standard output, interpreting ‘’ escapes and ‘%’ directives. Escapes are backslashed char-
      acters, like n and t. Perl interprets these, so you can use any escapes Perl knows about. Directives are replaced
      by %s, and as of this writing, you can’t add any special formatting instructions, like field widths or alignment
      (though I’m musing over ways to do that).
      Here is a list of the directives. Note that most of them simply come from columns of SHOW TABLE STATUS.
      If the column is NULL or doesn’t exist, you get an empty string in the output. A % character followed by any
      character not in the following list is discarded (but the other character is printed).
      CHAR   DATA SOURCE               NOTES
      ----   ------------------        ------------------------------------------
      a      Auto_increment
      A      Avg_row_length
      c      Checksum
      C      Create_time
      D      Database                  The database name in which the table lives
      d      Data_length
      E      Engine                    In older versions of MySQL, this is Type
      F      Data_free
      f      Innodb_free               Parsed from the Comment field
      I      Index_length
      K      Check_time
      L      Collation
      M      Max_data_length
      N      Name
      O      Comment
      P      Create_options
      R      Row_format
      S      Rows
      T      Table_length              Data_length+Index_length
      U      Update_time
      V      Version



2.8.9 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
     • A
      dsn: charset; copy: yes
      Default character set.
     • D
      dsn: database; copy: yes


62                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      Default database.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.8.10 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-find ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.8.11 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.8.12 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-find.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved


2.8. pt-find                                                                                                         63
Percona Toolkit Documentation, Release 2.1.1


     • Output from the tool including STDERR
     • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.8.13 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.8.14 AUTHORS

Baron Schwartz


2.8.15 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.8.16 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.8.17 VERSION

pt-find 2.1.1



64                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



2.9 pt-fingerprint

2.9.1 NAME

pt-fingerprint - Convert queries into fingerprints.


2.9.2 SYNOPSIS

Usage

pt-fingerprint [OPTIONS] [FILES]

pt-fingerprint converts queries into fingerprints. With the –query option, converts the option’s value into a fingerprint.
With no options, treats command-line arguments as FILEs and reads and converts semicolon-separated queries from
the FILEs. When FILE is -, it read standard input.
Convert a single query:
pt-fingerprint --query "select a, b, c from users where id = 500"

Convert a file full of queries:
pt-fingerprint /path/to/file.txt



2.9.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
The pt-fingerprint tool simply reads data and transforms it, so risks are minimal.
See also “BUGS” for more information on filing bugs and getting help.


2.9.4 DESCRIPTION

A query fingerprint is the abstracted form of a query, which makes it possible to group similar queries together.
Abstracting a query removes literal values, normalizes whitespace, and so on. For example, consider these two queries:
SELECT name, password FROM user WHERE id=’12823’;
select name,   password from user
   where id=5;

Both of those queries will fingerprint to
select name, password from user where id=?

Once the query’s fingerprint is known, we can then talk about a query as though it represents all similar queries.
Query fingerprinting accommodates a great many special cases, which have proven necessary in the real world. For
example, an IN list with 5 literals is really equivalent to one with 4 literals, so lists of literals are collapsed to a single
one. If you want to understand more about how and why all of these cases are handled, please review the test cases in
the Subversion repository. If you find something that is not fingerprinted properly, please submit a bug report with a
reproducible test case. Here is a list of transformations during fingerprinting, which might not be exhaustive:


2.9. pt-fingerprint                                                                                                         65
Percona Toolkit Documentation, Release 2.1.1


     • Group all SELECT queries from mysqldump together, even if they are against different tables. Ditto for all of
       pt-table-checksum’s checksum queries.
     • Shorten multi-value INSERT statements to a single VALUES() list.
     • Strip comments.
     • Abstract the databases in USE statements, so all USE statements are grouped together.
     • Replace all literals, such as quoted strings. For efficiency, the code that replaces literal numbers is somewhat
       non-selective, and might replace some things as numbers when they really are not. Hexadecimal literals are
       also replaced. NULL is treated as a literal. Numbers embedded in identifiers are also replaced, so tables named
       similarly will be fingerprinted to the same values (e.g. users_2009 and users_2010 will fingerprint identically).
     • Collapse all whitespace into a single space.
     • Lowercase the entire query.
     • Replace all literals inside of IN() and VALUES() lists with a single placeholder, regardless of cardinality.
     • Collapse multiple identical UNION queries into a single one.


2.9.5 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-config
    type: Array
       Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-help
    Show help and exit.
-match-embedded-numbers
    Match numbers embedded in words and replace as single values. This option causes the tool to be more careful
    about matching numbers so that words with numbers, like catch22 are matched and replaced as a single ?
    placeholder. Otherwise the default number matching pattern will replace catch22 as catch?.
       This is helpful if database or table names contain numbers.
-match-md5-checksums
    Match MD5 checksums and replace as single values. This option causes the tool to be more careful about
    matching numbers so that MD5 checksums like fbc5e685a5d3d45aa1d0347fdb7c4d35 are matched
    and replaced as a single ? placeholder. Otherwise, the default number matching pattern will replace
    fbc5e685a5d3d45aa1d0347fdb7c4d35 as fbc?.
-query
    type: string
       The query to convert into a fingerprint.
-version
    Show version and exit.


2.9.6 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:




66                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



PTDEBUG=1 pt-fingerprint ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.9.7 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.9.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-fingerprint.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.9.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.9.10 AUTHORS

Baron Schwartz and Daniel Nichter


2.9.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.



2.9. pt-fingerprint                                                                                                  67
Percona Toolkit Documentation, Release 2.1.1


2.9.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.9.13 VERSION

pt-fingerprint 2.1.1


2.10 pt-fk-error-logger

2.10.1 NAME

pt-fk-error-logger - Extract and log MySQL foreign key errors.


2.10.2 SYNOPSIS

Usage

pt-fk-error-logger [OPTION...] SOURCE_DSN

pt-fk-error-logger extracts and saves information about the most recent foreign key errors in a MySQL server.
Print foreign key errors on host1:
pt-fk-error-logger h=host1

Save foreign key errors on host1 to db.foreign_key_errors table on host2:
pt-fk-error-logger h=host1 --dest h=host1,D=db,t=foreign_key_errors



2.10.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-fk-error-logger is read-only unless you specify --dest. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-fk-
error-logger.


68                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


See also “BUGS” for more information on filing bugs and getting help.


2.10.4 DESCRIPTION

pt-fk-error-logger prints or saves the foreign key errors text from SHOW INNODB STATUS. The errors are not
parsed or interpreted in any way. Foreign key errors are uniquely identified by their timestamp. Only new (more
recent) errors are printed or saved.


2.10.5 OUTPUT

If --print is given or no --dest is given, then pt-fk-error-logger prints the foreign key error text to STDOUT
exactly as it appeared in SHOW INNODB STATUS.


2.10.6 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-dest
    type: DSN
      DSN for where to store foreign key errors; specify at least a database (D) and table (t).
      Missing values are filled in with the same values from the source host, so you can usually omit most parts of
      this argument if you’re storing foreign key errors on the same server on which they happen.
      The following table is suggested:
      CREATE TABLE foreign_key_errors (
        ts datetime NOT NULL,
        error text NOT NULL,
        PRIMARY KEY (ts),
      )

      The only information saved is the timestamp and the foreign key error text.




2.10. pt-fk-error-logger                                                                                           69
Percona Toolkit Documentation, Release 2.1.1


-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-interval
    type: time; default: 0
       How often to check for foreign key errors.
-log
       type: string
       Print all output to this file when daemonized.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-print
    Print results on standard output. See “OUTPUT” for more.
-run-time
    type: time
       How long to run before exiting.
-set-vars
    type: string; default: wait_timeout=10000
       Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
       executed.
-socket
    short form: -S; type: string
       Socket file to use for connection.
-user
    short form: -u; type: string
       User for login if not current user.
-version
    Show version and exit.




70                                                                                           Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.10.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      dsn: database; copy: yes
      Default database.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • t
      Table in which to store foreign key errors.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.10.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-fk-error-logger ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.10. pt-fk-error-logger                                                                                     71
Percona Toolkit Documentation, Release 2.1.1


2.10.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.10.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-fk-error-logger.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
     • Complete command-line used to run the tool
     • Tool --version
     • MySQL version of all servers involved
     • Output from the tool including STDERR
     • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.10.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.10.12 AUTHORS

Daniel Nichter


2.10.13 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.




72                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.10.14 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.10.15 VERSION

pt-fk-error-logger 2.1.1


2.11 pt-heartbeat

2.11.1 NAME

pt-heartbeat - Monitor MySQL replication delay.


2.11.2 SYNOPSIS

Usage

pt-heartbeat [OPTION...] [DSN] --update|--monitor|--check|--stop

pt-heartbeat measures replication lag on a MySQL or PostgreSQL server. You can use it to update a master or monitor
a replica. If possible, MySQL connection options are read from your .my.cnf file.
Start daemonized process to update test.heartbeat table on master:
pt-heartbeat -D test --update -h master-server --daemonize

Monitor replication lag on slave:
pt-heartbeat -D test --monitor -h slave-server

pt-heartbeat -D test --monitor -h slave-server --dbi-driver Pg

Check slave lag once and exit (using optional DSN to specify slave host):
pt-heartbeat -D test --check h=slave-server



2.11.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.


2.11. pt-heartbeat                                                                                                  73
Percona Toolkit Documentation, Release 2.1.1


pt-heartbeat merely reads and writes a single record in a table. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
heartbeat.
See also “BUGS” for more information on filing bugs and getting help.


2.11.4 DESCRIPTION

pt-heartbeat is a two-part MySQL and PostgreSQL replication delay monitoring system that measures delay by
looking at actual replicated data. This avoids reliance on the replication mechanism itself, which is unreliable. (For
example, SHOW SLAVE STATUS on MySQL).
The first part is an --update instance of pt-heartbeat that connects to a master and updates a timestamp (“heartbeat
record”) every --interval seconds. Since the heartbeat table may contain records from multiple masters (see
“MULTI-SLAVE HIERARCHY”), the server’s ID (@@server_id) is used to identify records.
The second part is a --monitor or --check instance of pt-heartbeat that connects to a slave, examines the
replicated heartbeat record from its immediate master or the specified --master-server-id, and computes the
difference from the current system time. If replication between the slave and the master is delayed or broken, the
computed difference will be greater than zero and potentially increase if --monitor is specified.
You must either manually create the heartbeat table on the master or use --create-table.                See
--create-table for the proper heartbeat table structure. The MEMORY storage engine is suggested, but not re-
quired of course, for MySQL.
The heartbeat table must contain a heartbeat row. By default, a heartbeat row is inserted if it doesn’t exist. This
feature can be disabled with the --[no]insert-heartbeat-row option in case the database user does not have
INSERT privileges.
pt-heartbeat depends only on the heartbeat record being replicated to the slave, so it works regardless of the replication
mechanism (built-in replication, a system such as Continuent Tungsten, etc). It works at any depth in the replication
hierarchy; for example, it will reliably report how far a slave lags its master’s master’s master. And if replication is
stopped, it will continue to work and report (accurately!) that the slave is falling further and further behind the master.
pt-heartbeat has a maximum resolution of 0.01 second. The clocks on the master and slave servers must be closely
synchronized via NTP. By default, --update checks happen on the edge of the second (e.g. 00:01) and --monitor
checks happen halfway between seconds (e.g. 00:01.5). As long as the servers’ clocks are closely synchronized and
replication events are propagating in less than half a second, pt-heartbeat will report zero seconds of delay.
pt-heartbeat will try to reconnect if the connection has an error, but will not retry if it can’t get a connection when it
first starts.
The --dbi-driver option lets you use pt-heartbeat to monitor PostgreSQL as well. It is reported to work well
with Slony-1 replication.


2.11.5 MULTI-SLAVE HIERARCHY

If the replication hierarchy has multiple slaves which are masters of other slaves, like “master -> slave1 -
> slave2”, --update instances can be ran on the slaves as well as the master. The default heartbeat ta-
ble (see --create-table) is keyed on the server_id column, so each server will update the row where
server_id=@@server_id.




74                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


For --monitor and --check, if --master-server-id is not specified, the tool tries to discover and use the
slave’s immediate master. If this fails, or if you want monitor lag from another master, then you can specify the
--master-server-id to use.
For example, if the replication hierarchy is “master -> slave1 -> slave2” with corresponding server IDs 1, 2 and 3, you
can:
pt-heartbeat --daemonize -D test --update -h master
pt-heartbeat --daemonize -D test --update -h slave1

Then check (or monitor) the replication delay from master to slave2:
pt-heartbeat -D test --master-server-id 1 --check slave2

Or check the replication delay from slave1 to slave2:
pt-heartbeat -D test --master-server-id 2 --check slave2

Stopping the --update instance one slave1 will not affect the instance on master.


2.11.6 MASTER AND SLAVE STATUS

The default heartbeat table (see --create-table) has columns for saving information from SHOW MASTER
STATUS and SHOW SLAVE STATUS. These columns are optional. If any are present, their corresponding infor-
mation will be saved.


2.11.7 OPTIONS

Specify at least one of --stop, --update, --monitor, or --check.
--update, --monitor, and --check are mutually exclusive.
--daemonize and --check are mutually exclusive.
This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-check
    Check slave delay once and exit. If you also specify --recurse, the tool will try to discover slave’s of the
    given slave and check and print their lag, too. The hostname or IP and port for each slave is printed before its
    delay. --recurse only works with MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-create-table
    Create the heartbeat --table if it does not exist.




2.11. pt-heartbeat                                                                                                  75
Percona Toolkit Documentation, Release 2.1.1


      This option causes the table specified by --database and --table to be created with the following
      MAGIC_create_heartbeat table definition:
      CREATE TABLE heartbeat (
        ts                     varchar(26) NOT NULL,
        server_id              int unsigned NOT NULL PRIMARY                    KEY,
        file                   varchar(255) DEFAULT NULL,                       -- SHOW    MASTER STATUS
        position              bigint unsigned DEFAULT NULL,                     -- SHOW    MASTER STATUS
        relay_master_log_file varchar(255) DEFAULT NULL,                        -- SHOW    SLAVE STATUS
        exec_master_log_pos    bigint unsigned DEFAULT NULL                     -- SHOW    SLAVE STATUS
      );

      The heartbeat table requires at least one row. If you manually create the heartbeat table, then you must insert a
      row by doing:
      INSERT INTO heartbeat (ts, server_id) VALUES (NOW(), N);

      where N is the server’s ID; do not use @@server_id because it will replicate and slaves will insert their own
      server ID instead of the master’s server ID.
      This is done automatically by --create-table.
      A legacy version of the heartbeat table is still supported:
      CREATE TABLE heartbeat (
        id int NOT NULL PRIMARY KEY,
        ts datetime NOT NULL
      );

      Legacy tables do not support --update instances on each slave of a multi-slave hierarchy like “master ->
      slave1 -> slave2”. To manually insert the one required row into a legacy table:
      INSERT INTO heartbeat (id, ts) VALUES (1, NOW());

      The tool automatically detects if the heartbeat table is legacy.
      See also “MULTI-SLAVE HIERARCHY”.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-database
    short form: -D; type: string
      The database to use for the connection.
-dbi-driver
    default: mysql; type: string
      Specify a driver for the connection; mysql and Pg are supported.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-file
    type: string
      Print latest --monitor output to this file.
      When --monitor is given, prints output to the specified file instead of to STDOUT. The file is opened, trun-
      cated, and closed every interval, so it will only contain the most recent statistics. Useful when --daemonize
      is given.


76                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-frames
    type: string; default: 1m,5m,15m
       Timeframes for averages.
       Specifies the timeframes over which to calculate moving averages when --monitor is given. Specify as a
       comma-separated list of numbers with suffixes. The suffix can be s for seconds, m for minutes, h for hours, or d
       for days. The size of the largest frame determines the maximum memory usage, as up to the specified number
       of per-second samples are kept in memory to calculate the averages. You can specify as many timeframes as
       you like.
-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-[no]insert-heartbeat-row
    default: yes
       Insert a heartbeat row in the --table if one doesn’t exist.
       The heartbeat --table requires a heartbeat row, else there’s nothing to --update, --monitor, or
       --check! By default, the tool will insert a heartbeat row if one is not already present. You can disable this
       feature by specifying --no-insert-heartbeat-row in case the database user does not have INSERT
       privileges.
-interval
    type: float; default: 1.0
       How often to update or check the heartbeat --table. Updates and checks begin on the first whole second
       then repeat every --interval seconds for --update and every --interval plus --skew seconds for
       --monitor.
       For example, if at 00:00.4 an --update instance is started at 0.5 second intervals, the first update happens at
       00:01.0, the next at 00:01.5, etc. If at 00:10.7 a --monitor instance is started at 0.05 second intervals with
       the default 0.5 second --skew, then the first check happens at 00:11.5 (00:11.0 + 0.5) which will be --skew
       seconds after the last update which, because the instances are checking at synchronized intervals, happened at
       00:11.0.
       The tool waits for and begins on the first whole second just to make the interval calculations simpler. Therefore,
       the tool could wait up to 1 second before updating or checking.
       The minimum (fastest) interval is 0.01, and the maximum precision is two decimal places, so 0.015 will be
       rounded to 0.02.
       If a legacy heartbeat table (see --create-table) is used, then the maximum precision is 1s because the ts
       column is type datetime.
-log
       type: string
       Print all output to this file when daemonized.
-master-server-id
    type: string
       Calculate delay from this master server ID for --monitor or --check. If not given, pt-heartbeat attempts
       to connect to the server’s master and determine its server id.




2.11. pt-heartbeat                                                                                                   77
Percona Toolkit Documentation, Release 2.1.1


-monitor
    Monitor slave delay continuously.
       Specifies that pt-heartbeat should check the slave’s delay every second and report to STDOUT (or if --file
       is given, to the file instead). The output is the current delay followed by moving averages over the timeframe
       given in --frames. For example,
       5s [    0.25s,     0.05s,     0.02s ]

-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-print-master-server-id
    Print the auto-detected or given --master-server-id. If --check or --monitor is specified, specify-
    ing this option will print the auto-detected or given --master-server-id at the end of each line.
-recurse
    type: int
       Check slaves recursively to this depth in --check mode.
       Try to discover slave servers recursively, to the specified depth. After discovering servers, run the check on each
       one of them and print the hostname (if possible), followed by the slave delay.
       This currently works only with MySQL. See --recursion-method.
-recursion-method
    type: string
       Preferred recursion method used to find slaves.
       Possible methods are:
       METHOD           USES
       ===========      ================
       processlist      SHOW PROCESSLIST
       hosts            SHOW SLAVE HOSTS

       The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method
       is required if the server uses a non-standard port (not 3306). Usually pt-heartbeat does the right thing and finds
       the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the other
       methods will be tried.
-replace
    Use REPLACE instead of UPDATE for –update.
       When running in --update mode, use REPLACE instead of UPDATE to set the heartbeat table’s timestamp.
       The REPLACE statement is a MySQL extension to SQL. This option is useful when you don’t know whether
       the table contains any rows or not. It must be used in conjunction with –update.



78                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-run-time
    type: time
      Time to run before exiting.
-sentinel
    type: string; default: /tmp/pt-heartbeat-sentinel
      Exit if this file exists.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-skew
    type: float; default: 0.5
      How long to delay checks.
      The default is to delay checks one half second. Since the update happens as soon as possible after the beginning
      of the second on the master, this allows one half second of replication delay before reporting that the slave lags
      the master by one second. If your clocks are not completely accurate or there is some other reason you’d like to
      delay the slave more or less, you can tweak this value. Try setting the PTDEBUG environment variable to see
      the effect this has.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-stop
    Stop running instances by creating the sentinel file.
      This should have the effect of stopping all running instances which are watching the same sentinel file. If none
      of --update, --monitor or --check is specified, pt-heartbeat will exit after creating the file. If one of
      these is specified, pt-heartbeat will wait the interval given by --interval, then remove the file and continue
      working.
      You might find this handy to stop cron jobs gracefully if necessary, or to replace one running instance with
      another. For example, if you want to stop and restart pt-heartbeat every hour (just to make sure that it is
      restarted every hour, in case of a server crash or some other problem), you could use a crontab line like this:
      0 * * * * :program:‘pt-heartbeat‘ --update -D test --stop 
        --sentinel /tmp/pt-heartbeat-hourly

      The non-default --sentinel will make sure the hourly cron job stops only instances previously started with
      the same options (that is, from the same cron job).
      See also --sentinel.
-table
    type: string; default: heartbeat
      The table to use for the heartbeat.
      Don’t specify database.table; use --database to specify the database.
      See --create-table.
-update
    Update a master’s heartbeat.



2.11. pt-heartbeat                                                                                                   79
Percona Toolkit Documentation, Release 2.1.1


-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.


2.11.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
     • A
      dsn: charset; copy: yes
      Default character set.
     • D
      dsn: database; copy: yes
      Default database.
     • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
     • h
      dsn: host; copy: yes
      Connect to host.
     • p
      dsn: password; copy: yes
      Password to use when connecting.
     • P
      dsn: port; copy: yes
      Port number to use for connection.
     • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
     • u
      dsn: user; copy: yes
      User for login if not current user.




80                                                                                          Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.11.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-heartbeat ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.11.10 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.11.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-heartbeat.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.11.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.11.13 AUTHORS

Proven Scaling LLC, SixApart Ltd, Baron Schwartz, and Daniel Nichter




2.11. pt-heartbeat                                                                                                  81
Percona Toolkit Documentation, Release 2.1.1


2.11.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.11.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2006 Proven Scaling LLC and Six Apart Ltd, 2007-2012 Percona Inc. Feedback and
improvements are welcome.
Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.11.16 VERSION

pt-heartbeat 2.1.1


2.12 pt-index-usage

2.12.1 NAME

pt-index-usage - Read queries from a log and analyze how they use indexes.


2.12.2 SYNOPSIS

Usage

pt-index-usage [OPTION...] [FILE...]

pt-index-usage reads queries from logs and analyzes how they use indexes.
Analyze queries in slow.log and print reports:
pt-index-usage /path/to/slow.log --host localhost

Disable reports and save results to mk database for later analysis:
pt-index-usage slow.log --no-report --save-results-database mk




82                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.12.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
This tool is read-only unless you use --save-results-database. It reads a log of queries and EXPLAIN them.
It also gathers information about all tables in all databases. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
index-usage.
See also “BUGS” for more information on filing bugs and getting help.


2.12.4 DESCRIPTION

This tool connects to a MySQL database server, reads through a query log, and uses EXPLAIN to ask MySQL how it
will use each query. When it is finished, it prints out a report on indexes that the queries didn’t use.
The query log needs to be in MySQL’s slow query log format. If you need to input a different format, you can use
pt-query-digest to translate the formats. If you don’t specify a filename, the tool reads from STDIN.
The tool runs two stages. In the first stage, the tool takes inventory of all the tables and indexes in your database, so
it can compare the existing indexes to those that were actually used by the queries in the log. In the second stage, it
runs EXPLAIN on each query in the query log. It uses separate database connections to inventory the tables and run
EXPLAIN, so it opens two connections to the database.
If a query is not a SELECT, it tries to transform it to a roughly equivalent SELECT query so it can be EXPLAINed.
This is not a perfect process, but it is good enough to be useful.
The tool skips the EXPLAIN step for queries that are exact duplicates of those seen before. It assumes that the same
query will generate the same EXPLAIN plan as it did previously (usually a safe assumption, and generally good for
performance), and simply increments the count of times that the indexes were used. However, queries that have the
same fingerprint but different checksums will be re-EXPLAINed. Queries that have different literal constants can have
different execution plans, and this is important to measure.
After EXPLAIN-ing the query, it is necessary to try to map aliases in the query back to the original table names. For
example, consider the EXPLAIN plan for the following query:
SELECT * FROM tbl1 AS foo;

The EXPLAIN output will show access to table foo, and that must be translated back to tbl1. This process involves
complex parsing. It is generally very accurate, but there is some chance that it might not work right. If you find cases
where it fails, submit a bug report and a reproducible test case.
Queries that cannot be EXPLAINed will cause all subsequent queries with the same fingerprint to be blacklisted. This
is to reduce the work they cause, and prevent them from continuing to print error messages. However, at least in this
stage of the tool’s development, it is my opinion that it’s not a good idea to preemptively silence these, or prevent them
from being EXPLAINed at all. I am looking for lots of feedback on how to improve things like the query parsing. So
please submit your test cases based on the errors the tool prints!


2.12.5 OUTPUT

After it reads all the events in the log, the tool prints out DROP statements for every index that was not used. It skips
indexes for tables that were never accessed by any queries in the log, to avoid false-positive results.


2.12. pt-index-usage                                                                                                   83
Percona Toolkit Documentation, Release 2.1.1


If you don’t specify --quiet, the tool also outputs warnings about statements that cannot be EXPLAINed and
similar. These go to standard error.
Progress reports are enabled by default (see --progress). These also go to standard error.


2.12.6 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-create-save-results-database
    Create the --save-results-database if it does not exist.
      If the --save-results-database already exists and this option is specified, the database is used and the
      necessary tables are created if they do not already exist.
-[no]create-views
    Create views for --save-results-database example queries.
      Several example queries are given for querying the tables in the --save-results-database. These
      example queries are, by default, created as views. Specifying --no-create-views prevents these views
      from being created.
-database
    short form: -D; type: string
      The database to use for the connection.
-databases
    short form: -d; type: hash
      Only get tables and indexes from this comma-separated list of databases.
-databases-regex
    type: string
      Only get tables and indexes from database whose names match this Perl regex.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-drop
    type: Hash; default: non-unique
      Suggest dropping only these types of unused indexes.
      By default pt-index-usage will only suggest to drop unused secondary indexes, not primary or unique indexes.
      You can specify which types of unused indexes the tool suggests to drop: primary, unique, non-unique, all.


84                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      A separate ALTER TABLE statement for each type is printed. So if you specify --drop all and there is
      a primary key and a non-unique index, the ALTER TABLE ... DROP for each will be printed on separate
      lines.
-empty-save-results-tables
    Drop and re-create all pre-existing tables in the --save-results-database. This allows information
    from previous runs to be removed before the current run.
-help
    Show help and exit.
-host
    short form: -h; type: string
      Connect to host.
-ignore-databases
    type: Hash
      Ignore this comma-separated list of databases.
-ignore-databases-regex
    type: string
      Ignore databases whose names match this Perl regex.
-ignore-tables
    type: Hash
      Ignore this comma-separated list of table names.
      Table names may be qualified with the database name.
-ignore-tables-regex
    type: string
      Ignore tables whose names match the Perl regex.
-password
    short form: -p; type: string
      Password to use when connecting.
-port
    short form: -P; type: int
      Port number to use for connection.
-progress
    type: array; default: time,30
      Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be
      percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage,
      seconds, or number of iterations.
-quiet
    short form: -q
      Do not print any warnings. Also disables --progress.
-[no]report
    default: yes
      Print the reports for --report-format.




2.12. pt-index-usage                                                                                              85
Percona Toolkit Documentation, Release 2.1.1


     You may want to disable the reports by specifying --no-report if, for example, you also specify
     --save-results-database and you only want to query the results tables later.
-report-format
    type: Array; default: drop_unused_indexes
     Right now there is only one report: drop_unused_indexes. This report prints SQL statements for dropping any
     unused indexes. See also --drop.
     See also --[no]report.
-save-results-database
    type: DSN
     Save results to tables in this database. Information about indexes, queries, tables and their usage is stored in
     several tables in the specified database. The tables are auto-created if they do not exist. If the database doesn’t
     exist, it can be auto-created with --create-save-results-database. In this case the connection is
     initially created with no default database, then after the database is created, it is USE’ed.
     pt-index-usage executes INSERT statements to save the results. Therefore, you should be careful if you use
     this feature on a production server. It might increase load, or cause trouble if you don’t want the server to be
     written to, or so on.
     This is a new feature. It may change in future releases.
     After a run, you can query the usage tables to answer various questions about index usage. The tables have the
     following CREATE TABLE definitions:
     MAGIC_create_indexes:
     CREATE TABLE IF NOT EXISTS indexes (
       db           VARCHAR(64) NOT NULL,
       tbl          VARCHAR(64) NOT NULL,
       idx          VARCHAR(64) NOT NULL,
       cnt          BIGINT UNSIGNED NOT NULL DEFAULT 0,
       PRIMARY KEY (db, tbl, idx)
     )

     MAGIC_create_queries:
     CREATE TABLE IF NOT EXISTS queries (
       query_id     BIGINT UNSIGNED NOT NULL,
       fingerprint TEXT NOT NULL,
       sample       TEXT NOT NULL,
       PRIMARY KEY (query_id)
     )

     MAGIC_create_tables:
     CREATE TABLE IF NOT EXISTS tables (
       db           VARCHAR(64) NOT NULL,
       tbl          VARCHAR(64) NOT NULL,
       cnt          BIGINT UNSIGNED NOT NULL DEFAULT 0,
       PRIMARY KEY (db, tbl)
     )

     MAGIC_create_index_usage:
     CREATE TABLE IF NOT EXISTS index_usage (
       query_id      BIGINT UNSIGNED NOT NULL,
       db            VARCHAR(64) NOT NULL,
       tbl           VARCHAR(64) NOT NULL,



86                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



         idx               VARCHAR(64) NOT NULL,
         cnt               BIGINT UNSIGNED NOT NULL DEFAULT 1,
         UNIQUE INDEX      (query_id, db, tbl, idx)
     )

     MAGIC_create_index_alternatives:
     CREATE TABLE IF       NOT EXISTS index_alternatives (
       query_id            BIGINT UNSIGNED NOT NULL, -- This query used
       db                  VARCHAR(64) NOT NULL,     -- this index, but...
       tbl                 VARCHAR(64) NOT NULL,     --
       idx                 VARCHAR(64) NOT NULL,     --
       alt_idx             VARCHAR(64) NOT NULL,     -- was an alternative
       cnt                 BIGINT UNSIGNED NOT NULL DEFAULT 1,
       UNIQUE INDEX        (query_id, db, tbl, idx, alt_idx),
       INDEX               (db, tbl, idx),
       INDEX               (db, tbl, alt_idx)
     )

     The following are some queries you can run against these tables to answer common questions you might have.
     Each query is also created as a view (with MySQL v5.0 and newer) if :option:‘--[no]create-views‘
     is true (it is by default). The view names are the strings after the MAGIC_view_ prefix.
     Question: which queries sometimes use different indexes, and what fraction of the time is each index chosen?
     MAGIC_view_query_uses_several_indexes:
     SELECT iu.query_id, CONCAT_WS(’.’, iu.db, iu.tbl, iu.idx) AS idx,
        variations, iu.cnt, iu.cnt / total_cnt * 100 AS pct
     FROM index_usage AS iu
        INNER JOIN (
           SELECT query_id, db, tbl, SUM(cnt) AS total_cnt,
             COUNT(*) AS variations
           FROM index_usage
           GROUP BY query_id, db, tbl
           HAVING COUNT(*) > 1
        ) AS qv USING(query_id, db, tbl);

     Question: which indexes have lots of alternatives, i.e. are chosen instead of other indexes, and for what queries?
     MAGIC_view_index_has_alternates:
     SELECT CONCAT_WS(’.’, db, tbl, idx) AS idx_chosen,
        GROUP_CONCAT(DISTINCT alt_idx) AS alternatives,
        GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
     FROM index_alternatives
     GROUP BY db, tbl, idx
     HAVING COUNT(*) > 1;

     Question: which indexes are considered as alternates for other indexes, and for what queries?
     MAGIC_view_index_alternates:
     SELECT CONCAT_WS(’.’, db, tbl, alt_idx) AS idx_considered,
        GROUP_CONCAT(DISTINCT idx) AS alternative_to,
        GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
     FROM index_alternatives
     GROUP BY db, tbl, alt_idx
     HAVING COUNT(*) > 1;

     Question: which of those are never chosen by any queries,                     and are therefore superfluous?
     MAGIC_view_unused_index_alternates:



2.12. pt-index-usage                                                                                                87
Percona Toolkit Documentation, Release 2.1.1



      SELECT CONCAT_WS(’.’, i.db, i.tbl, i.idx) AS idx,
         alt.alternative_to, alt.queries, alt.cnt
      FROM indexes AS i
         INNER JOIN (
            SELECT db, tbl, alt_idx, GROUP_CONCAT(DISTINCT idx) AS alternative_to,
               GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt
            FROM index_alternatives
            GROUP BY db, tbl, alt_idx
            HAVING COUNT(*) > 1
         ) AS alt ON i.db = alt.db AND i.tbl = alt.tbl
           AND i.idx = alt.alt_idx
      WHERE i.cnt = 0;

      Question: given a table, which indexes were used, by how many queries, with how many distinct fingerprints?
      Were there alternatives? Which indexes were not used? You can edit the following query’s SELECT list to also
      see the query IDs in question. MAGIC_view_index_usage:
      SELECT i.idx, iu.usage_cnt, iu.usage_total,
         ia.alt_cnt, ia.alt_total
      FROM indexes AS i
         LEFT OUTER JOIN (
            SELECT db, tbl, idx, COUNT(*) AS usage_cnt,
               SUM(cnt) AS usage_total, GROUP_CONCAT(query_id) AS used_by
            FROM index_usage
            GROUP BY db, tbl, idx
         ) AS iu ON i.db=iu.db AND i.tbl=iu.tbl AND i.idx = iu.idx
         LEFT OUTER JOIN (
            SELECT db, tbl, idx, COUNT(*) AS alt_cnt,
               SUM(cnt) AS alt_total,
               GROUP_CONCAT(query_id) AS alt_queries
            FROM index_alternatives
            GROUP BY db, tbl, idx
         ) AS ia ON i.db=ia.db AND i.tbl=ia.tbl AND i.idx = ia.idx;

      Question: which indexes on a given table are vital for at least one query (there is no alternative)?
      MAGIC_view_required_indexes:
      SELECT i.db, i.tbl, i.idx, no_alt.queries
      FROM indexes AS i
         INNER JOIN (
            SELECT iu.db, iu.tbl, iu.idx,
               GROUP_CONCAT(iu.query_id) AS queries
            FROM index_usage AS iu
               LEFT OUTER JOIN index_alternatives AS ia
                  USING(db, tbl, idx)
            WHERE ia.db IS NULL
            GROUP BY iu.db, iu.tbl, iu.idx
         ) AS no_alt ON no_alt.db = i.db AND no_alt.tbl = i.tbl
            AND no_alt.idx = i.idx
      ORDER BY i.db, i.tbl, i.idx, no_alt.queries;

-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string


88                                                                                            Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      Socket file to use for connection.
-tables
    short form: -t; type: hash
      Only get indexes from this comma-separated list of tables.
-tables-regex
    type: string
      Only get indexes from tables whose names match this Perl regex.
-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.


2.12.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.
   • D
      dsn: database; copy: yes
      Database to connect to.
   • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
   • h
      dsn: host; copy: yes
      Connect to host.
   • p
      dsn: password; copy: yes
      Password to use when connecting.
   • P
      dsn: port; copy: yes
      Port number to use for connection.
   • S




2.12. pt-index-usage                                                                                         89
Percona Toolkit Documentation, Release 2.1.1


       dsn: mysql_socket; copy: yes
       Socket file to use for connection.
     • u
       dsn: user; copy: yes
       User for login if not current user.


2.12.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-index-usage ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.12.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.12.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-index-usage.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
     • Complete command-line used to run the tool
     • Tool --version
     • MySQL version of all servers involved
     • Output from the tool including STDERR
     • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.12.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


90                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.12.12 AUTHORS

Baron Schwartz and Daniel Nichter


2.12.13 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.12.14 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.12.15 VERSION

pt-index-usage 2.1.1


2.13 pt-ioprofile

2.13.1 NAME

pt-ioprofile - Watch process IO and print a table of file and I/O activity.


2.13.2 SYNOPSIS

Usage

pt-ioprofile [OPTIONS] [FILE]

pt-ioprofile does two things: 1) get lsof+strace for -s seconds, 2) aggregate the result. If you specify a FILE, then step
1) is not performed.




2.13. pt-ioprofile                                                                                                     91
Percona Toolkit Documentation, Release 2.1.1


2.13.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-ioprofile is a read-only tool, so your data is not at risk. However, it works by attaching strace to the process
using ptrace(), which will make it run very slowly until strace detaches. In addition to freezing the server, there
is also some risk of the process crashing or performing badly after strace detaches from it, or indeed of strace
not detaching cleanly and leaving the process in a sleeping state. As a result, this should be considered an intrusive
tool, and should not be used on production servers unless you are comfortable with that.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
ioprofile.
See also “BUGS” for more information on filing bugs and getting help.


2.13.4 DESCRIPTION

pt-ioprofile uses strace and lsof to watch a process’s IO and print out a table of files and I/O activity. By default,
it watches the mysqld process for 30 seconds. The output is like:
Tue Dec 27 15:33:57 PST 2011
Tracing process ID 1833
     total       read      write                    lseek     ftruncate filename
  0.000150   0.000029   0.000068                 0.000038      0.000015 /tmp/ibBE5opS

You probably need to run this tool as root.


2.13.5 OPTIONS

-aggregate
    short form: -a; type: string; default: sum
      The aggregate function, either sum or avg.
      If sum, then each cell will contain the sum of the values in it. If avg, then each cell will contain the average of
      the values in it.
-cell
    short form: -c; type: string; default: times
      The cell contents.
      Valid values are:
      VALUE     CELLS CONTAIN
      =====     =======================
      count     Count of I/O operations
      sizes     Sizes of I/O operations
      times     I/O operation timing

-group-by
    short form: -g; type: string; default: filename
      The group-by item.


92                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      Valid values are:
      VALUE         GROUPING
      =====         ======================================
      all           Summarize into a single line of output
      filename      One line of output per filename
      pid           One line of output per process ID

-help
    Print help and exit.
-profile-pid
    short form: -p; type: int
      The PID to profile, overrides --profile-process.
-profile-process
    short form: -b; type: string; default: mysqld
      The process name to profile.
-run-time
    type: int; default: 30
      How long to profile.
-save-samples
    type: string
      Filename to save samples in; these can be used for later analysis.
-version
    Print the tool’s version and exit.


2.13.6 ENVIRONMENT

This tool does not use any environment variables.


2.13.7 SYSTEM REQUIREMENTS

This tool requires the Bourne shell (/bin/sh).


2.13.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-ioprofile.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.




2.13. pt-ioprofile                                                                                                   93
Percona Toolkit Documentation, Release 2.1.1


2.13.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.13.10 AUTHORS

Baron Schwartz


2.13.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.13.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.13.13 VERSION

pt-ioprofile 2.1.1




94                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



2.14 pt-kill

2.14.1 NAME

pt-kill - Kill MySQL queries that match certain criteria.


2.14.2 SYNOPSIS

Usage

pt-kill [OPTIONS]

pt-kill kills MySQL connections. pt-kill connects to MySQL and gets queries from SHOW PROCESSLIST if no
FILE is given. Else, it reads queries from one or more FILE which contains the output of SHOW PROCESSLIST. If
FILE is -, pt-kill reads from STDIN.
Kill queries running longer than 60s:
pt-kill --busy-time 60 --kill

Print, do not kill, queries running longer than 60s:
pt-kill --busy-time 60 --print

Check for sleeping processes and kill them all every 10s:
pt-kill --match-command Sleep --kill --victims all --interval 10

Print all login processes:
pt-kill --match-state login --print --victims all

See which queries in the processlist right now would match:
mysql -e "SHOW PROCESSLIST" > proclist.txt
pt-kill --test-matching proclist.txt --busy-time 60 --print



2.14.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-kill kills queries if you use the --kill option, so it can disrupt your database’s users, of course. You should test
with the <--print> option, which is safe, if you’re unsure what the tool will do.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-kill.
See also “BUGS” for more information on filing bugs and getting help.




2.14. pt-kill                                                                                                         95
Percona Toolkit Documentation, Release 2.1.1


2.14.4 DESCRIPTION

pt-kill captures queries from SHOW PROCESSLIST, filters them, and then either kills or prints them. This is also
known as a “slow query sniper” in some circles. The idea is to watch for queries that might be consuming too many
resources, and kill them.
For brevity, we talk about killing queries, but they may just be printed (or some other future action) depending on what
options are given.
Normally pt-kill connects to MySQL to get queries from SHOW PROCESSLIST. Alternatively, it can read SHOW
PROCESSLIST output from files. In this case, pt-kill does not connect to MySQL and --kill has no effect. You
should use --print instead when reading files. The ability to read a file with --test-matching allows you to
capture SHOW PROCESSLIST and test it later with pt-kill to make sure that your matches kill the proper queries.
There are a lot of special rules to follow, such as “don’t kill replication threads,” so be careful not to kill something
important!
Two important options to know are --busy-time and --victims. First, whereas most match/filter options match
their corresponding value from SHOW PROCESSLIST (e.g. --match-command matches a query’s Command
value), the Time value is matched by --busy-time. See also --interval.
Second, --victims controls which matching queries from each class are killed. By default, the matching query
with the highest Time value is killed (the oldest query). See the next section, “GROUP, MATCH AND KILL”, for
more details.
Usually you need to specify at least one --match option, else no queries will match. Or, you can specify
--match-all to match all queries that aren’t ignored by an --ignore option.


2.14.5 GROUP, MATCH AND KILL

Queries pass through several steps to determine which exactly will be killed (or printed–whatever action is specified).
Understanding these steps will help you match precisely the queries you want.
The first step is grouping queries into classes. The --group-by option controls grouping. By default, this option
has no value so all queries are grouped into one default class. All types of matching and filtering (the next step) are
applied per-class. Therefore, you may need to group queries in order to match/filter some classes but not others.
The second step is matching. Matching implies filtering since if a query doesn’t match some criteria, it is removed
from its class. Matching happens for each class. First, queries are filtered from their class by the various Query
Matches options like --match-user. Then, entire classes are filtered by the various Class Matches options
like --query-count.
The third step is victim selection, that is, which matching queries in each class to kill. This is controlled by the
--victims option. Although many queries in a class may match, you may only want to kill the oldest query, or all
queries, etc.
The forth and final step is to take some action on all matching queries from all classes. The Actions options specify
which actions will be taken. At this step, there are no more classes, just a single list of queries to kill, print, etc.


2.14.6 OUTPUT

If only --kill is given, then there is no output. If only --print is given, then a timestamped KILL statement if
printed for every query that would have been killed, like:
# 2009-07-15T15:04:01 KILL 8 (Query 42 sec) SELECT * FROM huge_table

The line shows a timestamp, the query’s Id (8), its Time (42 sec) and its Info (usually the query SQL).




96                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


If both --kill and --print are given, then matching queries are killed and a line for each like the one above is
printed.
Any command executed by --execute-command is responsible for its own output and logging. After being
executed, pt-kill has no control or interaction with the command.


2.14.7 OPTIONS

Specify at least one of --kill, --kill-query, --print, --execute-command or --stop.
--any-busy-time and --each-busy-time are mutually exclusive.
--kill and --kill-query are mutually exclusive.
--daemonize and --test-matching are mutually exclusive.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-filter
    type: string
      Discard events for which this Perl code doesn’t return true.
      This option is a string of Perl code or a file containing Perl code that gets compiled into a subroutine with one
      argument: $event. This is a hashref. If the given value is a readable file, then pt-kill reads the entire file and
      uses its contents as the code. The file should not contain a shebang (#!/usr/bin/perl) line.
      If the code returns true, the chain of callbacks continues; otherwise it ends. The code is the last statement in the
      subroutine other than return $event. The subroutine template is:
      sub { $event = shift; filter && return $event; }

      Filters given on the command line are wrapped inside parentheses like like ( filter ). For complex, multi-
      line filters, you must put the code inside a file so it will not be wrapped inside parentheses. Either way, the filter
      must produce syntactically valid code given the template. For example, an if-else branch given on the command
      line would not be valid:
      --filter ’if () { } else { }’              # WRONG




2.14. pt-kill                                                                                                          97
Percona Toolkit Documentation, Release 2.1.1


       Since it’s given on the command line, the if-else branch would be wrapped inside parentheses which is not
       syntactically valid. So to accomplish something more complex like this would require putting the code in a file,
       for example filter.txt:
       my $event_ok; if (...) { $event_ok=1; } else { $event_ok=0; } $event_ok

       Then specify --filter filter.txt to read the code from filter.txt.
       If the filter code won’t compile, pt-kill will die with an error. If the filter code does compile, an error may still
       occur at runtime if the code tries to do something wrong (like pattern match an undefined value). pt-kill does
       not provide any safeguards so code carefully!
       It is permissible for the code to have side effects (to alter $event).
-group-by
    type: string
       Apply matches to each class of queries grouped by this SHOW PROCESSLIST column. In addition to
       the basic columns of SHOW PROCESSLIST (user, host, command, state, etc.), queries can be matched by
       fingerprint which abstracts the SQL query in the Info column.
       By default, queries are not grouped, so matches and actions apply to all queries. Grouping allows matches and
       actions to apply to classes of similar queries, if any queries in the class match.
       For example, detecting cache stampedes (see all-but-oldest under --victims for an explanation of
       that term) requires that queries are grouped by the arg attribute. This creates classes of identical queries
       (stripped of comments). So queries "SELECT c FROM t WHERE id=1" and "SELECT c FROM t
       WHERE id=1" are grouped into the same class, but query c<”SELECT c FROM t WHERE id=3”> is not iden-
       tical to the first two queries so it is grouped into another class. Then when --victims all-but-oldest is
       specified, all but the oldest query in each class is killed for each class of queries that matches the match criteria.
-help
    Show help and exit.
-host
    short form: -h; type: string; default: localhost
       Connect to host.
-interval
    type: time
       How often to check for queries to kill. If --busy-time is not given, then the default interval is 30 seconds.
       Else the default is half as often as --busy-time. If both --interval and --busy-time are given, then
       the explicit --interval value is used.
       See also --run-time.
-log
       type: string
       Print all output to this file when daemonized.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.


98                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-port
    short form: -P; type: int
      Port number to use for connection.
-run-time
    type: time
      How long to run before exiting. By default pt-kill runs forever, or until its process is killed or stopped by the
      creation of a --sentinel file. If this option is specified, pt-kill runs for the specified amount of time and
      sleeps --interval seconds between each check of the PROCESSLIST.
-sentinel
    type: string; default: /tmp/pt-kill-sentinel
      Exit if this file exists.
      The presence of the file specified by --sentinel will cause all running instances of pt-kill to exit. You might
      find this handy to stop cron jobs gracefully if necessary. See also --stop.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-stop
    Stop running instances by creating the --sentinel file.
      Causes pt-kill to create the sentinel file specified by --sentinel and exit. This should have the effect of
      stopping all running instances which are watching the same sentinel file.
-[no]strip-comments
    default: yes
      Remove SQL comments from queries in the Info column of the PROCESSLIST.
-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.
-victims
    type: string; default: oldest
      Which of the matching queries in each class will be killed. After classes have been matched/filtered, this option
      specifies which of the matching queries in each class will be killed (or printed, etc.). The following values are
      possible:
      oldest
            Only kill the single oldest query. This is to prevent killing queries that aren’t really long-running,
            they’re just long-waiting. This sorts matching queries by Time and kills the one with the highest
            Time value.
      all



2.14. pt-kill                                                                                                        99
Percona Toolkit Documentation, Release 2.1.1


           Kill all queries in the class.
      all-but-oldest
           Kill all but the oldest query. This is the inverse of the oldest value.
           This value can be used to prevent “cache stampedes”, the condition where several identical queries
           are executed and create a backlog while the first query attempts to finish. Since all queries are
           identical, all but the first query are killed so that it can complete and populate the cache.
-wait-after-kill
    type: time
      Wait after killing a query, before looking for more to kill. The purpose of this is to give blocked queries a chance
      to execute, so we don’t kill a query that’s blocking a bunch of others, and then kill the others immediately
      afterwards.
-wait-before-kill
    type: time
      Wait before killing a query. The purpose of this is to give --execute-command a chance to see the matching
      query and gather other MySQL or system information before it’s killed.


2.14.8 QUERY MATCHES

These options filter queries from their classes. If a query does not match, it is removed from its class. The --ignore
options take precedence. The matches for command, db, host, etc. correspond to the columns returned by SHOW
PROCESSLIST: Command, db, Host, etc. All pattern matches are case-sensitive by default, but they can be made
case-insensitive by specifying a regex pattern like (?i-xsm:select).
See also “GROUP, MATCH AND KILL”.
-busy-time
    type: time; group: Query Matches
      Match queries that have been running for longer than this time. The queries must be in Command=Query status.
      This matches a query’s Time value as reported by SHOW PROCESSLIST.
-idle-time
    type: time; group: Query Matches
      Match queries that have been idle/sleeping for longer than this time. The queries must be in Command=Sleep
      status. This matches a query’s Time value as reported by SHOW PROCESSLIST.
-ignore-command
    type: string; group: Query Matches
      Ignore queries whose Command matches this Perl regex.
      See --match-command.
-ignore-db
    type: string; group: Query Matches
      Ignore queries whose db (database) matches this Perl regex.
      See --match-db.
-ignore-host
    type: string; group: Query Matches
      Ignore queries whose Host matches this Perl regex.
      See --match-host.


100                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-ignore-info
    type: string; group: Query Matches
     Ignore queries whose Info (query) matches this Perl regex.
     See --match-info.
-[no]ignore-self
    default: yes; group: Query Matches
     Don’t kill pt-kill‘s own connection.
-ignore-state
    type: string; group: Query Matches; default: Locked
     Ignore queries whose State matches this Perl regex. The default is to keep threads from being killed if they are
     locked waiting for another thread.
     See --match-state.
-ignore-user
    type: string; group: Query Matches
     Ignore queries whose user matches this Perl regex.
     See --match-user.
-match-all
    group: Query Matches
     Match all queries that are not ignored. If no ignore options are specified, then every query matches (except
     replication threads, unless --replication-threads is also specified). This option allows you to specify
     negative matches, i.e. “match every query except...” where the exceptions are defined by specifying various
     --ignore options.
     This option is not the same as --victims all. This option matches all queries within a class, whereas
     --victims all specifies that all matching queries in a class (however they matched) will be killed. Normally,
     however, the two are used together because if, for example, you specify --victims oldest, then although
     all queries may match, only the oldest will be killed.
-match-command
    type: string; group: Query Matches
     Match only queries whose Command matches this Perl regex.
     Common Command values are:
     Query
     Sleep
     Binlog Dump
     Connect
     Delayed insert
     Execute
     Fetch
     Init DB
     Kill
     Prepare
     Processlist
     Quit
     Reset stmt
     Table Dump

     See http://dev.mysql.com/doc/refman/5.1/en/thread-commands.html for a full list and description of Command
     values.


2.14. pt-kill                                                                                                   101
Percona Toolkit Documentation, Release 2.1.1


-match-db
    type: string; group: Query Matches
      Match only queries whose db (database) matches this Perl regex.
-match-host
    type: string; group: Query Matches
      Match only queries whose Host matches this Perl regex.
      The Host value often time includes the port like “host:port”.
-match-info
    type: string; group: Query Matches
      Match only queries whose Info (query) matches this Perl regex.
      The Info column of the processlist shows the query that is being executed or NULL if no query is being executed.
-match-state
    type: string; group: Query Matches
      Match only queries whose State matches this Perl regex.
      Common State values are:
      Locked
      login
      copy to tmp table
      Copying to tmp table
      Copying to tmp table on disk
      Creating tmp table
      executing
      Reading from net
      Sending data
      Sorting for order
      Sorting result
      Table lock
      Updating

      See http://dev.mysql.com/doc/refman/5.1/en/general-thread-states.html for a full list and description of State
      values.
-match-user
    type: string; group: Query Matches
      Match only queries whose User matches this Perl regex.
-replication-threads
    group: Query Matches
      Allow matching and killing replication threads.
      By default, matches do not apply to replication threads; i.e. replication threads are completely ignored. Speci-
      fying this option allows matches to match (and potentially kill) replication threads on masters and slaves.
-test-matching
    type: array; group: Query Matches
      Files with processlist snapshots to test matching options against. Since the matching options can be complex,
      you can save snapshots of processlist in files, then test matching options against queries in those files.
      This option disables --run-time, --interval, and --[no]ignore-self.




102                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.14.9 CLASS MATCHES

These matches apply to entire query classes. Classes are created by specifying the --group-by option, else all
queries are members of a single, default class.
See also “GROUP, MATCH AND KILL”.
-any-busy-time
    type: time; group: Class Matches
     Match query class if any query has been running for longer than this time. “Longer than” means that if you
     specify 10, for example, the class will only match if there’s at least one query that has been running for greater
     than 10 seconds.
     See --each-busy-time for more details.
-each-busy-time
    type: time; group: Class Matches
     Match query class if each query has been running for longer than this time. “Longer than” means that if you
     specify 10, for example, the class will only match if each and every query has been running for greater than 10
     seconds.
     See also --any-busy-time (to match a class if ANY query has been running longer than the specified time)
     and --busy-time.
-query-count
    type: int; group: Class Matches
     Match query class if it has at least this many queries. When queries are grouped into classes by specify-
     ing --group-by, this option causes matches to apply only to classes with at least this many queries. If
     --group-by is not specified then this option causes matches to apply only if there are at least this many
     queries in the entire SHOW PROCESSLIST.
-verbose
    short form: -v
     Print information to STDOUT about what is being done.


2.14.10 ACTIONS

These actions are taken for every matching query from all classes. The actions are taken in this order: --print,
--execute-command, --kill”/”--kill-query. This order allows --execute-command to see the out-
put of --print and the query before --kill”/”--kill-query. This may be helpful because pt-kill does not
pass any information to --execute-command.
See also “GROUP, MATCH AND KILL”.
-execute-command
    type: string; group: Actions
     Execute this command when a query matches.
     After the command is executed, pt-kill has no control over it, so the command is responsible for its own info
     gathering, logging, interval, etc. The command is executed each time a query matches, so be careful that the
     command behaves well when multiple instances are ran. No information from pt-kill is passed to the command.
     See also --wait-before-kill.
-kill
    group: Actions



2.14. pt-kill                                                                                                     103
Percona Toolkit Documentation, Release 2.1.1


      Kill the connection for matching queries.
      This option makes pt-kill kill the connections (a.k.a. processes, threads) that have matching queries. Use
      --kill-query if you only want to kill individual queries and not their connections.
      Unless --print is also given, no other information is printed that shows that pt-kill matched and killed a
      query.
      See also --wait-before-kill and --wait-after-kill.
-kill-query
    group: Actions
      Kill matching queries.
      This option makes pt-kill kill matching queries. This requires MySQL 5.0 or newer. Unlike --kill which
      kills the connection for matching queries, this option only kills the query, not its connection.
-print
    group: Actions
      Print a KILL statement for matching queries; does not actually kill queries.
      If you just want to see which queries match and would be killed without actually killing them, specify --print.
      To both kill and print matching queries, specify both --kill and --print.


2.14.11 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.
   • D
      dsn: database; copy: yes
      Default database.
   • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
   • h
      dsn: host; copy: yes
      Connect to host.
   • p
      dsn: password; copy: yes
      Password to use when connecting.
   • P




104                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.14.12 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-kill ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.14.13 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.14.14 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-kill.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.14.15 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb




2.14. pt-kill                                                                                                     105
Percona Toolkit Documentation, Release 2.1.1


You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.14.16 AUTHORS

Baron Schwartz and Daniel Nichter


2.14.17 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.14.18 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2009-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.14.19 VERSION

pt-kill 2.1.1


2.15 pt-log-player

2.15.1 NAME

pt-log-player - Replay MySQL query logs.


2.15.2 SYNOPSIS

Usage

pt-log-player [OPTION...] [DSN]




106                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


pt-log-player splits and plays slow log files.
Split slow.log on Thread_id into 16 session files, save in ./sessions:
pt-log-player --split Thread_id --session-files 16 --base-dir ./sessions slow.log

Play all those sessions on host1, save results in ./results:
pt-log-player --play ./sessions --base-dir ./results h=host1

Use pt-query-digest to summarize the results:
pt-query-digest ./results/*



2.15.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
This tool is meant to load a server as much as possible, for stress-testing purposes. It is not designed to be used on
production servers.
At the time of this release there is a bug which causes pt-log-player to exceed max open files during --split.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-log-
player.
See also “BUGS” for more information on filing bugs and getting help.


2.15.4 DESCRIPTION

pt-log-player does two things: it splits MySQL query logs into session files and it plays (executes) queries in session
files on a MySQL server. Only session files can be played; slow logs cannot be played directly without being split.
A session is a group of queries from the slow log that all share a common attribute, usually Thread_id. The common
attribute is specified with --split. Multiple sessions are saved into a single session file. See --session-files,
--max-sessions, --base-file-name and --base-dir. These session files are played with --play.
pt-log-player will --play session files in parallel using N number of --threads. (They’re not technically threads,
but we call them that anyway.) Each thread will play all the sessions in its given session files. The sessions are played
as fast as possible (there are no delays) because the goal is to stress-test and load-test the server. So be careful using
this script on a production server!
Each --play thread writes its results to a separate file. These result files are in slow log format so they can be
aggregated and summarized with pt-query-digest. See “OUTPUT”.


2.15.5 OUTPUT

Both --split and --play have two outputs: status messages printed to STDOUT to let you know what the script
is doing, and session or result files written to separate files saved in --base-dir. You can suppress all output to
STDOUT for each with --quiet, or increase output with --verbose.
The session files written by --split are simple text files containing queries grouped into sessions. For example:




2.15. pt-log-player                                                                                                  107
Percona Toolkit Documentation, Release 2.1.1



-- START SESSION 10

use foo

SELECT col FROM foo_tbl

The format of these session files is important: each query must be a single line separated by a single blank line. And
the “– START SESSION” comment tells pt-log-player where individual sessions begin and end so that --play can
correctly fake Thread_id in its result files.
The result files written by --play are in slow log format with a minimal header: the only attributes printed are
Thread_id, Query_time and Schema.


2.15.6 OPTIONS

Specify at least one of --play, --split or --split-random.
--play and --split are mutually exclusive.
This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    group: Play
      Prompt for a password when connecting to MySQL.
-base-dir
    type: string; default: ./
      Base directory for --split session files and --play result file.
-base-file-name
    type: string; default: session
      Base file name for --split session files and --play result file.
      Each --split session file will be saved as <base-file-name>-N.txt, where N is a four digit, zero-padded
      session ID. For example: session-0003.txt.
      Each --play result file will be saved as <base-file-name>-results-PID.txt, where PID is the process ID of the
      executing thread.
      All files are saved in --base-dir.
-charset
    short form: -A; type: string; group: Play
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file.
-dry-run
    Print which processes play which session files then exit.


108                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-filter
    type: string; group: Split
      Discard --split events for which this Perl code doesn’t return true.
      This option only works with --split.
      This option allows you to inject Perl code into the tool to affect how the tool runs. Usually your code should
      examine $event to decided whether or not to allow the event. $event is a hashref of attributes and values
      of the event being filtered. Or, your code could add new attribute-value pairs to $event for use by other
      options that accept event attributes as their value. You can find an explanation of the structure of $event at
      http://code.google.com/p/maatkit/wiki/EventAttributes.
      There are two ways to supply your code: on the command line or in a file. If you supply your code on the
      command line, it is injected into the following subroutine where $filter is your code:
      sub {
         PTDEBUG && _d(’callback: filter’);
         my( $event ) = shift;
         ( $filter ) && return $event;
      }

      Therefore you must ensure two things: first, that you correctly escape any special characters that need to be
      escaped on the command line for your shell, and two, that your code is syntactically valid when injected into
      the subroutine above.
      Here’s an example filter supplied on the command line that discards events that are not SELECT statements:
      --filter ’$event->{arg} =~ m/^select/i’

      The second way to supply your code is in a file. If your code is too complex to be expressed on the command
      line that results in valid syntax in the subroutine above, then you need to put the code in a file and give the
      file name as the value to --filter. The file should not contain a shebang (#!/usr/bin/perl) line. The
      entire contents of the file is injected into the following subroutine:
      sub {
         PTDEBUG && _d(’callback: filter’);
         my( $event ) = shift;
         $filter && return $event;
      }

      That subroutine is almost identical to the one above except your code is not wrapped in parentheses. This allows
      you to write multi-line code like:
      my $event_ok;
      if (...) {
         $event_ok = 1;
      }
      else {
         $event_ok = 0;
      }
      $event_ok

      Notice that the last line is not syntactically valid by itself, but it becomes syntactically valid when injected into
      the subroutine because it becomes:
      $event_ok && return $event;

      If your code doesn’t compile, the tool will die with an error. Even if your code compiles, it may crash to
      tool during runtime if, for example, it tries a pattern match an undefined value. No safeguards of any kind are
      provided so code carefully!


2.15. pt-log-player                                                                                                   109
Percona Toolkit Documentation, Release 2.1.1


-help
    Show help and exit.
-host
    short form: -h; type: string; group: Play
       Connect to host.
-iterations
    type: int; default: 1; group: Play
       How many times each thread should play all its session files.
-max-sessions
    type: int; default: 5000000; group: Split
       Maximum number of sessions to --split.
       By default, pt-log-player tries to split every session from the log file. For huge logs, however, this can result in
       millions of sessions. This option causes only the first N number of sessions to be saved. All sessions after this
       number are ignored, but sessions split before this number will continue to have their queries split even if those
       queries appear near the end of the log and after this number has been reached.
-only-select
    group: Play
       Play only SELECT and USE queries; ignore all others.
-password
    short form: -p; type: string; group: Play
       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-play
    type: string; group: Play
       Play (execute) session files created by --split.
       The argument to play must be a comma-separated list of session files created by --split or a directory. If the
       argument is a directory, ALL files in that directory will be played.
-port
    short form: -P; type: int; group: Play
       Port number to use for connection.
-print
    group: Play
       Print queries instead of playing them; requires --play.
       You must also specify --play with --print. Although the queries will not be executed, --play is required
       to specify which session files to read.




110                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-quiet
    short form: -q
      Do not print anything; disables --verbose.
-[no]results
    default: yes
      Print --play results to files in --base-dir.
-session-files
    type: int; default: 8; group: Split
      Number of session files to create with --split.
      The number of session files should either be equal to the number of --threads you intend to --play or be
      an even multiple of --threads. This number is important for maximum performance because it:
      * allows each thread to have roughly the same amount of sessions to play
      * avoids having to open/close many session files
      * avoids disk IO overhead by doing large sequential reads

      You may want to increase this number beyond --threads if each session file becomes too large. For example,
      splitting a 20G log into 8 sessions files may yield roughly eight 2G session files.
      See also --max-sessions.
-set-vars
    type: string; group: Play; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string; group: Play
      Socket file to use for connection.
-split
    type: string; group: Split
      Split log by given attribute to create session files.
      Valid attributes are any which appear in the log: Thread_id, Schema, etc.
-split-random
    group: Split
      Split log without an attribute, write queries round-robin to session files.
      This option, if specified, overrides --split and causes the log to be split query-by-query, writing each query
      to the next session file in round-robin style. If you don’t care about “sessions” and just want to split a lot into N
      many session files and the relation or order of the queries does not matter, then use this option.
-threads
    type: int; default: 2; group: Play
      Number of threads used to play sessions concurrently.
      Specifies the number of parallel processes to run. The default is 2. On GNU/Linux machines, the default is the
      number of times ‘processor’ appears in /proc/cpuinfo. On Windows, the default is read from the environment.
      In any case, the default is at least 2, even when there’s only a single processor.
      See also --session-files.



2.15. pt-log-player                                                                                                  111
Percona Toolkit Documentation, Release 2.1.1


-type
    type: string; group: Split
      The type of log to --split (default slowlog). The permitted types are
      binlog
           Split the output of running mysqlbinlog against a binary log file. Currently, splitting binary logs
           does not always work well depending on what the binary logs contain. Be sure to check the session
           files after splitting to ensure proper “OUTPUT”.
           If the binary log contains row-based replication data, you need to run mysqlbinlog with options
           --base64-output=decode-rows --verbose, else invalid statements will be written to the
           session files.
      genlog
           Split a general log file.
      slowlog
           Split a log file in any variation of MySQL slow-log format.
-user
    short form: -u; type: string; group: Play
      User for login if not current user.
-verbose
    short form: -v; cumulative: yes; default: 0
      Increase verbosity; can be specified multiple times.
      This option is disabled by --quiet.
-version
    Show version and exit.
-[no]warnings
    default: no; group: Play
      Print warnings about SQL errors such as invalid queries to STDERR.


2.15.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      dsn: database; copy: yes
      Default database.
    • F




112                                                                                           Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.15.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-log-player ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.15.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.15.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-log-player.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)


2.15. pt-log-player                                                                                               113
Percona Toolkit Documentation, Release 2.1.1


If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.15.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.15.12 AUTHORS

Daniel Nichter


2.15.13 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.15.14 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2008-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.15.15 VERSION

pt-log-player 2.1.1




114                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



2.16 pt-mext

2.16.1 NAME

pt-mext - Look at many samples of MySQL SHOW GLOBAL STATUS side-by-side.


2.16.2 SYNOPSIS

Usage

pt-mext [OPTIONS] -- COMMAND

pt-mext columnizes repeated output from a program like mysqladmin extended.
Get output from mysqladmin:
pt-mext -r -- mysqladmin ext -i10 -c3"

Get output from a file:
pt-mext -r -- cat mysqladmin-output.txt



2.16.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-mext is a read-only tool. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-mext.
See also “BUGS” for more information on filing bugs and getting help.


2.16.4 DESCRIPTION

pt-mext executes the COMMAND you specify, and reads through the result one line at a time. It places each line into a
temporary file. When it finds a blank line, it assumes that a new sample of SHOW GLOBAL STATUS is starting, and
it creates a new temporary file. At the end of this process, it has a number of temporary files. It joins the temporary
files together side-by-side and prints the result. If the “-r” option is given, it first subtracts each sample from the one
after it before printing results.


2.16.5 OPTIONS

       -r                     Relative: subtract each column from the previous column.


2.16.6 ENVIRONMENT

This tool does not use any environment variables.


2.16. pt-mext                                                                                                       115
Percona Toolkit Documentation, Release 2.1.1


2.16.7 SYSTEM REQUIREMENTS

This tool requires the Bourne shell (/bin/sh) and the seq program.


2.16.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-mext.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.16.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.16.10 AUTHORS

Baron Schwartz


2.16.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.




116                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.16.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.16.13 VERSION

pt-mext 2.1.1


2.17 pt-mysql-summary

2.17.1 NAME

pt-mysql-summary - Summarize MySQL information nicely.


2.17.2 SYNOPSIS

Usage

pt-mysql-summary [OPTIONS] [-- MYSQL OPTIONS]

pt-mysql-summary conveniently summarizes the status and configuration of a MySQL database server so that you
can learn about it at a glance. It is not a tuning tool or diagnosis tool. It produces a report that is easy to diff and can
be pasted into emails without losing the formatting. It should work well on any modern UNIX systems.


2.17.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-mysql-summary is a read-only tool. It should be very low-risk.
At the time of this release, we know of no bugs that could harm users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
mysql-summary.
See also “BUGS” for more information on filing bugs and getting help.




2.17. pt-mysql-summary                                                                                                 117
Percona Toolkit Documentation, Release 2.1.1


2.17.4 DESCRIPTION

pt-mysql-summary works by connecting to a MySQL database server and querying it for status and configuration
information. It saves these bits of data into files in a temporary directory, and then formats them neatly with awk and
other scripting languages.
To use, simply execute it. Optionally add a double dash and then the same command-line options you would use to
connect to MySQL, such as the following:
pt-mysql-summary -- --user=root

The tool interacts minimally with the server upon which it runs. It assumes that you’ll run it on the same server you’re
inspecting, and therefore it assumes that it will be able to find the my.cnf configuration file, for example. However, it
should degrade gracefully if this is not the case. Note, however, that its output does not indicate which information
comes from the MySQL database and which comes from the host operating system, so it is possible for confusing
output to be generated if you run the tool on one server and connect to a MySQL database server running on another
server.


2.17.5 OUTPUT

Many of the outputs from this tool are deliberately rounded to show their magnitude but not the exact detail. This is
called fuzzy-rounding. The idea is that it does not matter whether a server is running 918 queries per second or 921
queries per second; such a small variation is insignificant, and only makes the output hard to compare to other servers.
Fuzzy-rounding rounds in larger increments as the input grows. It begins by rounding to the nearest 5, then the nearest
10, nearest 25, and then repeats by a factor of 10 larger (50, 100, 250), and so on, as the input grows.
The following is a sample of the report that the tool produces:
# Percona Toolkit MySQL Summary Report #######################
              System time | 2012-03-30 18:46:05 UTC
                            (local TZ: EDT -0400)
# Instances ##################################################
  Port Data Directory              Nice OOM Socket
  ===== ========================== ==== === ======
  12345 /tmp/12345/data            0    0   /tmp/12345.sock
  12346 /tmp/12346/data            0    0   /tmp/12346.sock
  12347 /tmp/12347/data            0    0   /tmp/12347.sock

The first two sections show which server the report was generated on and which MySQL instances are running on the
server. This is detected from the output of ps and does not always detect all instances and parameters, but often works
well. From this point forward, the report will be focused on a single MySQL instance, although several instances may
appear in the above paragraph.
# Report On Port 12345 #######################################
                     User | msandbox@%
                     Time | 2012-03-30 14:46:05 (EDT)
                 Hostname | localhost.localdomain
                  Version | 5.5.20-log MySQL Community Server (GPL)
                 Built On | linux2.6 i686
                  Started | 2012-03-28 23:33 (up 1+15:12:09)
                Databases | 4
                  Datadir | /tmp/12345/data/
                Processes | 2 connected, 2 running
              Replication | Is not a slave, has 1 slaves connected
                  Pidfile | /tmp/12345/data/12345.pid (exists)




118                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


This section is a quick summary of the MySQL instance: version, uptime, and other very basic parameters. The Time
output is generated from the MySQL server, unlike the system date and time printed earlier, so you can see whether
the database and operating system times match.
# Processlist ################################################

  Command                        COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  Binlog Dump                           1       1    150000    150000
  Query                                 1       1         0         0

  User                           COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  msandbox                              2       2    150000    150000

  Host                           COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  localhost                             2       2    150000    150000

  db                             COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  NULL                                  2       2    150000    150000

  State                          COUNT(*) Working SUM(Time) MAX(Time)
  ------------------------------ -------- ------- --------- ---------
  Master has sent all binlog to         1       1    150000    150000
  NULL                                  1       1         0         0

This section is a summary of the output from SHOW PROCESSLIST. Each sub-section is aggregated by a differ-
ent item, which is shown as the first column heading. When summarized by Command, every row in SHOW PRO-
CESSLIST is included, but otherwise, rows whose Command is Sleep are excluded from the SUM and MAX columns,
so they do not skew the numbers too much. In the example shown, the server is idle except for this tool itself, and one
connected replica, which is executing Binlog Dump.
The columns are the number of rows included, the number that are not in Sleep status, the sum of the Time column,
and the maximum Time column. The numbers are fuzzy-rounded.
# Status Counters (Wait 10 Seconds) ##########################
Variable                            Per day Per second      10 secs
Binlog_cache_disk_use                     4
Binlog_cache_use                         80
Bytes_received                     15000000         175         200
Bytes_sent                         15000000         175        2000
Com_admin_commands                        1
...................(many lines omitted)............................
Threads_created                          40                       1
Uptime                                90000           1           1

This section shows selected counters from two snapshots of SHOW GLOBAL STATUS, gathered approximately 10
seconds apart and fuzzy-rounded. It includes only items that are incrementing counters; it does not include absolute
numbers such as the Threads_running status variable, which represents a current value, rather than an accumulated
number over time.
The first column is the variable name, and the second column is the counter from the first snapshot divided by 86400
(the number of seconds in a day), so you can see the magnitude of the counter’s change per day. 86400 fuzzy-rounds
to 90000, so the Uptime counter should always be about 90000.
The third column is the value from the first snapshot, divided by Uptime and then fuzzy-rounded, so it represents
approximately how quickly the counter is growing per-second over the uptime of the server.


2.17. pt-mysql-summary                                                                                            119
Percona Toolkit Documentation, Release 2.1.1


The third column is the incremental difference from the first and second snapshot, divided by the difference in uptime
and then fuzzy-rounded. Therefore, it shows how quickly the counter is growing per second at the time the report was
generated.
# Table cache ################################################
                     Size | 400
                    Usage | 15%

This section shows the size of the table cache, followed by the percentage of the table cache in use. The usage is
fuzzy-rounded.
# Key Percona Server features ################################
      Table & Index Stats | Not Supported
     Multiple I/O Threads | Enabled
     Corruption Resilient | Not Supported
      Durable Replication | Not Supported
     Import InnoDB Tables | Not Supported
     Fast Server Restarts | Not Supported
         Enhanced Logging | Not Supported
     Replica Perf Logging | Not Supported
      Response Time Hist. | Not Supported
          Smooth Flushing | Not Supported
      HandlerSocket NoSQL | Not Supported
           Fast Hash UDFs | Unknown

This section shows features that are available in Percona Server and whether they are enabled or not. In the example
shown, the server is standard MySQL, not Percona Server, so the features are generally not supported.
# Plugins ####################################################
       InnoDB compression | ACTIVE

This feature shows specific plugins and whether they are enabled.
# Query cache ################################################
         query_cache_type | ON
                     Size | 0.0
                    Usage | 0%
         HitToInsertRatio | 0%

This section shows whether the query cache is enabled and its size, followed by the percentage of the cache in use and
the hit-to-insert ratio. The latter two are fuzzy-rounded.
# Schema #####################################################
Would you like to mysqldump -d the schema and analyze it? y/n y
There are 4 databases. Would you like to dump all, or just one?
Type the name of the database, or press Enter to dump all of them.

  Database           Tables Views SPs Trigs Funcs                     FKs Partn
  mysql                  24
  performance_schema     17
  sakila                 16     7   3     6     3                      22

  Database           MyISAM CSV PERFORMANCE_SCHEMA InnoDB
  mysql                  22   2
  performance_schema                            17
  sakila                  8                            15

  Database                  BTREE FULLTEXT
  mysql                        31



120                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



  performance_schema
  sakila                           63           1

                               c        t   s   e     l    d     i    t
                                                                      m   v   s
                               h        i   e   n     o    a     n    i
                                                                      e   a   m
                               a        m   t   u     n    t     t    n
                                                                      d   r   a
                               r        e       m     g    e          y
                                                                      i   c   l
                                        s             b    t          i
                                                                      u   h   l
                                        t             l    i          n
                                                                      m   a   i
                                        a             o    m          t
                                                                      t   r   n
                                        m             b    e          e       t
                                        p                             x
                                                                      t
  Database           === === === ===                === === === === === === ===
  mysql               61 10    6 78                   5   4 26    3   4   5   3
  performance_schema               5                         16          33
  sakila               1 15    1   3                      4   3 19       42 26

If you select to dump the schema and analyze it, the tool will print the above section. This summarizes the number
and type of objects in the database. It is generated by running mysqldump --no-data, not by querying the
INFORMATION_SCHEMA, which can freeze a busy server. You can use the --databases option to specify
which databases to examine. If you do not, and you run the tool interactively, it will prompt you as shown.
You can choose not to dump the schema, to dump all of the databases, or to dump only a single named one, by
specifying the appropriate options. In the example above, we are dumping all databases.
The first sub-report in the section is the count of objects by type in each database: tables, views, and so on. The second
one shows how many tables use various storage engines in each database. The third sub-report shows the number of
each type of indexes in each database.
The last section shows the number of columns of various data types in each database. For compact display, the column
headers are formatted vertically, so you need to read downwards from the top. In this example, the first column is
char and the second column is timestamp. This example is truncated so it does not wrap on a terminal.
All of the numbers in this portion of the output are exact, not fuzzy-rounded.
# Noteworthy Technologies           ####################################
       Full Text Indexing           | Yes
         Geospatial Types           | No
             Foreign Keys           | Yes
             Partitioning           | No
       InnoDB Compression           | Yes
                      SSL           | No
     Explicit LOCK TABLES           | No
           Delayed Insert           | No
          XA Transactions           | No
              NDB Cluster           | No
      Prepared Statements           | No
 Prepared statement count           | 0

This section shows some specific technologies used on this server. Some of them are detected from the schema dump
performed for the previous sections; others can be detected by looking at SHOW GLOBAL STATUS.
# InnoDB #####################################################
                  Version | 1.1.8
         Buffer Pool Size | 16.0M
         Buffer Pool Fill | 100%
        Buffer Pool Dirty | 0%
           File Per Table | OFF



2.17. pt-mysql-summary                                                                                              121
Percona Toolkit Documentation, Release 2.1.1



                 Page Size        |   16k
             Log File Size        |   2 * 5.0M = 10.0M
           Log Buffer Size        |   8M
              Flush Method        |
       Flush Log At Commit        |   1
                XA Support        |   ON
                 Checksums        |   ON
               Doublewrite        |   ON
           R/W I/O Threads        |   4 4
              I/O Capacity        |   200
        Thread Concurrency        |   0
       Concurrency Tickets        |   500
        Commit Concurrency        |   0
       Txn Isolation Level        |   REPEATABLE-READ
         Adaptive Flushing        |   ON
       Adaptive Checkpoint        |
            Checkpoint Age   0    |
              InnoDB Queue        |
                             0 queries inside InnoDB, 0 queries in queue
        Oldest Transaction        |
                             0 Seconds
          History List Len   209  |
                Read Views   1    |
          Undo Log Entries        |
                             1 transactions, 1 total undo, 1 max undo
         Pending I/O Reads        |
                             0 buf pool reads, 0 normal AIO,
                             0 ibuf AIO, 0 preads
        Pending I/O Writes | 0 buf pool (0 LRU, 0 flush list, 0 page);
                             0 AIO, 0 sync, 0 log IO (0 log, 0 chkp);
                             0 pwrites
       Pending I/O Flushes | 0 buf pool, 0 log
        Transaction States | 1xnot started

This section shows important configuration variables for the InnoDB storage engine. The buffer pool fill percent and
dirty percent are fuzzy-rounded. The last few lines are derived from the output of SHOW INNODB STATUS. It is
likely that this output will change in the future to become more useful.
# MyISAM #####################################################
                Key Cache | 16.0M
                 Pct Used | 10%
                Unflushed | 0%

This section shows the size of the MyISAM key cache, followed by the percentage of the cache in use and percentage
unflushed (fuzzy-rounded).
# Security ###################################################
                    Users | 2 users, 0 anon, 0 w/o pw, 0 old pw
            Old Passwords | OFF

This section is generated from queries to tables in the mysql system database. It shows how many users exist, and
various potential security risks such as old-style passwords and users without passwords.
# Binary Logging #############################################
                  Binlogs | 1
               Zero-Sized | 0
               Total Size | 21.8M
            binlog_format | STATEMENT
         expire_logs_days | 0
              sync_binlog | 0
                server_id | 12345
             binlog_do_db |
         binlog_ignore_db |


122                                                                                           Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1




This section shows configuration and status of the binary logs. If there are zero-sized binary logs, then it is possible
that the binlog index is out of sync with the binary logs that actually exist on disk.
# Noteworthy Variables #######################################
     Auto-Inc Incr/Offset | 1/1
   default_storage_engine | InnoDB
               flush_time | 0
             init_connect |
                init_file |
                 sql_mode |
         join_buffer_size | 128k
         sort_buffer_size | 2M
         read_buffer_size | 128k
     read_rnd_buffer_size | 256k
       bulk_insert_buffer | 0.00
      max_heap_table_size | 16M
           tmp_table_size | 16M
       max_allowed_packet | 1M
             thread_stack | 192k
                      log | OFF
                log_error | /tmp/12345/data/mysqld.log
             log_warnings | 1
         log_slow_queries | ON
log_queries_not_using_indexes | OFF
        log_slave_updates | ON

This section shows several noteworthy server configuration variables that might be important to know about when
working with this server.
# Configuration File #########################################
              Config File | /tmp/12345/my.sandbox.cnf
[client]
user                                = msandbox
password                            = msandbox
port                                = 12345
socket                              = /tmp/12345/mysql_sandbox12345.sock
[mysqld]
port                                = 12345
socket                              = /tmp/12345/mysql_sandbox12345.sock
pid-file                            = /tmp/12345/data/mysql_sandbox12345.pid
basedir                             = /home/baron/5.5.20
datadir                             = /tmp/12345/data
key_buffer_size                     = 16M
innodb_buffer_pool_size             = 16M
innodb_data_home_dir                = /tmp/12345/data
innodb_log_group_home_dir           = /tmp/12345/data
innodb_data_file_path               = ibdata1:10M:autoextend
innodb_log_file_size                = 5M
log-bin                             = mysql-bin
relay_log                           = mysql-relay-bin
log_slave_updates
server-id                           = 12345
report-host                         = 127.0.0.1
report-port                         = 12345
log-error                           = mysqld.log
innodb_lock_wait_timeout            = 3
# The End ####################################################



2.17. pt-mysql-summary                                                                                            123
Percona Toolkit Documentation, Release 2.1.1


This section shows a pretty-printed version of the my.cnf file, with comments removed and with whitespace added to
align things for easy reading. The tool tries to detect the my.cnf file by looking at the output of ps, and if it does not
find the location of the file there, it tries common locations until it finds a file. Note that this file might not actually
correspond with the server from which the report was generated. This can happen when the tool isn’t run on the same
server it’s reporting on, or when detecting the location of the configuration file fails.


2.17.6 OPTIONS

All options after – are passed to mysql.
-config
    type: string
      Read this comma-separated list of config files. If specified, this must be the first option on the command line.
-help
    Print help and exit.
-save-samples
    type: string
      Save the data files used to generate the summary in this directory.
-read-samples
    type: string
      Create a report from the files found in this directory.
-databases
    type: string
      Names of databases to summarize. If you want all of them, you can use the value --all-databases; you
      can also pass in a comma-separated list of database names. If not provided, the program will ask you for manual
      input.
-sleep
    type: int; default: 10
      Seconds to sleep when gathering status counters.
-version
    Print tool’s version and exit.


2.17.7 ENVIRONMENT

This tool does not use any environment variables.


2.17.8 SYSTEM REQUIREMENTS

This tool requires Bash v3 or newer, Perl 5.8 or newer, and binutils. These are generally already provided by most
distributions. On BSD systems, it may require a mounted procfs.


2.17.9 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-mysql-summary.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:


124                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.17.10 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.17.11 AUTHORS

Baron Schwartz, Brian Fraser, and Daniel Nichter.


2.17.12 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.17.13 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.



2.17. pt-mysql-summary                                                                                          125
Percona Toolkit Documentation, Release 2.1.1


2.17.14 VERSION

pt-mysql-summary 2.1.1


2.18 pt-online-schema-change

2.18.1 NAME

pt-online-schema-change - ALTER tables without locking them.


2.18.2 SYNOPSIS

Usage

pt-online-schema-change [OPTIONS] DSN

pt-online-schema-change alters a table’s structure without blocking reads or writes. Specify the database and table
in the DSN. Do not use this tool before reading its documentation and checking your backups carefully.
Add a column to sakila.actor:
pt-online-schema-change --alter "ADD COLUMN c1 INT" D=sakila,t=actor

Change sakila.actor to InnoDB, effectively performing OPTIMIZE TABLE in a non-blocking fashion because it is
already an InnoDB table:
pt-online-schema-change --alter "ENGINE=InnoDB" D=sakila,t=actor



2.18.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-online-schema-change modifies data and structures. You should be careful with it, and test it before using it in
production. You should also ensure that you have recoverable backups before using this tool.
At the time of this release, we know of no bugs that could cause harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
online-schema-change.
See also “BUGS” for more information on filing bugs and getting help.


2.18.4 DESCRIPTION

pt-online-schema-change emulates the way that MySQL alters tables internally, but it works on a copy of the table
you wish to alter. This means that the original table is not locked, and clients may continue to read and change data in
it.




126                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


pt-online-schema-change works by creating an empty copy of the table to alter, modifying it as desired, and then
copying rows from the original table into the new table. When the copy is complete, it moves away the original table
and replaces it with the new one. By default, it also drops the original table.
The data copy process is performed in small chunks of data, which are varied to attempt to make them execute in
a specific amount of time (see --chunk-time). This process is very similar to how other tools, such as pt-table-
checksum, work. Any modifications to data in the original tables during the copy will be reflected in the new table,
because the tool creates triggers on the original table to update the corresponding rows in the new table. The use of
triggers means that the tool will not work if any triggers are already defined on the table.
When the tool finishes copying data into the new table, it uses an atomic RENAME TABLE operation to simultaneously
rename the original and new tables. After this is complete, the tool drops the original table.
Foreign keys complicate the tool’s operation and introduce additional risk. The technique of atomically renaming the
original and new tables does not work when foreign keys refer to the table. The tool must update foreign keys to refer
to the new table after the schema change is complete. The tool supports two methods for accomplishing this. You can
read more about this in the documentation for --alter-foreign-keys-method.
Foreign keys also cause some side effects. The final table will have the same foreign keys and indexes as the original
table (unless you specify differently in your ALTER statement), but the names of the objects may be changed slightly
to avoid object name collisions in MySQL and InnoDB.
For safety, the tool does not modify the table unless you specify the --execute option, which is not enabled
by default. The tool supports a variety of other measures to prevent unwanted load or other problems, including
automatically detecting replicas, connecting to them, and using the following safety checks:
    • The tool refuses to operate if it detects replication filters. See --[no]check-replication-filters for
      details.
    • The tool pauses the data copy operation if it observes any replicas that are delayed in replication. See
      --max-lag for details.
    • The tool pauses or aborts its operation if it detects too much load on the server. See --max-load and
      --critical-load for details.
    • The tool sets its lock wait timeout to 1 second so that it is more likely to be the victim of any lock contention,
      and less likely to disrupt other transactions. See --lock-wait-timeout for details.
    • The tool refuses to alter the table if foreign key constraints reference it, unless you specify
      --alter-foreign-keys-method.


2.18.5 OUTPUT

The tool prints information about its activities to STDOUT so that you can see what it is doing. During the data copy
phase, it prints progress reports to STDERR. You can get additional information with the --print option.


2.18.6 OPTIONS

--dry-run and --execute are mutually exclusive.
This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-alter
    type: string
      The schema modification, without the ALTER TABLE keywords. You can perform multiple modifications to the
      table by specifying them with commas. Please refer to the MySQL manual for the syntax of ALTER TABLE.
      You cannot use the RENAME clause to ALTER TABLE, or the tool will fail.


2.18. pt-online-schema-change                                                                                      127
Percona Toolkit Documentation, Release 2.1.1


-alter-foreign-keys-method
    type: string
      How to modify foreign keys so they reference the new table. Foreign keys that reference the table to be altered
      must be treated specially to ensure that they continue to reference the correct table. When the tool renames the
      original table to let the new one take its place, the foreign keys “follow” the renamed table, and must be changed
      to reference the new table instead.
      The tool supports two techniques to achieve this. It automatically finds “child tables” that reference the table to
      be altered.
      auto
             Automatically determine which method is best. The tool uses rebuild_constraints if possible
             (see the description of that method for details), and if not, then it uses drop_swap.
      rebuild_constraints
             This method uses ALTER TABLE to drop and re-add foreign key constraints that reference the new
             table. This is the preferred technique, unless one or more of the “child” tables is so large that the
             ALTER would take too long. The tool determines that by comparing the number of rows in the child
             table to the rate at which the tool is able to copy rows from the old table to the new table. If the tool
             estimates that the child table can be altered in less time than the --chunk-time, then it will use
             this technique. For purposes of estimating the time required to alter the child table, the tool multiplies
             the row-copying rate by --chunk-size-limit, because MySQL’s ALTER TABLE is typically
             much faster than the external process of copying rows.
             Due to a limitation in MySQL, foreign keys will not have the same names after the ALTER that they
             did prior to it. The tool has to rename the foreign key when it redefines it, which adds a leading
             underscore to the name. In some cases, MySQL also automatically renames indexes required for the
             foreign key.
      drop_swap
             Disable foreign key checks (FOREIGN_KEY_CHECKS=0), then drop the original table before re-
             naming the new table into its place. This is different from the normal method of swapping the old
             and new table, which uses an atomic RENAME that is undetectable to client applications.
             This method is faster and does not block, but it is riskier for two reasons. First, for a short time
             between dropping the original table and renaming the temporary table, the table to be altered simply
             does not exist, and queries against it will result in an error. Secondly, if there is an error and the new
             table cannot be renamed into the place of the old one, then it is too late to abort, because the old table
             is gone permanently.
      none
             This method is like drop_swap without the “swap”. Any foreign keys that referenced the original
             table will now reference a nonexistent table. This will typically cause foreign key violations that are
             visible in SHOW ENGINE INNODB STATUS, similar to the following:
             Trying to add to index ‘idx_fk_staff_id‘ tuple:
             DATA TUPLE: 2 fields;
             0: len 1; hex 05; asc ;;
             1: len 4; hex 80000001; asc     ;;
             But the parent table ‘sakila‘.‘staff_old‘
             or its .ibd file does not currently exist!

             This is because the original table (in this case, sakila.staff) was renamed to sakila.staff_old and then
             dropped. This method of handling foreign key constraints is provided so that the database adminis-
             trator can disable the tool’s built-in functionality if desired.



128                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
     Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
     option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
     on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-check-interval
    type: time; default: 1
     Sleep time between checks for --max-lag.
-[no]check-replication-filters
    default: yes
     Abort if any replication filter is set on any server. The tool looks for server options that filter replication, such
     as binlog_ignore_db and replicate_do_db. If it finds any such filters, it aborts with an error.
     If the replicas are configured with any filtering options, you should be careful not to modify any databases
     or tables that exist on the master and not the replicas, because it could cause replication to fail. For more
     information on replication rules, see http://dev.mysql.com/doc/en/replication-rules.html.
-check-slave-lag
    type: string
     Pause the data copy until this replica’s lag is less than --max-lag. The value is a DSN that inherits prop-
     erties from the the connection options (--port, --user, etc.). This option overrides the normal behavior
     of finding and continually monitoring replication lag on ALL connected replicas. If you don’t want to mon-
     itor ALL replicas, but you want more than just one replica to be monitored, then use the DSN option to the
     --recursion-method option instead of this option.
-chunk-index
    type: string
     Prefer this index for chunking tables. By default, the tool chooses the most appropriate index for chunking.
     This option lets you specify the index that you prefer. If the index doesn’t exist, then the tool will fall back to
     its default behavior of choosing an index. The tool adds the index to the SQL statements in a FORCE INDEX
     clause. Be careful when using this option; a poor choice of index could cause bad performance.
-chunk-size
    type: size; default: 1000
     Number of rows to select for each chunk copied. Allowable suffixes are k, M, G.
     This option can override the default behavior, which is to adjust chunk size dynamically to try to make chunks
     run in exactly --chunk-time seconds. When this option isn’t set explicitly, its default value is used as a
     starting point, but after that, the tool ignores this option’s value. If you set this option explicitly, however, then
     it disables the dynamic adjustment behavior and tries to make all chunks exactly the specified number of rows.
     There is a subtlety: if the chunk index is not unique, then it’s possible that chunks will be larger than desired.
     For example, if a table is chunked by an index that contains 10,000 of a given value, there is no way to write a
     WHERE clause that matches only 1,000 of the values, and that chunk will be at least 10,000 rows large. Such a
     chunk will probably be skipped because of --chunk-size-limit.
-chunk-size-limit
    type: float; default: 4.0
     Do not copy chunks this much larger than the desired chunk size.




2.18. pt-online-schema-change                                                                                         129
Percona Toolkit Documentation, Release 2.1.1


      When a table has no unique indexes, chunk sizes can be inaccurate. This option specifies a maximum tolerable
      limit to the inaccuracy. The tool uses <EXPLAIN> to estimate how many rows are in the chunk. If that estimate
      exceeds the desired chunk size times the limit, then the tool skips the chunk.
      The minimum value for this option is 1, which means that no chunk can be larger than --chunk-size. You
      probably don’t want to specify 1, because rows reported by EXPLAIN are estimates, which can be different
      from the real number of rows in the chunk. You can disable oversized chunk checking by specifying a value of
      0.
      The tool also uses this option to determine how to handle foreign keys that reference the table to be altered. See
      --alter-foreign-keys-method for details.
-chunk-time
    type: float; default: 0.5
      Adjust the chunk size dynamically so each data-copy query takes this long to execute. The tool tracks the copy
      rate (rows per second) and adjusts the chunk size after each data-copy query, so that the next query takes this
      amount of time (in seconds) to execute. It keeps an exponentially decaying moving average of queries per
      second, so that if the server’s performance changes due to changes in server load, the tool adapts quickly.
      If this option is set to zero, the chunk size doesn’t auto-adjust, so query times will vary, but query chunk sizes
      will not. Another way to do the same thing is to specify a value for --chunk-size explicitly, instead of
      leaving it at the default.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-critical-load
    type: Array; default: Threads_running=50
      Examine SHOW GLOBAL STATUS after every chunk, and abort if the load is too high. The option accepts a
      comma-separated list of MySQL status variables and thresholds. An optional =MAX_VALUE (or :MAX_VALUE)
      can follow each variable. If not given, the tool determines a threshold by examining the current value at startup
      and doubling it.
      See --max-load for further details. These options work similarly, except that this option will abort the tool’s
      operation instead of pausing it, and the default value is computed differently if you specify no threshold. The
      reason for this option is as a safety check in case the triggers on the original table add so much load to the server
      that it causes downtime. There is probably no single value of Threads_running that is wrong for every server,
      but a default of 50 seems likely to be unacceptably high for most servers, indicating that the operation should be
      canceled immediately.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-[no]drop-old-table
    default: yes
      Drop the original table after renaming it. After the original table has been successfully renamed to let the new
      table take its place, and if there are no errors, the tool drops the original table by default. If there are any errors,
      the tool leaves the original table in place.
-dry-run
    Create and alter the new table, but do not create triggers, copy data, or replace the original table.
-execute
    Indicate that you have read the documentation and want to alter the table. You must specify this option to alter
    the table. If you do not, then the tool will only perform some safety checks and exit. This helps ensure that you


130                                                                                                     Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      have read the documentation and understand how to use this tool. If you have not read the documentation, then
      do not specify this option.
-help
    Show help and exit.
-host
    short form: -h; type: string
      Connect to host.
-lock-wait-timeout
    type: int; default: 1
      Set the session value of innodb_lock_wait_timeout. This option helps guard against long lock waits
      if the data-copy queries become slow for some reason. Setting this option dynamically requires the InnoDB
      plugin, so this works only on newer InnoDB and MySQL versions. If the setting’s current value is greater than
      the specified value, and the tool cannot set the value as desired, then it prints a warning. If the tool cannot set
      the value but the current value is less than or equal to the desired value, there is no error.
-max-lag
    type: time; default: 1s
      Pause the data copy until all replicas’ lag is less than this value. After each data-copy query (each chunk),
      the tool looks at the replication lag of all replicas to which it connects, using Seconds_Behind_Master. If any
      replica is lagging more than the value of this option, then the tool will sleep for --check-interval seconds,
      then check all replicas again. If you specify --check-slave-lag, then the tool only examines that server
      for lag, not all servers. If you want to control exactly which servers the tool monitors, use the DSN value to
      --recursion-method.
      The tool waits forever for replicas to stop lagging. If any replica is stopped, the tool waits forever until the
      replica is started. The data copy continues when all replicas are running and not lagging too much.
      The tool prints progress reports while waiting. If a replica is stopped, it prints a progress report immediately,
      then again at every progress report interval.
-max-load
    type: Array; default: Threads_running=25
      Examine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than their
      thresholds. The option accepts a comma-separated list of MySQL status variables. An optional =MAX_VALUE
      (or :MAX_VALUE) can follow each variable. If not given, the tool determines a threshold by examining the
      current value and increasing it by 20%.
      For example, if you want the tool to pause when Threads_connected gets too high, you can specify
      “Threads_connected”, and the tool will check the current value when it starts working and add 20% to that
      value. If the current value is 100, then the tool will pause when Threads_connected exceeds 120, and resume
      working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use either
      “Threads_connected:110” or “Threads_connected=110”.
      The purpose of this option is to prevent the tool from adding too much load to the server. If the data-copy queries
      are intrusive, or if they cause lock waits, then other queries on the server will tend to block and queue. This will
      typically cause Threads_running to increase, and the tool can detect that by running SHOW GLOBAL STATUS
      immediately after each query finishes. If you specify a threshold for this variable, then you can instruct the tool
      to wait until queries are running normally again. This will not prevent queueing, however; it will only give the
      server a chance to recover from the queueing. If you notice queueing, it is best to decrease the chunk time.
-password
    short form: -p; type: string
      Password to use when connecting.



2.18. pt-online-schema-change                                                                                        131
Percona Toolkit Documentation, Release 2.1.1


-pid
       type: string
       Create the given PID file. The file contains the process ID of the tool’s instance. The PID file is removed when
       the tool exits. The tool checks for the existence of the PID file when starting; if it exists and the process with the
       matching PID exists, the tool exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-print
    Print SQL statements to STDOUT. Specifying this option allows you to see most of the statements that the tool
    executes. You can use this option with --dry-run, for example.
-progress
    type: array; default: time,30
       Print progress reports to STDERR while copying rows. The value is a comma-separated list with two parts. The
       first part can be percentage, time, or iterations; the second part specifies how often an update should be printed,
       in percentage, seconds, or number of iterations.
-quiet
    short form: -q
       Do not print messages to STDOUT. Errors and warnings are still printed to STDERR.
-recurse
    type: int
       Number of levels to recurse in the hierarchy when discovering replicas.             Default is infinite.    See also
       --recursion-method.
-recursion-method
    type: string
       Preferred recursion method for discovering replicas. Possible methods are:
       METHOD            USES
       ===========       ==================
       processlist       SHOW PROCESSLIST
       hosts             SHOW SLAVE HOSTS
       dsn=DSN           DSNs from a table

       The processlist method is the default, because SHOW SLAVE HOSTS is not reliable. However, the hosts
       method can work better if the server uses a non-standard port (not 3306). The tool usually does the right thing
       and finds all replicas, but you may give a preferred method and it will be used first.
       The hosts method requires replicas to be configured with report_host, report_port, etc.
       The dsn method is special: it specifies a table from which other DSN strings are read. The specified DSN must
       specify a D and t, or a database-qualified t. The DSN table should have the following structure:
       CREATE TABLE ‘dsns‘ (
         ‘id‘ int(11) NOT NULL AUTO_INCREMENT,
         ‘parent_id‘ int(11) DEFAULT NULL,
         ‘dsn‘ varchar(255) NOT NULL,
         PRIMARY KEY (‘id‘)
       );




132                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      To make the tool monitor only the hosts 10.10.1.16 and 10.10.1.17 for replication lag, insert the values
      h=10.10.1.16 and h=10.10.1.17 into the table. Currently, the DSNs are ordered by id, but id and
      parent_id are otherwise ignored.
-retries
    type: int; default: 3
      Retry a chunk this many times when there is a nonfatal error. Nonfatal errors are problems such as a lock wait
      timeout or the query being killed. This option applies to the data copy operation.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-[no]swap-tables
    default: yes
      Swap the original table and the new, altered table. This step completes the online schema change process by
      making the table with the new schema take the place of the original table. The original table becomes the “old
      table,” and the tool drops it unless you disable --[no]drop-old-table.
-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.


2.18.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      dsn: database; copy: yes
      Database for the old and new table.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h




2.18. pt-online-schema-change                                                                                  133
Percona Toolkit Documentation, Release 2.1.1


      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • t
      dsn: table; copy: no
      Table to alter.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.18.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-online-schema-change ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.18.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.
This tool works only on MySQL 5.0.2 and newer versions, because earlier versions do not support triggers.


2.18.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-online-schema-change.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR


134                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.18.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.18.12 AUTHORS

Daniel Nichter and Baron Schwartz


2.18.13 ACKNOWLEDGMENTS

The “online schema change” concept was first implemented by Shlomi Noach in his tool
oak-online-alter-table, part of http://code.google.com/p/openarkkit/. Engineers at Facebook then built
another version called OnlineSchemaChange.php as explained by their blog post: http://tinyurl.com/32zeb86.
This tool is a hybrid of both approaches, with additional features and functionality not present in either.


2.18.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.18.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.18. pt-online-schema-change                                                                                   135
Percona Toolkit Documentation, Release 2.1.1


2.18.16 VERSION

pt-online-schema-change 2.1.1


2.19 pt-pmp

2.19.1 NAME

pt-pmp - Aggregate GDB stack traces for a selected program.


2.19.2 SYNOPSIS

Usage

pt-pmp [OPTIONS] [FILES]

pt-pmp is a poor man’s profiler, inspired by http://poormansprofiler.org. It can create and summarize full stack traces
of processes on Linux. Summaries of stack traces can be an invaluable tool for diagnosing what a process is waiting
for.


2.19.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-pmp is a read-only tool. However, collecting GDB stacktraces is achieved by attaching GDB to the program and
printing stack traces from all threads. This will freeze the program for some period of time, ranging from a second or
so to much longer on very busy systems with a lot of memory and many threads in the program. In the tool’s default
usage as a MySQL profiling tool, this means that MySQL will be unresponsive while the tool runs, although if you are
using the tool to diagnose an unresponsive server, there is really no reason not to do this. In addition to freezing the
server, there is also some risk of the server crashing or performing badly after GDB detaches from it.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-pmp.
See also “BUGS” for more information on filing bugs and getting help.


2.19.4 DESCRIPTION

pt-pmp performs two tasks: it gets a stack trace, and it summarizes the stack trace. If a file is given on the command
line, the tool skips the first step and just aggregates the file.
To summarize the stack trace, the tool extracts the function name (symbol) from each level of the stack, and combines
them with commas. It does this for each thread in the output. Afterwards, it sorts similar threads together and counts
how many of each one there are, then sorts them most-frequent first.




136                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.19.5 OPTIONS

Options must precede files on the command line.
       -b BINARY              Which binary to trace (default mysqld)
       -i ITERATIONS          How many traces to gather and aggregate (default 1)
       -k KEEPFILE            Keep the raw traces in this file after aggregation
       -l NUMBER              Aggregate only first NUMBER functions; 0=infinity (default 0)
       -p PID                 Process ID of the process to trace; overrides -b
       -s SLEEPTIME           Number of seconds to sleep between iterations (default 0)


2.19.6 ENVIRONMENT

This tool does not use any environment variables.


2.19.7 SYSTEM REQUIREMENTS

This tool requires Bash v3 or newer.


2.19.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-pmp.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.19.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.



2.19. pt-pmp                                                                                                      137
Percona Toolkit Documentation, Release 2.1.1


2.19.10 AUTHORS

Baron Schwartz, based on a script by Domas Mituzas (http://poormansprofiler.org/)


2.19.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.19.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.19.13 VERSION

pt-pmp 2.1.1


2.20 pt-query-advisor

2.20.1 NAME

pt-query-advisor - Analyze queries and advise on possible problems.


2.20.2 SYNOPSIS

Usage

pt-query-advisor [OPTION...] [FILE]

pt-query-advisor analyzes queries and advises on possible problems. Queries are given either by specifying slowlog
files, –query, or –review.
Analyze all queries in a slow log:
pt-query-advisor /path/to/slow-query.log

Analyze all queries in a general log:


138                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



pt-query-advisor --type genlog mysql.log

Get queries from tcpdump using pt-query-digest:
pt-query-digest --type tcpdump.txt --print --no-report | pt-query-advisor



2.20.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-query-advisor simply reads queries and examines them, and is thus very low risk.
At the time of this release there is a bug that may cause an infinite (or very long) loop when parsing very large queries.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
query-advisor.
See also “BUGS” for more information on filing bugs and getting help.


2.20.4 DESCRIPTION

pt-query-advisor examines queries and applies rules to them, trying to find queries that look bad according to the
rules. It reports on queries that match the rules, so you can find bad practices or hidden problems in your SQL. By
default, it accepts a MySQL slow query log as input.


2.20.5 RULES

These are the rules that pt-query-advisor will apply to the queries it examines. Each rule has three bits of information:
an ID, a severity and a description.
The rule’s ID is its identifier. We use a seven-character ID, and the naming convention is three characters, a period,
and a three-digit number. The first three characters are sort of an abbreviation of the general class of the rule. For
example, ALI.001 is some rule related to how the query uses aliases.
The rule’s severity is an indication of how important it is that this rule matched a query. We use NOTE, WARN, and
CRIT to denote these levels.
The rule’s description is a textual, human-readable explanation of what it means when a query matches this rule.
Depending on the verbosity of the report you generate, you will see more of the text in the description. By default,
you’ll see only the first sentence, which is sort of a terse synopsis of the rule’s meaning. At a higher verbosity, you’ll
see subsequent sentences.
ALI.001
      severity: note
      Aliasing without the AS keyword. Explicitly using the AS keyword in column or table aliases, such as
      “tbl AS alias,” is more readable than implicit aliases such as “tbl alias”.
ALI.002
      severity: warn




2.20. pt-query-advisor                                                                                              139
Percona Toolkit Documentation, Release 2.1.1


      Aliasing the ‘*’ wildcard. Aliasing a column wildcard, such as “SELECT tbl.* col1, col2” probably
      indicates a bug in your SQL. You probably meant for the query to retrieve col1, but instead it renames the
      last column in the *-wildcarded list.
ALI.003
      severity: note
      Aliasing without renaming. The table or column’s alias is the same as its real name, and the alias just
      makes the query harder to read.
ARG.001
      severity: warn
      Argument with leading wildcard. An argument has a leading wildcard character, such as “%foo”. The
      predicate with this argument is not sargable and cannot use an index if one exists.
ARG.002
      severity: note
      LIKE without a wildcard. A LIKE pattern that does not include a wildcard is potentially a bug in the SQL.
CLA.001
      severity: warn
      SELECT without WHERE. The SELECT statement has no WHERE clause.
CLA.002
      severity: note
      ORDER BY RAND(). ORDER BY RAND() is a very inefficient way to retrieve a random row from the
      results.
CLA.003
      severity: note
      LIMIT with OFFSET. Paginating a result set with LIMIT and OFFSET is O(n^2) complexity, and will
      cause performance problems as the data grows larger.
CLA.004
      severity: note
      Ordinal in the GROUP BY clause. Using a number in the GROUP BY clause, instead of an expression or
      column name, can cause problems if the query is changed.
CLA.005
      severity: warn
      ORDER BY constant column.
CLA.006
      severity: warn
      GROUP BY or ORDER BY different tables will force a temp table and filesort.
CLA.007
      severity: warn
      ORDER BY different directions prevents index from being used. All tables in the ORDER BY clause
      must be either ASC or DESC, else MySQL cannot use an index.


140                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


COL.001
     severity: note
     SELECT *. Selecting all columns with the * wildcard will cause the query’s meaning and behavior to
     change if the table’s schema changes, and might cause the query to retrieve too much data.
COL.002
     severity: note
     Blind INSERT. The INSERT or REPLACE query doesn’t specify the columns explicitly, so the query’s
     behavior will change if the table’s schema changes; use “INSERT INTO tbl(col1, col2) VALUES...”
     instead.
LIT.001
     severity: warn
     Storing an IP address as characters. The string literal looks like an IP address, but is not an argument to
     INET_ATON(), indicating that the data is stored as characters instead of as integers. It is more efficient
     to store IP addresses as integers.
LIT.002
     severity: warn
     Unquoted date/time literal. A query such as “WHERE col<2010-02-12” is valid SQL but is probably a
     bug; the literal should be quoted.
KWR.001
     severity: note
     SQL_CALC_FOUND_ROWS is inefficient. SQL_CALC_FOUND_ROWS can cause performance prob-
     lems because it does not scale well; use alternative strategies to build functionality such as paginated result
     screens.
JOI.001
     severity: crit
     Mixing comma and ANSI joins. Mixing comma joins and ANSI joins is confusing to humans, and the
     behavior differs between some MySQL versions.
JOI.002
     severity: crit
     A table is joined twice. The same table appears at least twice in the FROM clause.
JOI.003
     severity: warn
     Reference to outer table column in WHERE clause prevents OUTER JOIN, implicitly converts to INNER
     JOIN.
JOI.004
     severity: warn
     Exclusion join uses wrong column in WHERE. The exclusion join (LEFT OUTER JOIN with a WHERE
     clause that is satisfied only if there is no row in the right-hand table) seems to use the wrong column in the
     WHERE clause. A query such as ”... FROM l LEFT OUTER JOIN r ON l.l=r.r WHERE r.z IS NULL”
     probably ought to list r.r in the WHERE IS NULL clause.



2.20. pt-query-advisor                                                                                                 141
Percona Toolkit Documentation, Release 2.1.1


RES.001
      severity: warn
      Non-deterministic GROUP BY. The SQL retrieves columns that are neither in an aggregate function nor
      the GROUP BY expression, so these values will be non-deterministic in the result.
RES.002
      severity: warn
      LIMIT without ORDER BY. LIMIT without ORDER BY causes non-deterministic results, depending on
      the query execution plan.
STA.001
      severity: note
      != is non-standard. Use the <> operator to test for inequality.
SUB.001
      severity: crit
      IN() and NOT IN() subqueries are poorly optimized. MySQL executes the subquery as a dependent
      subquery for each row in the outer query. This is a frequent cause of serious performance problems. This
      might change version 6.0 of MySQL, but for versions 5.1 and older, the query should be rewritten as a
      JOIN or a LEFT OUTER JOIN, respectively.


2.20.6 OPTIONS

--query and --review are mutually exclusive.
This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-[no]continue-on-error
    default: yes
      Continue working even if there is an error.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-database
    short form: -D; type: string
      Connect to this database. This is also used as the default database for --[no]show-create-table if a
      query does not use database-qualified tables.



142                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-defaults-file
    short form: -F; type: string
       Only read mysql options from the given file. You must give an absolute pathname.
-group-by
    type: string; default: rule_id
       Group items in the report by this attribute. Possible attributes are:
       ATTRIBUTE      GROUPS
       =========      ==========================================================
       rule_id        Items matching the same rule ID
       query_id       Queries with the same ID (the same fingerprint)
       none           No grouping, report each query and its advice individually

-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-ignore-rules
    type: hash
       Ignore these rule IDs.
       Specify a comma-separated list of rule IDs (e.g. LIT.001,RES.002,etc.) to ignore. Currently, the rule IDs are
       case-sensitive and must be uppercase.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-print-all
    Print all queries, even those that do not match any rules. With --group-by none, non-matching queries are
    printed in the main report and profile. For other --group-by values, non-matching queries are only printed
    in the profile. Non-matching queries have zeros for NOTE, WARN and CRIT in the profile.
-query
    type: string
       Analyze this single query and ignore files and STDIN. This option allows you to supply a single query on the
       command line. Any files also specified on the command line are ignored.
-report-format
    type: string; default: compact




2.20. pt-query-advisor                                                                                         143
Percona Toolkit Documentation, Release 2.1.1


      Type of report format: full or compact. In full mode, every query’s report contains the description of the rules
      it matched, even if this information was previously displayed. In compact mode, the repeated information is
      suppressed, and only the rule ID is displayed.
-review
    type: DSN
      Analyze queries from this pt-query-digest query review table.
-sample
    type: int; default: 1
      How many samples of the query to show.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-[no]show-create-table
    default: yes
      Get SHOW CREATE TABLE for each query’s table.
      If host connection options are given (like --host, --port, etc.) then the tool will also get SHOW CREATE
      TABLE for each query. This information is needed for some rules like JOI.004. If this option is disabled by
      specifying --no-show-create-table then some rules may not be checked.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-type
    type: Array
      The type of input to parse (default slowlog). The permitted types are slowlog and genlog.
-user
    short form: -u; type: string
      User for login if not current user.
-verbose
    short form: -v; cumulative: yes; default: 1
      Increase verbosity of output. At the default level of verbosity, the program prints only the first sentence of each
      rule’s description. At higher levels, the program prints more of the description. See also --report-format.
-version
    Show version and exit.
-where
    type: string
      Apply this WHERE clause to the SELECT query on the --review table.


2.20.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value



144                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      dsn: database; copy: yes
      Database that contains the query review table.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • t
      Table to use as the query review table.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.20.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-query-advisor ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.




2.20. pt-query-advisor                                                                                      145
Percona Toolkit Documentation, Release 2.1.1


2.20.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.20.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-query-advisor.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.20.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.20.12 AUTHORS

Baron Schwartz and Daniel Nichter


2.20.13 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.




146                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.20.14 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.20.15 VERSION

pt-query-advisor 2.1.1


2.21 pt-query-digest

2.21.1 NAME

pt-query-digest - Analyze query execution logs and generate a query report, filter, replay, or transform queries for
MySQL, PostgreSQL, memcached, and more.


2.21.2 SYNOPSIS

Usage

pt-query-digest [OPTION...] [FILE]

pt-query-digest parses and analyzes MySQL log files. With no FILE, or when FILE is -, it read standard input.
Analyze, aggregate, and report on a slow query log:
pt-query-digest /path/to/slow.log

Review a slow log, saving results to the test.query_review table in a MySQL server running on host1. See --review
for more on reviewing queries:
pt-query-digest --review h=host1,D=test,t=query_review /path/to/slow.log

Filter out everything but SELECT queries, replay the queries against another server, then use the timings from replay-
ing them to analyze their performance:
pt-query-digest /path/to/slow.log --execute h=another_server 
  --filter ’$event->{fingerprint} =~ m/^select/’

Print the structure of events so you can construct a complex --filter:
pt-query-digest /path/to/slow.log --no-report 
  --filter ’print Dumper($event)’




2.21. pt-query-digest                                                                                            147
Percona Toolkit Documentation, Release 2.1.1


Watch SHOW FULL PROCESSLIST and output a log in slow query log format:
pt-query-digest --processlist h=host1 --print --no-report

The default aggregation and analysis is CPU and memory intensive. Disable it if you don’t need the default report:
pt-query-digest <arguments> --no-report



2.21.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
By default pt-query-digest merely collects and aggregates data from the files specified. It is designed to be as efficient
as possible, but depending on the input you give it, it can use a lot of CPU and memory. Practically speaking, it is safe
to run even on production systems, but you might want to monitor it until you are satisfied that the input you give it
does not cause undue load.
Various options will cause pt-query-digest to insert data into tables, execute SQL queries, and so on. These include
the --execute option and --review.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
query-digest.
See also “BUGS” for more information on filing bugs and getting help.


2.21.4 DESCRIPTION

pt-query-digest is a framework for doing things with events from a query source such as the slow query log or
PROCESSLIST. By default it acts as a very sophisticated log analysis tool. You can group and sort queries in many
different ways simultaneously and find the most expensive queries, or create a timeline of queries in the log, for
example. It can also do a “query review,” which means to save a sample of each type of query into a MySQL table so
you can easily see whether you’ve reviewed and analyzed a query before. The benefit of this is that you can keep track
of changes to your server’s queries and avoid repeated work. You can also save other information with the queries,
such as comments, issue numbers in your ticketing system, and so on.
Note that this is a work in very active progress and you should expect incompatible changes in the future.


2.21.5 ATTRIBUTES

pt-query-digest works on events, which are a collection of key/value pairs called attributes. You’ll recognize most of
the attributes right away: Query_time, Lock_time, and so on. You can just look at a slow log and see them. However,
there are some that don’t exist in the slow log, and slow logs may actually include different kinds of attributes (for
example, you may have a server with the Percona patches).
For a full list of attributes, see http://code.google.com/p/maatkit/wiki/EventAttributes.
With creative use of --filter, you can create new attributes derived from existing attributes. For example, to create
an attribute called Row_ratio for examining the ratio of Rows_sent to Rows_examined, specify a filter like:
--filter ’($event->{Row_ratio} = $event->{Rows_sent} / ($event->{Rows_examined})) && 1’




148                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


The && 1 trick is needed to create a valid one-line syntax that is always true, even if the assignment happens to
evaluate false. The new attribute will automatically appears in the output:
# Row ratio                   1.00       0.00            1     0.50            1      0.71       0.50

Attributes created this way can be specified for --order-by or any option that requires an attribute.


2.21.6 memcached

memcached events have additional attributes related to the memcached protocol: cmd, key, res (result) and val. Also,
boolean attributes are created for the various commands, misses and errors: Memc_CMD where CMD is a memcached
command (get, set, delete, etc.), Memc_error and Memc_miss.
These attributes are no different from slow log attributes, so you can use them with --[no]report, --group-by,
in a --filter, etc.
These attributes and more are documented at http://code.google.com/p/maatkit/wiki/EventAttributes.


2.21.7 OUTPUT

The default output is a query analysis report. The --[no]report option controls whether or not this report is
printed. Sometimes you may wish to parse all the queries but suppress the report, for example when using --print
or --review.
There is one paragraph for each class of query analyzed. A “class” of queries all have the same value for the
--group-by attribute which is “fingerprint” by default. (See “ATTRIBUTES”.) A fingerprint is an abstracted ver-
sion of the query text with literals removed, whitespace collapsed, and so forth. The report is formatted so it’s easy to
paste into emails without wrapping, and all non-query lines begin with a comment, so you can save it to a .sql file and
open it in your favorite syntax-highlighting text editor. There is a response-time profile at the beginning.
The output described here is controlled by --report-format. That option allows you to specify what to print and
in what order. The default output in the default order is described here.
The report, by default, begins with a paragraph about the entire analysis run The information is very similar to what
you’ll see for each class of queries in the log, but it doesn’t have some information that would be too expensive to keep
globally for the analysis. It also has some statistics about the code’s execution itself, such as the CPU and memory
usage, the local date and time of the run, and a list of input file read/parsed.
Following this is the response-time profile over the events. This is a highly summarized view of the unique events in
the detailed query report that follows. It contains the following columns:
Column             Meaning
============       ==========================================================
Rank               The query’s rank within the entire set of queries analyzed
Query ID           The query’s fingerprint
Response time      The total response time, and percentage of overall total
Calls              The number of times this query was executed
R/Call             The mean response time per execution
Apdx               The Apdex score; see --apdex-threshold for details
V/M                The Variance-to-mean ratio of response time
EXPLAIN            If --explain was specified, a sparkline; see --explain
Item               The distilled query

A final line whose rank is shown as MISC contains aggregate statistics on the queries that were not included in the
report, due to options such as --limit and --outliers. For details on the variance-to-mean ratio, please see
http://en.wikipedia.org/wiki/Index_of_dispersion.




2.21. pt-query-digest                                                                                               149
Percona Toolkit Documentation, Release 2.1.1


Next, the detailed query report is printed. Each query appears in a paragraph. Here is a sample, slightly reformatted
so ‘perldoc’ will not wrap lines in a terminal. The following will all be one paragraph, but we’ll break it up for
commentary.
# Query 2: 0.01 QPS, 0.02x conc, ID 0xFDEA8D2993C9CAF3 at byte 160665

This line identifies the sequential number of the query in the sort order specified by --order-by. Then there’s the
queries per second, and the approximate concurrency for this query (calculated as a function of the timespan and total
Query_time). Next there’s a query ID. This ID is a hex version of the query’s checksum in the database, if you’re
using --review. You can select the reviewed query’s details from the database with a query like SELECT ....
WHERE checksum=0xFDEA8D2993C9CAF3.
If you are investigating the report and want to print out every sample of a particular query, then the following
--filter may be helpful: ‘‘pt-query-digest slow-log.log –no-report –print –filter ‘$event-‘‘{fingerprint} &&
make_checksum($event->{fingerprint}) eq “FDEA8D2993C9CAF3”’>.
Notice that you must remove the 0x prefix from the checksum in order for this to work.
Finally, in case you want to find a sample of the query in the log file, there’s the byte offset where you can look. (This
is not always accurate, due to some silly anomalies in the slow-log format, but it’s usually right.) The position refers
to the worst sample, which we’ll see more about below.
Next is the table of metrics about this class of queries.
#                pct     total        min       max         avg     95%    stddev     median
#   Count          0         2
#   Exec time     13     1105s      552s      554s       553s      554s         2s       553s
#   Lock time      0     216us      99us     117us      108us     117us       12us      108us
#   Rows sent     20     6.26M     3.13M     3.13M      3.13M     3.13M      12.73      3.13M
#   Rows exam      0     6.26M     3.13M     3.13M      3.13M     3.13M      12.73      3.13M

The first line is column headers for the table. The percentage is the percent of the total for the whole analysis run,
and the total is the actual value of the specified metric. For example, in this case we can see that the query executed 2
times, which is 13% of the total number of queries in the file. The min, max and avg columns are self-explanatory. The
95% column shows the 95th percentile; 95% of the values are less than or equal to this value. The standard deviation
shows you how tightly grouped the values are. The standard deviation and median are both calculated from the 95th
percentile, discarding the extremely large values.
The stddev, median and 95th percentile statistics are approximate. Exact statistics require keeping every value seen,
sorting, and doing some calculations on them. This uses a lot of memory. To avoid this, we keep 1000 buckets,
each of them 5% bigger than the one before, ranging from .000001 up to a very big number. When we see a value
we increment the bucket into which it falls. Thus we have fixed memory per class of queries. The drawback is the
imprecision, which typically falls in the 5 percent range.
Next we have statistics on the users, databases and time range for the query.
# Users       1   user1
# Databases   2     db1(1), db2(1)
# Time range 2008-11-26 04:55:18 to 2008-11-27 00:15:15

The users and databases are shown as a count of distinct values, followed by the values. If there’s only one, it’s shown
alone; if there are many, we show each of the most frequent ones, followed by the number of times it appears.
# Query_time distribution
#   1us
# 10us
# 100us
#   1ms
# 10ms
# 100ms



150                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



#     1s
#   10s+    #############################################################

The execution times show a logarithmic chart of time clustering. Each query goes into one of the “buckets” and is
counted up. The buckets are powers of ten. The first bucket is all values in the “single microsecond range” – that is,
less than 10us. The second is “tens of microseconds,” which is from 10us up to (but not including) 100us; and so on.
The charted attribute can be changed by specifying --report-histogram but is limited to time-based attributes.
# Tables
#    SHOW TABLE STATUS LIKE ’table1’G
#    SHOW CREATE TABLE ‘table1‘G
# EXPLAIN
SELECT * FROM table1G

This section is a convenience: if you’re trying to optimize the queries you see in the slow log, you probably want to
examine the table structure and size. These are copy-and-paste-ready commands to do that.
Finally, we see a sample of the queries in this class of query. This is not a random sample. It is the query that performed
the worst, according to the sort order given by --order-by. You will normally see a commented # EXPLAIN line
just before it, so you can copy-paste the query to examine its EXPLAIN plan. But for non-SELECT queries that isn’t
possible to do, so the tool tries to transform the query into a roughly equivalent SELECT query, and adds that below.
If you want to find this sample event in the log, use the offset mentioned above, and something like the following:
tail -c +<offset> /path/to/file | head

See also --report-format.


2.21.8 SPARKLINES

The output also contains sparklines.           Sparklines are “data-intense, design-simple, word-sized graphics”
(http://en.wikipedia.org/wiki/Sparkline).There is a sparkline for --report-histogram and for --explain. See
each of those options for details about interpreting their sparklines.


2.21.9 QUERY REVIEWS

A “query review” is the process of storing all the query fingerprints analyzed. This has several benefits:
    • You can add meta-data to classes of queries, such as marking them for follow-up, adding notes to queries, or
      marking them with an issue ID for your issue tracking system.
    • You can refer to the stored values on subsequent runs so you’ll know whether you’ve seen a query before. This
      can help you cut down on duplicated work.
    • You can store historical data such as the row count, query times, and generally anything you can see in the
      report.
To use this feature, you run pt-query-digest with the --review option. It will store the fingerprints and other
information into the table you specify. Next time you run it with the same option, it will do the following:
    • It won’t show you queries you’ve already reviewed. A query is considered to be already reviewed if you’ve
      set a value for the reviewed_by column. (If you want to see queries you’ve already reviewed, use the
      --report-all option.)
    • Queries that you’ve reviewed, and don’t appear in the output, will cause gaps in the query number sequence in
      the first line of each paragraph. And the value you’ve specified for --limit will still be honored. So if you’ve
      reviewed all queries in the top 10 and you ask for the top 10, you won’t see anything in the output.



2.21. pt-query-digest                                                                                                 151
Percona Toolkit Documentation, Release 2.1.1


    • If you want to see the queries you’ve already reviewed, you can specify --report-all. Then you’ll see the
      normal analysis output, but you’ll also see the information from the review table, just below the execution time
      graph. For example,
      # Review information
      #      comments: really bad              IN() subquery, fix soon!
      #    first_seen: 2008-12-01              11:48:57
      #   jira_ticket: 1933
      #     last_seen: 2008-12-18              11:49:07
      #      priority: high
      #   reviewed_by: xaprb
      #   reviewed_on: 2008-12-18              15:03:11

      You can see how useful this meta-data is – as you analyze your queries, you get your comments integrated right
      into the report.
      If you add the --review-history option, it will also store information into a separate database table, so
      you can keep historical trending information on classes of queries.


2.21.10 FINGERPRINTS

A query fingerprint is the abstracted form of a query, which makes it possible to group similar queries together.
Abstracting a query removes literal values, normalizes whitespace, and so on. For example, consider these two queries:
SELECT name, password FROM user WHERE id=’12823’;
select name,   password from user
   where id=5;

Both of those queries will fingerprint to
select name, password from user where id=?

Once the query’s fingerprint is known, we can then talk about a query as though it represents all similar queries.
What pt-query-digest does is analogous to a GROUP BY statement in SQL. (But note that “multiple columns” doesn’t
define a multi-column grouping; it defines multiple reports!) If your command-line looks like this,
pt-query-digest /path/to/slow.log --select Rows_read,Rows_sent 
    --group-by fingerprint --order-by Query_time:sum --limit 10

The corresponding pseudo-SQL looks like this:
SELECT WORST(query BY Query_time), SUM(Query_time), ...
FROM /path/to/slow.log
GROUP BY FINGERPRINT(query)
ORDER BY SUM(Query_time) DESC
LIMIT 10

You can also use the value distill, which is a kind of super-fingerprint. See --group-by for more.
When parsing memcached input (--type memcached), the fingerprint is an abstracted version of the com-
mand and key, with placeholders removed. For example, get user_123_preferences fingerprints to get
user_?_preferences. There is also a key_print which a fingerprinted version of the key. This example’s
key_print is user_?_preferences.
Query fingerprinting accommodates a great many special cases, which have proven necessary in the real world. For
example, an IN list with 5 literals is really equivalent to one with 4 literals, so lists of literals are collapsed to a single
one. If you want to understand more about how and why all of these cases are handled, please review the test cases in
the Subversion repository. If you find something that is not fingerprinted properly, please submit a bug report with a
reproducible test case. Here is a list of transformations during fingerprinting, which might not be exhaustive:


152                                                                                                      Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


   • Group all SELECT queries from mysqldump together, even if they are against different tables. Ditto for all of
     pt-table-checksum’s checksum queries.
   • Shorten multi-value INSERT statements to a single VALUES() list.
   • Strip comments.
   • Abstract the databases in USE statements, so all USE statements are grouped together.
   • Replace all literals, such as quoted strings. For efficiency, the code that replaces literal numbers is somewhat
     non-selective, and might replace some things as numbers when they really are not. Hexadecimal literals are
     also replaced. NULL is treated as a literal. Numbers embedded in identifiers are also replaced, so tables named
     similarly will be fingerprinted to the same values (e.g. users_2009 and users_2010 will fingerprint identically).
   • Collapse all whitespace into a single space.
   • Lowercase the entire query.
   • Replace all literals inside of IN() and VALUES() lists with a single placeholder, regardless of cardinality.
   • Collapse multiple identical UNION queries into a single one.


2.21.11 OPTIONS

DSN values in --review-history default to values in --review if COPY is yes.
This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-apdex-threshold
    type: float; default: 1.0
     Set Apdex target threshold (T) for query response time. The Application Performance Index (Apdex) Technical
     Specification V1.1 defines T as “a positive decimal value in seconds, having no more than two significant digits
     of granularity.” This value only applies to query response time (Query_time).
     Options can be abbreviated so specifying --apdex-t also works.
     See http://www.apdex.org/.
-ask-pass
    Prompt for a password when connecting to MySQL.
-attribute-aliases
    type: array; default: db|Schema
     List of attribute|alias,etc.
     Certain attributes have multiple names, like db and Schema. If an event does not have the primary attribute,
     pt-query-digest looks for an alias attribute. If it finds an alias, it creates the primary attribute with the alias
     attribute’s value and removes the alias attribute.
     If the event has the primary attribute, all alias attributes are deleted.
     This helps simplify event attributes so that, for example, there will not be report lines for both db and Schema.
-attribute-value-limit
    type: int; default: 4294967296
     A sanity limit for attribute values.
     This option deals with bugs in slow-logging functionality that causes large values for attributes. If the attribute’s
     value is bigger than this, the last-seen value for that class of query is used instead.




2.21. pt-query-digest                                                                                                153
Percona Toolkit Documentation, Release 2.1.1


-aux-dsn
    type: DSN
      Auxiliary DSN used for special options.
      The following options may require a DSN even when only parsing a slow log file:
      * --since
      * --until

      See each option for why it might require a DSN.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-check-attributes-limit
    type: int; default: 1000
      Stop checking for new attributes after this many events.
      For better speed, pt-query-digest stops checking events for new attributes after a certain number of events. Any
      new attributes after this number will be ignored and will not be reported.
      One special case is new attributes for pre-existing query classes (see --group-by about query classes).
      New attributes will not be added to pre-existing query classes even if the attributes are detected before the
      --check-attributes-limit limit.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-[no]continue-on-error
    default: yes
      Continue parsing even if there is an error.
-create-review-history-table
    Create the --review-history table if it does not exist.
      This option causes the table specified by --review-history to be created with the default structure shown
      in the documentation for that option.
-create-review-table
    Create the --review table if it does not exist.
      This option causes the table specified by --review to be created with the default structure shown in the
      documentation for that option.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-embedded-attributes
    type: array
      Two Perl regex patterns to capture pseudo-attributes embedded in queries.


154                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      Embedded attributes might be special attribute-value pairs that you’ve hidden in comments. The first regex
      should match the entire set of attributes (in case there are multiple). The second regex should match and capture
      attribute-value pairs from the first regex.
      For example, suppose your query looks like the following:
      SELECT * from users -- file: /login.php, line: 493;

      You might run pt-query-digest with the following option:
      :program:‘pt-query-digest‘ --embedded-attributes ’ -- .*’,’(w+): ([^,]+)’

      The first regular expression captures the whole comment:
      " -- file: /login.php, line: 493;"

      The second one splits it into attribute-value pairs and adds them to the event:
      ATTRIBUTE      VALUE
      =========      ==========
      file           /login.php
      line           493

      NOTE: All commas in the regex patterns must be escaped with otherwise the pattern will break.
-execute
    type: DSN
      Execute queries on this DSN.
      Adds a callback into the chain, after filters but before the reports. Events are executed on this DSN. If they are
      successful, the time they take to execute overwrites the event’s Query_time attribute and the original Query_time
      value (from the log) is saved as the Exec_orig_time attribute. If unsuccessful, the callback returns false and
      terminates the chain.
      If the connection fails, pt-query-digest tries to reconnect once per second.
      See also --mirror and --execute-throttle.
-execute-throttle
    type: array
      Throttle values for --execute.
      By default --execute runs without any limitations or concerns for the amount of time that it takes to execute
      the events. The --execute-throttle allows you to limit the amount of time spent doing --execute
      relative to the other processes that handle events. This works by marking some events with a Skip_exec
      attribute when --execute begins to take too much time. --execute will not execute an event if this
      attribute is true. This indirectly decreases the time spent doing --execute.
      The --execute-throttle option takes at least two comma-separated values: max allowed --execute
      time as a percentage and a check interval time. An optional third value is a percentage step for increasing and
      decreasing the probability that an event will be marked Skip_exec true. 5 (percent) is the default step.
      For example: --execute-throttle 70,60,10. This will limit --execute to 70% of total event
      processing time, checked every minute (60 seconds) and probability stepped up and down by 10%. When
      --execute exceeds 70%, the probability that events will be marked Skip_exec true increases by 10%.
      --execute time is checked again after another minute. If it’s still above 70%, then the probability will in-
      crease another 10%. Or, if it’s dropped below 70%, then the probability will decrease by 10%.
-expected-range
    type: array; default: 5,10



2.21. pt-query-digest                                                                                             155
Percona Toolkit Documentation, Release 2.1.1


      Explain items when there are more or fewer than expected.
      Defines the number of items expected to be seen in the report given by --[no]report, as controlled by
      --limit and --outliers. If there are more or fewer items in the report, each one will explain why it was
      included.
-explain
    type: DSN
      Run EXPLAIN for the sample query with this DSN and print results.
      This works only when --group-by includes fingerprint. It causes pt-query-digest to run EXPLAIN and
      include the output into the report. For safety, queries that appear to have a subquery that EXPLAIN will execute
      won’t be EXPLAINed. Those are typically “derived table” queries of the form
      select ... from ( select .... ) der;

      The EXPLAIN results are printed in three places: a sparkline in the event header, a full vertical format in the
      event report, and a sparkline in the profile.
      The full format appears at the end of each event report in vertical style (G) just like MySQL prints it.
      The sparklines (see “SPARKLINES”) are compact representations of the access type for each table and whether
      or not “Using temporary” or “Using filesort” appear in EXPLAIN. The sparklines look like:
      nr>TF

      That sparkline means that there are two tables, the first uses a range (n) access, the second uses a ref access,
      and both “Using temporary” (T) and “Using filesort” (F) appear. The greater-than character just separates table
      access codes from T and/or F.
      The abbreviated table access codes are:
      a   ALL
      c   const
      e   eq_ref
      f   fulltext
      i   index
      m   index_merge
      n   range
      o   ref_or_null
      r   ref
      s   system
      u   unique_subquery

      A capitalized access code means that “Using index” appears in EXPLAIN for that table.
-filter
    type: string
      Discard events for which this Perl code doesn’t return true.
      This option is a string of Perl code or a file containing Perl code that gets compiled into a subroutine with one
      argument: $event. This is a hashref. If the given value is a readable file, then pt-query-digest reads the entire
      file and uses its contents as the code. The file should not contain a shebang (#!/usr/bin/perl) line.
      If the code returns true, the chain of callbacks continues; otherwise it ends. The code is the last statement in the
      subroutine other than return $event. The subroutine template is:
      sub { $event = shift; filter && return $event; }

      Filters given on the command line are wrapped inside parentheses like like ( filter ). For complex, multi-
      line filters, you must put the code inside a file so it will not be wrapped inside parentheses. Either way, the filter


156                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      must produce syntactically valid code given the template. For example, an if-else branch given on the command
      line would not be valid:
      --filter ’if () { } else { }’              # WRONG

      Since it’s given on the command line, the if-else branch would be wrapped inside parentheses which is not
      syntactically valid. So to accomplish something more complex like this would require putting the code in a file,
      for example filter.txt:
      my $event_ok; if (...) { $event_ok=1; } else { $event_ok=0; } $event_ok

      Then specify --filter filter.txt to read the code from filter.txt.
      If the filter code won’t compile, pt-query-digest will die with an error. If the filter code does compile, an
      error may still occur at runtime if the code tries to do something wrong (like pattern match an undefined value).
      pt-query-digest does not provide any safeguards so code carefully!
      An example filter that discards everything but SELECT statements:
      --filter ’$event->{arg} =~ m/^select/i’

      This is compiled into a subroutine like the following:
      sub { $event = shift; ( $event->{arg} =~ m/^select/i ) && return $event; }

      It is permissible for the code to have side effects (to alter $event).
      You can find an explanation of the structure of $event at http://code.google.com/p/maatkit/wiki/EventAttributes.
      Here are more examples of filter code:
      Host/IP matches domain.com
           –filter ‘($event->{host} || $event->{ip} || “”) =~ m/domain.com/’
           Sometimes MySQL logs the host where the IP is expected. Therefore, we check both.
      User matches john
           –filter ‘($event->{user} || “”) =~ m/john/’
      More than 1 warning
           –filter ‘($event->{Warning_count} || 0) > 1’
      Query does full table scan or full join
           –filter ‘(($event->{Full_scan} || “”) eq “Yes”) || (($event->{Full_join} || “”) eq “Yes”)’
      Query was not served from query cache
           –filter ‘($event->{QC_Hit} || “”) eq “No”’
      Query is 1 MB or larger
           –filter ‘$event->{bytes} >= 1_048_576’
      Since --filter allows you to alter $event, you can use it to do other things, like create new attributes. See
      “ATTRIBUTES” for an example.
-fingerprints
    Add query fingerprints to the standard query analysis report. This is mostly useful for debugging purposes.
-[no]for-explain
    default: yes
      Print extra information to make analysis easy.


2.21. pt-query-digest                                                                                            157
Percona Toolkit Documentation, Release 2.1.1


      This option adds code snippets to make it easy to run SHOW CREATE TABLE and SHOW TABLE STATUS for
      the query’s tables. It also rewrites non-SELECT queries into a SELECT that might be helpful for determining
      the non-SELECT statement’s index usage.
-group-by
    type: Array; default: fingerprint
      Which attribute of the events to group by.
      In general, you can group queries into classes based on any attribute of the query, such as user or db, which
      will by default show you which users and which databases get the most Query_time. The default attribute,
      fingerprint, groups similar, abstracted queries into classes; see below and see also “FINGERPRINTS”.
      A report is printed for each --group-by value (unless --no-report is given). Therefore, --group-by
      user,db means “report on queries with the same user and report on queries with the same db”; it does not
      mean “report on queries with the same user and db.” See also “OUTPUT”.
      Every value must have a corresponding value in the same position in --order-by. However, adding values
      to --group-by will automatically add values to --order-by, for your convenience.
      There are several magical values that cause some extra data mining to happen before the grouping takes place:
      fingerprint
            This causes events to be fingerprinted to abstract queries into a canonical form, which is then used to
            group events together into a class. See “FINGERPRINTS” for more about fingerprinting.
      tables
            This causes events to be inspected for what appear to be tables, and then aggregated by that. Note
            that a query that contains two or more tables will be counted as many times as there are tables; so a
            join against two tables will count the Query_time against both tables.
      distill
            This is a sort of super-fingerprint that collapses queries down into a suggestion of what they do, such
            as INSERT SELECT table1 table2.
      If parsing memcached input (--type memcached), there are other attributes which you can group by: key_print
      (see memcached section in “FINGERPRINTS”), cmd, key, res and val (see memcached section in “AT-
      TRIBUTES”).
-help
    Show help and exit.
-host
    short form: -h; type: string
      Connect to host.
-ignore-attributes
    type: array; default: arg, cmd, insert_id, ip, port, Thread_id, timestamp, exptime, flags, key, res, val, server_id,
    offset, end_log_pos, Xid
      Do not aggregate these attributes when auto-detecting --select.
      If you do not specify --select then pt-query-digest auto-detects and aggregates every attribute that it finds
      in the slow log. Some attributes, however, should not be aggregated. This option allows you to specify a list of
      attributes to ignore. This only works when no explicit --select is given.
-inherit-attributes
    type: array; default: db,ts
      If missing, inherit these attributes from the last event that had them.



158                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


       This option sets which attributes are inherited or carried forward to events which do not have them. For example,
       if one event has the db attribute equal to “foo”, but the next event doesn’t have the db attribute, then it inherits
       “foo” for its db attribute.
       Inheritance is usually desirable, but in some cases it might confuse things. If a query inherits a database that it
       doesn’t actually use, then this could confuse --execute.
-interval
    type: float; default: .1
       How frequently to poll the processlist, in seconds.
-iterations
    type: int; default: 1
       How many times to iterate through the collect-and-report cycle. If 0, iterate to infinity. Each iteration runs for
       --run-time amount of time. An iteration is usually determined by an amount of time and a report is printed
       when that amount of time elapses. With --run-time-mode interval, an interval is instead determined
       by the interval time you specify with --run-time. See --run-time and --run-time-mode for more
       information.
-limit
    type: Array; default: 95%:20
       Limit output to the given percentage or count.
       If the argument is an integer, report only the top N worst queries. If the argument is an integer followed by the
       % sign, report that percentage of the worst queries. If the percentage is followed by a colon and another integer,
       report the top percentage or the number specified by that integer, whichever comes first.
       The value is actually a comma-separated array of values, one for each item in --group-by. If you don’t
       specify a value for any of those items, the default is the top 95%.
       See also --outliers.
-log
       type: string
       Print all output to this file when daemonized.
-mirror
    type: float
       How often to check whether connections should be moved, depending on read_only.                           Requires
       --processlist and --execute.
       This option causes pt-query-digest to check every N seconds whether it is reading from a read-write server and
       executing against a read-only server, which is a sensible way to set up two servers if you’re doing something like
       master-master replication. The http://code.google.com/p/mysql-master-master/ master-master toolkit does this.
       The aim is to keep the passive server ready for failover, which is impossible without putting it under a realistic
       workload.
-order-by
    type: Array; default: Query_time:sum
       Sort events by this attribute and aggregate function.
       This is a comma-separated list of order-by expressions, one for each --group-by attribute. The default
       Query_time:sum is used for --group-by attributes without explicitly given --order-by attributes
       (that is, if you specify more --group-by attributes than corresponding --order-by attributes). The syntax
       is attribute:aggregate. See “ATTRIBUTES” for valid attributes. Valid aggregates are:




2.21. pt-query-digest                                                                                                 159
Percona Toolkit Documentation, Release 2.1.1



       Aggregate      Meaning
       =========      ============================
       sum            Sum/total attribute value
       min            Minimum attribute value
       max            Maximum attribute value
       cnt            Frequency/count of the query

       For example, the default Query_time:sum means that queries in the query analysis report will be ordered
       (sorted) by their total query execution time (“Exec time”). Query_time:max orders the queries by their
       maximum query execution time, so the query with the single largest Query_time will be list first. cnt refers
       more to the frequency of the query as a whole, how often it appears; “Count” is its corresponding line in the
       query analysis report. So any attribute and cnt should yield the same report wherein queries are sorted by the
       number of times they appear.
       When parsing general logs (--type genlog), the default --order-by becomes Query_time:cnt.
       General logs do not report query times so only the cnt aggregate makes sense because all query times are
       zero.
       If you specify an attribute that doesn’t exist in the events, then pt-query-digest falls back to the default
       Query_time:sum and prints a notice at the beginning of the report for each query class. You can create
       attributes with --filter and order by them; see “ATTRIBUTES” for an example.
-outliers
    type: array; default: Query_time:1:10
       Report outliers by attribute:percentile:count.
       The syntax of this option is a comma-separated list of colon-delimited strings. The first field is the attribute by
       which an outlier is defined. The second is a number that is compared to the attribute’s 95th percentile. The third
       is optional, and is compared to the attribute’s cnt aggregate. Queries that pass this specification are added to the
       report, regardless of any limits you specified in --limit.
       For example, to report queries whose 95th percentile Query_time is at least 60 seconds and which are seen at
       least 5 times, use the following argument:
       --outliers Query_time:60:5

       You can specify an –outliers option for each value in --group-by.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-pipeline-profile
    Print a profile of the pipeline processes.
-port
    short form: -P; type: int
       Port number to use for connection.
-print
    Print log events to STDOUT in standard slow-query-log format.



160                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-print-iterations
    Print the start time for each --iterations.
      This option causes a line like the following to be printed at the start of each --iterations report:
      # Iteration 2 started at 2009-11-24T14:39:48.345780

      This line will print even if --no-report is specified. If --iterations 0 is specified, each iteration
      number will be 0.
-processlist
    type: DSN
      Poll this DSN’s processlist for queries, with --interval sleep between.
      If the connection fails, pt-query-digest tries to reopen it once per second. See also --mirror.
-progress
    type: array; default: time,30
      Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be
      percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage,
      seconds, or number of iterations.
-read-timeout
    type: time; default: 0
      Wait this long for an event from the input; 0 to wait forever.
      This option sets the maximum time to wait for an event from the input. It applies to all types of input except
      --processlist. If an event is not received after the specified time, the script stops reading the input and
      prints its reports. If --iterations is 0 or greater than 1, the next iteration will begin, else the script will exit.
      This option requires the Perl POSIX module.
-[no]report
    default: yes
      Print out reports on the aggregate results from --group-by.
      This is the standard slow-log analysis functionality. See “OUTPUT” for the description of what this does and
      what the results look like.
-report-all
    Include all queries, even if they have already been reviewed.
-report-format
    type: Array; default: rusage,date,hostname,files,header,profile,query_report,prepared
      Print these sections of the query analysis report.
      SECTION           PRINTS
      ============      ======================================================
      rusage            CPU times and memory usage reported by ps
      date              Current local date and time
      hostname          Hostname of machine on which :program:‘pt-query-digest‘ was run
      files             Input files read/parse
      header            Summary of the entire analysis run
      profile           Compact table of queries for an overview of the report
      query_report      Detailed information about each unique query
      prepared          Prepared statements

      The sections are printed in the order specified. The rusage, date, files and header sections are grouped together
      if specified together; other sections are separated by blank lines.


2.21. pt-query-digest                                                                                                 161
Percona Toolkit Documentation, Release 2.1.1


      See “OUTPUT” for more information on the various parts of the query report.
-report-histogram
    type: string; default: Query_time
      Chart the distribution of this attribute’s values.
      The distribution chart is limited to time-based attributes, so charting Rows_examined, for example, will
      produce a useless chart. Charts look like:
      # Query_time distribution
      #   1us
      # 10us
      # 100us
      #   1ms
      # 10ms ################################
      # 100ms ################################################################
      #    1s ########
      # 10s+

      A sparkline (see “SPARKLINES”) of the full chart is also printed in the header for each query event. The
      sparkline of that full chart is:
      # Query_time sparkline: |                 .^_ |

      The sparkline itself is the 8 characters between the pipes (|), one character for each of the 8 buckets (1us, 10us,
      etc.) Four character codes are used to represent the approximate relation between each bucket’s value:
      _ . - ^

      The caret ^ represents peaks (buckets with the most values), and the underscore _ represents lows (buckets with
      the least or at least one value). The period . and the hyphen - represent buckets with values between these two
      extremes. If a bucket has no values, a space is printed. So in the example above, the period represents the 10ms
      bucket, the caret the 100ms bucket, and the underscore the 1s bucket.
      See “OUTPUT” for more information.
-review
    type: DSN
      Store a sample of each class of query in this DSN.
      The argument specifies a table to store all unique query fingerprints in. The table must have at least the
      following columns. You can add more columns for your own special purposes, but they won’t be used by
      pt-query-digest. The following CREATE TABLE definition is also used for --create-review-table.
      MAGIC_create_review:
      CREATE TABLE query_review (
         checksum     BIGINT UNSIGNED NOT NULL PRIMARY KEY,
         fingerprint TEXT NOT NULL,
         sample       TEXT NOT NULL,
         first_seen   DATETIME,
         last_seen    DATETIME,
         reviewed_by VARCHAR(20),
         reviewed_on DATETIME,
         comments     TEXT
      )

      The columns are as follows:




162                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



     COLUMN            MEANING
     ===========       ===============
     checksum          A 64-bit checksum of the query fingerprint
     fingerprint       The abstracted version of the query; its primary key
     sample            The query text of a sample of the class of queries
     first_seen        The smallest timestamp of this class of queries
     last_seen         The largest timestamp of this class of queries
     reviewed_by       Initially NULL; if set, query is skipped thereafter
     reviewed_on       Initially NULL; not assigned any special meaning
     comments          Initially NULL; not assigned any special meaning

     Note that the fingerprint column is the true primary key for a class of queries. The checksum is just a
     cryptographic hash of this value, which provides a shorter value that is very likely to also be unique.
     After parsing and aggregating events, your table should contain a row for each fingerprint. This option depends
     on --group-by fingerprint (which is the default). It will not work otherwise.
-review-history
    type: DSN
     The table in which to store historical values for review trend analysis.
     Each time you review queries with --review, pt-query-digest will save information into this table so you
     can see how classes of queries have changed over time.
     This DSN inherits unspecified values from --review. It should mention a table in which to store statistics
     about each class of queries. pt-query-digest verifies the existence of the table, and your privileges to insert,
     delete and update on that table.
     pt-query-digest then inspects the columns in the table. The table must have at least the following columns:
     CREATE TABLE query_review_history (
       checksum     BIGINT UNSIGNED NOT NULL,
       sample       TEXT NOT NULL
     );

     Any columns not mentioned above are inspected to see if they follow a certain naming convention. The column
     is special if the name ends with an underscore followed by any of these MAGIC_history_cols values:
     pct|avt|cnt|sum|min|max|pct_95|stddev|median|rank

     If the column ends with one of those values, then the prefix is interpreted as the event attribute to store in that
     column, and the suffix is interpreted as the metric to be stored. For example, a column named Query_time_min
     will be used to store the minimum Query_time for the class of events. The presence of this column will also add
     Query_time to the --select list.
     The table should also have a primary key, but that is up to you, depending on how you want to store the historical
     data. We suggest adding ts_min and ts_max columns and making them part of the primary key along with the
     checksum. But you could also just add a ts_min column and make it a DATE type, so you’d get one row per
     class of queries per day.
     The default table structure follows. The following MAGIC_create_review_history table definition is used for
     --create-review-history-table:
     CREATE TABLE query_review_history (
       checksum             BIGINT UNSIGNED NOT NULL,
       sample               TEXT NOT NULL,
       ts_min               DATETIME,
       ts_max               DATETIME,
       ts_cnt               FLOAT,
       Query_time_sum       FLOAT,


2.21. pt-query-digest                                                                                             163
Percona Toolkit Documentation, Release 2.1.1



       Query_time_min        FLOAT,
       Query_time_max        FLOAT,
       Query_time_pct_95     FLOAT,
       Query_time_stddev     FLOAT,
       Query_time_median     FLOAT,
       Lock_time_sum         FLOAT,
       Lock_time_min         FLOAT,
       Lock_time_max         FLOAT,
       Lock_time_pct_95      FLOAT,
       Lock_time_stddev      FLOAT,
       Lock_time_median      FLOAT,
       Rows_sent_sum         FLOAT,
       Rows_sent_min         FLOAT,
       Rows_sent_max         FLOAT,
       Rows_sent_pct_95      FLOAT,
       Rows_sent_stddev      FLOAT,
       Rows_sent_median      FLOAT,
       Rows_examined_sum     FLOAT,
       Rows_examined_min     FLOAT,
       Rows_examined_max     FLOAT,
       Rows_examined_pct_95 FLOAT,
       Rows_examined_stddev FLOAT,
       Rows_examined_median FLOAT,
       -- Percona extended slowlog attributes
       -- http://www.percona.com/docs/wiki/patches:slow_extended
       Rows_affected_sum             FLOAT,
       Rows_affected_min             FLOAT,
       Rows_affected_max             FLOAT,
       Rows_affected_pct_95          FLOAT,
       Rows_affected_stddev          FLOAT,
       Rows_affected_median          FLOAT,
       Rows_read_sum                 FLOAT,
       Rows_read_min                 FLOAT,
       Rows_read_max                 FLOAT,
       Rows_read_pct_95              FLOAT,
       Rows_read_stddev              FLOAT,
       Rows_read_median              FLOAT,
       Merge_passes_sum              FLOAT,
       Merge_passes_min              FLOAT,
       Merge_passes_max              FLOAT,
       Merge_passes_pct_95           FLOAT,
       Merge_passes_stddev           FLOAT,
       Merge_passes_median           FLOAT,
       InnoDB_IO_r_ops_min           FLOAT,
       InnoDB_IO_r_ops_max           FLOAT,
       InnoDB_IO_r_ops_pct_95        FLOAT,
       InnoDB_IO_r_ops_stddev        FLOAT,
       InnoDB_IO_r_ops_median        FLOAT,
       InnoDB_IO_r_bytes_min         FLOAT,
       InnoDB_IO_r_bytes_max         FLOAT,
       InnoDB_IO_r_bytes_pct_95      FLOAT,
       InnoDB_IO_r_bytes_stddev      FLOAT,
       InnoDB_IO_r_bytes_median      FLOAT,
       InnoDB_IO_r_wait_min          FLOAT,
       InnoDB_IO_r_wait_max          FLOAT,
       InnoDB_IO_r_wait_pct_95       FLOAT,
       InnoDB_IO_r_wait_stddev       FLOAT,
       InnoDB_IO_r_wait_median       FLOAT,



164                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



        InnoDB_rec_lock_wait_min      FLOAT,
        InnoDB_rec_lock_wait_max      FLOAT,
        InnoDB_rec_lock_wait_pct_95   FLOAT,
        InnoDB_rec_lock_wait_stddev   FLOAT,
        InnoDB_rec_lock_wait_median   FLOAT,
        InnoDB_queue_wait_min         FLOAT,
        InnoDB_queue_wait_max         FLOAT,
        InnoDB_queue_wait_pct_95      FLOAT,
        InnoDB_queue_wait_stddev      FLOAT,
        InnoDB_queue_wait_median      FLOAT,
        InnoDB_pages_distinct_min     FLOAT,
        InnoDB_pages_distinct_max     FLOAT,
        InnoDB_pages_distinct_pct_95 FLOAT,
        InnoDB_pages_distinct_stddev FLOAT,
        InnoDB_pages_distinct_median FLOAT,
        -- Boolean (Yes/No) attributes. Only the cnt and sum are needed for these.
        -- cnt is how many times is attribute was recorded and sum is how many of
        -- those times the value was Yes. Therefore sum/cnt * 100 = % of recorded
        -- times that the value was Yes.
        QC_Hit_cnt          FLOAT,
        QC_Hit_sum          FLOAT,
        Full_scan_cnt       FLOAT,
        Full_scan_sum       FLOAT,
        Full_join_cnt       FLOAT,
        Full_join_sum       FLOAT,
        Tmp_table_cnt       FLOAT,
        Tmp_table_sum       FLOAT,
        Tmp_table_on_disk_cnt FLOAT,
        Tmp_table_on_disk_sum FLOAT,
        Filesort_cnt          FLOAT,
        Filesort_sum          FLOAT,
        Filesort_on_disk_cnt FLOAT,
        Filesort_on_disk_sum FLOAT,
        PRIMARY KEY(checksum, ts_min, ts_max)
      );

      Note that we store the count (cnt) for the ts attribute only; it will be redundant to store this for other attributes.
-run-time
    type: time
      How long to run for each --iterations. The default is to run forever (you can interrupt with CTRL-C). Be-
      cause --iterations defaults to 1, if you only specify --run-time, pt-query-digest runs for that amount
      of time and then exits. The two options are specified together to do collect-and-report cycles. For example, spec-
      ifying --iterations 4 --run-time 15m with a continuous input (like STDIN or --processlist) will
      cause pt-query-digest to run for 1 hour (15 minutes x 4), reporting four times, once at each 15 minute interval.
-run-time-mode
    type: string; default: clock
      Set what the value of --run-time operates on. Following are the possible values for this option:
      clock
              --run-time specifies an amount of real clock time during which the tool should run for each
              --iterations.
      event
              --run-time specifies an amount of log time. Log time is determined by timestamps in the log.
              The first timestamp seen is remembered, and each timestamp after that is compared to the first to


2.21. pt-query-digest                                                                                                   165
Percona Toolkit Documentation, Release 2.1.1


           determine how much log time has passed. For example, if the first timestamp seen is 12:00:00 and
           the next is 12:01:30, that is 1 minute and 30 seconds of log time. The tool will read events until
           the log time is greater than or equal to the specified --run-time value.
           Since timestamps in logs are not always printed, or not always printed frequently, this mode varies in
           accuracy.
      interval
           --run-time specifies interval boundaries of log time into which events are divided and reports
           are generated. This mode is different from the others because it doesn’t specify how long to run.
           The value of --run-time must be an interval that divides evenly into minutes, hours or days. For
           example, 5m divides evenly into hours (60/5=12, so 12 5 minutes intervals per hour) but 7m does not
           (60/7=8.6).
           Specifying --run-time-mode interval --run-time 30m --iterations 0 is simi-
           lar to specifying --run-time-mode clock --run-time 30m --iterations 0. In the
           latter case, pt-query-digest will run forever, producing reports every 30 minutes, but this only works
           effectively with continuous inputs like STDIN and the processlist. For fixed inputs, like log files,
           the former example produces multiple reports by dividing the log into 30 minutes intervals based on
           timestamps.
           Intervals are calculated from the zeroth second/minute/hour in which a timestamp occurs, not from
           whatever time it specifies. For example, with 30 minute intervals and a timestamp of 12:10:30, the
           interval is not 12:10:30 to 12:40:30, it is 12:00:00 to 12:29:59. Or, with 1 hour intervals,
           it is 12:00:00 to 12:59:59. When a new timestamp exceeds the interval, a report is printed, and
           the next interval is recalculated based on the new timestamp.
           Since --iterations is 1 by default, you probably want to specify a new value else pt-query-
           digest will only get and report on the first interval from the log since 1 interval = 1 iteration. If you
           want to get and report every interval in a log, specify --iterations 0.
-sample
    type: int
      Filter out all but the first N occurrences of each query. The queries are filtered on the first value in
      --group-by, so by default, this will filter by query fingerprint. For example, --sample 2 will permit
      two sample queries for each fingerprint. Useful in conjunction with --print to print out the queries. You
      probably want to set --no-report to avoid the overhead of aggregating and reporting if you’re just using this
      to print out samples of queries. A complete example:
      :program:‘pt-query-digest‘ --sample 2 --no-report --print slow.log

-select
    type: Array
      Compute aggregate statistics for these attributes.
      By default pt-query-digest auto-detects, aggregates and prints metrics for every query attribute that it finds in
      the slow query log. This option specifies a list of only the attributes that you want. You can specify an alternative
      attribute with a colon. For example, db:Schema uses db if it’s available, and Schema if it’s not.
      Previously, pt-query-digest only aggregated these attributes:
      Query_time,Lock_time,Rows_sent,Rows_examined,user,db:Schema,ts

      Attributes specified in the --review-history table will always be selected even if you do not specify
      --select.
      See also --ignore-attributes and “ATTRIBUTES”.



166                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-set-vars
    type: string; default: wait_timeout=10000
     Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
     executed.
-shorten
    type: int; default: 1024
     Shorten long statements in reports.
     Shortens long statements, replacing the omitted portion with a /*... omitted ...*/ comment. This
     applies only to the output in reports, not to information stored for --review or other places. It prevents
     a large statement from causing difficulty in a report. The argument is the preferred length of the shortened
     statement. Not all statements can be shortened, but very large INSERT and similar statements often can; and so
     can IN() lists, although only the first such list in the statement will be shortened.
     If it shortens something beyond recognition, you can find the original statement in the log, at the offset shown
     in the report header (see “OUTPUT”).
-show-all
    type: Hash
     Show all values for these attributes.
     By default pt-query-digest only shows as many of an attribute’s value that fit on a single line. This option
     allows you to specify attributes for which all values will be shown (line width is ignored). This only works for
     attributes with string values like user, host, db, etc. Multiple attributes can be specified, comma-separated.
-since
    type: string
     Parse only queries newer than this value (parse queries since this date).
     This option allows you to ignore queries older than a certain value and parse only those queries which are more
     recent than the value. The value can be several types:
     * Simple time value N with optional suffix: N[shmd], where
       s=seconds, h=hours, m=minutes, d=days (default s if no suffix
       given); this is like saying "since N[shmd] ago"
     * Full date with optional hours:minutes:seconds:
       YYYY-MM-DD [HH:MM::SS]
     * Short, MySQL-style date:
       YYMMDD [HH:MM:SS]
     * Any time expression evaluated by MySQL:
       CURRENT_DATE - INTERVAL 7 DAY

     If you give a MySQL time expression, then you must also specify a DSN so that pt-query-digest can con-
     nect to MySQL to evaluate the expression. If you specify --execute, --explain, --processlist,
     --review or --review-history, then one of these DSNs will be used automatically. Otherwise, you
     must specify an --aux-dsn or pt-query-digest will die saying that the value is invalid.
     The MySQL time expression is wrapped inside a query like “SELECT UNIX_TIMESTAMP(<expression>)”,
     so be sure that the expression is valid inside this query. For example, do not use UNIX_TIMESTAMP() because
     UNIX_TIMESTAMP(UNIX_TIMESTAMP()) returns 0.
     Events are assumed to be in chronological: older events at the beginning of the log and newer events at the end
     of the log. --since is strict: it ignores all queries until one is found that is new enough. Therefore, if the
     query events are not consistently timestamped, some may be ignored which are actually new enough.
     See also --until.



2.21. pt-query-digest                                                                                           167
Percona Toolkit Documentation, Release 2.1.1


-socket
    short form: -S; type: string
      Socket file to use for connection.
-statistics
    Print statistics about internal counters. This option is mostly for development and debugging. The statistics
    report is printed for each iteration after all other reports, even if no events are processed or --no-report is
    specified. The statistics report looks like:
      # No events processed.

      #   Statistic                                                          Count %/Events
      #   ================================================                   ====== ========
      #   events_read                                                        142030   100.00
      #   events_parsed                                                       50430    35.51
      #   events_aggregated                                                       0     0.00
      #   ignored_midstream_server_response                                   18111    12.75
      #   no_tcp_data                                                         91600    64.49
      #   pipeline_restarted_after_MemcachedProtocolParser                   142030   100.00
      #   pipeline_restarted_after_TcpdumpParser                                  1     0.00
      #   unknown_client_command                                                  1     0.00
      #   unknown_client_data                                                 32318    22.75

      The first column is the internal counter name; the second column is counter’s count; and the third column is the
      count as a percentage of events_read.
      In this case, it shows why no events were processed/aggregated: 100% of events were rejected by the
      MemcachedProtocolParser. Of those, 35.51% were data packets, but of these 12.75% of ignored mid-
      stream server response, one was an unknown client command, and 22.75% were unknown client data. The other
      64.49% were TCP control packets (probably most ACKs).
      Since pt-query-digest is complex, you will probably need someone familiar with its code to decipher the statis-
      tics report.
-table-access
    Print a table access report.
      The table access report shows which tables are accessed by all the queries and if the access is a read or write.
      The report looks like:
      write ‘baz‘.‘tbl‘
      read ‘baz‘.‘new_tbl‘
      write ‘baz‘.‘tbl3‘
      write ‘db6‘.‘tbl6‘

      If you pipe the output to sort, the read and write tables will be grouped together and sorted alphabetically:
      read ‘baz‘.‘new_tbl‘
      write ‘baz‘.‘tbl‘
      write ‘baz‘.‘tbl3‘
      write ‘db6‘.‘tbl6‘

-tcpdump-errors
    type: string
      Write the tcpdump data to this file on error. If pt-query-digest doesn’t parse the stream correctly for some
      reason, the session’s packets since the last query event will be written out to create a usable test case. If this
      happens, pt-query-digest will not raise an error; it will just discard the session’s saved state and permit the tool
      to continue working. See “tcpdump” for more information about parsing tcpdump output.



168                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-timeline
    Show a timeline of events.
     This option makes pt-query-digest print another kind of report: a timeline of the events. Each query is still
     grouped and aggregate into classes according to --group-by, but then they are printed in chronological
     order. The timeline report prints out the timestamp, interval, count and value of each classes.
     If all you want is the timeline report, then specify --no-report to suppress the default query analy-
     sis report. Otherwise, the timeline report will be printed at the end before the response-time profile (see
     --report-format and “OUTPUT”).
     For example, this:
     :program:‘pt-query-digest‘ /path/to/log --group-by distill --timeline

     will print something like:
     #   ########################################################
     #   distill report
     #   ########################################################
     #   2009-07-25 11:19:27 1+00:00:01   2 SELECT foo
     #   2009-07-27 11:19:30      00:01   2 SELECT bar
     #   2009-07-27 11:30:00 1+06:30:00   2 SELECT foo

-type
    type: Array
     The type of input to parse (default slowlog). The permitted types are
     binlog
            Parse a binary log file.
     genlog
            Parse a MySQL general log file. General logs lack a lot of “ATTRIBUTES”, notably Query_time.
            The default --order-by for general logs changes to Query_time:cnt.
     http
            Parse HTTP traffic from tcpdump.
     pglog
            Parse a log file in PostgreSQL format. The parser will automatically recognize logs sent to syslog
            and transparently parse the syslog format, too. The recommended configuration for logging in your
            postgresql.conf is as follows.
            The log_destination setting can be set to either syslog or stderr. Syslog has the added benefit of not
            interleaving log messages from several sessions concurrently, which the parser cannot handle, so this
            might be better than stderr. CSV-formatted logs are not supported at this time.
            The log_min_duration_statement setting should be set to 0 to capture all statements with their dura-
            tions. Alternatively, the parser will also recognize and handle various combinations of log_duration
            and log_statement.
            You may enable log_connections and log_disconnections, but this is optional.
            It is highly recommended to set your log_line_prefix to the following:
            log_line_prefix = ’%m c=%c,u=%u,D=%d ’

            This lets the parser find timestamps with milliseconds, session IDs, users, and databases from the
            log. If these items are missing, you’ll simply get less information to analyze. For compatibility with


2.21. pt-query-digest                                                                                                169
Percona Toolkit Documentation, Release 2.1.1


          other log analysis tools such as PQA and pgfouine, various log line prefix formats are supported. The
          general format is as follows: a timestamp can be detected and extracted (the syslog timestamp is NOT
          parsed), and a name=value list of properties can also. Although the suggested format is as shown
          above, any name=value list will be captured and interpreted by using the first letter of the ‘name’
          part, lowercased, to determine the meaning of the item. The lowercased first letter is interpreted
          to mean the same thing as PostgreSQL’s built-in %-codes for the log_line_prefix format string. For
          example, u means user, so unicorn=fred will be interpreted as user=fred; d means database, so D=john
          will be interpreted as database=john. The pgfouine-suggested formatting is user=%u and db=%d, so
          it should Just Work regardless of which format you choose. The main thing is to add as much
          information as possible into the log_line_prefix to permit richer analysis.
          Currently, only English locale messages are supported, so if your server’s locale is set to something
          else, the log won’t be parsed properly. (Log messages with “duration:” and “statement:” won’t be
          recognized.)
      slowlog
          Parse a log file in any variation of MySQL slow-log format.
      tcpdump
          Inspect network packets and decode the MySQL client protocol, extracting queries and responses
          from it.
          pt-query-digest does not actually watch the network (i.e. it does NOT “sniff packets”). Instead, it’s
          just parsing the output of tcpdump. You are responsible for generating this output; pt-query-digest
          does not do it for you. Then you send this to pt-query-digest as you would any log file: as files on
          the command line or to STDIN.
          The parser expects the input to be formatted with the following options: -x -n -q -tttt. For
          example, if you want to capture output from your local machine, you can do something like the
          following (the port must come last on FreeBSD):
          tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 
            > mysql.tcp.txt
          :program:‘pt-query-digest‘ --type tcpdump mysql.tcp.txt

          The other tcpdump parameters, such as -s, -c, and -i, are up to you. Just make sure the output looks
          like this (there is a line break in the first line to avoid man-page problems):
          2009-04-12 09:50:16.804849 IP 127.0.0.1.42167
                 > 127.0.0.1.3306: tcp 37
              0x0000: 4508 0059 6eb2 4000 4006 cde2 7f00 0001
              0x0010: ....

          Remember tcpdump has a handy -c option to stop after it captures some number of packets! That’s
          very useful for testing your tcpdump command. Note that tcpdump can’t capture traffic on a Unix
          socket. Read http://bugs.mysql.com/bug.php?id=31577 if you’re confused about this.
          Devananda Van Der Veen explained on the MySQL Performance Blog how to capture traffic without
          dropping packets on busy servers. Dropped packets cause pt-query-digest to miss the response to
          a request, then see the response to a later request and assign the wrong execution time to the query.
          You can change the filter to something like the following to help capture a subset of the queries. (See
          http://www.mysqlperformanceblog.com/?p=6092 for details.)
          tcpdump -i any -s 65535 -x -n -q -tttt 
             ’port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2’

          All MySQL servers running on port 3306 are automatically detected in the tcpdump output.
          Therefore, if the tcpdump out contains packets from multiple servers on port 3306 (for example,


170                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


           10.0.0.1:3306, 10.0.0.2:3306, etc.), all packets/queries from all these servers will be analyzed to-
           gether as if they were one server.
           If you’re analyzing traffic for a MySQL server that is not running on port 3306, see
           --watch-server.
           Also note that pt-query-digest may fail to report the database for queries when parsing tcpdump out-
           put. The database is discovered only in the initial connect events for a new client or when <USE db>
           is executed. If the tcpdump output contains neither of these, then pt-query-digest cannot discover
           the database.
           Server-side prepared statements are supported. SSL-encrypted traffic cannot be inspected and de-
           coded.
      memcached
           Similar to tcpdump, but the expected input is memcached packets instead of MySQL packets. For
           example:
           tcpdump -i any port 11211 -s 65535 -x -nn -q -tttt 
             > memcached.tcp.txt
           :program:‘pt-query-digest‘ --type memcached memcached.tcp.txt

           memcached uses port 11211 by default.
-until
    type: string
      Parse only queries older than this value (parse queries until this date).
      This option allows you to ignore queries newer than a certain value and parse only those queries which are older
      than the value. The value can be one of the same types listed for --since.
      Unlike --since, --until is not strict: all queries are parsed until one has a timestamp that is equal to or
      greater than --until. Then all subsequent queries are ignored.
-user
    short form: -u; type: string
      User for login if not current user.
-variations
    type: Array
      Report the number of variations in these attributes’ values.
      Variations show how many distinct values an attribute had within a class. The usual value for this option is arg
      which shows how many distinct queries were in the class. This can be useful to determine a query’s cacheability.
      Distinct values are determined by CRC32 checksums of the attributes’ values. These checksums are reported in
      the query report for attributes specified by this option, like:
      # arg crc            109 (1/25%), 144 (1/25%)... 2 more

      In that class there were 4 distinct queries. The checksums of the first two variations are shown, and each one
      occurred once (or, 25% of the time).
      The counts of distinct variations is approximate because only 1,000 variations are saved. The mod (%) 1000 of
      the full CRC32 checksum is saved, so some distinct checksums are treated as equal.
-version
    Show version and exit.




2.21. pt-query-digest                                                                                             171
Percona Toolkit Documentation, Release 2.1.1


-watch-server
    type: string
      This option tells pt-query-digest which server IP address and port (like “10.0.0.1:3306”) to watch when parsing
      tcpdump (for --type tcpdump and memcached); all other servers are ignored. If you don’t specify it, pt-
      query-digest watches all servers by looking for any IP address using port 3306 or “mysql”. If you’re watching
      a server with a non-standard port, this won’t work, so you must specify the IP address and port to watch.
      If you want to watch a mix of servers, some running on standard port 3306 and some running on non-standard
      ports, you need to create separate tcpdump outputs for the non-standard port servers and then specify this option
      for each. At present pt-query-digest cannot auto-detect servers on port 3306 and also be told to watch a server
      on a non-standard port.
-[no]zero-admin
    default: yes
      Zero out the Rows_XXX properties for administrator command events.
-[no]zero-bool
    default: yes
      Print 0% boolean values in report.


2.21.12 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.
   • D
      dsn: database; copy: yes
      Database that contains the query review table.
   • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
   • h
      dsn: host; copy: yes
      Connect to host.
   • p
      dsn: password; copy: yes
      Password to use when connecting.
   • P
      dsn: port; copy: yes
      Port number to use for connection.


172                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • t
      Table to use as the query review table.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.21.13 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-query-digest ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.21.14 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.21.15 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-query-digest.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.21.16 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb




2.21. pt-query-digest                                                                                             173
Percona Toolkit Documentation, Release 2.1.1


You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.21.17 AUTHORS

Baron Schwartz and Daniel Nichter


2.21.18 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.21.19 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2008-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.21.20 VERSION

pt-query-digest 2.1.1


2.22 pt-show-grants

2.22.1 NAME

pt-show-grants - Canonicalize and print MySQL grants so you can effectively replicate, compare and version-control
them.


2.22.2 SYNOPSIS

Usage

pt-show-grants [OPTION...] [DSN]

pt-show-grants shows grants (user privileges) from a MySQL server.


174                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


Examples

pt-show-grants

pt-show-grants --separate --revoke | diff othergrants.sql -



2.22.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-show-grants is read-only by default, and very low-risk. If you specify --flush, it will execute FLUSH
PRIVILEGES.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
show-grants.
See also “BUGS” for more information on filing bugs and getting help.


2.22.4 DESCRIPTION

pt-show-grants extracts, orders, and then prints grants for MySQL user accounts.
Why would you want this? There are several reasons.
The first is to easily replicate users from one server to another; you can simply extract the grants from the first server
and pipe the output directly into another server.
The second use is to place your grants into version control. If you do a daily automated grant dump into version
control, you’ll get lots of spurious changesets for grants that don’t change, because MySQL prints the actual grants
out in a seemingly random order. For instance, one day it’ll say
GRANT DELETE, INSERT, UPDATE ON ‘test‘.* TO ’foo’@’%’;

And then another day it’ll say
GRANT INSERT, DELETE, UPDATE ON ‘test‘.* TO ’foo’@’%’;

The grants haven’t changed, but the order has. This script sorts the grants within the line, between ‘GRANT’ and
‘ON’. If there are multiple rows from SHOW GRANTS, it sorts the rows too, except that it always prints the row
with the user’s password first, if it exists. This removes three kinds of inconsistency you’ll get from running SHOW
GRANTS, and avoids spurious changesets in version control.
Third, if you want to diff grants across servers, it will be hard without “canonicalizing” them, which pt-show-grants
does. The output is fully diff-able.
With the --revoke, --separate and other options, pt-show-grants also makes it easy to revoke specific privi-
leges from users. This is tedious otherwise.


2.22.5 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.



2.22. pt-show-grants                                                                                               175
Percona Toolkit Documentation, Release 2.1.1


-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-database
    short form: -D; type: string
      The database to use for the connection.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-drop
    Add DROP USER before each user in the output.
-flush
    Add FLUSH PRIVILEGES after output.
      You might need this on pre-4.1.1 servers if you want to drop a user completely.
-[no]header
    default: yes
      Print dump header.
      The header precedes the dumped grants. It looks like:
      -- Grants dumped by :program:‘pt-show-grants‘ 1.0.19
      -- Dumped from server Localhost via UNIX socket, MySQL 5.0.82-log at 2009-10-26 10:01:04

      See also --[no]timestamp.
-help
    Show help and exit.
-host
    short form: -h; type: string
      Connect to host.
-ignore
    type: array
      Ignore this comma-separated list of users.
-only
    type: array
      Only show grants for this comma-separated list of users.
-password
    short form: -p; type: string



176                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-port
    short form: -P; type: int
       Port number to use for connection.
-revoke
    Add REVOKE statements for each GRANT statement.
-separate
    List each GRANT or REVOKE separately.
       The default output from MySQL’s SHOW GRANTS command lists many privileges on a single line. With
       --flush, places a FLUSH PRIVILEGES after each user, instead of once at the end of all the output.
-set-vars
    type: string; default: wait_timeout=10000
       Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
       executed.
-socket
    short form: -S; type: string
       Socket file to use for connection.
-[no]timestamp
    default: yes
       Add timestamp to the dump header.
       See also --[no]header.
-user
    short form: -u; type: string
       User for login if not current user.
-version
    Show version and exit.


2.22.6 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
       dsn: charset; copy: yes
       Default character set.


2.22. pt-show-grants                                                                                                  177
Percona Toolkit Documentation, Release 2.1.1


    • D
      dsn: database; copy: yes
      Default database.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.22.7 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-show-grants ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.22.8 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.22.9 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-show-grants.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool


178                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.22.10 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.22.11 AUTHORS

Baron Schwartz


2.22.12 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.22.13 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.




2.22. pt-show-grants                                                                                            179
Percona Toolkit Documentation, Release 2.1.1


2.22.14 VERSION

pt-show-grants 2.1.1


2.23 pt-sift

2.23.1 NAME

pt-sift - Browses files created by pt-collect.


2.23.2 SYNOPSIS

Usage

pt-sift FILE|PREFIX|DIRECTORY

pt-sift browses the files created by pt-collect. If you specify a FILE or PREFIX, it browses only files with that prefix.
If you specify a DIRECTORY, then it browses all files within that directory.


2.23.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-sift is a read-only tool. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-sift.
See also “BUGS” for more information on filing bugs and getting help.


2.23.4 DESCRIPTION

pt-sift downloads other tools that it might need, such as pt-diskstats, and then makes a list of the unique timestamp
prefixes of all the files in the directory, as written by the pt-collect tool. If the user specified a timestamp on the
command line, then it begins with that sample of data; otherwise it begins by showing a list of the timestamps and
prompting for a selection. Thereafter, it displays a summary of the selected sample, and the user can navigate and
inspect with keystrokes. The keystroke commands you can use are as follows:
d
      Sets the action to start the pt-diskstats tool on the sample’s disk performance statistics.
i
      Sets the action to view the first INNODB STATUS sample in less.
m
      Displays the first 4 samples of SHOW STATUS counters side by side with the pt-mext tool.
n


180                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      Summarizes the first sample of netstat data in two ways: by originating host, and by connection state.
j
      Select the next timestamp as the active sample.
k
      Select the previous timestamp as the active sample.
q
      Quit the program.
1
      Sets the action for each sample to the default, which is to view a summary of the sample.
0
      Sets the action to just list the files in the sample.
    • Sets the action to view all of the sample’s files in the less program.


2.23.5 OPTIONS

This tool does not have any command-line options.


2.23.6 ENVIRONMENT

This tool does not use any environment variables.


2.23.7 SYSTEM REQUIREMENTS

This tool requires Bash v3 and the following programs: pt-diskstats, pt-pmp, pt-mext, and pt-align. If these programs
are not in your PATH, they will be fetched from the Internet if curl is available.


2.23.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-sift.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.




2.23. pt-sift                                                                                                     181
Percona Toolkit Documentation, Release 2.1.1


2.23.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.23.10 AUTHORS

Baron Schwartz


2.23.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.23.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.23.13 VERSION

pt-sift 2.1.1




182                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



2.24 pt-slave-delay

2.24.1 NAME

pt-slave-delay - Make a MySQL slave server lag behind its master.


2.24.2 SYNOPSIS

Usage

pt-slave-delay [OPTION...] SLAVE-HOST [MASTER-HOST]

pt-slave-delay starts and stops a slave server as needed to make it lag behind the master. The SLAVE-HOST and
MASTER-HOST use DSN syntax, and values are copied from the SLAVE-HOST to the MASTER-HOST if omitted.
To hold slavehost one minute behind its master for ten minutes:
pt-slave-delay --delay 1m --interval 15s --run-time 10m slavehost



2.24.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-slave-delay is generally very low-risk. It simply starts and stops the replication SQL thread. This might cause
monitoring systems to think the slave is having trouble.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
slave-delay.
See also “BUGS” for more information on filing bugs and getting help.


2.24.4 DESCRIPTION

pt-slave-delay watches a slave and starts and stops its replication SQL thread as necessary to hold it at least as
far behind the master as you request. In practice, it will typically cause the slave to lag between --delay and
--delay”+”--interval behind the master.
It bases the delay on binlog positions in the slave’s relay logs by default, so there is no need to connect to the master.
This works well if the IO thread doesn’t lag the master much, which is typical in most replication setups; the IO thread
lag is usually milliseconds on a fast network. If your IO thread’s lag is too large for your purposes, pt-slave-delay can
also connect to the master for information about binlog positions.
If the slave’s I/O thread reports that it is waiting for the SQL thread to free some relay log space, pt-slave-delay will
automatically connect to the master to find binary log positions. If --ask-pass and --daemonize are given, it
is possible that this could cause it to ask for a password while daemonized. In this case, it exits. Therefore, if you
think your slave might encounter this condition, you should be sure to either specify --use-master explicitly when
daemonizing, or don’t specify --ask-pass.
The SLAVE-HOST and optional MASTER-HOST are both DSNs. See “DSN OPTIONS”. Missing MASTER-HOST
values are filled in with values from SLAVE-HOST, so you don’t need to specify them in both places. pt-slave-delay


2.24. pt-slave-delay                                                                                                 183
Percona Toolkit Documentation, Release 2.1.1


reads all normal MySQL option files, such as ~/.my.cnf, so you may not need to specify username, password and other
common options at all.
pt-slave-delay tries to exit gracefully by trapping signals such as Ctrl-C. You cannot bypass --[no]continue with
a trappable signal.


2.24.5 PRIVILEGES

pt-slave-delay requires the following privileges: PROCESS, REPLICATION CLIENT, and SUPER.


2.24.6 OUTPUT

If you specify --quiet, there is no output. Otherwise, the normal output is a status message consisting of a timestamp
and information about what pt-slave-delay is doing: starting the slave, stopping the slave, or just observing.


2.24.7 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-[no]continue
    default: yes
      Continue replication normally on exit. After exiting, restart the slave’s SQL thread with no UNTIL condition,
      so it will run as usual and catch up to the master. This is enabled by default and works even if you terminate
      pt-slave-delay with Control-C.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-delay
    type: time; default: 1h
      How far the slave should lag its master.
-help
    Show help and exit.




184                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-host
    short form: -h; type: string
       Connect to host.
-interval
    type: time; default: 1m
       How frequently pt-slave-delay should check whether the slave needs to be started or stopped.
-log
       type: string
       Print all output to this file when daemonized.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-quiet
    short form: -q
       Don’t print informational messages about operation. See OUTPUT for details.
-run-time
    type: time
       How long pt-slave-delay should run before exiting. The default is to run forever.
-set-vars
    type: string; default: wait_timeout=10000
       Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
       executed.
-socket
    short form: -S; type: string
       Socket file to use for connection.
-use-master
    Get binlog positions from master, not slave. Don’t trust the binlog positions in the slave’s relay log. Connect
    to the master and get binlog positions instead. If you specify this option without giving a MASTER-HOST on
    the command line, pt-slave-delay examines the slave’s SHOW SLAVE STATUS to determine the hostname and
    port for connecting to the master.
       pt-slave-delay uses only the MASTER_HOST and MASTER_PORT values from SHOW SLAVE STATUS for
       the master connection. It does not use the MASTER_USER value. If you want to specify a different username
       for the master than the one you use to connect to the slave, you should specify the MASTER-HOST option
       explicitly on the command line.




2.24. pt-slave-delay                                                                                          185
Percona Toolkit Documentation, Release 2.1.1


-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.


2.24.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.
   • D
      dsn: database; copy: yes
      Default database.
   • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
   • h
      dsn: host; copy: yes
      Connect to host.
   • p
      dsn: password; copy: yes
      Password to use when connecting.
   • P
      dsn: port; copy: yes
      Port number to use for connection.
   • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
   • u
      dsn: user; copy: yes
      User for login if not current user.




186                                                                                         Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.24.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-slave-delay ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.24.10 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.24.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-slave-delay.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.24.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.24.13 AUTHORS

Sergey Zhuravlev and Baron Schwartz




2.24. pt-slave-delay                                                                                              187
Percona Toolkit Documentation, Release 2.1.1


2.24.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.24.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Sergey Zhuravle and Baron Schwartz, 2011-2012 Percona Inc. Feedback and
improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.24.16 VERSION

pt-slave-delay 2.1.1


2.25 pt-slave-find

2.25.1 NAME

pt-slave-find - Find and print replication hierarchy tree of MySQL slaves.


2.25.2 SYNOPSIS

Usage

pt-slave-find [OPTION...] MASTER-HOST

pt-slave-find finds and prints a hierarchy tree of MySQL slaves.


Examples

pt-slave-find --host master-host




188                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.25.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-slave-find is read-only and very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
slave-find.
See also “BUGS” for more information on filing bugs and getting help.


2.25.4 DESCRIPTION

pt-slave-find connects to a MySQL replication master and finds its slaves. Currently the only thing it can do is print a
tree-like view of the replication hierarchy.
The master host can be specified using one of two methods. The first method is to use the standard connection-related
command line options: --defaults-file, --password, --host, --port, --socket or --user.
The second method to specify the master host is a DSN. A DSN is a special syntax that can be either just a hostname
(like server.domain.com or 1.2.3.4), or a key=value,key=value string. Keys are a single letter:
KEY   MEANING
===   =======
h     Connect to host
P     Port number to use for connection
S     Socket file to use for connection
u     User for login if not current user
p     Password to use when connecting
F     Only read default options from the given file

pt-slave-find reads all normal MySQL option files, such as ~/.my.cnf, so you may not need to specify username,
password and other common options at all.


2.25.5 EXIT STATUS

An exit status of 0 (sometimes also called a return value or return code) indicates success. Any other value represents
the exit status of the Perl process itself.


2.25.6 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.


2.25. pt-slave-find                                                                                                189
Percona Toolkit Documentation, Release 2.1.1


-config
    type: Array
       Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-database
    type: string; short form: -D
       Database to use.
-defaults-file
    short form: -F; type: string
       Only read mysql options from the given file. You must give an absolute pathname.
-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-port
    short form: -P; type: int
       Port number to use for connection.
-recurse
    type: int
       Number of levels to recurse in the hierarchy. Default is infinite.
       See --recursion-method.
-recursion-method
    type: string
       Preferred recursion method used to find slaves.
       Possible methods are:
       METHOD             USES
       ===========        ================
       processlist        SHOW PROCESSLIST
       hosts              SHOW SLAVE HOSTS

       The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method
       is required if the server uses a non-standard port (not 3306). Usually pt-slave-find does the right thing and finds
       the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the other
       methods will be tried.


190                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-report-format
    type: string; default: summary
      Set what information about the slaves is printed. The report format can be one of the following:
          •hostname

           Print just the hostname name of the slaves. It looks like:
           127.0.0.1:12345
           +- 127.0.0.1:12346
              +- 127.0.0.1:12347


          •summary

           Print a summary of each slave’s settings. This report shows more information about each slave, like:
           127.0.0.1:12345
           Version         5.1.34-log
           Server ID       12345
           Uptime          04:56 (started 2010-06-17T11:21:22)
           Replication     Is not a slave, has 1 slaves connected
           Filters
           Binary logging STATEMENT
           Slave status
           Slave mode      STRICT
           Auto-increment increment 1, offset 1
           +- 127.0.0.1:12346
              Version         5.1.34-log
              Server ID       12346
              Uptime          04:54 (started 2010-06-17T11:21:24)
              Replication     Is a slave, has 1 slaves connected
              Filters
              Binary logging STATEMENT
              Slave status    0 seconds behind, running, no errors
              Slave mode      STRICT
              Auto-increment increment 1, offset 1


-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.




2.25. pt-slave-find                                                                                                191
Percona Toolkit Documentation, Release 2.1.1


2.25.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      dsn: database; copy: yes
      Default database.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.25.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-slave-find ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.




192                                                                                         Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.25.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.25.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-slave-find.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.25.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.25.12 AUTHORS

Baron Schwartz and Daniel Nichter


2.25.13 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.




2.25. pt-slave-find                                                                                                193
Percona Toolkit Documentation, Release 2.1.1


2.25.14 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.25.15 VERSION

pt-slave-find 2.1.1


2.26 pt-slave-restart

2.26.1 NAME

pt-slave-restart - Watch and restart MySQL replication after errors.


2.26.2 SYNOPSIS

Usage

pt-slave-restart [OPTION...] [DSN]

pt-slave-restart watches one or more MySQL replication slaves for errors, and tries to restart replication if it stops.


2.26.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-slave-restart is a brute-force way to try to keep a slave server running when it is having problems with replication.
Don’t be too hasty to use it unless you need to. If you use this tool carelessly, you might miss the chance to really
solve the slave server’s problems.
At the time of this release there is a bug that causes an invalid CHANGE MASTER TO statement to be executed.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
slave-restart.
See also “BUGS” for more information on filing bugs and getting help.




194                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.26.4 DESCRIPTION

pt-slave-restart watches one or more MySQL replication slaves and tries to skip statements that cause errors. It polls
slaves intelligently with an exponentially varying sleep time. You can specify errors to skip and run the slaves until a
certain binlog position.
Note: it has come to my attention that Yahoo! had or has an internal tool called fix_repl, described to me by a past
Yahoo! employee and mentioned in the first edition of High Performance MySQL. Apparently this tool does the same
thing. Make no mistake, though: this is not a way to “fix replication.” In fact I would not even encourage its use on a
regular basis; I use it only when I have an error I know I just need to skip past.


2.26.5 OUTPUT

If you specify --verbose, pt-slave-restart prints a line every time it sees the slave has an error. See --verbose
for details.


2.26.6 SLEEP

pt-slave-restart sleeps intelligently between polling the slave. The current sleep time varies.
    • The initial sleep time is given by --sleep.
    • If it checks and finds an error, it halves the previous sleep time.
    • If it finds no error, it doubles the previous sleep time.
    • The sleep time is bounded below by --min-sleep and above by --max-sleep.
    • Immediately after finding an error, pt-slave-restart assumes another error is very likely to happen next, so it
      sleeps the current sleep time or the initial sleep time, whichever is less.


2.26.7 EXIT STATUS

An exit status of 0 (sometimes also called a return value or return code) indicates success. Any other value represents
the exit status of the Perl process itself, or of the last forked process that exited if there were multiple servers to
monitor.


2.26.8 COMPATIBILITY

pt-slave-restart should work on many versions of MySQL. Lettercase of many output columns from SHOW SLAVE
STATUS has changed over time, so it treats them all as lowercase.


2.26.9 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-always
    Start slaves even when there is no error. With this option enabled, pt-slave-restart will not let you stop the slave
    manually if you want to!
-ask-pass
    Prompt for a password when connecting to MySQL.




2.26. pt-slave-restart                                                                                             195
Percona Toolkit Documentation, Release 2.1.1


-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-[no]check-relay-log
    default: yes
      Check the last relay log file and position before checking for slave errors.
      By default pt-slave-restart will not doing anything (it will just sleep) if neither the relay log file nor the relay
      log position have changed since the last check. This prevents infinite loops (i.e. restarting the same error in the
      same relay log file at the same relay log position).
      For certain slave errors, however, this check needs to be disabled by specifying --no-check-relay-log.
      Do not do this unless you know what you are doing!
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-database
    short form: -D; type: string
      Database to use.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-error-length
    type: int
      Max length of error message to print. When --verbose is set high enough to print the error, this option will
      truncate the error text to the specified length. This can be useful to prevent wrapping on the terminal.
-error-numbers
    type: hash
      Only restart this comma-separated list of errors. Makes pt-slave-restart only try to restart if the error number
      is in this comma-separated list of errors. If it sees an error not in the list, it will exit.
      The error number is in the last_errno column of SHOW SLAVE STATUS.
-error-text
    type: string
      Only restart errors that match this pattern. A Perl regular expression against which the error text, if any, is
      matched. If the error text exists and matches, pt-slave-restart will try to restart the slave. If it exists but doesn’t
      match, pt-slave-restart will exit.
      The error text is in the last_error column of SHOW SLAVE STATUS.
-help
    Show help and exit.




196                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-host
    short form: -h; type: string
       Connect to host.
-log
       type: string
       Print all output to this file when daemonized.
-max-sleep
    type: float; default: 64
       Maximum sleep seconds.
       The maximum time pt-slave-restart will sleep before polling the slave again. This is also the time that pt-slave-
       restart will wait for all other running instances to quit if both --stop and --monitor are specified.
       See “SLEEP”.
-min-sleep
    type: float; default: 0.015625
       The minimum time pt-slave-restart will sleep before polling the slave again. See “SLEEP”.
-monitor
    Whether to monitor the slave (default). Unless you specify –monitor explicitly, --stop will disable it.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-quiet
    short form: -q
       Suppresses normal output (disables --verbose).
-recurse
    type: int; default: 0
       Watch slaves of the specified server, up to the specified number of servers deep in the hierarchy. The default
       depth of 0 means “just watch the slave specified.”
       pt-slave-restart examines SHOW PROCESSLIST and tries to determine which connections are from slaves,
       then connect to them. See --recursion-method.
       Recursion works by finding all slaves when the program starts, then watching them. If there is more than one
       slave, pt-slave-restart uses fork() to monitor them.
       This also works if you have configured your slaves to show up in SHOW SLAVE HOSTS. The minimal config-
       uration for this is the report_host parameter, but there are other “report” parameters as well for the port,
       username, and password.



2.26. pt-slave-restart                                                                                             197
Percona Toolkit Documentation, Release 2.1.1


-recursion-method
    type: string
      Preferred recursion method used to find slaves.
      Possible methods are:
      METHOD             USES
      ===========        ================
      processlist        SHOW PROCESSLIST
      hosts              SHOW SLAVE HOSTS

      The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method
      is required if the server uses a non-standard port (not 3306). Usually pt-slave-restart does the right thing and
      finds the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the
      other methods will be tried.
-run-time
    type: time
      Time to run before exiting. Causes pt-slave-restart to stop after the specified time has elapsed. Optional suffix:
      s=seconds, m=minutes, h=hours, d=days; if no suffix, s is used.
-sentinel
    type: string; default: /tmp/pt-slave-restart-sentinel
      Exit if this file exists.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-skip-count
    type: int; default: 1
      Number of statements to skip when restarting the slave.
-sleep
    type: int; default: 1
      Initial sleep seconds between checking the slave.
      See “SLEEP”.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-stop
    Stop running instances by creating the sentinel file.
      Causes pt-slave-restart to create the sentinel file specified by --sentinel. This should have the effect of
      stopping all running instances which are watching the same sentinel file. If --monitor isn’t specified, pt-
      slave-restart will exit after creating the file. If it is specified, pt-slave-restart will wait the interval given by
      --max-sleep, then remove the file and continue working.
      You might find this handy to stop cron jobs gracefully if necessary, or to replace one running instance with
      another. For example, if you want to stop and restart pt-slave-restart every hour (just to make sure that it is
      restarted every hour, in case of a server crash or some other problem), you could use a crontab line like this:




198                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



      0 * * * * :program:‘pt-slave-restart‘ --monitor --stop --sentinel /tmp/pt-slave-restartup

      The non-default --sentinel will make sure the hourly cron job stops only instances previously started with
      the same options (that is, from the same cron job).
      See also --sentinel.
-until-master
    type: string
      Run until this master log file and position. Start the slave, and retry if it fails, until it reaches the given repli-
      cation coordinates. The coordinates are the logfile and position on the master, given by relay_master_log_file,
      exec_master_log_pos. The argument must be in the format “file,pos”. Separate the filename and position with a
      single comma and no space.
      This will also cause an UNTIL clause to be given to START SLAVE.
      After reaching this point, the slave should be stopped and pt-slave-restart will exit.
-until-relay
    type: string
      Run until this relay log file and position. Like --until-master, but in the slave’s relay logs instead. The
      coordinates are given by relay_log_file, relay_log_pos.
-user
    short form: -u; type: string
      User for login if not current user.
-verbose
    short form: -v; cumulative: yes; default: 1
      Be verbose; can specify multiple times. Verbosity 1 outputs connection information, a timestamp, relay_log_file,
      relay_log_pos, and last_errno. Verbosity 2 adds last_error. See also --error-length. Verbosity 3 prints
      the current sleep time each time pt-slave-restart sleeps.
-version
Show version and exit.


2.26.10 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      dsn: database; copy: yes
      Default database.
    • F




2.26. pt-slave-restart                                                                                                199
Percona Toolkit Documentation, Release 2.1.1


      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.26.11 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-slave-restart ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.26.12 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.26.13 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-slave-restart.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)


200                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.26.14 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.26.15 AUTHORS

Baron Schwartz


2.26.16 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.26.17 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.26.18 VERSION

pt-slave-restart 2.1.1




2.26. pt-slave-restart                                                                                          201
Percona Toolkit Documentation, Release 2.1.1



2.27 pt-stalk

2.27.1 NAME

pt-stalk - Gather forensic data about MySQL when a problem occurs.


2.27.2 SYNOPSIS

Usage

pt-stalk [OPTIONS] [-- MYSQL OPTIONS]

pt-stalk watches for a trigger condition to become true, and then collects data to help in diagnosing problems. It
is designed to run as a daemon with root privileges, so that you can diagnose intermittent problems that you cannot
observe directly. You can also use it to execute a custom command, or to gather the data on demand without waiting
for the trigger to happen.


2.27.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-stalk is a read-write tool; it collects data from the system and writes it into a series of files. It should be very
low-risk. Some of the options can cause intrusive data collection to be performed, however, so if you enable any
non-default options, you should read their documentation carefully.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-stalk.
See also “BUGS” for more information on filing bugs and getting help.


2.27.4 DESCRIPTION

Sometimes a problem happens infrequently and for a short time, giving you no chance to see the system when it
happens. How do you solve intermittent MySQL problems when you can’t observe them? That’s why pt-stalk exists.
In addition to using it when there’s a known problem on your servers, it is a good idea to run pt-stalk all the time, even
when you think nothing is wrong. You will appreciate the data it gathers when a problem occurs, because problems
such as MySQL lockups or spikes of activity typically leave no evidence to use in root cause analysis.
This tool does two things: it watches a server (typically MySQL) for a trigger to occur, and it gathers diagnostic data.
To use it effectively, you need to define a good trigger condition. A good trigger is sensitive enough to fire reliably
when a problem occurs, so that you don’t miss a chance to solve problems. On the other hand, a good trigger isn’t
prone to false positives, so you don’t gather information when the server is functioning normally.
The most reliable triggers for MySQL tend to be the number of connections to the server, and the number of queries
running concurrently. These are available in the SHOW GLOBAL STATUS command as Threads_connected and
Threads_running. Sometimes Threads_connected is not a reliable indicator of trouble, but Threads_running usually
is. Your job, as the tool’s user, is to define an appropriate trigger condition for the tool. Choose carefully, because the
quality of your results will depend on the trigger you choose.




202                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


You can define the trigger with the --function, --variable, and --threshold options, among others. Please
read the documentation for –function to learn how to do this.
The pt-stalk tool, by default, simply watches MySQL repeatedly until the trigger becomes true. It then gathers
diagnostics for a while, and sleeps afterwards for some time to prevent repeatedly gathering data if the condition
remains true. In crude pseudocode, omitting some subtleties,
while true; do
  if --variable from --function is greater than --threshold; then
    observations++
    if observations is greater than --cycles; then
      capture diagnostics for --run-time seconds
      exit if --iterations is exceeded
      sleep for --sleep seconds
    done
  done
  clean up data that’s older than --retention-time
  sleep for --interval seconds
done

The diagnostic data is written to files whose names begin with a timestamp, so you can distinguish samples from each
other in case the tool collects data multiple times. The pt-sift tool is designed to help you browse and analyze the
resulting samples of data.
Although this sounds simple enough, in practice there are a number of subtleties, such as detecting when the disk is
beginning to fill up so that the tool doesn’t cause the server to run out of disk space. This tool handles these types
of potential problems, so it’s a good idea to use this tool instead of writing something from scratch and possibly
experiencing some of the hazards this tool is designed to prevent.


2.27.5 CONFIGURING

You can use standard Percona Toolkit configuration files to set commandline options.
You will probably want to run the tool as a daemon and customize at least the diagnostic threshold. Here’s a sample
configuration file for triggering when there are more than 20 queries running at once:
daemonize
threshold=20

If you’re not running the tool as it’s designed (as a root user, daemonized) then you’ll need to set several options, such
as --dest, to locations that are writable by non-root users.


2.27.6 OPTIONS

-collect
    default: yes; negatable: yes
      Collect system information. You can negate this option to make the tool watch the system but not actually gather
      any diagnostic data.
      See also --stalk.
-collect-gdb
    Collect GDB stacktraces. This is achieved by attaching to MySQL and printing stack traces from all threads.
    This will freeze the server for some period of time, ranging from a second or so to much longer on very busy
    systems with a lot of memory and many threads in the server. For this reason, it is disabled by default. However,
    if you are trying to diagnose a server stall or lockup, freezing the server causes no additional harm, and the stack
    traces can be vital for diagnosis.


2.27. pt-stalk                                                                                                       203
Percona Toolkit Documentation, Release 2.1.1


      In addition to freezing the server, there is also some risk of the server crashing or performing badly after GDB
      detaches from it.
-collect-oprofile
    Collect oprofile data. This is achieved by starting an oprofile session, letting it run for the collection time, and
    then stopping and saving the resulting profile data in the system’s default location. Please read your system’s
    oprofile documentation to learn more about this.
-collect-strace
    Collect strace data. This is achieved by attaching strace to the server, which will make it run very slowly until
    strace detaches. The same cautions apply as those listed in –collect-gdb. You should not enable this option
    together with –collect-gdb, because GDB and strace can’t attach to the server process simultaneously.
-collect-tcpdump
    Collect tcpdump data. This option causes tcpdump to capture all traffic on all interfaces for the port on which
    MySQL is listening. You can later use pt-query-digest to decode the MySQL protocol and extract a log of query
    traffic from it.
-config
    type: string
      Read this comma-separated list of config files. If specified, this must be the first option on the command line.
-cycles
    type: int; default: 5
      The number of times the trigger condition must be true before collecting data. This helps prevent false positives,
      and makes the trigger condition less likely to fire when the problem recovers quickly.
-daemonize
    Daemonize the tool. This causes the tool to fork into the background and log its output as specified in –log.
-dest
    type: string; default: /var/lib/pt-stalk
      Where to store the diagnostic data. Each time the tool collects data, it writes to a new set of files, which are
      named with the current system timestamp.
-disk-bytes-free
    type: size; default: 100M
      Don’t collect data if the disk has less than this much free space. This prevents the tool from filling up the disk
      with diagnostic data.
      If the --dest directory contains a previously captured sample of data, the tool will measure its size and use
      that as an estimate of how much data is likely to be gathered this time, too. It will then be even more pessimistic,
      and will refuse to collect data unless the disk has enough free space to hold the sample and still have the desired
      amount of free space. For example, if you’d like 100MB of free space and the previous diagnostic sample
      consumed 100MB, the tool won’t collect any data unless the disk has 200MB free.
      Valid size value suffixes are k, M, G, and T.
-disk-pct-free
    type: int; default: 5
      Don’t collect data if the disk has less than this percent free space. This prevents the tool from filling up the disk
      with diagnostic data.
      This option works similarly to --disk-bytes-free but specifies a percentage margin of safety instead of
      a bytes margin of safety. The tool honors both options, and will not collect any data unless both margins are
      satisfied.




204                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-function
    type: string; default: status
       Specifies what to watch for a diagnostic trigger. The default value watches SHOW GLOBAL STATUS, but
       you can also watch SHOW PROCESSLIST or supply a plugin file with your own custom code. This function
       supplies the value of --variable, which is then compared against --threshold to see if the trigger
       condition is met. Additional options may be required as well; see below. Possible values:
           •status

            This value specifies that the source of data for the diagnostic trigger is SHOW GLOBAL STATUS.
            The value of --variable then defines which status counter is the trigger.

           •processlist

            This value specifies that the data for the diagnostic trigger comes from SHOW FULL PROCESSLIST.
            The trigger value is the count of processes whose --variable column matches the --match
            option. For example, to trigger when more than 10 processes are in the “statistics” state, use the
            following options:
            --function processlist --variable State 
              --match statistics --threshold 10


       In addition, you can specify a file that contains your custom trigger function, written in Unix shell script. This
       can be a wrapper that executes anything you wish. If the argument to –function is a file, then it takes precedence
       over builtin functions, so if there is a file in the working directory named “status” or “processlist” then the tool
       will use that file as a plugin, even though those are otherwise recognized as reserved words for this option.
       The plugin file works by providing a function called trg_plugin, and the tool simply sources the file and
       executes the function. For example, the function might look like the following:
       trg_plugin() {
          mysql $EXT_ARGV -e "SHOW ENGINE INNODB STATUS" 
            | grep -c "has waited at"
       }

       This snippet will count the number of mutex waits inside of InnoDB. It illustrates the general principle: the
       function must output a number, which is then compared to the threshold as usual. The $EXT_ARGV variable
       contains the MySQL options mentioned in the “SYNOPSIS” above.
       The plugin should not alter the tool’s existing global variables. Prefix any plugin-specific global variables with
       “PLUGIN_” or make them local.
-help
    Print help and exit.
-interval
    type: int; default: 1
       Interval between checks for the diagnostic trigger.
-iterations
    type: int
       Exit after collecting diagnostics this many times. By default, the tool will continue to watch the server forever,
       but this is useful for scenarios where you want to capture once and then exit, for example.
-log
       type: string; default: /var/log/pt-stalk.log
       Print all output to this file when daemonized.


2.27. pt-stalk                                                                                                       205
Percona Toolkit Documentation, Release 2.1.1


-match
    type: string
       The pattern to use when watching SHOW PROCESSLIST. See the documentation for --function for details.
-notify-by-email
    type: string
       Send mail to this list of addresses when data is collected.
-pid
       type: string; default: /var/run/pt-stalk.pid
       Create a PID file when daemonized.
-prefix
    type: string
       The filename prefix for diagnostic samples. By default, samples have a timestamp prefix based on the current
       local time, such as 2011_12_06_14_02_02, which is December 6, 2011 at 14:02:02.
-retention-time
    type: int; default: 30
       Number of days to retain collected samples. Any samples that are older will be purged.
-run-time
    type: int; default: 30
       How long the tool will collect data when it triggers. This should not be longer than --sleep. It is usually not
       necessary to change this; if the default 30 seconds hasn’t gathered enough diagnostic data, running longer is not
       likely to do so. In fact, in many cases a shorter collection period is appropriate.
-sleep
    type: int; default: 300
       How long to sleep after collecting data. This prevents the tool from triggering continuously, which might be a
       problem if the collection process is intrusive. It also prevents filling up the disk or gathering too much data to
       analyze reasonably.
-stalk
    default: yes; negatable: yes
       Watch the server and wait for the trigger to occur. You can negate this option to make the tool immediately
       gather any diagnostic data once and exit. This is useful if a problem is already happening, but pt-stalk is not
       running, so you only want to collect diagnostic data.
       If this option is negate, --daemonize, --log, --pid, and other stalking-related options have no ef-
       fect; the tool simply collects diagnostic data and exits. Safeguard options, like --disk-bytes-free and
       --disk-pct-free, are still respected.
       See also --collect.
-threshold
    type: int; default: 25
       The threshold at which the diagnostic trigger should fire. See --function for details.
-variable
    type: string; default: Threads_running
       The variable to compare against the threshold. See --function for details.
-version
    Print tool’s version and exit.


206                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.27.7 ENVIRONMENT

This tool does not use any environment variables for configuration.


2.27.8 SYSTEM REQUIREMENTS

This tool requires Bash v3 or newer.


2.27.9 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-stalk.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.27.10 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.27.11 AUTHORS

Baron Schwartz, Justin Swanhart, Fernando Ipar, and Daniel Nichter


2.27.12 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.




2.27. pt-stalk                                                                                                    207
Percona Toolkit Documentation, Release 2.1.1


2.27.13 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.27.14 VERSION

pt-stalk 2.1.1


2.28 pt-summary

2.28.1 NAME

pt-summary - Summarize system information nicely.


2.28.2 SYNOPSIS

Usage

pt-summary

pt-summary conveniently summarizes the status and configuration of a server. It is not a tuning tool or diagnosis tool.
It produces a report that is easy to diff and can be pasted into emails without losing the formatting. This tool works
well on many types of Unix systems.
Download and run:
wget http://percona.com/get/pt-summary
bash ./pt-summary



2.28.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-summary is a read-only tool. It should be very low-risk.
At the time of this release, we know of no bugs that could harm users.




208                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
summary.
See also “BUGS” for more information on filing bugs and getting help.


2.28.4 DESCRIPTION

pt-summary runs a large variety of commands to inspect system status and configuration, saves the output into files
in a temporary directory, and then runs Unix commands on these results to format them nicely. It works best when
executed as a privileged user, but will also work without privileges, although some output might not be possible to
generate without root.


2.28.5 OUTPUT

Many of the outputs from this tool are deliberately rounded to show their magnitude but not the exact detail. This
is called fuzzy-rounding. The idea is that it doesn’t matter whether a particular counter is 918 or 921; such a small
variation is insignificant, and only makes the output hard to compare to other servers. Fuzzy-rounding rounds in larger
increments as the input grows. It begins by rounding to the nearest 5, then the nearest 10, nearest 25, and then repeats
by a factor of 10 larger (50, 100, 250), and so on, as the input grows.
The following is a simple report generated from a CentOS virtual machine, broken into sections with commentary
following each section. Some long lines are reformatted for clarity when reading this documentation as a manual page
in a terminal.
# Percona Toolkit System Summary Report ######################
        Date | 2012-03-30 00:58:07 UTC (local TZ: EDT -0400)
    Hostname | localhost.localdomain
      Uptime | 20:58:06 up 1 day, 20 min, 1 user,
               load average: 0.14, 0.18, 0.18
      System | innotek GmbH; VirtualBox; v1.2 ()
 Service Tag | 0
    Platform | Linux
     Release | CentOS release 5.5 (Final)
      Kernel | 2.6.18-194.el5
Architecture | CPU = 32-bit, OS = 32-bit
   Threading | NPTL 2.5
    Compiler | GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-48).
     SELinux | Enforcing
 Virtualized | VirtualBox

This section shows the current date and time, and a synopsis of the server and operating system.
# Processor ##################################################
  Processors | physical = 1, cores = 0, virtual = 1, hyperthreading = no
      Speeds | 1x2510.626
      Models | 1xIntel(R) Core(TM) i5-2400S CPU @ 2.50GHz
      Caches | 1x6144 KB

This section is derived from /proc/cpuinfo.
# Memory #####################################################
       Total | 503.2M
        Free | 29.0M
        Used | physical = 474.2M, swap allocated = 1.0M,
               swap used = 16.0k, virtual = 474.3M



2.28. pt-summary                                                                                                   209
Percona Toolkit Documentation, Release 2.1.1



     Buffers |      33.9M
      Caches |      262.6M
       Dirty |      396 kB
     UsedRSS |      201.9M
  Swappiness |      60
 DirtyPolicy |      40, 10
 Locator Size        Speed       Form Factor       Type      Type Detail
 ======= ====        =====       ===========       ====      ===========

Information about memory is gathered from free. The Used statistic is the total of the rss sizes displayed by ps. The
Dirty statistic for the cached value comes from /proc/meminfo. On Linux, the swappiness settings are gathered from
sysctl. The final portion of this section is a table of the DIMMs, which comes from dmidecode. In this example
there is no output.
# Mounted Filesystems ########################################
  Filesystem                       Size Used Type Opts Mountpoint
  /dev/mapper/VolGroup00-LogVol00   15G 17% ext3 rw     /
  /dev/sda1                         99M 13% ext3 rw     /boot
  tmpfs                            252M   0% tmpfs rw   /dev/shm

The mounted filesystem section is a combination of information from mount and df. This section is skipped if you
disable --summarize-mounts.
# Disk Schedulers And Queue Size #############################
        dm-0 | UNREADABLE
        dm-1 | UNREADABLE
         hdc | [cfq] 128
         md0 | UNREADABLE
         sda | [cfq] 128

The disk scheduler information is extracted from the /sys filesystem in Linux.
# Disk Partioning ############################################
Device       Type      Start        End               Size
============ ==== ========== ========== ==================
/dev/sda     Disk                              17179869184
/dev/sda1    Part          1         13           98703360
/dev/sda2    Part         14       2088        17059230720

Information about disk partitioning comes from fdisk -l.
# Kernel Inode      State   #########################################
dentry-state |      10697   8559 45 0 0 0
     file-nr |      960     0 50539
    inode-nr |      14059   8139

These lines are from the files of the same name in the /proc/sys/fs directory on Linux. Read the proc man page to
learn about the meaning of these files on your system.
# LVM Volumes ################################################
LV       VG         Attr   LSize   Origin Snap% Move Log Copy% Convert
LogVol00 VolGroup00 -wi-ao 269.00G
LogVol01 VolGroup00 -wi-ao   9.75G

This section shows the output of lvs.
# RAID Controller ############################################
  Controller | No RAID controller detected




210                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


The tool can detect a variety of RAID controllers by examining lspci and dmesg information. If the controller
software is installed on the system, in many cases it is able to execute status commands and show a summary of the
RAID controller’s status and configuration. If your system is not supported, please file a bug report.
# Network Config #############################################
  Controller | Intel Corporation 82540EM Gigabit Ethernet Controller
 FIN Timeout | 60
  Port Range | 61000

The network controllers attached to the system are detected from lspci. The TCP/IP protocol configuration param-
eters are extracted from sysctl. You can skip this section by disabling the --summarize-network option.
# Interface Statistics #######################################
interface rx_bytes rx_packets rx_errors tx_bytes tx_packets tx_errors
========= ======== ========== ========= ======== ========== =========
lo        60000000      12500         0 60000000      12500         0
eth0      15000000      80000         0 1500000       10000         0
sit0             0          0         0        0          0         0

Interface statistics are gathered from ip -s link and are fuzzy-rounded. The columns are received and transmitted
bytes, packets, and errors. You can skip this section by disabling the --summarize-network option.
# Network Connections ########################################
  Connections from remote IP addresses
    127.0.0.1           2
  Connections to local IP addresses
    127.0.0.1           2
  Connections to top 10 local ports
    38346               1
    60875               1
  States of connections
    ESTABLISHED         5
    LISTEN              8

This section shows a summary of network connections, retrieved from netstat and “fuzzy-rounded” to make them
easier to compare when the numbers grow large. There are two sub-sections showing how many connections there
are per origin and destination IP address, and a sub-section showing the count of ports in use. The section ends with
the count of the network connections’ states. You can skip this section by disabling the --summarize-network
option.
# Top Processes ##############################################
  PID USER PR NI VIRT RES SHR S %CPU %MEM         TIME+ COMMAND
    1 root 15    0 2072 628 540 S 0.0 0.1        0:02.55 init
    2 root RT -5       0    0    0 S 0.0 0.0     0:00.00 migration/0
    3 root 34 19       0    0    0 S 0.0 0.0     0:00.03 ksoftirqd/0
    4 root RT -5       0    0    0 S 0.0 0.0     0:00.00 watchdog/0
    5 root 10 -5       0    0    0 S 0.0 0.0     0:00.97 events/0
    6 root 10 -5       0    0    0 S 0.0 0.0     0:00.00 khelper
    7 root 10 -5       0    0    0 S 0.0 0.0     0:00.00 kthread
   10 root 10 -5       0    0    0 S 0.0 0.0     0:00.13 kblockd/0
   11 root 20 -5       0    0    0 S 0.0 0.0     0:00.00 kacpid
# Notable Processes ##########################################
  PID    OOM    COMMAND
 2028    +0    sshd

This section shows the first few lines of top so that you can see what processes are actively using CPU time. The
notable processes include the SSH daemon and any process whose out-of-memory-killer priority is set to 17. You can
skip this section by disabling the --summarize-processes option.



2.28. pt-summary                                                                                                211
Percona Toolkit Documentation, Release 2.1.1



# Simplified and fuzzy rounded vmstat (wait please) ##########
  procs ---swap-- -----io---- ---system---- --------cpu--------
   r b     si   so    bi    bo     ir     cs us sy il wa st
   2 0      0    0     3    15     30    125   0   0 99    0   0
   0 0      0    0     0     0   1250    800   6 10 84     0   0
   0 0      0    0     0     0   1000    125   0   0 100   0   0
   0 0      0    0     0     0   1000    125   0   0 100   0   0
   0 0      0    0     0   450   1000    125   0   1 88 11     0
# The End ####################################################

This section is a trimmed-down sample of vmstat 1 5, so you can see the general status of the system at present.
The values in the table are fuzzy-rounded, except for the CPU columns. You can skip this section by disabling the
--summarize-processes option.


2.28.6 OPTIONS

-config
    type: string
      Read this comma-separated list of config files. If specified, this must be the first option on the command line.
-help
    Print help and exit.
-save-samples
    type: string
      Save the collected data in this directory.
-read-samples
    type: string
      Create a report from the files in this directory.
-summarize-mounts
    default: yes; negatable: yes
      Report on mounted filesystems and disk usage.
-summarize-network
    default: yes; negatable: yes
      Report on network controllers and configuration.
-summarize-processes
    default: yes; negatable: yes
      Report on top processes and vmstat output.
-sleep
    type: int; default: 5
      How long to sleep when gathering samples from vmstat.
-version
    Print tool’s version and exit.


2.28.7 SYSTEM REQUIREMENTS

This tool requires the Bourne shell (/bin/sh).


212                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.28.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-summary.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.28.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.28.10 AUTHORS

Baron Schwartz and Kevin van Zonneveld (http://kevin.vanzonneveld.net)


2.28.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.28.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.




2.28. pt-summary                                                                                                  213
Percona Toolkit Documentation, Release 2.1.1


This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.28.13 VERSION

pt-summary 2.1.1


2.29 pt-table-checksum

2.29.1 NAME

pt-table-checksum - Verify MySQL replication integrity.


2.29.2 SYNOPSIS

Usage

pt-table-checksum [OPTION...] [DSN]

pt-table-checksum performs an online replication consistency check by executing checksum queries on the master,
which produces different results on replicas that are inconsistent with the master. The optional DSN specifies the
master host. The tool’s exit status is nonzero if any differences are found, or if any warnings or errors occur.
The following command will connect to the replication master on localhost, checksum every table, and report the
results on every detected replica:
pt-table-checksum

This tool is focused on finding data differences efficiently. If any data is different, you can resolve the problem with
pt-table-sync.


2.29.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-table-checksum can add load to the MySQL server, although it has many safeguards to prevent this. It inserts a
small amount of data into a table that contains checksum results. It has checks that, if disabled, can potentially cause
replication to fail when unsafe replication options are used. In short, it is safe by default, but it permits you to turn off
its safety checks.
The tool presumes that schemas and tables are identical on the master and all replicas. Replication will break if, for
example, a replica does not have a schema that exists on the master (and that schema is checksummed), or if the
structure of a table on a replica is different than on the master.
At the time of this release, we know of no bugs that could cause harm to users.




214                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-table-
checksum.
See also “BUGS” for more information on filing bugs and getting help.


2.29.4 DESCRIPTION

pt-table-checksum is designed to do the right thing by default in almost every case. When in doubt, use --explain
to see how the tool will checksum a table. The following is a high-level overview of how the tool functions.
In contrast to older versions of pt-table-checksum, this tool is focused on a single purpose, and does not have a lot
of complexity or support many different checksumming techniques. It executes checksum queries on only one server,
and these flow through replication to re-execute on replicas. If you need the older behavior, you can use Percona
Toolkit version 1.0.
pt-table-checksum connects to the server you specify, and finds databases and tables that match the filters you specify
(if any). It works one table at a time, so it does not accumulate large amounts of memory or do a lot of work before
beginning to checksum. This makes it usable on very large servers. We have used it on servers with hundreds of
thousands of databases and tables, and trillions of rows. No matter how large the server is, pt-table-checksum works
equally well.
One reason it can work on very large tables is that it divides each table into chunks of rows, and checksums each chunk
with a single REPLACE..SELECT query. It varies the chunk size to make the checksum queries run in the desired
amount of time. The goal of chunking the tables, instead of doing each table with a single big query, is to ensure that
checksums are unintrusive and don’t cause too much replication lag or load on the server. That’s why the target time
for each chunk is 0.5 seconds by default.
The tool keeps track of how quickly the server is able to execute the queries, and adjusts the chunks as it learns more
about the server’s performance. It uses an exponentially decaying weighted average to keep the chunk size stable,
yet remain responsive if the server’s performance changes during checksumming for any reason. This means that the
tool will quickly throttle itself if your server becomes heavily loaded during a traffic spike or a background task, for
example.
Chunking is accomplished by a technique that we used to call “nibbling” in other tools in Percona Toolkit. It is the
same technique used for pt-archiver, for example. The legacy chunking algorithms used in older versions of pt-table-
checksum are removed, because they did not result in predictably sized chunks, and didn’t work well on many tables.
All that is required to divide a table into chunks is an index of some sort (preferably a primary key or unique index). If
there is no index, and the table contains a suitably small number of rows, the tool will checksum the table in a single
chunk.
pt-table-checksum has many other safeguards to ensure that it does not interfere with any server’s operation, including
replicas. To accomplish this, pt-table-checksum detects replicas and connects to them automatically. (If this fails,
you can give it a hint with the --recursion-method option.)
The tool monitors replicas continually. If any replica falls too far behind in replication, pt-table-checksum pauses to
allow it to catch up. If any replica has an error, or replication stops, pt-table-checksum pauses and waits. In addition,
pt-table-checksum looks for common causes of problems, such as replication filters, and refuses to operate unless you
force it to. Replication filters are dangerous, because the queries that pt-table-checksum executes could potentially
conflict with them and cause replication to fail.
pt-table-checksum verifies that chunks are not too large to checksum safely. It performs an EXPLAIN query on each
chunk, and skips chunks that might be larger than the desired number of rows. You can configure the sensitivity of
this safeguard with the --chunk-size-limit option. If a table will be checksummed in a single chunk because
it has a small number of rows, then pt-table-checksum additionally verifies that the table isn’t oversized on replicas.
This avoids the following scenario: a table is empty on the master but is very large on a replica, and is checksummed
in a single large query, which causes a very long delay in replication.



2.29. pt-table-checksum                                                                                              215
Percona Toolkit Documentation, Release 2.1.1


There are several other safeguards. For example, pt-table-checksum sets its session-level innodb_lock_wait_timeout
to 1 second, so that if there is a lock wait, it will be the victim instead of causing other queries to time out. Another
safeguard checks the load on the database server, and pauses if the load is too high. There is no single right answer for
how to do this, but by default pt-table-checksum will pause if there are more than 25 concurrently executing queries.
You should probably set a sane value for your server with the --max-load option.
Checksumming usually is a low-priority task that should yield to other work on the server. However, a tool that must
be restarted constantly is difficult to use. Thus, pt-table-checksum is very resilient to errors. For example, if the
database administrator needs to kill pt-table-checksum‘s queries for any reason, that is not a fatal error. Users often
run pt-kill to kill any long-running checksum queries. The tool will retry a killed query once, and if it fails again, it will
move on to the next chunk of that table. The same behavior applies if there is a lock wait timeout. The tool will print
a warning if such an error happens, but only once per table. If the connection to any server fails, pt-table-checksum
will attempt to reconnect and continue working.
If pt-table-checksum encounters a condition that causes it to stop completely, it is easy to resume it with the
--resume option. It will begin from the last chunk of the last table that it processed. You can also safely stop
the tool with CTRL-C. It will finish the chunk it is currently processing, and then exit. You can resume it as usual
afterwards.
After pt-table-checksum finishes checksumming all of the chunks in a table, it pauses and waits for all detected
replicas to finish executing the checksum queries. Once that is finished, it checks all of the replicas to see if they have
the same data as the master, and then prints a line of output with the results. You can see a sample of its output later in
this documentation.
The tool prints progress indicators during time-consuming operations. It prints a progress indicator as each table is
checksummed. The progress is computed by the estimated number of rows in the table. It will also print a progress
report when it pauses to wait for replication to catch up, and when it is waiting to check replicas for differences from
the master. You can make the output less verbose with the --quiet option.
If you wish, you can query the checksum tables manually to get a report of which tables and chunks have differences
from the master. The following query will report every database and table with differences, along with a summary of
the number of chunks and rows possibly affected:
SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks
FROM percona.checksums
WHERE (
 master_cnt <> this_cnt
 OR master_crc <> this_crc
 OR ISNULL(master_crc) <> ISNULL(this_crc))
GROUP BY db, tbl;

The table referenced in that query is the checksum table, where the checksums are stored. Each row in the table
contains the checksum of one chunk of data from some table in the server.
Version 2.0 of pt-table-checksum is not backwards compatible with pt-table-sync version 1.0. In some cases this is not
a serious problem. Adding a “boundaries” column to the table, and then updating it with a manually generated WHERE
clause, may suffice to let pt-table-sync version 1.0 interoperate with pt-table-checksum version 2.0. Assuming an
integer primary key named ‘id’, You can try something like the following:
ALTER TABLE checksums ADD boundaries VARCHAR(500);
UPDATE checksums
 SET boundaries = COALESCE(CONCAT(’id BETWEEN ’, lower_boundary,
    ’ AND ’, upper_boundary), ’1=1’);



2.29.5 OUTPUT

The tool prints tabular results, one line per table:



216                                                                                                     Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



            TS ERRORS          DIFFS     ROWS    CHUNKS SKIPPED          TIME   TABLE
10-20T08:36:50      0              0      200         1       0         0.005   db1.tbl1
10-20T08:36:50      0              0      603         7       0         0.035   db1.tbl2
10-20T08:36:50      0              0       16         1       0         0.003   db2.tbl3
10-20T08:36:50      0              0      600         6       0         0.024   db2.tbl4

Errors, warnings, and progress reports are printed to standard error. See also --quiet.
Each table’s results are printed when the tool finishes checksumming the table. The columns are as follows:
TS The timestamp (without the year) when the tool finished checksumming the table.
ERRORS The number of errors and warnings that occurred while checksumming the table. Errors and warnings are
    printed to standard error while the table is in progress.
DIFFS The number of chunks that differ from the master on one or more replicas. If --no-replicate-check is
    specified, this column will always have zeros. If --replicate-check-only is specified, then only tables
    with differences are printed.
ROWS The number of rows selected and checksummed from the table. It might be different from the number of rows
   in the table if you use the –where option.
CHUNKS The number of chunks into which the table was divided.
SKIPPED The number of chunks that were skipped due to errors or warnings, or because they were oversized.
TIME The time elapsed while checksumming the table.
TABLE The database and table that was checksummed.
If --replicate-check-only is specified, only checksum differences on detected replicas are printed. The
output is different: one paragraph per replica, one checksum difference per line, and values are separted by spaces:
Differences on h=127.0.0.1,P=12346
TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY
db1.tbl1 1 0 1 PRIMARY 1 100
db1.tbl1 6 0 1 PRIMARY 501 600

Differences on h=127.0.0.1,P=12347
TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY
db1.tbl1 1 0 1 PRIMARY 1 100
db2.tbl2 9 5 0 PRIMARY 101 200

The first line of a paragraph indicates the replica with differences. In this example there are two: h=127.0.0.1,P=12346
and h=127.0.0.1,P=12347. The columns are as follows:
TABLE The database and table that differs from the master.
CHUNK The chunk number of the table that differs from the master.
CNT_DIFF The number of chunk rows on the replica minus the number of chunk rows on the master.
CRC_DIFF 1 if the CRC of the chunk on the replica is different than the CRC of the chunk on the master, else 0.
CHUNK_INDEX The index used to chunk the table.
LOWER_BOUNDARY The index values that define the lower boundary of the chunk.
UPPER_BOUNDARY The index values that define the upper boundary of the chunk.


2.29.6 EXIT STATUS

A non-zero exit status indicates errors, warnings, or checksum differences.


2.29. pt-table-checksum                                                                                           217
Percona Toolkit Documentation, Release 2.1.1


2.29.7 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    group: Connection
      Prompt for a password when connecting to MySQL.
-check-interval
    type: time; default: 1; group: Throttle
      Sleep time between checks for --max-lag.
-[no]check-replication-filters
    default: yes; group: Safety
      Do not checksum if any replication filters are set on any replicas. The tool looks for server options that filter
      replication, such as binlog_ignore_db and replicate_do_db. If it finds any such filters, it aborts with an error.
      If the replicas are configured with any filtering options, you should be careful not to checksum any databases or
      tables that exist on the master and not the replicas. Changes to such tables might normally be skipped on the
      replicas because of the filtering options, but the checksum queries modify the contents of the table that stores
      the checksums, not the tables whose data you are checksumming. Therefore, these queries will be executed on
      the replica, and if the table or database you’re checksumming does not exist, the queries will cause replication
      to fail. For more information on replication rules, see http://dev.mysql.com/doc/en/replication-rules.html.
      Replication filtering makes it impossible to be sure that the checksum queries won’t break replication (or simply
      fail to replicate). If you are sure that it’s OK to run the checksum queries, you can negate this option to disable
      the checks. See also --replicate-database.
-check-slave-lag
    type: string; group: Throttle
      Pause checksumming until this replica’s lag is less than --max-lag. The value is a DSN that inherits proper-
      ties from the master host and the connection options (--port, --user, etc.). This option overrides the normal
      behavior of finding and continually monitoring replication lag on ALL connected replicas. If you don’t want to
      monitor ALL replicas, but you want more than just one replica to be monitored, then use the DSN option to the
      --recursion-method option instead of this option.
-chunk-index
    type: string
      Prefer this index for chunking tables. By default, pt-table-checksum chooses the most appropriate index for
      chunking. This option lets you specify the index that you prefer. If the index doesn’t exist, then pt-table-
      checksum will fall back to its default behavior of choosing an index. pt-table-checksum adds the index to the
      checksum SQL statements in a FORCE INDEX clause. Be careful when using this option; a poor choice of
      index could cause bad performance. This is probably best to use when you are checksumming only a single
      table, not an entire server.
-chunk-size
    type: size; default: 1000
      Number of rows to select for each checksum query. Allowable suffixes are k, M, G.
      This option can override the default behavior, which is to adjust chunk size dynamically to try to make chunks
      run in exactly --chunk-time seconds. When this option isn’t set explicitly, its default value is used as a
      starting point, but after that, the tool ignores this option’s value. If you set this option explicitly, however, then
      it disables the dynamic adjustment behavior and tries to make all chunks exactly the specified number of rows.
      There is a subtlety: if the chunk index is not unique, then it’s possible that chunks will be larger than desired.
      For example, if a table is chunked by an index that contains 10,000 of a given value, there is no way to write a


218                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      WHERE clause that matches only 1,000 of the values, and that chunk will be at least 10,000 rows large. Such a
      chunk will probably be skipped because of --chunk-size-limit.
-chunk-size-limit
    type: float; default: 2.0; group: Safety
      Do not checksum chunks this much larger than the desired chunk size.
      When a table has no unique indexes, chunk sizes can be inaccurate. This option specifies a maximum tolerable
      limit to the inaccuracy. The tool uses <EXPLAIN> to estimate how many rows are in the chunk. If that estimate
      exceeds the desired chunk size times the limit (twice as large, by default), then the tool skips the chunk.
      The minimum value for this option is 1, which means that no chunk can be larger than --chunk-size. You
      probably don’t want to specify 1, because rows reported by EXPLAIN are estimates, which can be different
      from the real number of rows in the chunk. If the tool skips too many chunks because they are oversized, you
      might want to specify a value larger than the default of 2.
      You can disable oversized chunk checking by specifying a value of 0.
-chunk-time
    type: float; default: 0.5
      Adjust the chunk size dynamically so each checksum query takes this long to execute.
      The tool tracks the checksum rate (rows per second) for all tables and each table individually. It uses these rates
      to adjust the chunk size after each checksum query, so that the next checksum query takes this amount of time
      (in seconds) to execute.
      The algorithm is as follows: at the beginning of each table, the chunk size is initialized from the overall average
      rows per second since the tool began working, or the value of --chunk-size if the tool hasn’t started working
      yet. For each subsequent chunk of a table, the tool adjusts the chunk size to try to make queries run in the desired
      amount of time. It keeps an exponentially decaying moving average of queries per second, so that if the server’s
      performance changes due to changes in server load, the tool adapts quickly. This allows the tool to achieve
      predictably timed queries for each table, and for the server overall.
      If this option is set to zero, the chunk size doesn’t auto-adjust, so query checksum times will vary, but query
      checksum sizes will not. Another way to do the same thing is to specify a value for --chunk-size explicitly,
      instead of leaving it at the default.
-columns
    short form: -c; type: array; group: Filter
      Checksum only this comma-separated list of columns.
-config
    type: Array; group: Config
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-[no]create-replicate-table
    default: yes
      Create the --replicate database and table if they do not exist. The structure of the replicate table is the
      same as the suggested table mentioned in --replicate.
-databases
    short form: -d; type: hash; group: Filter
      Only checksum this comma-separated list of databases.
-databases-regex
    type: string; group: Filter
      Only checksum databases whose names match this Perl regex.


2.29. pt-table-checksum                                                                                              219
Percona Toolkit Documentation, Release 2.1.1


-defaults-file
    short form: -F; type: string; group: Connection
      Only read mysql options from the given file. You must give an absolute pathname.
-[no]empty-replicate-table
    default: yes
      Delete previous checksums for each table before checksumming the table. This option does not truncate the
      entire table, it only deletes rows (checksums) for each table just before checksumming the table. Therefore, if
      checksumming stops prematurely and there was preexisting data, there will still be rows for tables that were not
      checksummed before the tool was stopped.
      If you’re resuming from a previous checksum run, then the checksum records for the table from which the tool
      resumes won’t be emptied.
-engines
    short form: -e; type: hash; group: Filter
      Only checksum tables which use these storage engines.
-explain
    cumulative: yes; default: 0; group: Output
      Show, but do not execute, checksum queries (disables --[no]empty-replicate-table). If specified
      twice, the tool actually iterates through the chunking algorithm, printing the upper and lower boundary values
      for each chunk, but not executing the checksum queries.
-float-precision
    type: int
      Precision for FLOAT and DOUBLE number-to-string conversion. Causes FLOAT and DOUBLE values to be
      rounded to the specified number of digits after the decimal point, with the ROUND() function in MySQL.
      This can help avoid checksum mismatches due to different floating-point representations of the same values on
      different MySQL versions and hardware. The default is no rounding; the values are converted to strings by the
      CONCAT() function, and MySQL chooses the string representation. If you specify a value of 2, for example,
      then the values 1.008 and 1.009 will be rounded to 1.01, and will checksum as equal.
-function
    type: string
      Hash function for checksums (FNV1A_64, MURMUR_HASH, SHA1, MD5, CRC32, etc).
      The default is to use CRC32(), but MD5() and SHA1() also work, and you can use your own function, such as
      a compiled UDF, if you wish. The function you specify is run in SQL, not in Perl, so it must be available to
      MySQL.
      MySQL doesn’t have good built-in hash functions that are fast. CRC32() is too prone to hash collisions, and
      MD5() and SHA1() are very CPU-intensive. The FNV1A_64() UDF that is distributed with Percona Server is a
      faster alternative. It is very simple to compile and install; look at the header in the source code for instructions. If
      it is installed, it is preferred over MD5(). You can also use the MURMUR_HASH() function if you compile and
      install that as a UDF; the source is also distributed with Percona Server, and it might be better than FNV1A_64().
-help
    group: Help
      Show help and exit.
-host
    short form: -h; type: string; default: localhost; group: Connection
      Host to connect to.



220                                                                                                     Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-ignore-columns
    type: Hash; group: Filter
      Ignore this comma-separated list of columns when calculating the checksum.
-ignore-databases
    type: Hash; group: Filter
      Ignore this comma-separated list of databases.
-ignore-databases-regex
    type: string; group: Filter
      Ignore databases whose names match this Perl regex.
-ignore-engines
    type: Hash; default: FEDERATED,MRG_MyISAM; group: Filter
      Ignore this comma-separated list of storage engines.
-ignore-tables
    type: Hash; group: Filter
      Ignore this comma-separated list of tables. Table names may be qualified with the database name. The
      --replicate table is always automatically ignored.
-ignore-tables-regex
    type: string; group: Filter
      Ignore tables whose names match the Perl regex.
-lock-wait-timeout
    type: int; default: 1
      Set the session value of innodb_lock_wait_timeout on the master host. This option helps guard against
      long lock waits if the checksum queries become slow for some reason. Setting this option dynamically requires
      the InnoDB plugin, so this works only on newer InnoDB and MySQL versions. If setting the value fails and the
      current server value is greater than the specified value, then a warning is printed; else, if the current server value
      is less than or equal to the specified value, no warning is printed.
-max-lag
    type: time; default: 1s; group: Throttle
      Pause checksumming until all replicas’ lag is less than this value. After each checksum query (each chunk), pt-
      table-checksum looks at the replication lag of all replicas to which it connects, using Seconds_Behind_Master.
      If any replica is lagging more than the value of this option, then pt-table-checksum will sleep for
      --check-interval seconds, then check all replicas again. If you specify --check-slave-lag, then
      the tool only examines that server for lag, not all servers. If you want to control exactly which servers the tool
      monitors, use the DSN value to --recursion-method.
      The tool waits forever for replicas to stop lagging. If any replica is stopped, the tool waits forever until the
      replica is started. Checksumming continues once all replicas are running and not lagging too much.
      The tool prints progress reports while waiting. If a replica is stopped, it prints a progress report immediately,
      then again at every progress report interval.
-max-load
    type: Array; default: Threads_running=25; group: Throttle
      Examine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than the
      threshold. The option accepts a comma-separated list of MySQL status variables to check for a threshold.
      An optional =MAX_VALUE (or :MAX_VALUE) can follow each variable. If not given, the tool determines a
      threshold by examining the current value and increasing it by 20%.



2.29. pt-table-checksum                                                                                               221
Percona Toolkit Documentation, Release 2.1.1


       For example, if you want the tool to pause when Threads_connected gets too high, you can specify
       “Threads_connected”, and the tool will check the current value when it starts working and add 20% to that
       value. If the current value is 100, then the tool will pause when Threads_connected exceeds 120, and resume
       working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use either
       “Threads_connected:110” or “Threads_connected=110”.
       The purpose of this option is to prevent the tool from adding too much load to the server. If the checksum
       queries are intrusive, or if they cause lock waits, then other queries on the server will tend to block and queue.
       This will typically cause Threads_running to increase, and the tool can detect that by running SHOW GLOBAL
       STATUS immediately after each checksum query finishes. If you specify a threshold for this variable, then you
       can instruct the tool to wait until queries are running normally again. This will not prevent queueing, however;
       it will only give the server a chance to recover from the queueing. If you notice queueing, it is best to decrease
       the chunk time.
-password
    short form: -p; type: string; group: Connection
       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-port
    short form: -P; type: int; group: Connection
       Port number to use for connection.
-progress
    type: array; default: time,30
       Print progress reports to STDERR.
       The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the
       second part specifies how often an update should be printed, in percentage, seconds, or number of iterations.
       The tool prints progress reports for a variety of time-consuming operations, including waiting for replicas to
       catch up if they become lagged.
-quiet
    short form: -q; cumulative: yes; default: 0
       Print only the most important information (disables --progress). Specifying this option once causes the tool
       to print only errors, warnings, and tables that have checksum differences.
       Specifying this option twice causes the tool to print only errors. In this case, you can use the tool’s exit status to
       determine if there were any warnings or checksum differences.
-recurse
    type: int
       Number of levels to recurse in the hierarchy when discovering replicas.              Default is infinite.    See also
       --recursion-method.
-recursion-method
    type: string
       Preferred recursion method for discovering replicas. Possible methods are:



222                                                                                                    Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



     METHOD           USES
     ===========      ==================
     processlist      SHOW PROCESSLIST
     hosts            SHOW SLAVE HOSTS
     dsn=DSN          DSNs from a table

     The processlist method is the default, because SHOW SLAVE HOSTS is not reliable. However, the hosts
     method can work better if the server uses a non-standard port (not 3306). The tool usually does the right thing
     and finds all replicas, but you may give a preferred method and it will be used first.
     The hosts method requires replicas to be configured with report_host, report_port, etc.
     The dsn method is special: it specifies a table from which other DSN strings are read. The specified DSN must
     specify a D and t, or a database-qualified t. The DSN table should have the following structure:
     CREATE TABLE ‘dsns‘ (
       ‘id‘ int(11) NOT NULL AUTO_INCREMENT,
       ‘parent_id‘ int(11) DEFAULT NULL,
       ‘dsn‘ varchar(255) NOT NULL,
       PRIMARY KEY (‘id‘)
     );

     To make the tool monitor only the hosts 10.10.1.16 and 10.10.1.17 for replication lag and checksum differences,
     insert the values h=10.10.1.16 and h=10.10.1.17 into the table. Currently, the DSNs are ordered by id,
     but id and parent_id are otherwise ignored.
-replicate
    type: string; default: percona.checksums
     Write checksum results to this table. The replicate table must have this structure (MAGIC_create_replicate):
     CREATE TABLE checksums (
        db             char(64)     NOT             NULL,
        tbl            char(64)     NOT             NULL,
        chunk          int          NOT             NULL,
        chunk_time     float                        NULL,
        chunk_index    varchar(200)                 NULL,
        lower_boundary text                         NULL,
        upper_boundary text                         NULL,
        this_crc       char(40)     NOT             NULL,
        this_cnt       int          NOT             NULL,
        master_crc     char(40)                     NULL,
        master_cnt     int                          NULL,
        ts             timestamp    NOT             NULL,
        PRIMARY KEY (db, tbl, chunk),
        INDEX ts_db_tbl (ts, db, tbl)
     ) ENGINE=InnoDB;

     By default, --[no]create-replicate-table is true, so the database and the table specified by this
     option are created automatically if they do not exist.
     Be sure to choose an appropriate storage engine for the replicate table. If you are checksumming InnoDB tables,
     and you use MyISAM for this table, a deadlock will break replication, because the mixture of transactional and
     non-transactional tables in the checksum statements will cause it to be written to the binlog even though it had
     an error. It will then replay without a deadlock on the replicas, and break replication with “different error on
     master and slave.” This is not a problem with pt-table-checksum; it’s a problem with MySQL replication, and
     you can read more about it in the MySQL manual.
     The replicate table is never checksummed (the tool automatically adds this table to --ignore-tables).



2.29. pt-table-checksum                                                                                         223
Percona Toolkit Documentation, Release 2.1.1


-[no]replicate-check
    default: yes
      Check replicas for data differences after finishing each table. The tool finds differences by executing a simple
      SELECT statement on all detected replicas. The query compares the replica’s checksum results to the master’s
      checksum results. It reports differences in the DIFFS column of the output.
-replicate-check-only
    Check replicas for consistency without executing checksum queries. This option is used only with
    --[no]replicate-check. If specified, pt-table-checksum doesn’t checksum any tables. It checks repli-
    cas for differences found by previous checksumming, and then exits. It might be useful if you run pt-table-
    checksum quietly in a cron job, for example, and later want a report on the results of the cron job, perhaps to
    implement a Nagios check.
-replicate-database
    type: string
      USE only this database. By default, pt-table-checksum executes USE to select the database that contains
      the table it’s currently working on. This is is a best effort to avoid problems with replication filters such as
      binlog_ignore_db and replicate_ignore_db. However, replication filters can create a situation where there simply
      is no one right way to do things. Some statements might not be replicated, and others might cause replication
      to fail. In such cases, you can use this option to specify a default database that pt-table-checksum selects with
      USE, and never changes. See also --[no]check-replication-filters.
-resume
    Resume checksumming from the last completed chunk (disables --[no]empty-replicate-table). If
    the tool stops before it checksums all tables, this option makes checksumming resume from the last chunk of
    the last table that it finished.
-retries
    type: int; default: 2
      Retry a chunk this many times when there is a nonfatal error. Nonfatal errors are problems such as a lock wait
      timeout or the query being killed.
-separator
    type: string; default: #
      The separator character used for CONCAT_WS(). This character is used to join the values of columns when
      checksumming.
-set-vars
    type: string; default: wait_timeout=10000; group: Connection
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string; group: Connection
      Socket file to use for connection.
-tables
    short form: -t; type: hash; group: Filter
      Checksum only this comma-separated list of tables. Table names may be qualified with the database name.
-tables-regex
    type: string; group: Filter
      Checksum only tables whose names match this Perl regex.




224                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-trim
    Add TRIM() to VARCHAR columns (helps when comparing 4.1 to >= 5.0). This is useful when you don’t
    care about the trailing space differences between MySQL versions that vary in their handling of trailing spaces.
    MySQL 5.0 and later all retain trailing spaces in VARCHAR, while previous versions would remove them.
    These differences will cause false checksum differences.
-user
    short form: -u; type: string; group: Connection
      User for login if not current user.
-version
    group: Help
      Show version and exit.
-where
    type: string
      Do only rows matching this WHERE clause. You can use this option to limit the checksum to only part of the
      table. This is particularly useful if you have append-only tables and don’t want to constantly re-check all rows;
      you could run a daily job to just check yesterday’s rows, for instance.
      This option is much like the -w option to mysqldump. Do not specify the WHERE keyword. You might need to
      quote the value. Here is an example:
      :program:‘pt-table-checksum‘ --where "ts > CURRENT_DATE - INTERVAL 1 DAY"



2.29.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      copy: no
      DSN table database.
    • F
      dsn: mysql_read_default_file; copy: no
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.


2.29. pt-table-checksum                                                                                           225
Percona Toolkit Documentation, Release 2.1.1


    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: no
      Socket file to use for connection.
    • t
      copy: no
      DSN table table.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.29.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-table-checksum ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.29.10 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.29.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-table-checksum.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.




226                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.29.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.29.13 AUTHORS

Baron Schwartz and Daniel Nichter


2.29.14 ACKNOWLEDGMENTS

Claus Jeppesen, Francois Saint-Jacques, Giuseppe Maxia, Heikki Tuuri, James Briggs, Martin Friebe, and Sergey
Zhuravlev


2.29.15 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.29.16 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.29.17 VERSION

pt-table-checksum 2.1.1



2.29. pt-table-checksum                                                                                         227
Percona Toolkit Documentation, Release 2.1.1



2.30 pt-table-sync

2.30.1 NAME

pt-table-sync - Synchronize MySQL table data efficiently.


2.30.2 SYNOPSIS

Usage

pt-table-sync [OPTION...] DSN [DSN...]

pt-table-sync synchronizes data efficiently between MySQL tables.
This tool changes data, so for maximum safety, you should back up your data before you use it. When synchronizing
a server that is a replication slave with the –replicate or –sync-to-master methods, it always makes the changes on the
replication master, never the replication slave directly. This is in general the only safe way to bring a replica back
in sync with its master; changes to the replica are usually the source of the problems in the first place. However, the
changes it makes on the master should be no-op changes that set the data to their current values, and actually affect
only the replica. Please read the detailed documentation that follows to learn more about this.
Sync db.tbl on host1 to host2:
pt-table-sync --execute h=host1,D=db,t=tbl h=host2

Sync all tables on host1 to host2 and host3:
pt-table-sync --execute host1 host2 host3

Make slave1 have the same data as its replication master:
pt-table-sync --execute --sync-to-master slave1

Resolve differences that pt-table-checksum found on all slaves of master1:
pt-table-sync --execute --replicate test.checksum master1

Same as above but only resolve differences on slave1:
pt-table-sync --execute --replicate test.checksum 
  --sync-to-master slave1

Sync master2 in a master-master replication configuration, where master2’s copy of db.tbl is known or suspected to be
incorrect:
pt-table-sync --execute --sync-to-master h=master2,D=db,t=tbl

Note that in the master-master configuration, the following will NOT do what you want, because it will make changes
directly on master2, which will then flow through replication and change master1’s data:
# Don’t do this in a master-master setup!
pt-table-sync --execute h=master1,D=db,t=tbl master2




228                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.30.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
With great power comes great responsibility! This tool changes data, so it is a good idea to back up your data. It is
also very powerful, which means it is very complex, so you should run it with the --dry-run option to see what
it will do, until you’re familiar with its operation. If you want to see which rows are different, without changing any
data, use --print instead of --execute.
Be careful when using pt-table-sync in any master-master setup. Master-master replication is inherently tricky, and
it’s easy to make mistakes. You need to be sure you’re using the tool correctly for master-master replication. See the
“SYNOPSIS” for the overview of the correct usage.
Also be careful with tables that have foreign key constraints with ON DELETE or ON UPDATE definitions because
these might cause unintended changes on the child tables.
In general, this tool is best suited when your tables have a primary key or unique index. Although it can synchronize
data in tables lacking a primary key or unique index, it might be best to synchronize that data by another means.
At the time of this release, there is a potential bug using --lock-and-rename with MySQL 5.1, a bug detecting
certain differences, a bug using ROUND() across different platforms, and a bug mixing collations.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-table-
sync.
See also “BUGS” for more information on filing bugs and getting help.


2.30.4 DESCRIPTION

pt-table-sync does one-way and bidirectional synchronization of table data. It does not synchronize table structures,
indexes, or any other schema objects. The following describes one-way synchronization. “BIDIRECTIONAL SYNC-
ING” is described later.
This tool is complex and functions in several different ways. To use it safely and effectively, you should understand
three things: the purpose of --replicate, finding differences, and specifying hosts. These three concepts are
closely related and determine how the tool will run. The following is the abbreviated logic:
if DSN has a t part, sync only that table:
   if 1 DSN:
      if --sync-to-master:
         The DSN is a slave. Connect to its master and sync.
   if more than 1 DSN:
      The first DSN is the source. Sync each DSN in turn.
else if --replicate:
   if --sync-to-master:
      The DSN is a slave. Connect to its master, find records
      of differences, and fix.
   else:
      The DSN is the master. Find slaves and connect to each,
      find records of differences, and fix.
else:
   if only 1 DSN and --sync-to-master:
      The DSN is a slave. Connect to its master, find tables and
      filter with --databases etc, and sync each table to the master.
   else:




2.30. pt-table-sync                                                                                                 229
Percona Toolkit Documentation, Release 2.1.1



        find tables, filtering with --databases etc, and sync each
        DSN to the first.

pt-table-sync can run in one of two ways: with --replicate or without. The default is to run without
--replicate which causes pt-table-sync to automatically find differences efficiently with one of several algo-
rithms (see “ALGORITHMS”). Alternatively, the value of --replicate, if specified, causes pt-table-sync to
use the differences already found by having previously ran pt-table-checksum with its own --replicate option.
Strictly speaking, you don’t need to use --replicate because pt-table-sync can find differences, but many people
use --replicate if, for example, they checksum regularly using pt-table-checksum then fix differences as needed
with pt-table-sync. If you’re unsure, read each tool’s documentation carefully and decide for yourself, or consult with
an expert.
Regardless of whether --replicate is used or not, you need to specify which hosts to sync. There are two ways:
with --sync-to-master or without. Specifying --sync-to-master makes pt-table-sync expect one and
only slave DSN on the command line. The tool will automatically discover the slave’s master and sync it so that
its data is the same as its master. This is accomplished by making changes on the master which then flow through
replication and update the slave to resolve its differences. Be careful though: although this option specifies and syncs
a single slave, if there are other slaves on the same master, they will receive via replication the changes intended for
the slave that you’re trying to sync.
Alternatively, if you do not specify --sync-to-master, the first DSN given on the command line is the source
host. There is only ever one source host. If you do not also specify --replicate, then you must specify at least
one other DSN as the destination host. There can be one or more destination hosts. Source and destination hosts must
be independent; they cannot be in the same replication topology. pt-table-sync will die with an error if it detects that
a destination host is a slave because changes are written directly to destination hosts (and it’s not safe to write directly
to slaves). Or, if you specify --replicate (but not --sync-to-master) then pt-table-sync expects one and
only one master DSN on the command line. The tool will automatically discover all the master’s slaves and sync them
to the master. This is the only way to sync several (all) slaves at once (because --sync-to-master only specifies
one slave).
Each host on the command line is specified as a DSN. The first DSN (or only DSN for cases like
--sync-to-master) provides default values for other DSNs, whether those other DSNs are specified on the com-
mand line or auto-discovered by the tool. So in this example,
pt-table-sync --execute h=host1,u=msandbox,p=msandbox h=host2

the host2 DSN inherits the u and p DSN parts from the host1 DSN. Use the --explain-hosts option to see how
pt-table-sync will interpret the DSNs given on the command line.


2.30.5 OUTPUT

If you specify the --verbose option, you’ll see information about the differences between the tables. There is one
row per table. Each server is printed separately. For example,
# Syncing h=host1,D=test,t=test1
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       0      3      0 Chunk     13:00:00 13:00:17 2    test.test1

Table test.test1 on host1 required 3 INSERT statements to synchronize and it used the Chunk algorithm (see “ALGO-
RITHMS”). The sync operation for this table started at 13:00:00 and ended 17 seconds later (times taken from NOW()
on the source host). Because differences were found, its “EXIT STATUS” was 2.
If you specify the --print option, you’ll see the actual SQL statements that the script uses to synchronize the table
if --execute is also specified.
If you want to see the SQL statements that pt-table-sync is using to select chunks, nibbles, rows, etc., then specify
--print once and --verbose twice. Be careful though: this can print a lot of SQL statements.


230                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


There are cases where no combination of INSERT, UPDATE or DELETE statements can resolve differences without
violating some unique key. For example, suppose there’s a primary key on column a and a unique key on column b.
Then there is no way to sync these two tables with straightforward UPDATE statements:
+---+---+      +---+---+
| a | b |      | a | b |
+---+---+      +---+---+
| 1 | 2 |      | 1 | 1 |
| 2 | 1 |      | 2 | 2 |
+---+---+      +---+---+

The tool rewrites queries to DELETE and REPLACE in this case. This is automatically handled after the first index
violation, so you don’t have to worry about it.


2.30.6 REPLICATION SAFETY

Synchronizing a replication master and slave safely is a non-trivial problem, in general. There are all sorts of issues
to think about, such as other processes changing data, trying to change data on the slave, whether the destination and
source are a master-master pair, and much more.
In general, the safe way to do it is to change the data on the master, and let the changes flow through replication to
the slave like any other changes. However, this works only if it’s possible to REPLACE into the table on the master.
REPLACE works only if there’s a unique index on the table (otherwise it just acts like an ordinary INSERT).
If your table has unique keys, you should use the --sync-to-master and/or --replicate options to sync a
slave to its master. This will generally do the right thing. When there is no unique key on the table, there is no choice
but to change the data on the slave, and pt-table-sync will detect that you’re trying to do so. It will complain and die
unless you specify --no-check-slave (see --[no]check-slave).
If you’re syncing a table without a primary or unique key on a master-master pair, you must change the data on the
destination server. Therefore, you need to specify --no-bin-log for safety (see --[no]bin-log). If you don’t,
the changes you make on the destination server will replicate back to the source server and change the data there!
The generally safe thing to do on a master-master pair is to use the --sync-to-master option so you don’t change
the data on the destination server. You will also need to specify --no-check-slave to keep pt-table-sync from
complaining that it is changing data on a slave.


2.30.7 ALGORITHMS

pt-table-sync has a generic data-syncing framework which uses different algorithms to find differences. The tool
automatically chooses the best algorithm for each table based on indexes, column types, and the algorithm preferences
specified by --algorithms. The following algorithms are available, listed in their default order of preference:
Chunk
      Finds an index whose first column is numeric (including date and time types), and divides the column’s
      range of values into chunks of approximately --chunk-size rows. Syncs a chunk at a time by check-
      summing the entire chunk. If the chunk differs on the source and destination, checksums each chunk’s
      rows individually to find the rows that differ.
      It is efficient when the column has sufficient cardinality to make the chunks end up about the right size.
      The initial per-chunk checksum is quite small and results in minimal network traffic and memory con-
      sumption. If a chunk’s rows must be examined, only the primary key columns and a checksum are sent
      over the network, not the entire row. If a row is found to be different, the entire row will be fetched, but
      not before.
Nibble


2.30. pt-table-sync                                                                                                  231
Percona Toolkit Documentation, Release 2.1.1


       Finds an index and ascends the index in fixed-size nibbles of --chunk-size rows, using a non-
       backtracking algorithm (see pt-archiver for more on this algorithm). It is very similar to “Chunk”, but
       instead of pre-calculating the boundaries of each piece of the table based on index cardinality, it uses
       LIMIT to define each nibble’s upper limit, and the previous nibble’s upper limit to define the lower limit.
       It works in steps: one query finds the row that will define the next nibble’s upper boundary, and the next
       query checksums the entire nibble. If the nibble differs between the source and destination, it examines
       the nibble row-by-row, just as “Chunk” does.
GroupBy
       Selects the entire table grouped by all columns, with a COUNT(*) column added. Compares all columns,
       and if they’re the same, compares the COUNT(*) column’s value to determine how many rows to insert
       or delete into the destination. Works on tables with no primary key or unique index.
Stream
       Selects the entire table in one big stream and compares all columns. Selects all columns. Much less
       efficient than the other algorithms, but works when there is no suitable index for them to use.
Future Plans
       Possibilities for future algorithms are TempTable (what I originally called bottom-up in earlier versions
       of this tool), DrillDown (what I originally called top-down), and GroupByPrefix (similar to how SqlYOG
       Job Agent works). Each algorithm has strengths and weaknesses. If you’d like to implement your favorite
       technique for finding differences between two sources of data on possibly different servers, I’m willing to
       help. The algorithms adhere to a simple interface that makes it pretty easy to write your own.


2.30.8 BIDIRECTIONAL SYNCING

Bidirectional syncing is a new, experimental feature. To make it work reliably there are a number of strict limitations:
*   only works when syncing one server to other independent servers
*   does not work in any way with replication
*   requires that the table(s) are chunkable with the Chunk algorithm
*   is not N-way, only bidirectional between two servers at a time
*   does not handle DELETE changes

For example, suppose we have three servers: c1, r1, r2. c1 is the central server, a pseudo-master to the other servers
(viz. r1 and r2 are not slaves to c1). r1 and r2 are remote servers. Rows in table foo are updated and inserted on all
three servers and we want to synchronize all the changes between all the servers. Table foo has columns:
id       int PRIMARY KEY
ts       timestamp auto updated
name     varchar

Auto-increment offsets are used so that new rows from any server do not create conflicting primary key (id) values. In
general, newer rows, as determined by the ts column, take precedence when a same but differing row is found during
the bidirectional sync. “Same but differing” means that two rows have the same primary key (id) value but different
values for some other column, like the name column in this example. Same but differing conflicts are resolved by
a “conflict”. A conflict compares some column of the competing rows to determine a “winner”. The winning row
becomes the source and its values are used to update the other row.
There are subtle differences between three columns used to achieve bidirectional syncing that you should be fa-
miliar with: chunk column (--chunk-column), comparison column(s) (--columns), and conflict column
(--conflict-column). The chunk column is only used to chunk the table; e.g. “WHERE id >= 5 AND id
< 10”. Chunks are checksummed and when chunk checksums reveal a difference, the tool selects the rows in that
chunk and checksums the --columns for each row. If a column checksum differs, the rows have one or more con-
flicting column values. In a traditional unidirectional sync, the conflict is a moot point because it can be resolved


232                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


simply by updating the entire destination row with the source row’s values. In a bidirectional sync, however, the
--conflict-column (in accordance with other --conflict-* options list below) is compared to determine
which row is “correct” or “authoritative”; this row becomes the “source”.
To sync all three servers completely, two runs of pt-table-sync are required. The first run syncs c1 and r1, then syncs
c1 and r2 including any changes from r1. At this point c1 and r2 are completely in sync, but r1 is missing any changes
from r2 because c1 didn’t have these changes when it and r1 were synced. So a second run is needed which syncs the
servers in the same order, but this time when c1 and r1 are synced r1 gets r2’s changes.
The tool does not sync N-ways, only bidirectionally between the first DSN given on the command line and each
subsequent DSN in turn. So the tool in this example would be ran twice like:
pt-table-sync --bidirectional h=c1 h=r1 h=r2

The --bidirectional option enables this feature and causes various sanity checks to be performed. You must
specify other options that tell pt-table-sync how to resolve conflicts for same but differing rows. These options are:
*   --conflict-column
*   --conflict-comparison
*   --conflict-value
*   --conflict-threshold
*   --conflict-error"> (optional)

Use --print to test this option before --execute. The printed SQL statements will have comments saying on
which host the statement would be executed if you used --execute.
Technical side note: the first DSN is always the “left” server and the other DSNs are always the “right” server. Since
either server can become the source or destination it’s confusing to think of them as “src” and “dst”. Therefore, they’re
generically referred to as left and right. It’s easy to remember this because the first DSN is always to the left of the
other server DSNs on the command line.


2.30.9 EXIT STATUS

The following are the exit statuses (also called return values, or return codes) when pt-table-sync finishes and exits.
STATUS     MEANING
======     =======================================================
0          Success.
1          Internal error.
2          At least one table differed on the destination.
3          Combination of 1 and 2.



2.30.10 OPTIONS

Specify at least one of --print, --execute, or --dry-run.
--where and --replicate are mutually exclusive.
This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-algorithms
    type: string; default: Chunk,Nibble,GroupBy,Stream
      Algorithm to use when comparing the tables, in order of preference.
      For each table, pt-table-sync will check if the table can be synced with the given algorithms in the order that
      they’re given. The first algorithm that can sync the table is used. See “ALGORITHMS”.




2.30. pt-table-sync                                                                                                 233
Percona Toolkit Documentation, Release 2.1.1


-ask-pass
    Prompt for a password when connecting to MySQL.
-bidirectional
    Enable bidirectional sync between first and subsequent hosts.
      See “BIDIRECTIONAL SYNCING” for more information.
-[no]bin-log
    default: yes
      Log to the binary log (SET SQL_LOG_BIN=1).
      Specifying --no-bin-log will SET SQL_LOG_BIN=0.
-buffer-in-mysql
    Instruct MySQL to buffer queries in its memory.
      This option adds the SQL_BUFFER_RESULT option to the comparison queries. This causes MySQL to execute
      the queries and place them in a temporary table internally before sending the results back to pt-table-sync. The
      advantage of this strategy is that pt-table-sync can fetch rows as desired without using a lot of memory inside
      the Perl process, while releasing locks on the MySQL table (to reduce contention with other queries). The
      disadvantage is that it uses more memory on the MySQL server instead.
      You probably want to leave --[no]buffer-to-client enabled too, because buffering into a temp table
      and then fetching it all into Perl’s memory is probably a silly thing to do. This option is most useful for the
      GroupBy and Stream algorithms, which may fetch a lot of data from the server.
-[no]buffer-to-client
    default: yes
      Fetch rows one-by-one from MySQL while comparing.
      This option enables mysql_use_result which causes MySQL to hold the selected rows on the server until
      the tool fetches them. This allows the tool to use less memory but may keep the rows locked on the server
      longer.
      If this option is disabled by specifying --no-buffer-to-client then mysql_store_result is used
      which causes MySQL to send all selected rows to the tool at once. This may result in the results “cursor” being
      held open for a shorter time on the server, but if the tables are large, it could take a long time anyway, and use
      all your memory.
      For most non-trivial data sizes, you want to leave this option enabled.
      This option is disabled when --bidirectional is used.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-[no]check-master
    default: yes
      With --sync-to-master, try to verify that the detected master is the real master.
-[no]check-privileges
    default: yes
      Check that user has all necessary privileges on source and destination table.




234                                                                                                Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-[no]check-slave
    default: yes
     Check whether the destination server is a slave.
     If the destination server is a slave, it’s generally unsafe to make changes on it. However, sometimes you have
     to; --replace won’t work unless there’s a unique index, for example, so you can’t make changes on the
     master in that scenario. By default pt-table-sync will complain if you try to change data on a slave. Specify
     --no-check-slave to disable this check. Use it at your own risk.
-[no]check-triggers
    default: yes
     Check that no triggers are defined on the destination table.
     Triggers were introduced in MySQL v5.0.2, so for older versions this option has no effect because triggers will
     not be checked.
-chunk-column
    type: string
     Chunk the table on this column.
-chunk-index
    type: string
     Chunk the table using this index.
-chunk-size
    type: string; default: 1000
     Number of rows or data size per chunk.
     The size of each chunk of rows for the “Chunk” and “Nibble” algorithms. The size can be either a number of
     rows, or a data size. Data sizes are specified with a suffix of k=kibibytes, M=mebibytes, G=gibibytes. Data
     sizes are converted to a number of rows by dividing by the average row length.
-columns
    short form: -c; type: array
     Compare this comma-separated list of columns.
-config
    type: Array
     Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-conflict-column
    type: string
     Compare this column when rows conflict during a --bidirectional sync.
     When a same but differing row is found the value of this column from each row is compared according to
     --conflict-comparison, --conflict-value and --conflict-threshold to determine which
     row has the correct data and becomes the source. The column can be any type for which there is an appropriate
     --conflict-comparison (this is almost all types except, for example, blobs).
     This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information.
-conflict-comparison
    type: string
     Choose the --conflict-column with this property as the source.




2.30. pt-table-sync                                                                                            235
Percona Toolkit Documentation, Release 2.1.1


      The option affects how the --conflict-column values from the conflicting rows are compared. Possible
      comparisons are one of these MAGIC_comparisons:
      newest|oldest|greatest|least|equals|matches

      COMPARISON      CHOOSES ROW WITH
      ==========      =========================================================
      newest          Newest temporal --conflict-column value
      oldest          Oldest temporal --conflict-column value
      greatest        Greatest numerical "--conflict-column value
      least           Least numerical --conflict-column value
      equals          --conflict-column value equal to --conflict-value
      matches         --conflict-column value matching Perl regex pattern
                      --conflict-value

      This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information.
-conflict-error
    type: string; default: warn
      How to report unresolvable conflicts and conflict errors
      This option changes how the user is notified when a conflict cannot be resolved or causes some kind of error.
      Possible values are:
      * warn: Print a warning to STDERR about the unresolvable conflict
      * die: Die, stop syncing, and print a warning to STDERR

      This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information.
-conflict-threshold
    type: string
      Amount by which one --conflict-column must exceed the other.
      The --conflict-threshold prevents a conflict from being resolved if the absolute difference between
      the two --conflict-column values is less than this amount. For example, if two --conflict-column
      have timestamp values “2009-12-01 12:00:00” and “2009-12-01 12:05:00” the difference is 5 minutes. If
      --conflict-threshold is set to “5m” the conflict will be resolved, but if --conflict-threshold
      is set to “6m” the conflict will fail to resolve because the difference is not greater than or equal to 6 minutes. In
      this latter case, --conflict-error will report the failure.
      This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information.
-conflict-value
    type: string
      Use this value for certain --conflict-comparison.
      This option gives the value for equals and matches --conflict-comparison.
      This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information.
-databases
    short form: -d; type: hash
      Sync only this comma-separated list of databases.
      A common request is to sync tables from one database with tables from another database on the same or different
      server. This is not yet possible. --databases will not do it, and you can’t do it with the D part of the DSN
      either because in the absence of a table name it assumes the whole server should be synced and the D part
      controls only the connection’s default database.



236                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-dry-run
    Analyze, decide the sync algorithm to use, print and exit.
      Implies --verbose so you can see the results. The results are in the same output format that you’ll see from
      actually running the tool, but there will be zeros for rows affected. This is because the tool actually executes,
      but stops before it compares any data and just returns zeros. The zeros do not mean there are no changes to be
      made.
-engines
    short form: -e; type: hash
      Sync only this comma-separated list of storage engines.
-execute
    Execute queries to make the tables have identical data.
      This option makes pt-table-sync actually sync table data by executing all the queries that it created to resolve
      table differences. Therefore, the tables will be changed! And unless you also specify --verbose, the
      changes will be made silently. If this is not what you want, see --print or --dry-run.
-explain-hosts
    Print connection information and exit.
      Print out a list of hosts to which pt-table-sync will connect, with all the various connection options, and exit.
-float-precision
    type: int
      Precision for FLOAT and DOUBLE number-to-string conversion. Causes FLOAT and DOUBLE values to be
      rounded to the specified number of digits after the decimal point, with the ROUND() function in MySQL.
      This can help avoid checksum mismatches due to different floating-point representations of the same values on
      different MySQL versions and hardware. The default is no rounding; the values are converted to strings by the
      CONCAT() function, and MySQL chooses the string representation. If you specify a value of 2, for example,
      then the values 1.008 and 1.009 will be rounded to 1.01, and will checksum as equal.
-[no]foreign-key-checks
    default: yes
      Enable foreign key checks (SET FOREIGN_KEY_CHECKS=1).
      Specifying --no-foreign-key-checks will SET FOREIGN_KEY_CHECKS=0.
-function
    type: string
      Which hash function you’d like to use for checksums.
      The default is CRC32. Other good choices include MD5 and SHA1. If you have installed the FNV_64 user-
      defined function, pt-table-sync will detect it and prefer to use it, because it is much faster than the built-ins.
      You can also use MURMUR_HASH if you’ve installed that user-defined function. Both of these are distributed
      with Maatkit. See pt-table-checksum for more information and benchmarks.
-help
    Show help and exit.
-[no]hex-blob
    default: yes
      HEX() BLOB, TEXT and BINARY columns.


2.30. pt-table-sync                                                                                                 237
Percona Toolkit Documentation, Release 2.1.1


      When row data from the source is fetched to create queries to sync the data (i.e. the queries seen with --print
      and executed by --execute), binary columns are wrapped in HEX() so the binary data does not produce an
      invalid SQL statement. You can disable this option but you probably shouldn’t.
-host
    short form: -h; type: string
      Connect to host.
-ignore-columns
    type: Hash
      Ignore this comma-separated list of column names in comparisons.
      This option causes columns not to be compared. However, if a row is determined to differ between tables, all
      columns in that row will be synced, regardless. (It is not currently possible to exclude columns from the sync
      process itself, only from the comparison.)
-ignore-databases
    type: Hash
      Ignore this comma-separated list of databases.
-ignore-engines
    type: Hash; default: FEDERATED,MRG_MyISAM
      Ignore this comma-separated list of storage engines.
-ignore-tables
    type: Hash
      Ignore this comma-separated list of tables.
      Table names may be qualified with the database name.
-[no]index-hint
    default: yes
      Add FORCE/USE INDEX hints to the chunk and row queries.
      By default pt-table-sync adds a FORCE/USE INDEX hint to each SQL statement to coerce MySQL into using
      the index chosen by the sync algorithm or specified by --chunk-index. This is usually a good thing, but
      in rare cases the index may not be the best for the query so you can suppress the index hint by specifying
      --no-index-hint and let MySQL choose the index.
      This does not affect the queries printed by --print; it only affects the chunk and row queries that pt-table-
      sync uses to select and compare rows.
-lock
    type: int
      Lock tables: 0=none, 1=per sync cycle, 2=per table, or 3=globally.
      This uses LOCK TABLES. This can help prevent tables being changed while you’re examining them. The
      possible values are as follows:
      VALUE     MEANING
      =====     =======================================================
      0         Never lock tables.
      1         Lock and unlock one time per sync cycle (as implemented
                by the syncing algorithm). This is the most granular
                level of locking available. For example, the Chunk
                algorithm will lock each chunk of C<N> rows, and then
                unlock them if they are the same on the source and the



238                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



                destination, before moving on to the next chunk.
       2        Lock and unlock before and after each table.
       3        Lock and unlock once for every server (DSN) synced, with
                C<FLUSH TABLES WITH READ LOCK>.

       A replication slave is never locked if --replicate or --sync-to-master is specified, since in theory
       locking the table on the master should prevent any changes from taking place. (You are not changing data on
       your slave, right?) If --wait is given, the master (source) is locked and then the tool waits for the slave to
       catch up to the master before continuing.
       If --transaction is specified, LOCK TABLES is not used. Instead, lock and unlock are implemented by
       beginning and committing transactions. The exception is if --lock is 3.
       If --no-transaction is specified, then LOCK TABLES is used for any value of --lock.                              See
       --[no]transaction.
-lock-and-rename
    Lock the source and destination table, sync, then swap names. This is useful as a less-blocking ALTER TABLE,
    once the tables are reasonably in sync with each other (which you may choose to accomplish via any number
    of means, including dump and reload or even something like pt-archiver). It requires exactly two DSNs and
    assumes they are on the same server, so it does no waiting for replication or the like. Tables are locked with
    LOCK TABLES.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-port
    short form: -P; type: int
       Port number to use for connection.
-print
    Print queries that will resolve differences.
       If you don’t trust pt-table-sync, or just want to see what it will do, this is a good way to be safe. These queries
       are valid SQL and you can run them yourself if you want to sync the tables manually.
-recursion-method
    type: string
       Preferred recursion method used to find slaves.
       Possible methods are:
       METHOD            USES
       ===========       ================
       processlist       SHOW PROCESSLIST
       hosts             SHOW SLAVE HOSTS

       The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method
       is required if the server uses a non-standard port (not 3306). Usually pt-table-sync does the right thing and finds


2.30. pt-table-sync                                                                                                   239
Percona Toolkit Documentation, Release 2.1.1


      the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the other
      methods will be tried.
-replace
    Write all INSERT and UPDATE statements as REPLACE.
      This is automatically switched on as needed when there are unique index violations.
-replicate
    type: string
      Sync tables listed as different in this table.
      Specifies that pt-table-sync should examine the specified table to find data that differs. The table is exactly the
      same as the argument of the same name to pt-table-checksum. That is, it contains records of which tables (and
      ranges of values) differ between the master and slave.
      For each table and range of values that shows differences between the master and slave,
      pt-table-checksum will sync that table, with the appropriate WHERE clause, to its master.
      This automatically sets --wait to 60 and causes changes to be made on the master instead of the slave.
      If --sync-to-master is specified, the tool will assume the server you specified is the slave, and connect to
      the master as usual to sync.
      Otherwise, it will try to use SHOW PROCESSLIST to find slaves of the server you specified. If it is unable to
      find any slaves via SHOW PROCESSLIST, it will inspect SHOW SLAVE HOSTS instead. You must configure
      each slave’s report-host, report-port and other options for this to work right. After finding slaves, it
      will inspect the specified table on each slave to find data that needs to be synced, and sync it.
      The tool examines the master’s copy of the table first, assuming that the master is potentially a slave as well. Any
      table that shows differences there will NOT be synced on the slave(s). For example, suppose your replication is
      set up as A->B, B->C, B->D. Suppose you use this argument and specify server B. The tool will examine server
      B’s copy of the table. If it looks like server B’s data in table test.tbl1 is different from server A’s copy, the
      tool will not sync that table on servers C and D.
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-sync-to-master
    Treat the DSN as a slave and sync it to its master.
      Treat the server you specified as a slave. Inspect SHOW SLAVE STATUS, connect to the server’s master, and
      treat the master as the source and the slave as the destination. Causes changes to be made on the master. Sets
      --wait to 60 by default, sets --lock to 1 by default, and disables --[no]transaction by default. See
      also --replicate, which changes this option’s behavior.
-tables
    short form: -t; type: hash
      Sync only this comma-separated list of tables.
      Table names may be qualified with the database name.
-timeout-ok
    Keep going if --wait fails.


240                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      If you specify --wait and the slave doesn’t catch up to the master’s position before the wait times out, the
      default behavior is to abort. This option makes the tool keep going anyway. Warning: if you are trying to get a
      consistent comparison between the two servers, you probably don’t want to keep going after a timeout.
-[no]transaction
    Use transactions instead of LOCK TABLES.
      The granularity of beginning and committing transactions is controlled by --lock. This is enabled by default,
      but since --lock is disabled by default, it has no effect.
      Most options that enable locking also disable transactions by default, so if you want to use transactional locking
      (via LOCK IN SHARE MODE and FOR UPDATE, you must specify --transaction explicitly.
      If you don’t specify --transaction explicitly pt-table-sync will decide on a per-table basis whether to use
      transactions or table locks. It currently uses transactions on InnoDB tables, and table locks on all others.
      If --no-transaction is specified, then pt-table-sync will not use transactions at all (not even for InnoDB
      tables) and locking is controlled by --lock.
      When enabled, either explicitly or implicitly, the transaction isolation level is set REPEATABLE READ and
      transactions are started WITH CONSISTENT SNAPSHOT.
-trim
    TRIM() VARCHAR columns in BIT_XOR and ACCUM modes. Helps when comparing MySQL 4.1 to >= 5.0.
      This is useful when you don’t care about the trailing space differences between MySQL versions which vary in
      their handling of trailing spaces. MySQL 5.0 and later all retain trailing spaces in VARCHAR, while previous
      versions would remove them.
-[no]unique-checks
    default: yes
      Enable unique key checks (SET UNIQUE_CHECKS=1).
      Specifying --no-unique-checks will SET UNIQUE_CHECKS=0.
-user
    short form: -u; type: string
      User for login if not current user.
-verbose
    short form: -v; cumulative: yes
      Print results of sync operations.
      See “OUTPUT” for more details about the output.
-version
    Show version and exit.
-wait
    short form: -w; type: time
      How long to wait for slaves to catch up to their master.
      Make the master wait for the slave to catch up in replication before comparing the tables. The value is
      the number of seconds to wait before timing out (see also --timeout-ok). Sets --lock to 1 and
      --[no]transaction to 0 by default. If you see an error such as the following,
      MASTER_POS_WAIT returned -1

      It means the timeout was exceeded and you need to increase it.
      The default value of this option is influenced by other options. To see what value is in effect, run with --help.


2.30. pt-table-sync                                                                                                241
Percona Toolkit Documentation, Release 2.1.1


      To disable waiting entirely (except for locks), specify --wait 0. This helps when the slave is lagging on tables
      that are not being synced.
-where
    type: string
      WHERE clause to restrict syncing to part of the table.
-[no]zero-chunk
    default: yes
      Add a chunk for rows with zero or zero-equivalent values. The only has an effect when --chunk-size is
      specified. The purpose of the zero chunk is to capture a potentially large number of zero values that would
      imbalance the size of the first chunk. For example, if a lot of negative numbers were inserted into an unsigned
      integer column causing them to be stored as zeros, then these zero values are captured by the zero chunk instead
      of the first chunk and all its non-zero values.


2.30.11 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.
   • D
      dsn: database; copy: yes
      Database containing the table to be synced.
   • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
   • h
      dsn: host; copy: yes
      Connect to host.
   • p
      dsn: password; copy: yes
      Password to use when connecting.
   • P
      dsn: port; copy: yes
      Port number to use for connection.
   • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.



242                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • t
      copy: yes
      Table to be synced.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.30.12 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-table-sync ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.30.13 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.30.14 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-table-sync.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.30.15 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:




2.30. pt-table-sync                                                                                               243
Percona Toolkit Documentation, Release 2.1.1



wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.30.16 AUTHORS

Baron Schwartz


2.30.17 ACKNOWLEDGMENTS

My      work      is    based     in     part  on   Giuseppe     Maxia’s   work on        distributed  databases,
http://www.sysadminmag.com/articles/2004/0408/ and code derived from that article.        There is more explana-
tion, and a link to the code, at http://www.perlmonks.org/?node_id=381053.
Another programmer extended Maxia’s work even further. Fabien Coelho changed and generalized Maxia’s technique,
introducing symmetry and avoiding some problems that might have caused too-frequent checksum collisions. This
work grew into pg_comparator, http://www.coelho.net/pg_comparator/. Coelho also explained the technique further
in a paper titled “Remote Comparison of Database Tables” (http://cri.ensmp.fr/classement/doc/A-375.pdf).
This existing literature mostly addressed how to find the differences between the tables, not how to resolve them
once found. I needed a tool that would not only find them efficiently, but would then resolve them. I first began
thinking about how to improve the technique further with my article http://tinyurl.com/mysql-data-diff-algorithm,
where I discussed a number of problems with the Maxia/Coelho “bottom-up” algorithm. After writing that article,
I began to write this tool. I wanted to actually implement their algorithm with some improvements so I was sure
I understood it completely. I discovered it is not what I thought it was, and is considerably more complex than it
appeared to me at first. Fabien Coelho was kind enough to address some questions over email.
The first versions of this tool implemented a version of the Coelho/Maxia algorithm, which I called “bottom-up”, and
my own, which I called “top-down.” Those algorithms are considerably more complex than the current algorithms and
I have removed them from this tool, and may add them back later. The improvements to the bottom-up algorithm are
my original work, as is the top-down algorithm. The techniques to actually resolve the differences are also my own
work.
Another tool that can synchronize tables is the SQLyog Job Agent from webyog. Thanks to Rohit Nadhani, SJA’s
author, for the conversations about the general techniques. There is a comparison of pt-table-sync and SJA at
http://tinyurl.com/maatkit-vs-sqlyog
Thanks to the following people and organizations for helping in many ways:
The Rimm-Kaufman Group http://www.rimmkaufman.com/, MySQL AB http://www.mysql.com/, Blue Ridge Inter-
netWorks http://www.briworks.com/, Percona http://www.percona.com/, Fabien Coelho, Giuseppe Maxia and others
at MySQL AB, Kristian Koehntopp (MySQL AB), Rohit Nadhani (WebYog), The helpful monks at Perlmonks, And
others too numerous to mention.


2.30.18 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.




244                                                                                            Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.30.19 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.30.20 VERSION

pt-table-sync 2.1.1


2.31 pt-table-usage

2.31.1 NAME

pt-table-usage - Analyze how queries use tables.


2.31.2 SYNOPSIS

Usage

pt-table-usage [OPTIONS] [FILES]

pt-table-usage reads queries from a log and analyzes how they use tables. If no FILE is specified, it reads STDIN. It
prints a report for each query.


2.31.3 RISKS

pt-table-use is very low risk. By default, it simply reads queries from a log. It executes EXPLAIN EXTENDED if you
specify the --explain-extended option.
At the time of this release, we know of no bugs that could harm users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-table-
usage.
See also “BUGS” for more information on filing bugs and getting help.




2.31. pt-table-usage                                                                                                245
Percona Toolkit Documentation, Release 2.1.1


2.31.4 DESCRIPTION

pt-table-usage reads queries from a log and analyzes how they use tables. The log should be in MySQL’s slow query
log format.
Table usage is more than simply an indication of which tables the query reads or writes. It also indicates data flow:
data in and data out. The tool determines the data flow by the contexts in which tables appear. A single query can
use a table in several different contexts simultaneously. The tool’s output lists every context for every table. This
CONTEXT-TABLE list indicates how data flows between tables. The “OUTPUT” section lists the possible contexts
and describes how to read a table usage report.
The tool analyzes data flow down to the level of individual columns, so it is helpful if columns are identified un-
ambiguously in the query. If a query uses only one table, then all columns must be from that table, and there’s no
difficulty. But if a query uses multiple tables and the column names are not table-qualified, then it is necessary to use
EXPLAIN EXTENDED, followed by SHOW WARNINGS, to determine to which tables the columns belong.
If the tool does not know the query’s default database, which can occur when the database is not printed in the log,
then EXPLAIN EXTENDED can fail. In this case, you can specify a default database with --database. You can
also use the --create-table-definitions option to help resolve ambiguities.


2.31.5 OUTPUT

The tool prints a usage report for each table in every query, similar to the following:
Query_id: 0x1CD27577D202A339.1
UPDATE t1
SELECT DUAL
JOIN t1
JOIN t2
WHERE t1

Query_id: 0x1CD27577D202A339.2
UPDATE t2
SELECT DUAL
JOIN t1
JOIN t2
WHERE t1

The first line contains the query ID, which by default is the same as those shown in pt-query-digest reports. It is an
MD5 checksum of the query’s “fingerprint,” which is what remains after removing literals, collapsing white space,
and a variety of other transformations. The query ID has two parts separated by a period: the query ID and the table
number. If you wish to use a different value to identify the query, you can specify the --id-attribute option.
The previous example shows two paragraphs for a single query, not two queries. Note that the query ID is identical for
the two, but the table number differs. The table number increments by 1 for each table that the query updates. Only
multi-table UPDATE queries can update multiple tables with a single query, so the table number is 1 for all other types
of queries. (The tool does not support multi-table DELETE queries.) The example output above is from this query:
UPDATE t1 AS a JOIN t2 AS b USING (id)
SET a.foo="bar", b.foo="bat"
WHERE a.id=1;

The SET clause indicates that the query updates two tables: a aliased as t1, and b aliased as t2.
After the first line, the tool prints a variable number of CONTEXT-TABLE lines. Possible contexts are as follows:
    • SELECT




246                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


     SELECT means that the query retrieves data from the table for one of two reasons. The first is to be
     returned to the user as part of a result set. Only SELECT queries return result sets, so the report always
     shows a SELECT context for SELECT queries.
     The second case is when data flows to another table as part of an INSERT or UPDATE. For example, the
     UPDATE query in the example above has the usage:
     SELECT DUAL

     This refers to:
     SET a.foo="bar", b.foo="bat"

     The tool uses DUAL for any values that do not originate in a table, in this case the literal values “bar” and
     “bat”. If that SET clause were SET a.foo=b.foo instead, then the complete usage would be:
     Query_id: 0x1CD27577D202A339.1
     UPDATE t1
     SELECT t2
     JOIN t1
     JOIN t2
     WHERE t1

     The presence of a SELECT context after another context, such as UPDATE or INSERT, indicates where
     the UPDATE or INSERT retrieves its data. The example immediately above reflects an UPDATE query
     that updates rows in table t1 with data from table t2.
   • Any other verb
     Any other verb, such as INSERT, UPDATE, DELETE, etc. may be a context. These verbs indicate that the
     query modifies data in some way. If a SELECT context follows one of these verbs, then the query reads
     data from the SELECT table and writes it to this table. This happens, for example, with INSERT..SELECT
     or UPDATE queries that use values from tables instead of constant values.
     These query types are not supported: SET, LOAD, and multi-table DELETE.
   • JOIN
     The JOIN context lists tables that are joined, either with an explicit JOIN in the FROM clause, or implicitly
     in the WHERE clause, such as t1.id = t2.id.
   • WHERE
     The WHERE context lists tables that are used in the WHERE clause to filter results. This does not include
     tables that are implicitly joined in the WHERE clause; those are listed as JOIN contexts. For example:
     WHERE t1.id > 100 AND t1.id < 200 AND t2.foo IS NOT NULL

     Results in:
     WHERE t1
     WHERE t2

     The tool lists only distinct tables; that is why table t1 is listed only once.
   • TLIST
     The TLIST context lists tables that the query accesses, but which do not appear in any other context.
     These tables are usually an implicit cartesian join. For example, the query SELECT * FROM t1, t2
     results in:




2.31. pt-table-usage                                                                                                 247
Percona Toolkit Documentation, Release 2.1.1



      Query_id: 0xBDDEB6EDA41897A8.1
      SELECT t1
      SELECT t2
      TLIST t1
      TLIST t2

      First of all, there are two SELECT contexts, because SELECT * selects rows from all tables; t1 and
      t2 in this case. Secondly, the tables are implicitly joined, but without any kind of join condition, which
      results in a cartesian join as indicated by the TLIST context for each.


2.31.6 EXIT STATUS

pt-table-usage exits 1 on any kind of error, or 0 if no errors.


2.31.7 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-constant-data-value
    type: string; default: DUAL
      Table to print as the source for constant data (literals). This is any data not retrieved from tables (or subqueries,
      because subqueries are not supported). This includes literal values such as strings (“foo”) and numbers (42),
      or functions such as NOW(). For example, in the query INSERT INTO t (c) VALUES (’a’), the string
      ‘a’ is constant data, so the table usage report is:
      INSERT t
      SELECT DUAL

      The first line indicates that the query inserts data into table t, and the second line indicates that the inserted data
      comes from some constant value.
-[no]continue-on-error
    default: yes
      Continue to work even if there is an error.
-create-table-definitions
    type: array
      Read CREATE TABLE definitions from this list of comma-separated files.             If you cannot use
      --explain-extended to fully qualify table and column names, you can save the output of mysqldump
      --no-data to one or more files and specify those files with this option. The tool will parse all CREATE



248                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


       TABLE definitions from the files and use this information to qualify table and column names. If a column name
       appears in multiple tables, or a table name appears in multiple databases, the ambiguities cannot be resolved.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-database
    short form: -D; type: string
       Default database.
-defaults-file
    short form: -F; type: string
       Only read mysql options from the given file. You must give an absolute pathname.
-explain-extended
    type: DSN
       A server to execute EXPLAIN EXTENDED queries. This may be necessary to resolve ambiguous (unqualified)
       column and table names.
-filter
    type: string
       Discard events for which this Perl code doesn’t return true.
       This option is a string of Perl code or a file containing Perl code that is compiled into a subroutine with one
       argument: $event. If the given value is a readable file, then pt-table-usage reads the entire file and uses its
       contents as the code.
       Filters are implemented in the same fashion as in the pt-query-digest tool, so please refer to its documentation
       for more information.
-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-id-attribute
    type: string
       Identify each event using this attribute. The default is to use a query ID, which is an MD5 checksum of the
       query’s fingerprint.
-log
       type: string
       Print all output to this file when daemonized.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when running. The file contains the process ID of the daemonized instance. The PID
       file is removed when the daemonized instance exits. The program checks for the existence of the PID file when
       starting; if it exists and the process with the matching PID exists, the program exits.



2.31. pt-table-usage                                                                                              249
Percona Toolkit Documentation, Release 2.1.1


-port
    short form: -P; type: int
      Port number to use for connection.
-progress
    type: array; default: time,30
      Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be
      percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage,
      seconds, or number of iterations.
-query
    type: string
      Analyze the specified query instead of reading a log file.
-read-timeout
    type: time; default: 0
      Wait this long for an event from the input; 0 to wait forever.
      This option sets the maximum time to wait for an event from the input. If an event is not received after the
      specified time, the tool stops reading the input and prints its reports.
      This option requires the Perl POSIX module.
-run-time
    type: time
      How long to run before exiting. The default is to run forever (you can interrupt with CTRL-C).
-set-vars
    type: string; default: wait_timeout=10000
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-socket
    short form: -S; type: string
      Socket file to use for connection.
-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.


2.31.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.



250                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • D
      copy: no
      Default database.
    • F
      dsn: mysql_read_default_file; copy: no
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: no
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.31.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-table-usage ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.31.10 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.31.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-table-usage.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool


2.31. pt-table-usage                                                                                              251
Percona Toolkit Documentation, Release 2.1.1


    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.31.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.31.13 AUTHORS

Daniel Nichter


2.31.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.31.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.




252                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.31.16 VERSION

pt-table-usage 2.1.1


2.32 pt-tcp-model

2.32.1 NAME

pt-tcp-model - Transform tcpdump into metrics that permit performance and scalability modeling.


2.32.2 SYNOPSIS

Usage

pt-tcp-model [OPTION...] [FILE]

pt-tcp-model parses and analyzes tcpdump files. With no FILE, or when FILE is -, it read standard input.
Dump TCP requests and responses to a file, capturing only the packet headers to avoid dropped packets, and ignoring
any packets without a payload (such as ack-only packets). Capture port 3306 (MySQL database traffic). Note that to
avoid line breaking in terminals and man pages, the TCP filtering expression that follows has a line break at the end
of the second line; you should omit this from your tcpdump command.
tcpdump -s 384 -i any -nnq -tttt 
      ’tcp port 3306 and (((ip[2:2] - ((ip[0]&0xf)<<2))
     - ((tcp[12]&0xf0)>>2)) != 0)’ 
  > /path/to/tcp-file.txt

Extract individual response times, sorted by end time:
pt-tcp-model /path/to/tcp-file.txt > requests.txt

Sort the result by arrival time, for input to the next step:
sort -n -k1,1 requests.txt > sorted.txt

Slice the result into 10-second intervals and emit throughput, concurrency, and response time metrics for each interval:
pt-tcp-model --type=requests --run-time=10 sorted.txt > sliced.txt

Transform the result for modeling with Aspersa’s usl tool, discarding the first and last line of each file if you specify
multiple files (the first and last line are normally incomplete observation periods and are aberrant):
for f in sliced.txt; do
   tail -n +2 "$f" | head -n -1 | awk ’{print $2, $3, $7/$4}’
done > usl-input.txt



2.32.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-tcp-model merely reads and transforms its input, printing it to the output. It should be very low risk.


2.32. pt-tcp-model                                                                                                 253
Percona Toolkit Documentation, Release 2.1.1


At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-tcp-
model.
See also “BUGS” for more information on filing bugs and getting help.


2.32.4 DESCRIPTION

This tool recognizes requests and responses in a TCP stream, and extracts the “conversations”. You can use it to
capture the response times of individual queries to a database, for example. It expects the TCP input to be in the
following format, which should result from the sample shown in the SYNOPSIS:
<date> <time.microseconds> IP <IP.port> > <IP.port>: <junk>

The tool watches for “incoming” packets to the port you specify with the --watch-server option. This begins a
request. If multiple inbound packets follow each other, then by default the last inbound packet seen determines the
time at which the request is assumed to begin. This is logical if one assumes that a server must receive the whole SQL
statement before beginning execution, for example.
When the first outbound packet is seen, the server is considered to have responded to the request. The tool might see
an inbound packet, but never see a response. This can happen when the kernel drops packets, for example. As a result,
the tool never prints a request unless it sees the response to it. However, the tool actually does not print any request
until it sees the “last” outbound packet. It determines this by waiting for either another inbound packet, or EOF, and
then considers the previous inbound/outbound pair to be complete. As a result, the tool prints requests in a relatively
random order. Most types of analysis require processing in either arrival or completion order. Therefore, the second
type of processing this tool can do requires that you sort the output from the first stage and supply it as input.
The second type of processing is selected with the --type option set to “requests”. In this mode, the tool reads a
group of requests and aggregates them, then emits the aggregated metrics.


2.32.5 OUTPUT

In the default mode (parsing tcpdump output), requests are printed out one per line, in the following format:
<id> <start> <end> <elapsed> <IP:port>

The ID is an incrementing number, assigned in arrival order in the original TCP traffic. The start and end timestamps,
and the elapsed time, can be customized with the --start-end option.
In --type=requests mode, the tool prints out one line per time interval as defined by --run-time, with the
following columns: ts, concurrency, throughput, arrivals, completions, busy_time, weighted_time, sum_time, vari-
ance_mean, quantile_time, obs_time. A detailed explanation follows:
ts
      The timestamp that defines the beginning of the interval.
concurrency
      The average number of requests resident in the server during the interval.
throughput
      The number of arrivals per second during the interval.
arrivals
      The number of arrivals during the interval.


254                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


completions
     The number of completions during the interval.
busy_time
     The total amount of time during which at least one request was resident in the server during the interval.
weighted_time
     The total response time of all the requests resident in the server during the interval, including requests that
     neither arrived nor completed during the interval.
sum_time
     The total response time of all the requests that arrived in the interval.
variance_mean
     The variance-to-mean ratio (index of dispersion) of the response times of the requests that arrived in the
     interval.
quantile_time
     The Nth percentile response time for all the requests that arrived in the interval. See also --quantile.
obs_time
     The length of the observation time window. This will usually be the same as the interval length, except
     for the first and last intervals in a file, which might have a shorter observation time.


2.32.6 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-config
    type: Array
     Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-help
    Show help and exit.
-progress
    type: array; default: time,30
     Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be
     percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage,
     seconds, or number of iterations.
-quantile
    type: float
     The percentile for the last column when --type is “requests” (default .99).
-run-time
    type: float
     The size of the aggregation interval in seconds when --type is “requests” (default 1). Fractional values are
     permitted.
-start-end
    type: Array; default: ts,end




2.32. pt-tcp-model                                                                                                     255
Percona Toolkit Documentation, Release 2.1.1


      Define how the arrival and completion timestamps of a query, and thus its response time (elapsed time) are
      computed. Recall that there may be multiple inbound and outbound packets per request and response, and refer
      to the following ASCII diagram. Suppose that a client sends a series of three inbound (I) packets to the server,
      which computes the result and then sends two outbound (O) packets back:
      I I    I ..................... O    O
      |<---->|<---response time----->|<-->|
      ts0    ts                      end end1

      By default, the query is considered to arrive at time ts, and complete at time end. However, this might not be
      what you want. Perhaps you do not want to consider the query to have completed until time end1. You can
      accomplish this by setting this option to ts,end1.
-type
    type: string
      The type of input to parse (default tcpdump). The permitted types are
      tcpdump
           The parser expects the input to be formatted with the following options: -x -n -q -tttt. For
           example, if you want to capture output from your local machine, you can do something like the
           following (the port must come last on FreeBSD):
           tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 
             > mysql.tcp.txt
           pt-query-digest --type tcpdump mysql.tcp.txt

           The other tcpdump parameters, such as -s, -c, and -i, are up to you. Just make sure the output looks
           like this (there is a line break in the first line to avoid man-page problems):
           2009-04-12 09:50:16.804849 IP 127.0.0.1.42167
                  > 127.0.0.1.3306: tcp 37

           All MySQL servers running on port 3306 are automatically detected in the tcpdump output.
           Therefore, if the tcpdump out contains packets from multiple servers on port 3306 (for example,
           10.0.0.1:3306, 10.0.0.2:3306, etc.), all packets/queries from all these servers will be analyzed to-
           gether as if they were one server.
           If you’re analyzing traffic for a protocol that is not running on port 3306, see --watch-server.
-version
    Show version and exit.
-watch-server
    type: string; default: 10.10.10.10:3306
      This option tells pt-tcp-model which server IP address and port (such as “10.0.0.1:3306”) to watch when parsing
      tcpdump for --type tcpdump. If you don’t specify it, the tool watches all servers by looking for any IP address
      using port 3306. If you’re watching a server with a non-standard port, this won’t work, so you must specify the
      IP address and port to watch.
      Currently, IP address filtering isn’t implemented; so even though you must specify the option in IP:port form, it
      ignores the IP and only looks at the port number.


2.32.7 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:



256                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



PTDEBUG=1 pt-tcp-model ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.32.8 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.32.9 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-tcp-model.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.32.10 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.32.11 AUTHORS

Baron Schwartz


2.32.12 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.



2.32. pt-tcp-model                                                                                                257
Percona Toolkit Documentation, Release 2.1.1


2.32.13 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.32.14 VERSION

pt-tcp-model 2.1.1


2.33 pt-trend

2.33.1 NAME

pt-trend - Compute statistics over a set of time-series data points.


2.33.2 SYNOPSIS

Usage

pt-trend [OPTION...] [FILE ...]

pt-trend reads a slow query log and outputs statistics on it.


2.33.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-trend simply reads files give on the command-line. It should be very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool
will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-trend.
See also “BUGS” for more information on filing bugs and getting help.


2.33.4 DESCRIPTION

You can specify multiple files on the command line. If you don’t specify any, or if you use the special filename -,
lines are read from standard input.


258                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.33.5 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-config
    type: Array
       Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-help
    Show help and exit.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-progress
    type: array; default: time,15
       Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be
       percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage,
       seconds, or number of iterations.
-quiet
    short form: -q
       Disables --progress.
-version
    Show version and exit.


2.33.6 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-trend ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.33.7 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.33.8 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-trend.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool



2.33. pt-trend                                                                                                        259
Percona Toolkit Documentation, Release 2.1.1


    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.33.9 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.33.10 AUTHORS

Baron Schwartz


2.33.11 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.33.12 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.




260                                                                                             Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.33.13 VERSION

pt-trend 2.1.1


2.34 pt-upgrade

2.34.1 NAME

pt-upgrade - Execute queries on multiple servers and check for differences.


2.34.2 SYNOPSIS

Usage

pt-upgrade [OPTION...] DSN [DSN...] [FILE]

pt-upgrade compares query execution on two hosts by executing queries in the given file (or STDIN if no file given)
and examining the results, errors, warnings, etc.produced on each.
Execute and compare all queries in slow.log on host1 to host2:
pt-upgrade slow.log h=host1 h=host2

Use pt-query-digest to get, execute and compare queries from tcpdump:
tcpdump -i eth0 port 3306 -s 65535 -x -n -q -tttt      
  | pt-query-digest --type tcpdump --no-report --print 
  | pt-upgrade h=host1 h=host2

Compare only query times on host1 to host2 and host3:
pt-upgrade slow.log h=host1 h=host2 h=host3 --compare query_times

Compare a single query, no slowlog needed:
pt-upgrade h=host1 h=host2 --query ’SELECT * FROM db.tbl’



2.34.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-upgrade is a read-only tool that is meant to be used on non-production servers. It executes the SQL that you give
it as input, which could cause undesired load on a production server.
At the time of this release, there is a bug that causes the tool to crash, and a bug that causes a deadlock.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
upgrade.
See also “BUGS” for more information on filing bugs and getting help.




2.34. pt-upgrade                                                                                                  261
Percona Toolkit Documentation, Release 2.1.1


2.34.4 DESCRIPTION

pt-upgrade executes queries from slowlogs on one or more MySQL server to find differences in query time, warn-
ings, results, and other aspects of the queries’ execution. This helps evaluate upgrades, migrations and configuration
changes. The comparisons specified by --compare determine what differences can be found. A report is printed
which outlines all the differences found; see “OUTPUT” below.
The first DSN (host) specified on the command line is authoritative; it defines the results to which the other DSNs are
compared. You can “compare” only one host, in which case there will be no differences but the output can be saved to
be diffed later against the output of another single host “comparison”.
At present, pt-upgrade only reads slowlogs. Use pt-query-digest --print to transform other log formats to
slowlog.
DSNs and slowlog files can be specified in any order. pt-upgrade will automatically determine if an argument is a
DSN or a slowlog file. If no slowlog files are given and --query is not specified then pt-upgrade will read from
STDIN.


2.34.5 OUTPUT

Queries are group by fingerprints and any with differences are printed. The first part of a query report is a summary of
differences. In the example below, the query returns a different number of rows (row counts) on each server. The
second part is the side-by-side comparison of values obtained from the query on each server. Then a sample of the
query is printed, preceded by its ID which can be used to locate more information in the sub-report at the end. There
are sub-reports for various types of differences.
# Query 1: ID 0x3C830E3839B916D7 at byte 0 _______________________________
# Found 1 differences in 1 samples:
#   column counts   0
#   column types    0
#   column values   0
#   row counts      1
#   warning counts 0
#   warning levels 0
#   warnings        0
#            127.1:12345 127.1:12348
# Errors               0            0
# Warnings             0            0
# Query_time
#   sum                0            0
#   min                0            0
#   max                0            0
#   avg                0            0
#   pct_95             0            0
#   stddev             0            0
#   median             0            0
# row_count
#   sum                4            3
#   min                4            3
#   max                4            3
#   avg                4            3
#   pct_95             4            3
#   stddev             0            0
#   median             4            3
use ‘test‘;
select i from t where i is not null




262                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



/* 3C830E3839B916D7-1 */ select i from t where i is not null

#   Row count differences
#   Query ID           127.1:12345 127.1:12348
#   ================== =========== ===========
#   3C830E3839B916D7-1           4           3

The output will vary slightly depending on which options are specified.


2.34.6 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-base-dir
    type: string; default: /tmp
      Save outfiles for the rows comparison method in this directory.
      See the rows --compare-results-method.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-[no]clear-warnings
    default: yes
      Clear warnings before each warnings comparison.
      If comparing warnings (--compare includes warnings), this option causes pt-upgrade to execute a suc-
      cessful SELECT statement which clears any warnings left over from previous queries. This requires a current
      database that pt-upgrade usually detects automatically, but in some cases it might be necessary to specify
      --temp-database. If pt-upgrade can’t auto-detect the current database, it will create a temporary table in
      the --temp-database called mk_upgrade_clear_warnings.
-clear-warnings-table
    type: string
      Execute SELECT * FROM ...            LIMIT 1 from this table to clear warnings.
-compare
    type: Hash; default: query_times,results,warnings
      What to compare for each query executed on each host.
      Comparisons determine differences when the queries are executed on the hosts. More comparisons enable more
      differences to be detected. The following comparisons are available:
      query_times
           Compare query execution times. If this comparison is disabled, the queries are still executed so that
           other comparisons will work, but the query time attributes are removed from the events.
      results




2.34. pt-upgrade                                                                                                   263
Percona Toolkit Documentation, Release 2.1.1


             Compare result sets to find differences in rows, columns, etc.
             What differences can be found depends on the --compare-results-method used.
      warnings
             Compare warnings from SHOW WARNINGS. Requires at least MySQL 4.1.
-compare-results-method
    type: string; default: CHECKSUM; group: Comparisons
      Method to use for --compare results. This option has no effect if --no-compare-results is given.
      Available compare methods (case-insensitive):
      CHECKSUM
             Do CREATE TEMPORARY TABLE ‘mk_upgrade‘ AS query then CHECKSUM TABLE
             ‘mk_upgrade‘. This method is fast and simple but in rare cases might it be inaccurate because
             the MySQL manual says:
             [The] fact that two tables produce the same checksum does I<not> mean that
             the tables are identical.

             Requires at least MySQL 4.1.
      rows
             Compare rows one-by-one to find differences. This method has advantages and disadvantages. Its
             disadvantages are that it may be slower and it requires writing and reading outfiles from disk. Its
             advantages are that it is universal (works for all versions of MySQL), it doesn’t alter the query in any
             way, and it can find column value differences.
             The rows method works as follows:
             1. Rows from each host are compared one-by-one.
             2. If no differences are found, comparison stops, else...
             3. All remain rows (after the point where they begin to differ)
                are written to outfiles.
             4. The outfiles are loaded into temporary tables with
                C<LOAD DATA LOCAL INFILE>.
             5. The temporary tables are analyzed to determine the differences.

             The outfiles are written to the --base-dir.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-continue-on-error
    Continue working even if there is an error.
-convert-to-select
    Convert non-SELECT statements to SELECTs and compare.
      By default non-SELECT statements are not allowed. This option causes non-SELECT statements (like UP-
      DATE, INSERT and DELETE) to be converted to SELECT statements, executed and compared.
      For example, DELETE col FROM tbl WHERE id=1 is converted to SELECT col FROM tbl
      WHERE id=1.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.



264                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-explain-hosts
    Print connection information and exit.
-filter
    type: string
      Discard events for which this Perl code doesn’t return true.
      This option is a string of Perl code or a file containing Perl code that gets compiled into a subroutine with one
      argument: $event. This is a hashref. If the given value is a readable file, then pt-upgrade reads the entire file
      and uses its contents as the code. The file should not contain a shebang (#!/usr/bin/perl) line.
      If the code returns true, the chain of callbacks continues; otherwise it ends. The code is the last statement in the
      subroutine other than return $event. The subroutine template is:
      sub { $event = shift; filter && return $event; }

      Filters given on the command line are wrapped inside parentheses like like ( filter ). For complex, multi-
      line filters, you must put the code inside a file so it will not be wrapped inside parentheses. Either way, the filter
      must produce syntactically valid code given the template. For example, an if-else branch given on the command
      line would not be valid:
      --filter ’if () { } else { }’               # WRONG

      Since it’s given on the command line, the if-else branch would be wrapped inside parentheses which is not
      syntactically valid. So to accomplish something more complex like this would require putting the code in a file,
      for example filter.txt:
      my $event_ok; if (...) { $event_ok=1; } else { $event_ok=0; } $event_ok

      Then specify --filter filter.txt to read the code from filter.txt.
      If the filter code won’t compile, pt-upgrade will die with an error. If the filter code does compile, an error
      may still occur at runtime if the code tries to do something wrong (like pattern match an undefined value).
      pt-upgrade does not provide any safeguards so code carefully!
      An example filter that discards everything but SELECT statements:
      --filter ’$event->{arg} =~ m/^select/i’

      This is compiled into a subroutine like the following:
      sub { $event = shift; ( $event->{arg} =~ m/^select/i ) && return $event; }

      It is permissible for the code to have side effects (to alter $event).
      You can find an explanation of the structure of $event at http://code.google.com/p/maatkit/wiki/EventAttributes.
-fingerprints
    Add query fingerprints to the standard query analysis report. This is mostly useful for debugging purposes.
-float-precision
    type: int
      Round float, double and decimal values to this many places.
      This option helps eliminate false-positives caused by floating-point imprecision.
-help
    Show help and exit.
-host
    short form: -h; type: string


2.34. pt-upgrade                                                                                                     265
Percona Toolkit Documentation, Release 2.1.1


       Connect to host.
-iterations
    type: int; default: 1
       How many times to iterate through the collect-and-report cycle. If 0, iterate to infinity. See also –run-time.
-limit
    type: string; default: 95%:20
       Limit output to the given percentage or count.
       If the argument is an integer, report only the top N worst queries. If the argument is an integer followed by the
       % sign, report that percentage of the worst queries. If the percentage is followed by a colon and another integer,
       report the top percentage or the number specified by that integer, whichever comes first.
-log
       type: string
       Print all output to this file when daemonized.
-max-different-rows
    type: int; default: 10
       Stop comparing rows for --compare-results-method rows after this many differences are found.
-order-by
    type: string; default: differences:sum
       Sort events by this attribute and aggregate function.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-query
    type: string
       Execute and compare this single query; ignores files on command line.
       This option allows you to supply a single query on the command line. Any slowlogs also specified on the
       command line are ignored.
-reports
    type: Hash; default: queries,differences,errors,statistics
       Print these reports. Valid reports are queries, differences, errors, and statistics.
       See “OUTPUT” for more information on the various parts of the report.
-run-time
    type: time



266                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      How long to run before exiting. The default is to run forever (you can interrupt with CTRL-C).
-set-vars
    type: string; default: wait_timeout=10000,query_cache_type=0
      Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
      executed.
-shorten
    type: int; default: 1024
      Shorten long statements in reports.
      Shortens long statements, replacing the omitted portion with a /*... omitted ...*/ comment. This
      applies only to the output in reports. It prevents a large statement from causing difficulty in a report. The
      argument is the preferred length of the shortened statement. Not all statements can be shortened, but very large
      INSERT and similar statements often can; and so can IN() lists, although only the first such list in the statement
      will be shortened.
      If it shortens something beyond recognition, you can find the original statement in the log, at the offset shown
      in the report header (see “OUTPUT”).
-socket
    short form: -S; type: string
      Socket file to use for connection.
-temp-database
    type: string
      Use this database for creating temporary tables.
      If given, this database is used for creating temporary tables for the results comparison (see --compare).
      Otherwise, the current database (from the last event that specified its database) is used.
-temp-table
    type: string; default: mk_upgrade
      Use this table for checksumming results.
-user
    short form: -u; type: string
      User for login if not current user.
-version
    Show version and exit.
-zero-query-times
    Zero the query times in the report.


2.34.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the =, and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
      dsn: charset; copy: yes
      Default character set.


2.34. pt-upgrade                                                                                                  267
Percona Toolkit Documentation, Release 2.1.1


    • D
      dsn: database; copy: yes
      Default database.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.34.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-upgrade ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.34.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.34.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-upgrade.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool


268                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.34.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.34.12 AUTHORS

Daniel Nichter


2.34.13 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.34.14 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2009-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.




2.34. pt-upgrade                                                                                                269
Percona Toolkit Documentation, Release 2.1.1


2.34.15 VERSION

pt-upgrade 2.1.1


2.35 pt-variable-advisor

2.35.1 NAME

pt-variable-advisor - Analyze MySQL variables and advise on possible problems.


2.35.2 SYNOPSIS

Usage

pt-variable-advisor [OPTION...] [DSN]

pt-variable-advisor analyzes variables and advises on possible problems.
Get SHOW VARIABLES from localhost:
pt-variable-advisor localhost

Get SHOW VARIABLES output saved in vars.txt:
pt-variable-advisor --source-of-variables vars.txt



2.35.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-variable-advisor reads MySQL’s configuration and examines it and is thus very low risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
variable-advisor.
See also “BUGS” for more information on filing bugs and getting help.


2.35.4 DESCRIPTION

pt-variable-advisor examines SHOW VARIABLES for bad values and settings according to the “RULES” described
below. It reports on variables that match the rules, so you can find bad settings in your MySQL server.
At the time of this release, pt-variable-advisor only examples SHOW VARIABLES, but other input sources are
planned like SHOW STATUS and SHOW SLAVE STATUS.




270                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.35.5 RULES

These are the rules that pt-variable-advisor will apply to SHOW VARIABLES. Each rule has three parts: an ID, a
severity, and a description.
The rule’s ID is a short, unique name for the rule. It usually relates to the variable that the rule examines. If a variable
is examined by several rules, then the rules’ IDs are numbered like “-1”, “-2”, “-N”.
The rule’s severity is an indication of how important it is that this rule matched a query. We use NOTE, WARN, and
CRIT to denote these levels.
The rule’s description is a textual, human-readable explanation of what it means when a variable matches this rule.
Depending on the verbosity of the report you generate, you will see more of the text in the description. By default,
you’ll see only the first sentence, which is sort of a terse synopsis of the rule’s meaning. At a higher verbosity, you’ll
see subsequent sentences.
auto_increment
        severity: note
        Are you trying to write to more than one server in a dual-master or ring replication configuration? This
        is potentially very dangerous and in most cases is a serious mistake. Most people’s reasons for doing this
        are actually not valid at all.
concurrent_insert
        severity: note
        Holes (spaces left by deletes) in MyISAM tables might never be reused.
connect_timeout
        severity: note
        A large value of this setting can create a denial of service vulnerability.
debug
        severity: crit
        Servers built with debugging capability should not be used in production because of the large performance
        impact.
delay_key_write
        severity: warn
        MyISAM index blocks are never flushed until necessary. If there is a server crash, data corruption on
        MyISAM tables can be much worse than usual.
flush
        severity: warn
        This option might decrease performance greatly.
flush_time
        severity: warn
        This option might decrease performance greatly.
have_bdb
        severity: note
        The BDB engine is deprecated. If you aren’t using it, you should disable it with the skip_bdb option.


2.35. pt-variable-advisor                                                                                              271
Percona Toolkit Documentation, Release 2.1.1


init_connect
      severity: note
      The init_connect option is enabled on this server.
init_file
      severity: note
      The init_file option is enabled on this server.
init_slave
      severity: note
      The init_slave option is enabled on this server.
innodb_additional_mem_pool_size
      severity: warn
      This variable generally doesn’t need to be larger than 20MB.
innodb_buffer_pool_size
      severity: warn
      The InnoDB buffer pool size is unconfigured. In a production environment it should always be configured
      explicitly, and the default 10MB size is not good.
innodb_checksums
      severity: warn
      InnoDB checksums are disabled. Your data is not protected from hardware corruption or other errors!
innodb_doublewrite
      severity: warn
      InnoDB doublewrite is disabled. Unless you use a filesystem that protects against partial page writes,
      your data is not safe!
innodb_fast_shutdown
      severity: warn
      InnoDB’s shutdown behavior is not the default. This can lead to poor performance, or the need to perform
      crash recovery upon startup.
innodb_flush_log_at_trx_commit-1
      severity: warn
      InnoDB is not configured in strictly ACID mode. If there is a crash, some transactions can be lost.
innodb_flush_log_at_trx_commit-2
      severity: warn
      Setting innodb_flush_log_at_trx_commit to 0 has no performance benefits over setting it to 2, and more
      types of data loss are possible. If you are trying to change it from 1 for performance reasons, you should
      set it to 2 instead of 0.
innodb_force_recovery




272                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


     severity: warn
     InnoDB is in forced recovery mode! This should be used only temporarily when recovering from data
     corruption or other bugs, not for normal usage.
innodb_lock_wait_timeout
     severity: warn
     This option has an unusually long value, which can cause system overload if locks are not being released.
innodb_log_buffer_size
     severity: warn
     The InnoDB log buffer size generally should not be set larger than 16MB. If you are doing large BLOB
     operations, InnoDB is not really a good choice of engines anyway.
innodb_log_file_size
     severity: warn
     The InnoDB log file size is set to its default value, which is not usable on production systems.
innodb_max_dirty_pages_pct
     severity: note
     The innodb_max_dirty_pages_pct is lower than the default. This can cause overly aggressive flushing and
     add load to the I/O system.
flush_time
     severity: warn
     This setting is likely to cause very bad performance every flush_time seconds.
key_buffer_size
     severity: warn
     The key buffer size is unconfigured. In a production environment it should always be configured explicitly,
     and the default 8MB size is not good.
large_pages
     severity: note
     Large pages are enabled.
locked_in_memory
     severity: note
     The server is locked in memory with –memlock.
log_warnings-1
     severity: note
     Log_warnings is disabled, so unusual events such as statements unsafe for replication and aborted con-
     nections will not be logged to the error log.
log_warnings-2
     severity: note
     Log_warnings must be set greater than 1 to log unusual events such as aborted connections.
low_priority_updates


2.35. pt-variable-advisor                                                                                        273
Percona Toolkit Documentation, Release 2.1.1


       severity: note
       The server is running with non-default lock priority for updates. This could cause update queries to wait
       unexpectedly for read queries.
max_binlog_size
       severity: note
       The max_binlog_size is smaller than the default of 1GB.
max_connect_errors
       severity: note
       max_connect_errors should probably be set as large as your platform allows.
max_connections
       severity: warn
       If the server ever really has more than a thousand threads running, then the system is likely to spend more
       time scheduling threads than really doing useful work. This variable’s value should be considered in light
       of your workload.
myisam_repair_threads
       severity: note
       myisam_repair_threads > 1 enables multi-threaded repair, which is relatively untested and is still listed as
       beta-quality code in the official documentation.
old_passwords
       severity: warn
       Old-style passwords are insecure. They are sent in plain text across the wire.
optimizer_prune_level
       severity: warn
       The optimizer will use an exhaustive search when planning complex queries, which can cause the planning
       process to take a long time.
port
       severity: note
       The server is listening on a non-default port.
query_cache_size-1
       severity: note
       The query cache does not scale to large sizes and can cause unstable performance when larger than
       128MB, especially on multi-core machines.
query_cache_size-2
       severity: warn
       The query cache can cause severe performance problems when it is larger than 256MB, especially on
       multi-core machines.
read_buffer_size-1




274                                                                                                 Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      severity: note
      The read_buffer_size variable should generally be left at its default unless an expert determines it is
      necessary to change it.
read_buffer_size-2
      severity: warn
      The read_buffer_size variable should not be larger than 8MB. It should generally be left at its default
      unless an expert determines it is necessary to change it. Making it larger than 2MB can hurt performance
      significantly, and can make the server crash, swap to death, or just become extremely unstable.
read_rnd_buffer_size-1
      severity: note
      The read_rnd_buffer_size variable should generally be left at its default unless an expert determines it is
      necessary to change it.
read_rnd_buffer_size-2
      severity: warn
      The read_rnd_buffer_size variable should not be larger than 4M. It should generally be left at its default
      unless an expert determines it is necessary to change it.
relay_log_space_limit
      severity: warn
      Setting relay_log_space_limit can cause replicas to stop fetching binary logs from their master immedi-
      ately. This could increase the risk that your data will be lost if the master crashes. If the replicas have
      encountered a limit on relay log space, then it is possible that the latest transactions exist only on the
      master and no replica has retrieved them.
slave_net_timeout
      severity: warn
      This variable is set too high. This is too long to wait before noticing that the connection to the master
      has failed and retrying. This should probably be set to 60 seconds or less. It is also a good idea to use
      pt-heartbeat to ensure that the connection does not appear to time out when the master is simply idle.
slave_skip_errors
      severity: crit
      You should not set this option. If replication is having errors, you need to find and resolve the cause of that;
      it is likely that your slave’s data is different from the master. You can find out with pt-table-checksum.
sort_buffer_size-1
      severity: note
      The sort_buffer_size variable should generally be left at its default unless an expert determines it is nec-
      essary to change it.
sort_buffer_size-2
      severity: note
      The sort_buffer_size variable should generally be left at its default unless an expert determines it is nec-
      essary to change it. Making it larger than a few MB can hurt performance significantly, and can make the
      server crash, swap to death, or just become extremely unstable.
sql_notes


2.35. pt-variable-advisor                                                                                               275
Percona Toolkit Documentation, Release 2.1.1


      severity: note
      This server is configured not to log Note level warnings to the error log.
sync_frm
      severity: warn
      It is best to set sync_frm so that .frm files are flushed safely to disk in case of a server crash.
tx_isolation-1
      severity: note
      This server’s transaction isolation level is non-default.
tx_isolation-2
      severity: warn
      Most applications should use the default REPEATABLE-READ transaction isolation level, or in a few
      cases READ-COMMITTED.
expire_log_days
      severity: warn
      Binary logs are enabled, but automatic purging is not enabled. If you do not purge binary logs, your disk
      will fill up. If you delete binary logs externally to MySQL, you will cause unwanted behaviors. Always
      ask MySQL to purge obsolete logs, never delete them externally.
innodb_file_io_threads
      severity: note
      This option is useless except on Windows.
innodb_data_file_path
      severity: note
      Auto-extending InnoDB files can consume a lot of disk space that is very difficult to reclaim later. Some
      people prefer to set innodb_file_per_table and allocate a fixed-size file for ibdata1.
innodb_flush_method
      severity: note
      Most production database servers that use InnoDB should set innodb_flush_method to O_DIRECT to
      avoid double-buffering, unless the I/O system is very low performance.
innodb_locks_unsafe_for_binlog
      severity: warn
      This option makes point-in-time recovery from binary logs, and replication, untrustworthy if statement-
      based logging is used.
innodb_support_xa
      severity: warn
      MySQL’s internal XA transaction support between InnoDB and the binary log is disabled. The binary
      log might not match InnoDB’s state after crash recovery, and replication might drift out of sync due to
      out-of-order statements in the binary log.
log_bin




276                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


      severity: warn
      Binary logging is disabled, so point-in-time recovery and replication are not possible.
log_output
      severity: warn
      Directing log output to tables has a high performance impact.
max_relay_log_size
      severity: note
      A custom max_relay_log_size is defined.
myisam_recover_options
      severity: warn
      myisam_recover_options should be set to some value such as BACKUP,FORCE to ensure that table cor-
      ruption is noticed.
storage_engine
      severity: note
      The server is using a non-standard storage engine as default.
sync_binlog
      severity: warn
      Binary logging is enabled, but sync_binlog isn’t configured so that every transaction is flushed to the
      binary log for durability.
tmp_table_size
      severity: note
      The effective minimum size of in-memory implicit temporary tables used internally during query execu-
      tion is min(tmp_table_size, max_heap_table_size), so max_heap_table_size should be at least as large as
      tmp_table_size.
old mysql version
      severity: warn
      These are the recommended minimum version for each major release: 3.23, 4.1.20, 5.0.37, 5.1.30.
end-of-life mysql version
      severity: note
      Every release older than 5.1 is now officially end-of-life.


2.35.6 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string




2.35. pt-variable-advisor                                                                                       277
Percona Toolkit Documentation, Release 2.1.1


       Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
       option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
       on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-config
    type: Array
       Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-daemonize
    Fork to the background and detach from the shell. POSIX operating systems only.
-defaults-file
    short form: -F; type: string
       Only read mysql options from the given file. You must give an absolute pathname.
-help
    Show help and exit.
-host
    short form: -h; type: string
       Connect to host.
-ignore-rules
    type: hash
       Ignore these rule IDs.
       Specify a comma-separated list of rule IDs (e.g. LIT.001,RES.002,etc.) to ignore.
-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The
       PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file
       when starting; if it exists and the process with the matching PID exists, the program exits.
-port
    short form: -P; type: int
       Port number to use for connection.
-set-vars
    type: string; default: wait_timeout=10000
       Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
       executed.
-socket
    short form: -S; type: string
       Socket file to use for connection.
-source-of-variables
    type: string; default: mysql
       Read SHOW VARIABLES from this source. Possible values are “mysql”, “none” or a file name. If “mysql” is
       specified then you must also specify a DSN on the command line.



278                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-user
    short form: -u; type: string
      User for login if not current user.
-verbose
    short form: -v; cumulative: yes; default: 1
      Increase verbosity of output. At the default level of verbosity, the program prints only the first sentence of each
      rule’s description. At higher levels, the program prints more of the description.
-version
    Show version and exit.


2.35.7 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
    • A
      dsn: charset; copy: yes
      Default character set.
    • D
      dsn: database; copy: yes
      Default database.
    • F
      dsn: mysql_read_default_file; copy: yes
      Only read default options from the given file
    • h
      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u




2.35. pt-variable-advisor                                                                                          279
Percona Toolkit Documentation, Release 2.1.1


      dsn: user; copy: yes
      User for login if not current user.


2.35.8 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-variable-advisor ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.35.9 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.35.10 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-variable-advisor.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


2.35.11 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.35.12 AUTHORS

Baron Schwartz and Daniel Nichter


280                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.35.13 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.35.14 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2010-2012 Percona Inc. Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.35.15 VERSION

pt-variable-advisor 2.1.1


2.36 pt-visual-explain

2.36.1 NAME

pt-visual-explain - Format EXPLAIN output as a tree.


2.36.2 SYNOPSIS

Usage

pt-visual-explain [OPTION...] [FILE...]

pt-visual-explain transforms EXPLAIN output into a tree representation of the query plan. If FILE is given, input is
read from the file(s). With no FILE, or when FILE is -, read standard input.


Examples

pt-visual-explain <file_containing_explain_output>

pt-visual-explain -c <file_containing_query>

mysql -e "explain select * from mysql.user" | pt-visual-explain




2.36. pt-visual-explain                                                                                        281
Percona Toolkit Documentation, Release 2.1.1


2.36.3 RISKS

The following section is included to inform users about the potential risks, whether known or unknown, of using this
tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write
tools) and those created by bugs.
pt-visual-explain is read-only and very low-risk.
At the time of this release, we know of no bugs that could cause serious harm to users.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this
tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-
visual-explain.
See also “BUGS” for more information on filing bugs and getting help.


2.36.4 DESCRIPTION

pt-visual-explain reverse-engineers MySQL’s EXPLAIN output into a query execution plan, which it then formats
as a left-deep tree – the same way the plan is represented inside MySQL. It is possible to do this by hand, or to
read EXPLAIN’s output directly, but it requires patience and expertise. Many people find a tree representation more
understandable.
You can pipe input into pt-visual-explain or specify a filename at the command line, including the magical ‘-‘ file-
name, which will read from standard input. It can do two things with the input: parse it for something that looks like
EXPLAIN output, or connect to a MySQL instance and run EXPLAIN on the input.
When parsing its input, pt-visual-explain understands three formats: tabular like that shown in the mysql command-
line client, vertical like that created by using the G line terminator in the mysql command-line client, and tab separated.
It ignores any lines it doesn’t know how to parse.
When executing the input, pt-visual-explain replaces everything in the input up to the first SELECT keyword with
‘EXPLAIN SELECT,’ and then executes the result. You must specify --connect to execute the input as a query.
Either way, it builds a tree from the result set and prints it to standard output. For the following query,
select * from sakila.film_actor join sakila.film using(film_id);

pt-visual-explain generates this query plan:
JOIN
+- Bookmark lookup
| +- Table
| | table             film_actor
| | possible_keys idx_fk_film_id
| +- Index lookup
|     key             film_actor->idx_fk_film_id
|     possible_keys idx_fk_film_id
|     key_len         2
|     ref             sakila.film.film_id
|     rows            2
+- Table scan
   rows           952
   +- Table
      table           film
      possible_keys PRIMARY

The query plan is left-deep, depth-first search, and the tree’s root is the output node – the last step in the execution
plan. In other words, read it like this:



282                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


1
      Table scan the ‘film’ table, which accesses an estimated 952 rows.
2
      For each row, find matching rows by doing an index lookup into the film_actor->idx_fk_film_id index
      with the value from sakila.film.film_id, then a bookmark lookup into the film_actor table.
For more information on how to read EXPLAIN output, please see http://dev.mysql.com/doc/en/explain.html, and this
talk titled “Query Optimizer Internals and What’s New in the MySQL 5.2 Optimizer,” from Timour Katchaounov, one
of the MySQL developers: http://maatkit.org/presentations/katchaounov_timour.pdf.


2.36.5 MODULES

This program is actually a runnable module, not just an ordinary Perl script. In fact, there are two modules embedded
in it. This makes unit testing easy, but it also makes it easy for you to use the parsing and tree-building functionality if
you want.
The ExplainParser package accepts a string and parses whatever it thinks looks like EXPLAIN output from it. The
synopsis is as follows:
require "pt-visual-explain";
my $p    = ExplainParser->new();
my $rows = $p->parse("some text");
# $rows is an arrayref of hashrefs.

The ExplainTree package accepts a set of rows and turns it into a tree. For convenience, you can also have it delegate
to ExplainParser and parse text for you. Here’s the synopsis:
require "pt-visual-explain";
my $e      = ExplainTree->new();
my $tree   = $e->parse("some text", %options);
my $output = $e->pretty_print($tree);
print $tree;



2.36.6 ALGORITHM

This section explains the algorithm that converts EXPLAIN into a tree. You may be interested in reading this if
you want to understand EXPLAIN more fully, or trying to figure out how this works, but otherwise this section will
probably not make your life richer.
The tree can be built by examining the id, select_type, and table columns of each row. Here’s what I know about them:
The id column is the sequential number of the select. This does not indicate nesting; it just comes from counting
SELECT from the left of the SQL statement. It’s like capturing parentheses in a regular expression. A UNION
RESULT row doesn’t have an id, because it isn’t a SELECT. The source code actually refers to UNIONs as a fake_lex,
as I recall.
If two adjacent rows have the same id value, they are joined with the standard single-sweep multi-join method.
The select_type column tells a) that a new sub-scope has opened b) what kind of relationship the row has to the
previous row c) what kind of operation the row represents.
    • SIMPLE means there are no subqueries or unions in the whole query.
    • PRIMARY means there are, but this is the outermost SELECT.
    • [DEPENDENT] UNION means this result is UNIONed with the previous result (not row; a result might encom-
      pass more than one row).


2.36. pt-visual-explain                                                                                                283
Percona Toolkit Documentation, Release 2.1.1


    • UNION RESULT terminates a set of UNIONed results.
    • [DEPENDENT|UNCACHEABLE] SUBQUERY means a new sub-scope is opening. This is the kind of sub-
      query that happens in a WHERE clause, SELECT list or whatnot; it does not return a so-called “derived table.”
    • DERIVED is a subquery in the FROM clause.
Tables that are JOINed all have the same select_type. For example, if you JOIN three tables inside a dependent
subquery, they’ll all say the same thing: DEPENDENT SUBQUERY.
The table column usually specifies the table name or alias, but may also say <derivedN> or <unionN,N...N>. If it says
<derivedN>, the row represents an access to the temporary table that holds the result of the subquery whose id is N. If
it says <unionN,..N> it’s the same thing, but it refers to the results it UNIONs together.
Finally, order matters. If a row’s id is less than the one before it, I think that means it is dependent on something other
than the one before it. For example,
explain select
   (select 1 from sakila.film),
   (select 2 from sakila.film_actor),
   (select 3 from sakila.actor);

| id | select_type | table      |
+----+-------------+------------+
| 1 | PRIMARY      | NULL       |
| 4 | SUBQUERY     | actor      |
| 3 | SUBQUERY     | film_actor |
| 2 | SUBQUERY     | film       |

If the results were in order 2-3-4, I think that would mean 3 is a subquery of 2, 4 is a subquery of 3. As it is, this means
4 is a subquery of the nearest previous recent row with a smaller id, which is 1. Likewise for 3 and 2.
This structure is hard to programmatically build into a tree for the same reason it’s hard to understand by inspection:
there are both forward and backward references. <derivedN> is a forward reference to selectN, while <unionM,N> is a
backward reference to selectM and selectN. That makes recursion and other tree-building algorithms hard to get right
(NOTE: after implementation, I now see how it would be possible to deal with both forward and backward references,
but I have no motivation to change something that works). Consider the following:
select * from (
   select 1 from        sakila.actor as actor_1
   union
   select 1 from        sakila.actor as actor_2
) as der_1
union
select * from (
   select 1 from        sakila.actor as actor_3
   union all
   select 1 from        sakila.actor as actor_4
) as der_2;

| id   | select_type | table       |
+------+--------------+------------+
| 1    | PRIMARY      | <derived2> |
| 2    | DERIVED      | actor_1    |
| 3    | UNION        | actor_2    |
| NULL | UNION RESULT | <union2,3> |
| 4    | UNION        | <derived5> |
| 5    | DERIVED      | actor_3    |
| 6    | UNION        | actor_4    |
| NULL | UNION RESULT | <union5,6> |



284                                                                                                   Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



| NULL | UNION RESULT | <union1,4> |

This would be a lot easier to work with if it looked like this (I’ve bracketed the id on rows I moved):
| id   | select_type | table       |
+------+--------------+------------+
| [1] | UNION RESULT | <union1,4> |
| 1    | PRIMARY      | <derived2> |
| [2] | UNION RESULT | <union2,3> |
| 2    | DERIVED      | actor_1    |
| 3    | UNION        | actor_2    |
| 4    | UNION        | <derived5> |
| [5] | UNION RESULT | <union5,6> |
| 5    | DERIVED      | actor_3    |
| 6    | UNION        | actor_4    |

In fact, why not re-number all the ids, so the PRIMARY row becomes 2, and so on? That would make it even easier
to read. Unfortunately that would also have the effect of destroying the meaning of the id column, which I think is
important to preserve in the final tree. Also, though it makes it easier to read, it doesn’t make it easier to manipulate
programmatically; so it’s fine to leave them numbered as they are.
The goal of re-ordering is to make it easier to figure out which rows are children of which rows in the execution plan.
Given the reordered list and some row whose table is <union...> or <derived>, it is easy to find the beginning of the
slice of rows that should be child nodes in the tree: you just look for the first row whose ID is the same as the first
number in the table.
The next question is how to find the last row that should be a child node of a UNION or DERIVED. I’ll start with
DERIVED, because the solution makes UNION easy.
Consider how MySQL numbers the SELECTs sequentially according to their position in the SQL, left-to-right. Since
a DERIVED table encloses everything within it in a scope, which becomes a temporary table, there are only two things
to think about: its child subqueries and unions (if any), and its next siblings in the scope that encloses it. Its children
will all have an id greater than it does, by definition, so any later rows with a smaller id terminate the scope.
Here’s an example. The middle derived table here has a subquery and a UNION to make it a little more complex for
the example.
explain select 1
from (
   select film_id from sakila.film limit 1
) as der_1
join (
   select film_id, actor_id, (select count(*) from sakila.rental) as r
   from sakila.film_actor limit 1
   union all
   select 1, 1, 1 from sakila.film_actor as dummy
) as der_2 using (film_id)
join (
   select actor_id from sakila.actor limit 1
) as der_3 using (actor_id);

Here’s the output of EXPLAIN:
| id     |   select_type       |   table         |
| 1      |   PRIMARY           |   <derived2>    |
| 1      |   PRIMARY           |   <derived6>    |
| 1      |   PRIMARY           |   <derived3>    |
| 6      |   DERIVED           |   actor         |
| 3      |   DERIVED           |   film_actor    |



2.36. pt-visual-explain                                                                                               285
Percona Toolkit Documentation, Release 2.1.1



| 4    |     SUBQUERY           |   rental       |
| 5    |     UNION              |   dummy        |
| NULL |     UNION RESULT       |   <union3,5>   |
| 2    |     DERIVED            |   film         |

The siblings all have id 1, and the middle one I care about is derived3. (Notice MySQL doesn’t execute them in the
order I defined them, which is fine). Now notice that MySQL prints out the rows in the opposite order I defined the
subqueries: 6, 3, 2. It always seems to do this, and there might be other methods of finding the scope boundaries
including looking for the lower boundary of the next largest sibling, but this is a good enough heuristic. I am forced
to rely on it for non-DERIVED subqueries, so I rely on it here too. Therefore, I decide that everything greater than or
equal to 3 belongs to the DERIVED scope.
The rule for UNION is simple: they consume the entire enclosing scope, and to find the component parts of each one,
you find each part’s beginning as referred to in the <unionN,...> definition, and its end is either just before the next
one, or if it’s the last part, the end is the end of the scope.
This is only simple because UNION consumes the entire scope, which is either the entire statement, or the scope of a
DERIVED table. This is because a UNION cannot be a sibling of another UNION or a table, DERIVED or not. (Try
writing such a statement if you don’t see it intuitively). Therefore, you can just find the enclosing scope’s boundaries,
and the rest is easy. Notice in the example above, the UNION is over <union3,5>, which includes the row with id 4 –
it includes every row between 3 and 5.
Finally, there are non-derived subqueries to deal with as well. In this case I can’t look at siblings to find the end of the
scope as I did for DERIVED. I have to trust that MySQL executes depth-first. Here’s an example:
explain
select actor_id,
(
   select count(film_id)
   + (select count(*) from sakila.film)
   from sakila.film join sakila.film_actor using(film_id)
   where exists(
      select * from sakila.actor
      where sakila.actor.actor_id = sakila.film_actor.actor_id
   )
)
from sakila.actor;

| id | select_type                    |   table        |
| 1 | PRIMARY                         |   actor        |
| 2 | SUBQUERY                        |   film         |
| 2 | SUBQUERY                        |   film_actor   |
| 4 | DEPENDENT SUBQUERY              |   actor        |
| 3 | SUBQUERY                        |   film         |

In order, the tree should be built like this:
    • See row 1.
    • See row 2. It’s a higher id than 1, so it’s a subquery, along with every other row whose id is greater than 2.
    • Inside this scope, see 2 and 2 and JOIN them. See 4. It’s a higher id than 2, so it’s again a subquery; recurse.
      After that, see 3, which is also higher; recurse.
But the only reason the nested subquery didn’t include select 3 is because select 4 came first. In other words, if
EXPLAIN looked like this,
| id | select_type                    | table          |
| 1 | PRIMARY                         | actor          |
| 2 | SUBQUERY                        | film           |


286                                                                                                  Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1



|   2 | SUBQUERY           | film_actor |
|   3 | SUBQUERY           | film       |
|   4 | DEPENDENT SUBQUERY | actor      |

I would be forced to assume upon seeing select 3 that select 4 is a subquery of it, rather than just being the next sibling
in the enclosing scope. If this is ever wrong, then the algorithm is wrong, and I don’t see what could be done about it.
UNION is a little more complicated than just “the entire scope is a UNION,” because the UNION might itself be inside
an enclosing scope that’s only indicated by the first item inside the UNION. There are only three kinds of enclosing
scopes: UNION, DERIVED, and SUBQUERY. A UNION can’t enclose a UNION, and a DERIVED has its own
“scope markers,” but a SUBQUERY can wholly enclose a UNION, like this strange example on the empty table t1:
explain select * from t1 where not exists(
   (select t11.i from t1 t11) union (select t12.i from t1 t12));

|   id | select_type | table       | Extra                          |
+------+--------------+------------+--------------------------------+
|    1 | PRIMARY      | t1         | const row not found            |
|    2 | SUBQUERY     | NULL       | No tables used                 |
|    3 | SUBQUERY     | NULL       | no matching row in const table |
|    4 | UNION        | t12        | const row not found            |
| NULL | UNION RESULT | <union2,4> |                                |

The UNION’s backward references might make it look like the UNION encloses the subquery, but studying the query
makes it clear this isn’t the case. So when a UNION’s first row says SUBQUERY, it is this special case.
By the way, I don’t fully understand this query plan; there are 4 numbered SELECT in the plan, but only 3 in the
query. The parens around the UNIONs are meaningful. Removing them will make the EXPLAIN different. Please
tell me how and why this works if you know.
Armed with this knowledge, it’s possible to use recursion to turn the parent-child relationship between all the rows
into a tree representing the execution plan.
MySQL prints the rows in execution order, even the forward and backward references. At any given scope, the rows
are processed as a left-deep tree. MySQL does not do “bushy” execution plans. It begins with a table, finds a matching
row in the next table, and continues till the last table, when it emits a row. When it runs out, it backtracks till it can
find the next row and repeats. There are subtleties of course, but this is the basic plan. This is why MySQL transforms
all RIGHT OUTER JOINs into LEFT OUTER JOINs and cannot do FULL OUTER JOIN.
This means in any given scope, say
| id     |   select_type       |   table         |
| 1      |   SIMPLE            |   tbl1          |
| 1      |   SIMPLE            |   tbl2          |
| 1      |   SIMPLE            |   tbl3          |

The execution plan looks like a depth-first traversal of this tree:
       JOIN
     /     
   JOIN tbl3
  /     
tbl1    tbl2

The JOIN might not be a JOIN. It might be a subquery, for example. This comes from the type column of EXPLAIN.
The documentation says this is a “join type,” but I think “access type” is more accurate, because it’s “how MySQL
accesses rows.”
pt-visual-explain decorates the tree significantly more than just turning rows into nodes. Each node may get a series
of transformations that turn it into a subtree of more than one node. For example, an index scan not marked with


2.36. pt-visual-explain                                                                                               287
Percona Toolkit Documentation, Release 2.1.1


‘Using index’ must do a bookmark lookup into the table rows; that is a three-node subtree. However, after the above
node-ordering and scoping stuff, the rest of the process is pretty simple.


2.36.7 OPTIONS

This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details.
-ask-pass
    Prompt for a password when connecting to MySQL.
-charset
    short form: -A; type: string
      Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8
      option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode
      on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL.
-clustered-pk
    Assume that PRIMARY KEY index accesses don’t need to do a bookmark lookup to retrieve rows. This is the
    case for InnoDB.
-config
    type: Array
      Read this comma-separated list of config files; if specified, this must be the first option on the command line.
-connect
    Treat input as a query, and obtain EXPLAIN output by connecting to a MySQL instance and running EXPLAIN
    on the query. When this option is given, pt-visual-explain uses the other connection-specific options such as
    --user to connect to the MySQL instance. If you have a .my.cnf file, it will read it, so you may not need to
    specify any connection-specific options.
-database
    short form: -D; type: string
      Connect to this database.
-defaults-file
    short form: -F; type: string
      Only read mysql options from the given file. You must give an absolute pathname.
-format
    type: string; default: tree
      Set output format.
      The default is a terse pretty-printed tree. The valid values are:
      Value    Meaning
      =====    ================================================
      tree     Pretty-printed terse tree.
      dump     Data::Dumper output (see Data::Dumper for more).

-help
    Show help and exit.
-host
    short form: -h; type: string
      Connect to host.



288                                                                                              Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


-password
    short form: -p; type: string
       Password to use when connecting.
-pid
       type: string
       Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script
       exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and
       writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process
       is running with that PID, then the script dies; or, if there is no process running with that PID, then the script
       overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies.
-port
    short form: -P; type: int
       Port number to use for connection.
-set-vars
    type: string; default: wait_timeout=10000
       Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and
       executed.
-socket
    short form: -S; type: string
       Socket file to use for connection.
-user
    short form: -u; type: string
       User for login if not current user.
-version
    Show version and exit.


2.36.8 DSN OPTIONS

These DSN options are used to create a DSN. Each option is given like option=value. The options are case-
sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full
details.
   • A
       dsn: charset; copy: yes
       Default character set.
   • D
       dsn: database; copy: yes
       Default database.
   • F
       dsn: mysql_read_default_file; copy: yes
       Only read default options from the given file
   • h


2.36. pt-visual-explain                                                                                               289
Percona Toolkit Documentation, Release 2.1.1


      dsn: host; copy: yes
      Connect to host.
    • p
      dsn: password; copy: yes
      Password to use when connecting.
    • P
      dsn: port; copy: yes
      Port number to use for connection.
    • S
      dsn: mysql_socket; copy: yes
      Socket file to use for connection.
    • u
      dsn: user; copy: yes
      User for login if not current user.


2.36.9 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-visual-explain ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


2.36.10 SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of
Perl.


2.36.11 BUGS

For a list of known bugs, see http://www.percona.com/bugs/pt-visual-explain.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
    • Complete command-line used to run the tool
    • Tool --version
    • MySQL version of all servers involved
    • Output from the tool including STDERR
    • Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.




290                                                                                               Chapter 2. Tools
Percona Toolkit Documentation, Release 2.1.1


2.36.12 DOWNLOADING

Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

You can also get individual tools from the latest release:
wget percona.com/get/TOOL

Replace TOOL with the name of any tool.


2.36.13 AUTHORS

Baron Schwartz


2.36.14 ABOUT PERCONA TOOLKIT

This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL
support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are
employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona.


2.36.15 COPYRIGHT, LICENSE, AND WARRANTY

This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


2.36.16 VERSION

pt-visual-explain 2.1.1




2.36. pt-visual-explain                                                                                         291
Percona Toolkit Documentation, Release 2.1.1




292                                            Chapter 2. Tools
CHAPTER

                                                                                                          THREE



                                                                   CONFIGURATION

3.1 CONFIGURATION FILES

Percona Toolkit tools can read options from configuration files. The configuration file syntax is simple and direct,
and bears some resemblances to the MySQL command-line client tools. The configuration files all follow the same
conventions.
Internally, what actually happens is that the lines are read from the file and then added as command-line options and
arguments to the tool, so just think of the configuration files as a way to write your command lines.


3.1.1 SYNTAX

The syntax of the configuration files is as follows:
*
      Whitespace followed by a hash sign (#) signifies that the rest of the line is a comment. This is deleted.
      For example:
*
      Whitespace is stripped from the beginning and end of all lines.
*
      Empty lines are ignored.
*
      Each line is permitted to be in either of the following formats:
      option
      option=value

      Do not prefix the option with --. Do not quote the values, even if it has spaces; value are literal. Whites-
      pace around the equals sign is deleted during processing.
*
      Only long options are recognized.
*
      A line containing only two hyphens signals the end of option parsing. Any further lines are interpreted as
      additional arguments (not options) to the program.




                                                                                                                    293
Percona Toolkit Documentation, Release 2.1.1


3.1.2 EXAMPLE

This config file for pt-stalk,
# Config for pt-stalk
variable=Threads_connected
cycles=2 # trigger if problem seen twice in a row
--
--user daniel

is equivalent to this command line:
pt-stalk --variable Threads_connected --cycles 2 -- --user daniel

Options after -- are passed literally to mysql and mysqladmin.


3.1.3 READ ORDER

The tools read several configuration files in order:
   1. The global Percona Toolkit configuration file, /etc/percona-toolkit/percona-toolkit.conf. All tools read this file,
      so you should only add options to it that you want to apply to all tools.
   2. The global tool-specific configuration file, /etc/percona-toolkit/TOOL.conf, where TOOL is a tool name like
      pt-query-digest. This file is named after the specific tool you’re using, so you can add options that apply
      only to that tool.
   3. The user’s own Percona Toolkit configuration file, $HOME/.percona-toolkit.conf. All tools read this file, so you
      should only add options to it that you want to apply to all tools.
   4. The user’s tool-specific configuration file, $HOME/.TOOL.conf, where TOOL is a tool name like
      pt-query-digest. This file is named after the specific tool you’re using, so you can add options that
      apply only to that tool.


3.1.4 SPECIFYING

There is a special --config option, which lets you specify which configuration files Percona Toolkit should read.
You specify a comma-separated list of files. However, its behavior is not like other command-line options. It must
be given first on the command line, before any other options. If you try to specify it anywhere else, it will cause an
error. Also, you cannot specify --config=/path/to/file; you must specify the option and the path to the file
separated by whitespace without an equal sign between them, like:
--config /path/to/file

If you don’t want any configuration files at all, specify --config ” to provide an empty list of files.


3.2 DSN (DATA SOURCE NAME) SPECIFICATIONS

Percona Toolkit tools use DSNs to specify how to create a DBD connection to a MySQL server. A DSN is a comma-
separated string of key=value parts, like:
h=host1,P=3306,u=bob




294                                                                                    Chapter 3. Configuration
Percona Toolkit Documentation, Release 2.1.1


The standard key parts are shown below, but some tools add additional key parts. See each tool’s documentation for
details.
Some tools do not use DSNs but still connect to MySQL using options like --host, --user, and --password.
Such tools uses these options to create a DSN automatically, behind the scenes.
Other tools uses both DSNs and options like the ones above. The options provide defaults for all DSNs that do
not specify the option’s corresponding key part. For example, if DSN h=host1 and option --port=12345 are
specified, then the tool automatically adds P=12345 to DSN.


3.2.1 KEY PARTS

Many of the tools add more parts to DSNs for special purposes, and sometimes override parts to make them do
something slightly different. However, all the tools support at least the following:
A
      Specifies the default character set for the connection.
      Enables character set settings in Perl and MySQL. If the value is utf8, sets Perl’s binmode on STDOUT
      to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after
      connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET
      NAMES after connecting to MySQL.
      Unfortunately, there is no way from within Perl itself to specify the client library’s character set. SET
      NAMES only affects the server; if the client library’s settings don’t match, there could be problems. You
      can use the defaults file to specify the client library’s character set, however. See the description of the F
      part below.
D
      Specifies the connection’s default database.
F
      Specifies a defaults file the mysql client library (the C client library used by DBD::mysql, not Percona
      Toolkit itself ) should read. The tools all read the [client] section within the defaults file. If you omit
      this, the standard defaults files will be read in the usual order. “Standard” varies from system to system,
      because the filenames to read are compiled into the client library. On Debian systems, for example, it’s
      usually /etc/mysql/my.cnf then ~/.my.cnf. If you place the following into ~/.my.cnf, tools will Do The
      Right Thing:
      [client]
      user=your_user_name
      pass=secret

      Omitting the F part is usually the right thing to do. As long as you have configured your ~/.my.cnf
      correctly, that will result in tools connecting automatically without needing a username or password.
      You can also specify a default character set in the defaults file. Unlike the “A” part described above, this
      will actually instruct the client library (DBD::mysql) to change the character set it uses internally, which
      cannot be accomplished any other way as far as I know, except for utf8.
h
      Hostname or IP address for the connection.
p
      Password to use when connecting.
P


3.2. DSN (DATA SOURCE NAME) SPECIFICATIONS                                                                            295
Percona Toolkit Documentation, Release 2.1.1


      Port number to use for the connection. Note that the usual special-case behaviors apply: if you specify
      localhost as your hostname on Unix systems, the connection actually uses a socket file, not a TCP/IP
      connection, and thus ignores the port.
S
      Socket file to use for the connection (on Unix systems).
u
      User for login if not current user.


3.2.2 BAREWORD

Many of the tools will let you specify a DSN as a single word, without any key=value syntax. This is called a
‘bareword’. How this is handled is tool-specific, but it is usually interpreted as the “h” part. The tool’s --help output
will tell you the behavior for that tool.


3.2.3 PROPAGATION

Many tools will let you propagate values from one DSN to the next, so you don’t have to specify all the parts for each
DSN. For example, if you want to specify a username and password for each DSN, you can connect to three hosts as
follows:
h=host1,u=fred,p=wilma host2 host3

This is tool-specific.


3.3 ENVIRONMENT

The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1 pt-table-checksum ... > FILE 2>&1

Be careful: debugging output is voluminous and can generate several megabytes of output.


3.4 SYSTEM REQUIREMENTS

Most tools require:
* Perl v5.8 or newer
* Bash v3 or newer
* Core Perl modules like Time::HiRes
Tools that connect to MySQL require:
* Perl modules DBI and DBD::mysql
* MySQL 5.0 or newer
Percona Toolkit is only tested on UNIX systems, primarily Debian and Red Hat derivatives; other operating systems
are not supported.



296                                                                                      Chapter 3. Configuration
Percona Toolkit Documentation, Release 2.1.1


Tools that connect to MySQL may work with MySQL v4.1, but this is not test or supported.




3.4. SYSTEM REQUIREMENTS                                                                            297
Percona Toolkit Documentation, Release 2.1.1




298                                            Chapter 3. Configuration
CHAPTER

                                                                                                            FOUR



                                                                MISCELLANEOUS

4.1 BUGS

Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
* Complete command-line used to run the tool
* Tool --version
* MySQL version of all servers involved
* Output from the tool including STDERR
* Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”.


4.2 AUTHORS

Percona Toolkit is primarily developed by Baron Schwartz and Daniel Nichter, both of whom are employed by Percona
Inc. See each program’s documentation for details.


4.3 COPYRIGHT, LICENSE, AND WARRANTY

Percona Toolkit is copyright 2011-2012 Percona Inc. and others. See each program’s documentation for complete
copyright notices.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN-
CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.




                                                                                                                  299
Percona Toolkit Documentation, Release 2.1.1



4.4 VERSION

Percona Toolkit v2.1.1 released 2012-04-03


4.5 Release Notes

4.5.1 v2.1.1 released 2012-04-03

Percona Toolkit 2.1.1 has been released. This is the first release in the new 2.1 series which supersedes the 2.0 series.
We will continue to fix bugs in 2.0, but 2.1 is now the focus of development.
2.1 introduces a lot of new code for:
    • pt-online-schema-change (completely redesigned)
    • pt-mysql-summary (completely redesigned)
    • pt-summary (completely redesigned)
    • pt-fingerprint (new tool)
    • pt-table-usage (new tool)
There were also several bug fixes.
The redesigned tools are meant to replace their 2.0 counterparts because the 2.1 versions have the same or more
functionality and they are simpler and more reliable. pt-online-schema-change was particularly enhanced to be as safe
as possible given that the tool is inherently risky.
Percona Toolkit packages can be downloaded from http://www.percona.com/downloads/percona-toolkit/ or the Per-
cona Software Repositories (http://www.percona.com/software/repositories/).


Changelog

    • Completely redesigned pt-online-schema-change
    • Completely redesigned pt-mysql-summary
    • Completely redesigned pt-summary
    • Added new tool: pt-table-usage
    • Added new tool: pt-fingerprint
    • Fixed bug 955860: pt-stalk doesn’t run vmstat, iostat, and mpstat for –run-time
    • Fixed bug 960513: SHOW TABLE STATUS is used needlessly
    • Fixed bug 969726: pt-online-schema-change loses foreign keys
    • Fixed bug 846028: pt-online-schema-change does not show progress until completed
    • Fixed bug 898695: pt-online-schema-change add useless ORDER BY
    • Fixed bug 952727: pt-diskstats shows incorrect wr_mb_s
    • Fixed bug 963225: pt-query-digest fails to set history columns for disk tmp tables and disk filesort
    • Fixed bug 967451: Char chunking doesn’t quote column name
    • Fixed bug 972399: pt-table-checksum docs are not rendered right



300                                                                                     Chapter 4. Miscellaneous
Percona Toolkit Documentation, Release 2.1.1


    • Fixed bug 896553: Various documentation spelling fixes
    • Fixed bug 949154: pt-variable-advisor advice for relay-log-space-limit
    • Fixed bug 953461: pt-upgrade manual broken ‘output’ section
    • Fixed bug 949653: pt-table-checksum docs don’t mention risks posed by inconsistent schemas


4.5.2 v2.0.4 released 2012-03-07

Percona Toolkit 2.0.4 has been released. 23 bugs were fixed in this release, and three new features were implemented.
First, –filter was added to pt-kill which allows for arbitrary –group-by. Second, pt-online-schema-change now requires
that its new –execute option be given, else the tool will just check the tables and exit. This is a safeguard to encourage
users to read the documentation, particularly when replication is involved. Third, pt-stalk also received a new option:
–[no]stalk. To collect immediately without stalking, specify –no-stalk and the tool will collect once and exit.
This release is completely backwards compatible with previous 2.0 releases. Given the number of bug fixes, it’s worth
upgrading to 2.0.4.


Changelog

    • Added –filter to pt-kill to allow arbitrary –group-by
    • Added –[no]stalk to pt-stalk (bug 932331)
    • Added –execute to pt-online-schema-change (bug 933232)
    • Fixed bug 873598: pt-online-schema-change doesn’t like reserved words in column names
    • Fixed bug 928966: pt-pmp still uses insecure /tmp
    • Fixed bug 933232: pt-online-schema-change can break replication
    • Fixed bug 941225: Use of qw(...) as parentheses is deprecated at pt-kill line 3511
    • Fixed bug 821694: pt-query-digest doesn’t recognize hex InnoDB txn IDs
    • Fixed bug 894255: pt-kill shouldn’t check if STDIN is a tty when –daemonize is given
    • Fixed bug 916999: pt-table-checksum error: DBD::mysql::st execute failed: called with 2 bind variables when
      6 are needed
    • Fixed bug 926598: DBD::mysql bug causes pt-upgrade to use wrong precision (M) and scale (D)
    • Fixed bug 928226: pt-diskstats illegal division by zero
    • Fixed bug 928415: Typo in pt-stalk doc: –trigger should be –function
    • Fixed bug 930317: pt-archiver doc refers to nonexistent pt-query-profiler
    • Fixed bug 930533: pt-sift looking for *-processlist1; broken compatibility with pt-stalk
    • Fixed bug 932331: pt-stalk cannot collect without stalking
    • Fixed bug 932442: pt-table-checksum error when column name has two spaces
    • Fixed bug 932883: File Debian bug after each release
    • Fixed bug 940503: pt-stalk disk space checks wrong on 32bit platforms
    • Fixed bug 944420: –daemonize doesn’t always close STDIN
    • Fixed bug 945834: pt-sift invokes pt-diskstats with deprecated argument
    • Fixed bug 945836: pt-sift prints awk error if there are no stack traces to aggregate


4.5. Release Notes                                                                                                   301
Percona Toolkit Documentation, Release 2.1.1


    • Fixed bug 945842: pt-sift generates wrong state sum during processlist analysis
    • Fixed bug 946438: pt-query-digest should print a better message when an unsupported log format is specified
    • Fixed bug 946776: pt-table-checksum ignores –lock-wait-timeout
    • Fixed bug 940440: Bad grammar in pt-kill docs


4.5.3 v2.0.3 released 2012-02-03

Percona Toolkit 2.0.3 has been released. The development team was very busy last month making this release signifi-
cant: two completely redesigned and improved tools, pt-diskstats and pt-stalk, and 20 bug fixes.
Both pt-diskstats and pt-stalk were redesigned and rewritten from the ground up. This allowed us to greatly improve
these tools’ functionality and increase testing for them. The accuracy and output of pt-diskstats was enhanced, and the
tool was rewritten in Perl. pt-collect was removed and its functionality was put into a new, enhanced pt-stalk. pt-stalk
is now designed to be a stable, long-running daemon on a variety of common platforms. It is worth re-reading the
documentation for each of these tools.
The 20 bug fixes cover a wide range of problems. The most important are fixes to pt-table-checksum, pt-iostats,
and pt-kill. Apart from pt-diskstats, pt-stalk, and pt-collect (which was removed), no other tools were changed in
backwards-incompatible ways, so it is worth reviewing the full changelog for this release and upgrading if you use
any tools which had bug fixes.
Thank you to the many people who reported bugs and submitted patches.
Download the latest release of Percona Toolkit 2.0 from http://www.percona.com/software/percona-toolkit/ or the
Percona Software Repositories (http://www.percona.com/docs/wiki/repositories:start).


Changelog

    • Completely redesigned pt-diskstats
    • Completely redesigned pt-stalk
    • Removed pt-collect and put its functionality in pt-stalk
    • Fixed bug 871438: Bash tools are insecure
    • Fixed bug 897758: Failed to prepare TableSyncChunk plugin: Use of uninitialized value $args{“chunk_range”}
      in lc at pt-table-sync line 3055
    • Fixed bug 919819: pt-kill –execute-command creates zombies
    • Fixed bug 925778: pt-ioprofile doesn’t run without a file
    • Fixed bug 925477: pt-ioprofile docs refer to pt-iostats
    • Fixed bug 857091: pt-sift downloads http://percona.com/get/pt-pmp, which does not work
    • Fixed bug 857104: pt-sift tries to invoke mext, should be pt-mext
    • Fixed bug 872699: pt-diskstats: rd_avkb & wr_avkb derived incorrectly
    • Fixed bug 897029: pt-diskstats computes wrong values for md0
    • Fixed bug 882918: pt-stalk spams warning if oprofile isn’t installed
    • Fixed bug 884504: pt-stalk doesn’t check pt-collect
    • Fixed bug 897483: pt-online-schema-change “uninitialized value” due to update-foreign-keys-method




302                                                                                     Chapter 4. Miscellaneous
Percona Toolkit Documentation, Release 2.1.1


    • Fixed bug 925007: pt-online-schema-change Use of uninitialized value $tables{“old_table”} in concatenation
      (.) or string at line 4330
    • Fixed bug 915598: pt-config-diff ignores –ask-pass option
    • Fixed bug 919352: pt-table-checksum changes binlog_format even if already set to statement
    • Fixed bug 921700: pt-table-checksum doesn’t add –where to chunk size test on replicas
    • Fixed bug 921802: pt-table-checksum does not recognize –recursion-method=processlist
    • Fixed bug 925855: pt-table-checksum index check is case-sensitive
    • Fixed bug 821709: pt-show-grants –revoke and –separate don’t work together
    • Fixed bug 918247: Some tools use VALUE instead of VALUES


4.5.4 v2.0.2 released 2012-01-05

Percona Toolkit 2.0.2 fixes one critical bug: pt-table-sync –replicate did not work with character values, causing an
“Unknown column” error. If using Percona Toolkit 2.0.1, you should upgrade to 2.0.2.
Download the latest release of Percona Toolkit 2.0 from http://www.percona.com/software/percona-toolkit/ or the
Percona Software Repositories (http://www.percona.com/docs/wiki/repositories:start).


Changelog

    • Fixed bug 911996: pt-table-sync –replicate causes “Unknown column” error


4.5.5 v2.0.1 released 2011-12-30

The Percona Toolkit development team is proud to announce a new major version: 2.0. Beginning with Percona
Toolkit 2.0, we are overhauling, redesigning, and improving the major tools. 2.0 tools are therefore not backwards
compatible with 1.0 tools, which we still support but will not continue to develop.
New in Percona Toolkit 2.0.1 is a completely redesigned pt-table-checksum. The original pt-table-checksum 1.0 was
rather complex, but it worked well for many years. By contrast, the new pt-table-checksum 2.0 is much simpler but
also much more efficient and reliable. We spent months rethinking, redesigning, and testing every aspect of the tool.
The three most significant changes: pt-table-checksum 2.0 does only –replicate, it has only one chunking algorithm,
and its memory usage is stable even with hundreds of thousands of tables and trillions of rows. The tool is now
dedicated to verifying MySQL replication integrity, nothing else, which it does extremely well.
In Percona Toolkit 2.0.1 we also fixed various small bugs and forked ioprofile and align (as pt-ioprofile and pt-align)
from Aspersa.
If you still need functionalities in the original pt-table-checksum, the latest Percona Toolkit 1.0 release remains avail-
able for download. Otherwise, all new development in Percona Toolkit will happen in 2.0.
Download the latest release of Percona Toolkit 2.0 from http://www.percona.com/software/percona-toolkit/ or the
Percona Software Repositories (http://www.percona.com/docs/wiki/repositories:start).


Changelog

    • Completely redesigned pt-table-checksum
    • Fixed bug 856065: pt-trend does not work
    • Fixed bug 887688: Prepared statements crash pt-query-digest


4.5. Release Notes                                                                                                   303
Percona Toolkit Documentation, Release 2.1.1


    • Fixed bug 888286: align not part of percona-toolkit
    • Fixed bug 897961: ptc 2.0 replicate-check error does not include hostname
    • Fixed bug 898318: ptc 2.0 –resume with –tables does not always work
    • Fixed bug 903513: MKDEBUG should be PTDEBUG
    • Fixed bug 908256: Percona Toolkit should include pt-ioprofile
    • Fixed bug 821717: pt-tcp-model –type=requests crashes
    • Fixed bug 844038: pt-online-schema-change documentation example w/drop-tmp-table does not work
    • Fixed bug 864205: Remove the query to reset @crc from pt-table-checksum
    • Fixed bug 898663: Typo in pt-log-player documentation


4.5.6 v1.0.1 released 2011-09-01

Percona Toolkit 1.0.1 has been released. In July, Baron announced planned changes to Maatkit and Aspersa devel-
opment;[1] Percona Toolkit is the result. In brief, Percona Toolkit is the combined fork of Maatkit and Aspersa, so
although the toolkit is new, the programs are not. That means Percona Toolkit 1.0.1 is mature, stable, and production-
ready. In fact, it’s even a little more stable because we fixed a few bugs in this release.
Percona Toolkit packages can be downloaded from http://www.percona.com/downloads/percona-toolkit/ or the Per-
cona Software Repositories (http://www.percona.com/docs/wiki/repositories:start).
Although Maatkit and Aspersa development use Google Code,                       Percona Toolkit uses Launchpad:
https://launchpad.net/percona-toolkit
[1] http://www.xaprb.com/blog/2011/07/06/planned-change-in-maatkit-aspersa-development/


Changelog

    • Fixed bug 819421: MasterSlave::is_replication_thread() doesn’t match all
    • Fixed bug 821673: pt-table-checksum doesn’t include –where in min max queries
    • Fixed bug 821688: pt-table-checksum SELECT MIN MAX for char chunking is wrong
    • Fixed bug 838211: pt-collect: line 24: [: : integer expression expected
    • Fixed bug 838248: pt-collect creates a “5.1” file


4.5.7 v0.9.5 released 2011-08-04

Percona Toolkit 0.9.5 represents the completed transition from Maatkit and Aspersa. There are no bug fixes or new
features, but some features have been removed (like –save-results from pt-query-digest). This release is the starting
point for the 1.0 series where new development will happen, and no more changes will be made to the 0.9 series.


Changelog

    • Forked, combined, and rebranded Maatkit and Aspersa as Percona Toolkit.




304                                                                                   Chapter 4. Miscellaneous
INDEX


Symbols                                                       pt-table-usage command line option, 248
–aggregate                                                    pt-upgrade command line option, 263
     pt-ioprofile command line option, 92                      pt-variable-advisor command line option, 277
–algorithms                                                   pt-visual-explain command line option, 288
     pt-table-sync command line option, 233             –attribute-aliases
–all-structs                                                  pt-query-digest command line option, 153
     pt-duplicate-key-checker command line option, 47   –attribute-value-limit
–alter                                                        pt-query-digest command line option, 153
     pt-online-schema-change command line option, 127   –autoinc
–alter-foreign-keys-method                                    pt-find command line option, 58
     pt-online-schema-change command line option, 127   –aux-dsn
–always                                                       pt-query-digest command line option, 153
     pt-slave-restart command line option, 195          –avgrowlen
–analyze                                                      pt-find command line option, 58
     pt-archiver command line option, 9                 –base-dir
–any-busy-time                                                pt-log-player command line option, 108
     pt-kill command line option, 103                         pt-upgrade command line option, 263
–apdex-threshold                                        –base-file-name
     pt-query-digest command line option, 153                 pt-log-player command line option, 108
–ascend-first                                            –bidirectional
     pt-archiver command line option, 9                       pt-table-sync command line option, 234
–ask-pass                                               –buffer
     pt-archiver command line option, 10                      pt-archiver command line option, 10
     pt-config-diff command line option, 26              –buffer-in-mysql
     pt-deadlock-logger command line option, 32               pt-table-sync command line option, 234
     pt-duplicate-key-checker command line option, 47   –bulk-delete
     pt-find command line option, 56                           pt-archiver command line option, 10
     pt-fk-error-logger command line option, 69         –bulk-insert
     pt-heartbeat command line option, 75                     pt-archiver command line option, 10
     pt-index-usage command line option, 84             –busy-time
     pt-kill command line option, 97                          pt-kill command line option, 100
     pt-log-player command line option, 108             –case-insensitive
     pt-online-schema-change command line option, 128         pt-find command line option, 56
     pt-query-advisor command line option, 142          –cell
     pt-query-digest command line option, 153                 pt-ioprofile command line option, 92
     pt-show-grants command line option, 175            –charset
     pt-slave-delay command line option, 184                  pt-archiver command line option, 10
     pt-slave-find command line option, 189                    pt-config-diff command line option, 26
     pt-slave-restart command line option, 195                pt-deadlock-logger command line option, 32
     pt-table-checksum command line option, 218               pt-duplicate-key-checker command line option, 47
     pt-table-sync command line option, 233                   pt-find command line option, 56
                                                              pt-fk-error-logger command line option, 69


                                                                                                           305
Percona Toolkit Documentation, Release 2.1.1


     pt-heartbeat command line option, 75               –collation
     pt-index-usage command line option, 84                  pt-find command line option, 58
     pt-kill command line option, 97                    –collect
     pt-log-player command line option, 108                  pt-stalk command line option, 203
     pt-online-schema-change command line option, 129   –collect-gdb
     pt-query-advisor command line option, 142               pt-stalk command line option, 203
     pt-query-digest command line option, 154           –collect-oprofile
     pt-show-grants command line option, 176                 pt-stalk command line option, 204
     pt-slave-delay command line option, 184            –collect-strace
     pt-slave-find command line option, 189                   pt-stalk command line option, 204
     pt-slave-restart command line option, 195          –collect-tcpdump
     pt-table-sync command line option, 234                  pt-stalk command line option, 204
     pt-table-usage command line option, 248            –column-name
     pt-upgrade command line option, 263                     pt-find command line option, 58
     pt-variable-advisor command line option, 277       –column-type
     pt-visual-explain command line option, 288              pt-find command line option, 58
–check                                                  –columns
     pt-heartbeat command line option, 75                    pt-archiver command line option, 11
–check-attributes-limit                                      pt-deadlock-logger command line option, 33
     pt-query-digest command line option, 154                pt-table-checksum command line option, 219
–check-interval                                              pt-table-sync command line option, 235
     pt-archiver command line option, 11                –columns-regex
     pt-online-schema-change command line option, 129        pt-diskstats command line option, 43
     pt-table-checksum command line option, 218         –comment
–check-slave-lag                                             pt-find command line option, 58
     pt-archiver command line option, 11                –commit-each
     pt-online-schema-change command line option, 129        pt-archiver command line option, 11
     pt-table-checksum command line option, 218         –compare
–checksum                                                    pt-upgrade command line option, 263
     pt-find command line option, 58                     –compare-results-method
–chunk-column                                                pt-upgrade command line option, 264
     pt-table-sync command line option, 235             –config
–chunk-index                                                 pt-archiver command line option, 11
     pt-online-schema-change command line option, 129        pt-config-diff command line option, 26
     pt-table-checksum command line option, 218              pt-deadlock-logger command line option, 33
     pt-table-sync command line option, 235                  pt-diskstats command line option, 43
–chunk-size                                                  pt-duplicate-key-checker command line option, 47
     pt-online-schema-change command line option, 129        pt-fifo-split command line option, 52
     pt-table-checksum command line option, 218              pt-find command line option, 56
     pt-table-sync command line option, 235                  pt-fingerprint command line option, 66
–chunk-size-limit                                            pt-fk-error-logger command line option, 69
     pt-online-schema-change command line option, 129        pt-heartbeat command line option, 75
     pt-table-checksum command line option, 219              pt-index-usage command line option, 84
–chunk-time                                                  pt-kill command line option, 97
     pt-online-schema-change command line option, 130        pt-log-player command line option, 108
     pt-table-checksum command line option, 219              pt-mysql-summary command line option, 124
–clear-deadlocks                                             pt-online-schema-change command line option, 130
     pt-deadlock-logger command line option, 32              pt-query-advisor command line option, 142
–clear-warnings-table                                        pt-query-digest command line option, 154
     pt-upgrade command line option, 263                     pt-show-grants command line option, 176
–clustered-pk                                                pt-slave-delay command line option, 184
     pt-visual-explain command line option, 288              pt-slave-find command line option, 189
–cmin                                                        pt-slave-restart command line option, 196
     pt-find command line option, 58                          pt-stalk command line option, 204


306                                                                                                    Index
Percona Toolkit Documentation, Release 2.1.1


      pt-summary command line option, 212                     pt-kill command line option, 97
      pt-table-checksum command line option, 219              pt-query-advisor command line option, 142
      pt-table-sync command line option, 235                  pt-query-digest command line option, 154
      pt-table-usage command line option, 248                 pt-slave-delay command line option, 184
      pt-tcp-model command line option, 255                   pt-slave-restart command line option, 196
      pt-trend command line option, 259                       pt-stalk command line option, 204
      pt-upgrade command line option, 264                     pt-table-usage command line option, 249
      pt-variable-advisor command line option, 278            pt-upgrade command line option, 264
      pt-visual-explain command line option, 288              pt-variable-advisor command line option, 278
–conflict-column                                          –database
      pt-table-sync command line option, 235                  pt-heartbeat command line option, 76
–conflict-comparison                                           pt-index-usage command line option, 84
      pt-table-sync command line option, 235                  pt-query-advisor command line option, 142
–conflict-error                                                pt-show-grants command line option, 176
      pt-table-sync command line option, 236                  pt-slave-find command line option, 190
–conflict-threshold                                            pt-slave-restart command line option, 196
      pt-table-sync command line option, 236                  pt-table-usage command line option, 249
–conflict-value                                                pt-visual-explain command line option, 288
      pt-table-sync command line option, 236             –databases
–connect                                                      pt-duplicate-key-checker command line option, 48
      pt-visual-explain command line option, 288              pt-index-usage command line option, 84
–connection-id                                                pt-mysql-summary command line option, 124
      pt-find command line option, 58                          pt-table-checksum command line option, 219
–constant-data-value                                          pt-table-sync command line option, 236
      pt-table-usage command line option, 248            –databases-regex
–continue-on-error                                            pt-index-usage command line option, 84
      pt-upgrade command line option, 264                     pt-table-checksum command line option, 219
–convert-to-select                                       –datafree
      pt-upgrade command line option, 264                     pt-find command line option, 59
–create-dest-table                                       –datasize
      pt-deadlock-logger command line option, 33              pt-find command line option, 59
–create-review-history-table                             –day-start
      pt-query-digest command line option, 154                pt-find command line option, 56
–create-review-table                                     –dbi-driver
      pt-query-digest command line option, 154                pt-heartbeat command line option, 76
–create-save-results-database                            –dblike
      pt-index-usage command line option, 84                  pt-find command line option, 59
–create-table                                            –dbregex
      pt-heartbeat command line option, 75                    pt-find command line option, 59
–create-table-definitions                                 –defaults-file
      pt-table-usage command line option, 248                 pt-config-diff command line option, 26
–createopts                                                   pt-deadlock-logger command line option, 33
      pt-find command line option, 59                          pt-duplicate-key-checker command line option, 48
–critical-load                                                pt-find command line option, 56
      pt-online-schema-change command line option, 130        pt-fk-error-logger command line option, 69
–ctime                                                        pt-heartbeat command line option, 76
      pt-find command line option, 59                          pt-index-usage command line option, 84
–cycles                                                       pt-kill command line option, 97
      pt-stalk command line option, 204                       pt-log-player command line option, 108
–daemonize                                                    pt-online-schema-change command line option, 130
      pt-config-diff command line option, 26                   pt-query-advisor command line option, 142
      pt-deadlock-logger command line option, 33              pt-query-digest command line option, 154
      pt-fk-error-logger command line option, 69              pt-show-grants command line option, 176
      pt-heartbeat command line option, 76                    pt-slave-delay command line option, 184


Index                                                                                                     307
Percona Toolkit Documentation, Release 2.1.1


     pt-slave-find command line option, 190              –exec-plus
     pt-slave-restart command line option, 196               pt-find command line option, 61
     pt-table-checksum command line option, 219         –execute
     pt-table-sync command line option, 236                  pt-online-schema-change command line option, 130
     pt-table-usage command line option, 249                 pt-query-digest command line option, 155
     pt-variable-advisor command line option, 278            pt-table-sync command line option, 237
     pt-visual-explain command line option, 288         –execute-command
–delay                                                       pt-kill command line option, 103
     pt-slave-delay command line option, 184            –execute-throttle
–delayed-insert                                              pt-query-digest command line option, 155
     pt-archiver command line option, 12                –expected-range
–dest                                                        pt-query-digest command line option, 155
     pt-archiver command line option, 12                –explain
     pt-deadlock-logger command line option, 33              pt-query-digest command line option, 156
     pt-fk-error-logger command line option, 69              pt-table-checksum command line option, 220
     pt-stalk command line option, 204                  –explain-extended
–devices-regex                                               pt-table-usage command line option, 249
     pt-diskstats command line option, 43               –explain-hosts
–disk-bytes-free                                             pt-table-sync command line option, 237
     pt-stalk command line option, 204                       pt-upgrade command line option, 264
–disk-pct-free                                          –fifo
     pt-stalk command line option, 204                       pt-fifo-split command line option, 52
–drop                                                   –file
     pt-index-usage command line option, 84                  pt-archiver command line option, 12
     pt-show-grants command line option, 176                 pt-heartbeat command line option, 76
–dry-run                                                –filter
     pt-archiver command line option, 12                     pt-kill command line option, 97
     pt-log-player command line option, 108                  pt-log-player command line option, 108
     pt-online-schema-change command line option, 130        pt-query-digest command line option, 156
     pt-table-sync command line option, 237                  pt-table-usage command line option, 249
–each-busy-time                                              pt-upgrade command line option, 265
     pt-kill command line option, 103                   –fingerprints
–embedded-attributes                                         pt-query-digest command line option, 157
     pt-query-digest command line option, 154                pt-upgrade command line option, 265
–empty                                                  –float-precision
     pt-find command line option, 59                          pt-table-checksum command line option, 220
–empty-save-results-tables                                   pt-table-sync command line option, 237
     pt-index-usage command line option, 85                  pt-upgrade command line option, 265
–engine                                                 –flush
     pt-find command line option, 59                          pt-show-grants command line option, 176
–engines                                                –for-update
     pt-duplicate-key-checker command line option, 48        pt-archiver command line option, 12
     pt-table-checksum command line option, 220         –force
     pt-table-sync command line option, 237                  pt-fifo-split command line option, 53
–error-length                                           –format
     pt-slave-restart command line option, 196               pt-visual-explain command line option, 288
–error-numbers                                          –frames
     pt-slave-restart command line option, 196               pt-heartbeat command line option, 76
–error-text                                             –function
     pt-slave-restart command line option, 196               pt-find command line option, 60
–exec                                                        pt-stalk command line option, 204
     pt-find command line option, 61                          pt-table-checksum command line option, 220
–exec-dsn                                                    pt-table-sync command line option, 237
     pt-find command line option, 61                     –group-by


308                                                                                                    Index
Percona Toolkit Documentation, Release 2.1.1


     pt-diskstats command line option, 43                    pt-log-player command line option, 110
     pt-ioprofile command line option, 92                     pt-online-schema-change command line option, 131
     pt-kill command line option, 98                         pt-query-advisor command line option, 143
     pt-query-advisor command line option, 143               pt-query-digest command line option, 158
     pt-query-digest command line option, 158                pt-show-grants command line option, 176
–header                                                      pt-slave-delay command line option, 184
     pt-archiver command line option, 13                     pt-slave-find command line option, 190
–headers                                                     pt-slave-restart command line option, 196
     pt-diskstats command line option, 43                    pt-table-checksum command line option, 220
–help                                                        pt-table-sync command line option, 238
     pt-archiver command line option, 13                     pt-table-usage command line option, 249
     pt-config-diff command line option, 26                   pt-upgrade command line option, 265
     pt-deadlock-logger command line option, 34              pt-variable-advisor command line option, 278
     pt-diskstats command line option, 44                    pt-visual-explain command line option, 288
     pt-duplicate-key-checker command line option, 48   –id-attribute
     pt-fifo-split command line option, 53                    pt-table-usage command line option, 249
     pt-find command line option, 56                     –idle-time
     pt-fingerprint command line option, 66                   pt-kill command line option, 100
     pt-fk-error-logger command line option, 69         –ignore
     pt-heartbeat command line option, 77                    pt-archiver command line option, 13
     pt-index-usage command line option, 85                  pt-show-grants command line option, 176
     pt-ioprofile command line option, 93                –ignore-attributes
     pt-kill command line option, 98                         pt-query-digest command line option, 158
     pt-log-player command line option, 110             –ignore-columns
     pt-mysql-summary command line option, 124               pt-table-checksum command line option, 220
     pt-online-schema-change command line option, 131        pt-table-sync command line option, 238
     pt-query-advisor command line option, 143          –ignore-command
     pt-query-digest command line option, 158                pt-kill command line option, 100
     pt-show-grants command line option, 176            –ignore-databases
     pt-slave-delay command line option, 184                 pt-duplicate-key-checker command line option, 48
     pt-slave-find command line option, 190                   pt-index-usage command line option, 85
     pt-slave-restart command line option, 196               pt-table-checksum command line option, 221
     pt-stalk command line option, 205                       pt-table-sync command line option, 238
     pt-summary command line option, 212                –ignore-databases-regex
     pt-table-checksum command line option, 220              pt-index-usage command line option, 85
     pt-table-sync command line option, 237                  pt-table-checksum command line option, 221
     pt-table-usage command line option, 249            –ignore-db
     pt-tcp-model command line option, 255                   pt-kill command line option, 100
     pt-trend command line option, 259                  –ignore-engines
     pt-upgrade command line option, 265                     pt-duplicate-key-checker command line option, 48
     pt-variable-advisor command line option, 278            pt-table-checksum command line option, 221
     pt-visual-explain command line option, 288              pt-table-sync command line option, 238
–high-priority-select                                   –ignore-host
     pt-archiver command line option, 13                     pt-kill command line option, 100
–host                                                   –ignore-info
     pt-archiver command line option, 13                     pt-kill command line option, 101
     pt-config-diff command line option, 26              –ignore-order
     pt-deadlock-logger command line option, 34              pt-duplicate-key-checker command line option, 48
     pt-duplicate-key-checker command line option, 48   –ignore-rules
     pt-find command line option, 57                          pt-query-advisor command line option, 143
     pt-fk-error-logger command line option, 70              pt-variable-advisor command line option, 278
     pt-heartbeat command line option, 77               –ignore-state
     pt-index-usage command line option, 85                  pt-kill command line option, 101
     pt-kill command line option, 98                    –ignore-tables


Index                                                                                                    309
Percona Toolkit Documentation, Release 2.1.1


      pt-duplicate-key-checker command line option, 48          pt-table-checksum command line option, 221
      pt-index-usage command line option, 85             –log
      pt-table-checksum command line option, 221             pt-deadlock-logger command line option, 34
      pt-table-sync command line option, 238                 pt-fk-error-logger command line option, 70
–ignore-tables-regex                                         pt-heartbeat command line option, 77
      pt-index-usage command line option, 85                 pt-kill command line option, 98
      pt-table-checksum command line option, 221             pt-query-digest command line option, 159
–ignore-user                                                 pt-slave-delay command line option, 185
      pt-kill command line option, 101                       pt-slave-restart command line option, 197
–ignore-variables                                            pt-stalk command line option, 205
      pt-config-diff command line option, 26                  pt-table-usage command line option, 249
–indexsize                                                   pt-upgrade command line option, 266
      pt-find command line option, 60                     –low-priority-delete
–inherit-attributes                                          pt-archiver command line option, 13
      pt-query-digest command line option, 158           –low-priority-insert
–interval                                                    pt-archiver command line option, 13
      pt-deadlock-logger command line option, 34         –master-server-id
      pt-diskstats command line option, 44                   pt-heartbeat command line option, 77
      pt-fk-error-logger command line option, 70         –match
      pt-heartbeat command line option, 77                   pt-stalk command line option, 206
      pt-kill command line option, 98                    –match-all
      pt-query-digest command line option, 159               pt-kill command line option, 101
      pt-slave-delay command line option, 185            –match-command
      pt-stalk command line option, 205                      pt-kill command line option, 101
–iterations                                              –match-db
      pt-diskstats command line option, 44                   pt-kill command line option, 101
      pt-log-player command line option, 110             –match-embedded-numbers
      pt-query-digest command line option, 159               pt-fingerprint command line option, 66
      pt-stalk command line option, 205                  –match-host
      pt-upgrade command line option, 266                    pt-kill command line option, 102
–key-types                                               –match-info
      pt-duplicate-key-checker command line option, 48       pt-kill command line option, 102
–kill                                                    –match-md5-checksums
      pt-kill command line option, 103                       pt-fingerprint command line option, 66
–kill-query                                              –match-state
      pt-kill command line option, 104                       pt-kill command line option, 102
–kmin                                                    –match-user
      pt-find command line option, 60                         pt-kill command line option, 102
–ktime                                                   –max-different-rows
      pt-find command line option, 60                         pt-upgrade command line option, 266
–limit                                                   –max-lag
      pt-archiver command line option, 13                    pt-archiver command line option, 13
      pt-query-digest command line option, 159               pt-online-schema-change command line option, 131
      pt-upgrade command line option, 266                    pt-table-checksum command line option, 221
–lines                                                   –max-load
      pt-fifo-split command line option, 53                   pt-online-schema-change command line option, 131
–local                                                       pt-table-checksum command line option, 221
      pt-archiver command line option, 13                –max-sessions
–lock                                                        pt-log-player command line option, 110
      pt-table-sync command line option, 238             –max-sleep
–lock-and-rename                                             pt-slave-restart command line option, 197
      pt-table-sync command line option, 239             –min-sleep
–lock-wait-timeout                                           pt-slave-restart command line option, 197
      pt-online-schema-change command line option, 131   –mirror


310                                                                                                          Index
Percona Toolkit Documentation, Release 2.1.1


     pt-query-digest command line option, 159                  pt-visual-explain command line option, 288
–mmin                                                   –pid
     pt-find command line option, 60                          pt-archiver command line option, 14
–monitor                                                     pt-config-diff command line option, 26
     pt-heartbeat command line option, 77                    pt-deadlock-logger command line option, 34
     pt-slave-restart command line option, 197               pt-duplicate-key-checker command line option, 48
–mtime                                                       pt-fifo-split command line option, 53
     pt-find command line option, 60                          pt-find command line option, 57
–no-ascend                                                   pt-fk-error-logger command line option, 70
     pt-archiver command line option, 14                     pt-heartbeat command line option, 78
–no-delete                                                   pt-kill command line option, 98
     pt-archiver command line option, 14                     pt-log-player command line option, 110
–notify-by-email                                             pt-online-schema-change command line option, 131
     pt-stalk command line option, 206                       pt-query-advisor command line option, 143
–numeric-ip                                                  pt-query-digest command line option, 160
     pt-deadlock-logger command line option, 34              pt-show-grants command line option, 177
–offset                                                      pt-slave-delay command line option, 185
     pt-fifo-split command line option, 53                    pt-slave-find command line option, 190
–only                                                        pt-slave-restart command line option, 197
     pt-show-grants command line option, 176                 pt-stalk command line option, 206
–only-select                                                 pt-table-checksum command line option, 222
     pt-log-player command line option, 110                  pt-table-sync command line option, 239
–optimize                                                    pt-table-usage command line option, 249
     pt-archiver command line option, 14                     pt-trend command line option, 259
–or                                                          pt-upgrade command line option, 266
     pt-find command line option, 57                          pt-variable-advisor command line option, 278
–order-by                                                    pt-visual-explain command line option, 289
     pt-query-digest command line option, 159           –pipeline-profile
     pt-upgrade command line option, 266                     pt-query-digest command line option, 160
–outliers                                               –play
     pt-query-digest command line option, 160                pt-log-player command line option, 110
–password                                               –plugin
     pt-archiver command line option, 14                     pt-archiver command line option, 14
     pt-config-diff command line option, 26              –port
     pt-deadlock-logger command line option, 34              pt-archiver command line option, 15
     pt-duplicate-key-checker command line option, 48        pt-config-diff command line option, 26
     pt-find command line option, 57                          pt-deadlock-logger command line option, 34
     pt-fk-error-logger command line option, 70              pt-duplicate-key-checker command line option, 48
     pt-heartbeat command line option, 78                    pt-find command line option, 57
     pt-index-usage command line option, 85                  pt-fk-error-logger command line option, 70
     pt-kill command line option, 98                         pt-heartbeat command line option, 78
     pt-log-player command line option, 110                  pt-index-usage command line option, 85
     pt-online-schema-change command line option, 131        pt-kill command line option, 99
     pt-query-advisor command line option, 143               pt-log-player command line option, 110
     pt-query-digest command line option, 160                pt-online-schema-change command line option, 132
     pt-show-grants command line option, 176                 pt-query-advisor command line option, 143
     pt-slave-delay command line option, 185                 pt-query-digest command line option, 160
     pt-slave-find command line option, 190                   pt-show-grants command line option, 177
     pt-slave-restart command line option, 197               pt-slave-delay command line option, 185
     pt-table-checksum command line option, 222              pt-slave-find command line option, 190
     pt-table-sync command line option, 239                  pt-slave-restart command line option, 197
     pt-table-usage command line option, 249                 pt-table-checksum command line option, 222
     pt-upgrade command line option, 266                     pt-table-sync command line option, 239
     pt-variable-advisor command line option, 278            pt-table-usage command line option, 249


Index                                                                                                       311
Percona Toolkit Documentation, Release 2.1.1


     pt-upgrade command line option, 266                –quiet
     pt-variable-advisor command line option, 278            pt-archiver command line option, 15
     pt-visual-explain command line option, 289              pt-index-usage command line option, 85
–prefix                                                       pt-log-player command line option, 110
     pt-stalk command line option, 206                       pt-online-schema-change command line option, 132
–primary-key-only                                            pt-slave-delay command line option, 185
     pt-archiver command line option, 15                     pt-slave-restart command line option, 197
–print                                                       pt-table-checksum command line option, 222
     pt-deadlock-logger command line option, 34              pt-trend command line option, 259
     pt-find command line option, 62                     –read-samples
     pt-fk-error-logger command line option, 70              pt-mysql-summary command line option, 124
     pt-kill command line option, 104                        pt-summary command line option, 212
     pt-log-player command line option, 110             –read-timeout
     pt-online-schema-change command line option, 132        pt-query-digest command line option, 161
     pt-query-digest command line option, 160                pt-table-usage command line option, 250
     pt-table-sync command line option, 239             –recurse
–print-all                                                   pt-heartbeat command line option, 78
     pt-query-advisor command line option, 143               pt-online-schema-change command line option, 132
–print-iterations                                            pt-slave-find command line option, 190
     pt-query-digest command line option, 160                pt-slave-restart command line option, 197
–print-master-server-id                                      pt-table-checksum command line option, 222
     pt-heartbeat command line option, 78               –recursion-method
–printf                                                      pt-heartbeat command line option, 78
     pt-find command line option, 62                          pt-online-schema-change command line option, 132
–procedure                                                   pt-slave-find command line option, 190
     pt-find command line option, 60                          pt-slave-restart command line option, 197
–processlist                                                 pt-table-checksum command line option, 222
     pt-query-digest command line option, 161                pt-table-sync command line option, 239
–profile-pid                                             –replace
     pt-ioprofile command line option, 93                     pt-archiver command line option, 15
–profile-process                                              pt-heartbeat command line option, 78
     pt-ioprofile command line option, 93                     pt-table-sync command line option, 240
–progress                                               –replicate
     pt-archiver command line option, 15                     pt-table-checksum command line option, 223
     pt-index-usage command line option, 85                  pt-table-sync command line option, 240
     pt-online-schema-change command line option, 132   –replicate-check-only
     pt-query-digest command line option, 161                pt-table-checksum command line option, 224
     pt-table-checksum command line option, 222         –replicate-database
     pt-table-usage command line option, 250                 pt-table-checksum command line option, 224
     pt-tcp-model command line option, 255              –replication-threads
     pt-trend command line option, 259                       pt-kill command line option, 102
–purge                                                  –report-all
     pt-archiver command line option, 15                     pt-query-digest command line option, 161
–quantile                                               –report-format
     pt-tcp-model command line option, 255                   pt-index-usage command line option, 86
–query                                                       pt-query-advisor command line option, 143
     pt-fingerprint command line option, 66                   pt-query-digest command line option, 161
     pt-query-advisor command line option, 143               pt-slave-find command line option, 190
     pt-table-usage command line option, 250            –report-histogram
     pt-upgrade command line option, 266                     pt-query-digest command line option, 162
–query-count                                            –report-width
     pt-kill command line option, 103                        pt-config-diff command line option, 27
–quick-delete                                           –reports
     pt-archiver command line option, 15                     pt-upgrade command line option, 266


312                                                                                                    Index
Percona Toolkit Documentation, Release 2.1.1


–resume                                                 –separate
     pt-table-checksum command line option, 224              pt-show-grants command line option, 177
–retention-time                                         –separator
     pt-stalk command line option, 206                       pt-table-checksum command line option, 224
–retries                                                –server-id
     pt-archiver command line option, 15                     pt-find command line option, 60
     pt-online-schema-change command line option, 133   –session-files
     pt-table-checksum command line option, 224              pt-log-player command line option, 111
–review                                                 –set-vars
     pt-query-advisor command line option, 144               pt-archiver command line option, 16
     pt-query-digest command line option, 162                pt-config-diff command line option, 27
–review-history                                              pt-deadlock-logger command line option, 34
     pt-query-digest command line option, 163                pt-duplicate-key-checker command line option, 49
–revoke                                                      pt-find command line option, 57
     pt-show-grants command line option, 177                 pt-fk-error-logger command line option, 70
–rowformat                                                   pt-heartbeat command line option, 79
     pt-find command line option, 60                          pt-index-usage command line option, 88
–rows                                                        pt-kill command line option, 99
     pt-find command line option, 60                          pt-log-player command line option, 111
–run-time                                                    pt-online-schema-change command line option, 133
     pt-archiver command line option, 16                     pt-query-advisor command line option, 144
     pt-deadlock-logger command line option, 34              pt-query-digest command line option, 166
     pt-fk-error-logger command line option, 70              pt-show-grants command line option, 177
     pt-heartbeat command line option, 78                    pt-slave-delay command line option, 185
     pt-ioprofile command line option, 93                     pt-slave-find command line option, 191
     pt-kill command line option, 99                         pt-slave-restart command line option, 198
     pt-query-digest command line option, 165                pt-table-checksum command line option, 224
     pt-slave-delay command line option, 185                 pt-table-sync command line option, 240
     pt-slave-restart command line option, 198               pt-table-usage command line option, 250
     pt-stalk command line option, 206                       pt-upgrade command line option, 267
     pt-table-usage command line option, 250                 pt-variable-advisor command line option, 278
     pt-tcp-model command line option, 255                   pt-visual-explain command line option, 289
     pt-upgrade command line option, 266                –share-lock
–run-time-mode                                               pt-archiver command line option, 16
     pt-query-digest command line option, 165           –shorten
–sample                                                      pt-query-digest command line option, 167
     pt-query-advisor command line option, 144               pt-upgrade command line option, 267
     pt-query-digest command line option, 166           –show-all
–sample-time                                                 pt-query-digest command line option, 167
     pt-diskstats command line option, 44               –show-inactive
–save-results-database                                       pt-diskstats command line option, 44
     pt-index-usage command line option, 86             –show-timestamps
–save-samples                                                pt-diskstats command line option, 44
     pt-diskstats command line option, 44               –since
     pt-ioprofile command line option, 93                     pt-query-digest command line option, 167
     pt-mysql-summary command line option, 124          –skew
     pt-summary command line option, 212                     pt-heartbeat command line option, 79
–select                                                 –skip-count
     pt-query-digest command line option, 166                pt-slave-restart command line option, 198
–sentinel                                               –skip-foreign-key-checks
     pt-archiver command line option, 16                     pt-archiver command line option, 16
     pt-heartbeat command line option, 79               –sleep
     pt-kill command line option, 99                         pt-archiver command line option, 16
     pt-slave-restart command line option, 198               pt-mysql-summary command line option, 124


Index                                                                                                    313
Percona Toolkit Documentation, Release 2.1.1


     pt-slave-restart command line option, 198          –summarize-processes
     pt-stalk command line option, 206                        pt-summary command line option, 212
     pt-summary command line option, 212                –sync-to-master
–sleep-coef                                                   pt-table-sync command line option, 240
     pt-archiver command line option, 16                –tab
–socket                                                       pt-deadlock-logger command line option, 35
     pt-archiver command line option, 17                –table
     pt-config-diff command line option, 27                    pt-heartbeat command line option, 79
     pt-deadlock-logger command line option, 34         –table-access
     pt-duplicate-key-checker command line option, 49         pt-query-digest command line option, 168
     pt-find command line option, 57                     –tables
     pt-fk-error-logger command line option, 70               pt-duplicate-key-checker command line option, 49
     pt-heartbeat command line option, 79                     pt-index-usage command line option, 89
     pt-index-usage command line option, 88                   pt-table-checksum command line option, 224
     pt-kill command line option, 99                          pt-table-sync command line option, 240
     pt-log-player command line option, 111             –tables-regex
     pt-online-schema-change command line option, 133         pt-index-usage command line option, 89
     pt-query-advisor command line option, 144                pt-table-checksum command line option, 224
     pt-query-digest command line option, 167           –tablesize
     pt-show-grants command line option, 177                  pt-find command line option, 60
     pt-slave-delay command line option, 185            –tbllike
     pt-slave-find command line option, 191                    pt-find command line option, 61
     pt-slave-restart command line option, 198          –tblregex
     pt-table-checksum command line option, 224               pt-find command line option, 61
     pt-table-sync command line option, 240             –tblversion
     pt-table-usage command line option, 250                  pt-find command line option, 61
     pt-upgrade command line option, 267                –tcpdump-errors
     pt-variable-advisor command line option, 278             pt-query-digest command line option, 168
     pt-visual-explain command line option, 289         –temp-database
–source                                                       pt-upgrade command line option, 267
     pt-archiver command line option, 17                –temp-table
–source-of-variables                                          pt-upgrade command line option, 267
     pt-variable-advisor command line option, 278       –test-matching
–split                                                        pt-kill command line option, 102
     pt-log-player command line option, 111             –threads
–split-random                                                 pt-log-player command line option, 111
     pt-log-player command line option, 111             –threshold
–stalk                                                        pt-stalk command line option, 206
     pt-stalk command line option, 206                  –timeline
–start-end                                                    pt-query-digest command line option, 168
     pt-tcp-model command line option, 255              –timeout-ok
–statistics                                                   pt-table-sync command line option, 240
     pt-archiver command line option, 17                –trigger
     pt-fifo-split command line option, 53                     pt-find command line option, 61
     pt-query-digest command line option, 168           –trigger-table
–stop                                                         pt-find command line option, 61
     pt-archiver command line option, 18                –trim
     pt-heartbeat command line option, 79                     pt-table-checksum command line option, 224
     pt-kill command line option, 99                          pt-table-sync command line option, 241
     pt-slave-restart command line option, 198          –txn-size
–summarize-mounts                                             pt-archiver command line option, 18
     pt-summary command line option, 212                –type
–summarize-network                                            pt-log-player command line option, 111
     pt-summary command line option, 212                      pt-query-advisor command line option, 144


314                                                                                                      Index
Percona Toolkit Documentation, Release 2.1.1


     pt-query-digest command line option, 169                pt-fifo-split command line option, 53
     pt-tcp-model command line option, 256                   pt-find command line option, 57
–until                                                       pt-fingerprint command line option, 66
     pt-query-digest command line option, 171                pt-fk-error-logger command line option, 70
–until-master                                                pt-heartbeat command line option, 80
     pt-slave-restart command line option, 199               pt-index-usage command line option, 89
–until-relay                                                 pt-ioprofile command line option, 93
     pt-slave-restart command line option, 199               pt-kill command line option, 99
–update                                                      pt-log-player command line option, 112
     pt-heartbeat command line option, 79                    pt-mysql-summary command line option, 124
–use-master                                                  pt-online-schema-change command line option, 133
     pt-slave-delay command line option, 185                 pt-query-advisor command line option, 144
–user                                                        pt-query-digest command line option, 171
     pt-archiver command line option, 18                     pt-show-grants command line option, 177
     pt-config-diff command line option, 27                   pt-slave-delay command line option, 186
     pt-deadlock-logger command line option, 35              pt-slave-find command line option, 191
     pt-duplicate-key-checker command line option, 49        pt-slave-restart command line option, 199
     pt-find command line option, 57                          pt-stalk command line option, 206
     pt-fk-error-logger command line option, 70              pt-summary command line option, 212
     pt-heartbeat command line option, 79                    pt-table-checksum command line option, 225
     pt-index-usage command line option, 89                  pt-table-sync command line option, 241
     pt-kill command line option, 99                         pt-table-usage command line option, 250
     pt-log-player command line option, 112                  pt-tcp-model command line option, 256
     pt-online-schema-change command line option, 133        pt-trend command line option, 259
     pt-query-advisor command line option, 144               pt-upgrade command line option, 267
     pt-query-digest command line option, 171                pt-variable-advisor command line option, 279
     pt-show-grants command line option, 177                 pt-visual-explain command line option, 289
     pt-slave-delay command line option, 185            –victims
     pt-slave-find command line option, 191                   pt-kill command line option, 99
     pt-slave-restart command line option, 199          –view
     pt-table-checksum command line option, 225              pt-find command line option, 61
     pt-table-sync command line option, 241             –wait
     pt-table-usage command line option, 250                 pt-table-sync command line option, 241
     pt-upgrade command line option, 267                –wait-after-kill
     pt-variable-advisor command line option, 278            pt-kill command line option, 100
     pt-visual-explain command line option, 289         –wait-before-kill
–variable                                                    pt-kill command line option, 100
     pt-stalk command line option, 206                  –watch-server
–variations                                                  pt-query-digest command line option, 171
     pt-query-digest command line option, 171                pt-tcp-model command line option, 256
–verbose                                                –where
     pt-duplicate-key-checker command line option, 49        pt-archiver command line option, 18
     pt-kill command line option, 103                        pt-query-advisor command line option, 144
     pt-log-player command line option, 112                  pt-table-checksum command line option, 225
     pt-query-advisor command line option, 144               pt-table-sync command line option, 242
     pt-slave-restart command line option, 199          –why-quit
     pt-table-sync command line option, 241                  pt-archiver command line option, 19
     pt-variable-advisor command line option, 279       –zero-query-times
–version                                                     pt-upgrade command line option, 267
     pt-archiver command line option, 18                –[no]bin-log
     pt-config-diff command line option, 27                   pt-table-sync command line option, 234
     pt-deadlock-logger command line option, 35         –[no]buffer-to-client
     pt-diskstats command line option, 44                    pt-table-sync command line option, 234
     pt-duplicate-key-checker command line option, 49   –[no]bulk-delete-limit


Index                                                                                                    315
Percona Toolkit Documentation, Release 2.1.1


     pt-archiver command line option, 10                –[no]replicate-check
–[no]check-charset                                           pt-table-checksum command line option, 223
     pt-archiver command line option, 11                –[no]report
–[no]check-columns                                           pt-config-diff command line option, 26
     pt-archiver command line option, 11                     pt-index-usage command line option, 85
–[no]check-master                                            pt-query-digest command line option, 161
     pt-table-sync command line option, 234             –[no]results
–[no]check-privileges                                        pt-log-player command line option, 111
     pt-table-sync command line option, 234             –[no]safe-auto-increment
–[no]check-relay-log                                         pt-archiver command line option, 16
     pt-slave-restart command line option, 196          –[no]show-create-table
–[no]check-replication-filters                                pt-query-advisor command line option, 144
     pt-online-schema-change command line option, 129   –[no]sql
     pt-table-checksum command line option, 218              pt-duplicate-key-checker command line option, 49
–[no]check-slave                                        –[no]strip-comments
     pt-table-sync command line option, 234                  pt-kill command line option, 99
–[no]check-triggers                                     –[no]summary
     pt-table-sync command line option, 235                  pt-duplicate-key-checker command line option, 49
–[no]clear-warnings                                     –[no]swap-tables
     pt-upgrade command line option, 263                     pt-online-schema-change command line option, 133
–[no]clustered                                          –[no]timestamp
     pt-duplicate-key-checker command line option, 47        pt-show-grants command line option, 177
–[no]collapse                                           –[no]transaction
     pt-deadlock-logger command line option, 32              pt-table-sync command line option, 241
–[no]continue                                           –[no]unique-checks
     pt-slave-delay command line option, 184                 pt-table-sync command line option, 241
–[no]continue-on-error                                  –[no]warnings
     pt-query-advisor command line option, 142               pt-log-player command line option, 112
     pt-query-digest command line option, 154           –[no]zero-admin
     pt-table-usage command line option, 248                 pt-query-digest command line option, 172
–[no]create-replicate-table                             –[no]zero-bool
     pt-table-checksum command line option, 219              pt-query-digest command line option, 172
–[no]create-views                                       –[no]zero-chunk
     pt-index-usage command line option, 84                  pt-table-sync command line option, 242
–[no]drop-old-table
     pt-online-schema-change command line option, 130   P
–[no]empty-replicate-table                              pt-archiver command line option
     pt-table-checksum command line option, 220              –analyze, 9
–[no]for-explain                                             –ascend-first, 9
     pt-query-digest command line option, 157                –ask-pass, 10
–[no]foreign-key-checks                                      –buffer, 10
     pt-table-sync command line option, 237                  –bulk-delete, 10
–[no]header                                                  –bulk-insert, 10
     pt-show-grants command line option, 176                 –charset, 10
–[no]hex-blob                                                –check-interval, 11
     pt-table-sync command line option, 237                  –check-slave-lag, 11
–[no]ignore-self                                             –columns, 11
     pt-kill command line option, 101                        –commit-each, 11
–[no]index-hint                                              –config, 11
     pt-table-sync command line option, 238                  –delayed-insert, 12
–[no]insert-heartbeat-row                                    –dest, 12
     pt-heartbeat command line option, 77                    –dry-run, 12
–[no]quote                                                   –file, 12
     pt-find command line option, 57                          –for-update, 12


316                                                                                                    Index
Percona Toolkit Documentation, Release 2.1.1


     –header, 13                         –pid, 26
     –help, 13                           –port, 26
     –high-priority-select, 13           –report-width, 27
     –host, 13                           –set-vars, 27
     –ignore, 13                         –socket, 27
     –limit, 13                          –user, 27
     –local, 13                          –version, 27
     –low-priority-delete, 13            –[no]report, 26
     –low-priority-insert, 13       pt-deadlock-logger command line option
     –max-lag, 13                        –ask-pass, 32
     –no-ascend, 14                      –charset, 32
     –no-delete, 14                      –clear-deadlocks, 32
     –optimize, 14                       –columns, 33
     –password, 14                       –config, 33
     –pid, 14                            –create-dest-table, 33
     –plugin, 14                         –daemonize, 33
     –port, 15                           –defaults-file, 33
     –primary-key-only, 15               –dest, 33
     –progress, 15                       –help, 34
     –purge, 15                          –host, 34
     –quick-delete, 15                   –interval, 34
     –quiet, 15                          –log, 34
     –replace, 15                        –numeric-ip, 34
     –retries, 15                        –password, 34
     –run-time, 16                       –pid, 34
     –sentinel, 16                       –port, 34
     –set-vars, 16                       –print, 34
     –share-lock, 16                     –run-time, 34
     –skip-foreign-key-checks, 16        –set-vars, 34
     –sleep, 16                          –socket, 34
     –sleep-coef, 16                     –tab, 35
     –socket, 17                         –user, 35
     –source, 17                         –version, 35
     –statistics, 17                     –[no]collapse, 32
     –stop, 18                      pt-diskstats command line option
     –txn-size, 18                       –columns-regex, 43
     –user, 18                           –config, 43
     –version, 18                        –devices-regex, 43
     –where, 18                          –group-by, 43
     –why-quit, 19                       –headers, 43
     –[no]bulk-delete-limit, 10          –help, 44
     –[no]check-charset, 11              –interval, 44
     –[no]check-columns, 11              –iterations, 44
     –[no]safe-auto-increment, 16        –sample-time, 44
pt-config-diff command line option        –save-samples, 44
     –ask-pass, 26                       –show-inactive, 44
     –charset, 26                        –show-timestamps, 44
     –config, 26                          –version, 44
     –daemonize, 26                 pt-duplicate-key-checker command line option
     –defaults-file, 26                   –all-structs, 47
     –help, 26                           –ask-pass, 47
     –host, 26                           –charset, 47
     –ignore-variables, 26               –config, 47
     –password, 26                       –databases, 48


Index                                                                              317
Percona Toolkit Documentation, Release 2.1.1


     –defaults-file, 48                              –engine, 59
     –engines, 48                                   –exec, 61
     –help, 48                                      –exec-dsn, 61
     –host, 48                                      –exec-plus, 61
     –ignore-databases, 48                          –function, 60
     –ignore-engines, 48                            –help, 56
     –ignore-order, 48                              –host, 57
     –ignore-tables, 48                             –indexsize, 60
     –key-types, 48                                 –kmin, 60
     –password, 48                                  –ktime, 60
     –pid, 48                                       –mmin, 60
     –port, 48                                      –mtime, 60
     –set-vars, 49                                  –or, 57
     –socket, 49                                    –password, 57
     –tables, 49                                    –pid, 57
     –user, 49                                      –port, 57
     –verbose, 49                                   –print, 62
     –version, 49                                   –printf, 62
     –[no]clustered, 47                             –procedure, 60
     –[no]sql, 49                                   –rowformat, 60
     –[no]summary, 49                               –rows, 60
pt-fifo-split command line option                    –server-id, 60
     –config, 52                                     –set-vars, 57
     –fifo, 52                                       –socket, 57
     –force, 53                                     –tablesize, 60
     –help, 53                                      –tbllike, 61
     –lines, 53                                     –tblregex, 61
     –offset, 53                                    –tblversion, 61
     –pid, 53                                       –trigger, 61
     –statistics, 53                                –trigger-table, 61
     –version, 53                                   –user, 57
pt-find command line option                          –version, 57
     –ask-pass, 56                                  –view, 61
     –autoinc, 58                                   –[no]quote, 57
     –avgrowlen, 58                            pt-fingerprint command line option
     –case-insensitive, 56                          –config, 66
     –charset, 56                                   –help, 66
     –checksum, 58                                  –match-embedded-numbers, 66
     –cmin, 58                                      –match-md5-checksums, 66
     –collation, 58                                 –query, 66
     –column-name, 58                               –version, 66
     –column-type, 58                          pt-fk-error-logger command line option
     –comment, 58                                   –ask-pass, 69
     –config, 56                                     –charset, 69
     –connection-id, 58                             –config, 69
     –createopts, 59                                –daemonize, 69
     –ctime, 59                                     –defaults-file, 69
     –datafree, 59                                  –dest, 69
     –datasize, 59                                  –help, 69
     –day-start, 56                                 –host, 70
     –dblike, 59                                    –interval, 70
     –dbregex, 59                                   –log, 70
     –defaults-file, 56                              –password, 70
     –empty, 59                                     –pid, 70


318                                                                                     Index
Percona Toolkit Documentation, Release 2.1.1


     –port, 70                                –help, 85
     –print, 70                               –host, 85
     –run-time, 70                            –ignore-databases, 85
     –set-vars, 70                            –ignore-databases-regex, 85
     –socket, 70                              –ignore-tables, 85
     –user, 70                                –ignore-tables-regex, 85
     –version, 70                             –password, 85
pt-heartbeat command line option              –port, 85
     –ask-pass, 75                            –progress, 85
     –charset, 75                             –quiet, 85
     –check, 75                               –report-format, 86
     –config, 75                               –save-results-database, 86
     –create-table, 75                        –set-vars, 88
     –daemonize, 76                           –socket, 88
     –database, 76                            –tables, 89
     –dbi-driver, 76                          –tables-regex, 89
     –defaults-file, 76                        –user, 89
     –file, 76                                 –version, 89
     –frames, 76                              –[no]create-views, 84
     –help, 77                                –[no]report, 85
     –host, 77                           pt-ioprofile command line option
     –interval, 77                            –aggregate, 92
     –log, 77                                 –cell, 92
     –master-server-id, 77                    –group-by, 92
     –monitor, 77                             –help, 93
     –password, 78                            –profile-pid, 93
     –pid, 78                                 –profile-process, 93
     –port, 78                                –run-time, 93
     –print-master-server-id, 78              –save-samples, 93
     –recurse, 78                             –version, 93
     –recursion-method, 78               pt-kill command line option
     –replace, 78                             –any-busy-time, 103
     –run-time, 78                            –ask-pass, 97
     –sentinel, 79                            –busy-time, 100
     –set-vars, 79                            –charset, 97
     –skew, 79                                –config, 97
     –socket, 79                              –daemonize, 97
     –stop, 79                                –defaults-file, 97
     –table, 79                               –each-busy-time, 103
     –update, 79                              –execute-command, 103
     –user, 79                                –filter, 97
     –version, 80                             –group-by, 98
     –[no]insert-heartbeat-row, 77            –help, 98
pt-index-usage command line option            –host, 98
     –ask-pass, 84                            –idle-time, 100
     –charset, 84                             –ignore-command, 100
     –config, 84                               –ignore-db, 100
     –create-save-results-database, 84        –ignore-host, 100
     –database, 84                            –ignore-info, 101
     –databases, 84                           –ignore-state, 101
     –databases-regex, 84                     –ignore-user, 101
     –defaults-file, 84                        –interval, 98
     –drop, 84                                –kill, 103
     –empty-save-results-tables, 85           –kill-query, 104


Index                                                                              319
Percona Toolkit Documentation, Release 2.1.1


     –log, 98                                       –type, 111
     –match-all, 101                                –user, 112
     –match-command, 101                            –verbose, 112
     –match-db, 101                                 –version, 112
     –match-host, 102                               –[no]results, 111
     –match-info, 102                               –[no]warnings, 112
     –match-state, 102                         pt-mysql-summary command line option
     –match-user, 102                               –config, 124
     –password, 98                                  –databases, 124
     –pid, 98                                       –help, 124
     –port, 99                                      –read-samples, 124
     –print, 104                                    –save-samples, 124
     –query-count, 103                              –sleep, 124
     –replication-threads, 102                      –version, 124
     –run-time, 99                             pt-online-schema-change command line option
     –sentinel, 99                                  –alter, 127
     –set-vars, 99                                  –alter-foreign-keys-method, 127
     –socket, 99                                    –ask-pass, 128
     –stop, 99                                      –charset, 129
     –test-matching, 102                            –check-interval, 129
     –user, 99                                      –check-slave-lag, 129
     –verbose, 103                                  –chunk-index, 129
     –version, 99                                   –chunk-size, 129
     –victims, 99                                   –chunk-size-limit, 129
     –wait-after-kill, 100                          –chunk-time, 130
     –wait-before-kill, 100                         –config, 130
     –[no]ignore-self, 101                          –critical-load, 130
     –[no]strip-comments, 99                        –defaults-file, 130
pt-log-player command line option                   –dry-run, 130
     –ask-pass, 108                                 –execute, 130
     –base-dir, 108                                 –help, 131
     –base-file-name, 108                            –host, 131
     –charset, 108                                  –lock-wait-timeout, 131
     –config, 108                                    –max-lag, 131
     –defaults-file, 108                             –max-load, 131
     –dry-run, 108                                  –password, 131
     –filter, 108                                    –pid, 131
     –help, 110                                     –port, 132
     –host, 110                                     –print, 132
     –iterations, 110                               –progress, 132
     –max-sessions, 110                             –quiet, 132
     –only-select, 110                              –recurse, 132
     –password, 110                                 –recursion-method, 132
     –pid, 110                                      –retries, 133
     –play, 110                                     –set-vars, 133
     –port, 110                                     –socket, 133
     –print, 110                                    –user, 133
     –quiet, 110                                    –version, 133
     –session-files, 111                             –[no]check-replication-filters, 129
     –set-vars, 111                                 –[no]drop-old-table, 130
     –socket, 111                                   –[no]swap-tables, 133
     –split, 111                               pt-query-advisor command line option
     –split-random, 111                             –ask-pass, 142
     –threads, 111                                  –charset, 142


320                                                                                          Index
Percona Toolkit Documentation, Release 2.1.1


     –config, 142                              –mirror, 159
     –daemonize, 142                          –order-by, 159
     –database, 142                           –outliers, 160
     –defaults-file, 142                       –password, 160
     –group-by, 143                           –pid, 160
     –help, 143                               –pipeline-profile, 160
     –host, 143                               –port, 160
     –ignore-rules, 143                       –print, 160
     –password, 143                           –print-iterations, 160
     –pid, 143                                –processlist, 161
     –port, 143                               –progress, 161
     –print-all, 143                          –read-timeout, 161
     –query, 143                              –report-all, 161
     –report-format, 143                      –report-format, 161
     –review, 144                             –report-histogram, 162
     –sample, 144                             –review, 162
     –set-vars, 144                           –review-history, 163
     –socket, 144                             –run-time, 165
     –type, 144                               –run-time-mode, 165
     –user, 144                               –sample, 166
     –verbose, 144                            –select, 166
     –version, 144                            –set-vars, 166
     –where, 144                              –shorten, 167
     –[no]continue-on-error, 142              –show-all, 167
     –[no]show-create-table, 144              –since, 167
pt-query-digest command line option           –socket, 167
     –apdex-threshold, 153                    –statistics, 168
     –ask-pass, 153                           –table-access, 168
     –attribute-aliases, 153                  –tcpdump-errors, 168
     –attribute-value-limit, 153              –timeline, 168
     –aux-dsn, 153                            –type, 169
     –charset, 154                            –until, 171
     –check-attributes-limit, 154             –user, 171
     –config, 154                              –variations, 171
     –create-review-history-table, 154        –version, 171
     –create-review-table, 154                –watch-server, 171
     –daemonize, 154                          –[no]continue-on-error, 154
     –defaults-file, 154                       –[no]for-explain, 157
     –embedded-attributes, 154                –[no]report, 161
     –execute, 155                            –[no]zero-admin, 172
     –execute-throttle, 155                   –[no]zero-bool, 172
     –expected-range, 155                pt-show-grants command line option
     –explain, 156                            –ask-pass, 175
     –filter, 156                              –charset, 176
     –fingerprints, 157                        –config, 176
     –group-by, 158                           –database, 176
     –help, 158                               –defaults-file, 176
     –host, 158                               –drop, 176
     –ignore-attributes, 158                  –flush, 176
     –inherit-attributes, 158                 –help, 176
     –interval, 159                           –host, 176
     –iterations, 159                         –ignore, 176
     –limit, 159                              –only, 176
     –log, 159                                –password, 176


Index                                                                              321
Percona Toolkit Documentation, Release 2.1.1


      –pid, 177                                      –config, 196
      –port, 177                                     –daemonize, 196
      –revoke, 177                                   –database, 196
      –separate, 177                                 –defaults-file, 196
      –set-vars, 177                                 –error-length, 196
      –socket, 177                                   –error-numbers, 196
      –user, 177                                     –error-text, 196
      –version, 177                                  –help, 196
      –[no]header, 176                               –host, 196
      –[no]timestamp, 177                            –log, 197
pt-slave-delay command line option                   –max-sleep, 197
      –ask-pass, 184                                 –min-sleep, 197
      –charset, 184                                  –monitor, 197
      –config, 184                                    –password, 197
      –daemonize, 184                                –pid, 197
      –defaults-file, 184                             –port, 197
      –delay, 184                                    –quiet, 197
      –help, 184                                     –recurse, 197
      –host, 184                                     –recursion-method, 197
      –interval, 185                                 –run-time, 198
      –log, 185                                      –sentinel, 198
      –password, 185                                 –set-vars, 198
      –pid, 185                                      –skip-count, 198
      –port, 185                                     –sleep, 198
      –quiet, 185                                    –socket, 198
      –run-time, 185                                 –stop, 198
      –set-vars, 185                                 –until-master, 199
      –socket, 185                                   –until-relay, 199
      –use-master, 185                               –user, 199
      –user, 185                                     –verbose, 199
      –version, 186                                  –version, 199
      –[no]continue, 184                             –[no]check-relay-log, 196
pt-slave-find command line option               pt-stalk command line option
      –ask-pass, 189                                 –collect, 203
      –charset, 189                                  –collect-gdb, 203
      –config, 189                                    –collect-oprofile, 204
      –database, 190                                 –collect-strace, 204
      –defaults-file, 190                             –collect-tcpdump, 204
      –help, 190                                     –config, 204
      –host, 190                                     –cycles, 204
      –password, 190                                 –daemonize, 204
      –pid, 190                                      –dest, 204
      –port, 190                                     –disk-bytes-free, 204
      –recurse, 190                                  –disk-pct-free, 204
      –recursion-method, 190                         –function, 204
      –report-format, 190                            –help, 205
      –set-vars, 191                                 –interval, 205
      –socket, 191                                   –iterations, 205
      –user, 191                                     –log, 205
      –version, 191                                  –match, 206
pt-slave-restart command line option                 –notify-by-email, 206
      –always, 195                                   –pid, 206
      –ask-pass, 195                                 –prefix, 206
      –charset, 195                                  –retention-time, 206


322                                                                              Index
Percona Toolkit Documentation, Release 2.1.1


     –run-time, 206                          –resume, 224
     –sleep, 206                             –retries, 224
     –stalk, 206                             –separator, 224
     –threshold, 206                         –set-vars, 224
     –variable, 206                          –socket, 224
     –version, 206                           –tables, 224
pt-summary command line option               –tables-regex, 224
     –config, 212                             –trim, 224
     –help, 212                              –user, 225
     –read-samples, 212                      –version, 225
     –save-samples, 212                      –where, 225
     –sleep, 212                             –[no]check-replication-filters, 218
     –summarize-mounts, 212                  –[no]create-replicate-table, 219
     –summarize-network, 212                 –[no]empty-replicate-table, 220
     –summarize-processes, 212               –[no]replicate-check, 223
     –version, 212                      pt-table-sync command line option
pt-table-checksum command line option        –algorithms, 233
     –ask-pass, 218                          –ask-pass, 233
     –check-interval, 218                    –bidirectional, 234
     –check-slave-lag, 218                   –buffer-in-mysql, 234
     –chunk-index, 218                       –charset, 234
     –chunk-size, 218                        –chunk-column, 235
     –chunk-size-limit, 219                  –chunk-index, 235
     –chunk-time, 219                        –chunk-size, 235
     –columns, 219                           –columns, 235
     –config, 219                             –config, 235
     –databases, 219                         –conflict-column, 235
     –databases-regex, 219                   –conflict-comparison, 235
     –defaults-file, 219                      –conflict-error, 236
     –engines, 220                           –conflict-threshold, 236
     –explain, 220                           –conflict-value, 236
     –float-precision, 220                    –databases, 236
     –function, 220                          –defaults-file, 236
     –help, 220                              –dry-run, 237
     –host, 220                              –engines, 237
     –ignore-columns, 220                    –execute, 237
     –ignore-databases, 221                  –explain-hosts, 237
     –ignore-databases-regex, 221            –float-precision, 237
     –ignore-engines, 221                    –function, 237
     –ignore-tables, 221                     –help, 237
     –ignore-tables-regex, 221               –host, 238
     –lock-wait-timeout, 221                 –ignore-columns, 238
     –max-lag, 221                           –ignore-databases, 238
     –max-load, 221                          –ignore-engines, 238
     –password, 222                          –ignore-tables, 238
     –pid, 222                               –lock, 238
     –port, 222                              –lock-and-rename, 239
     –progress, 222                          –password, 239
     –quiet, 222                             –pid, 239
     –recurse, 222                           –port, 239
     –recursion-method, 222                  –print, 239
     –replicate, 223                         –recursion-method, 239
     –replicate-check-only, 224              –replace, 240
     –replicate-database, 224                –replicate, 240


Index                                                                              323
Percona Toolkit Documentation, Release 2.1.1


     –set-vars, 240                                  –quantile, 255
     –socket, 240                                    –run-time, 255
     –sync-to-master, 240                            –start-end, 255
     –tables, 240                                    –type, 256
     –timeout-ok, 240                                –version, 256
     –trim, 241                                      –watch-server, 256
     –user, 241                                pt-trend command line option
     –verbose, 241                                   –config, 259
     –version, 241                                   –help, 259
     –wait, 241                                      –pid, 259
     –where, 242                                     –progress, 259
     –[no]bin-log, 234                               –quiet, 259
     –[no]buffer-to-client, 234                      –version, 259
     –[no]check-master, 234                    pt-upgrade command line option
     –[no]check-privileges, 234                      –ask-pass, 263
     –[no]check-slave, 234                           –base-dir, 263
     –[no]check-triggers, 235                        –charset, 263
     –[no]foreign-key-checks, 237                    –clear-warnings-table, 263
     –[no]hex-blob, 237                              –compare, 263
     –[no]index-hint, 238                            –compare-results-method, 264
     –[no]transaction, 241                           –config, 264
     –[no]unique-checks, 241                         –continue-on-error, 264
     –[no]zero-chunk, 242                            –convert-to-select, 264
pt-table-usage command line option                   –daemonize, 264
     –ask-pass, 248                                  –explain-hosts, 264
     –charset, 248                                   –filter, 265
     –config, 248                                     –fingerprints, 265
     –constant-data-value, 248                       –float-precision, 265
     –create-table-definitions, 248                   –help, 265
     –daemonize, 249                                 –host, 265
     –database, 249                                  –iterations, 266
     –defaults-file, 249                              –limit, 266
     –explain-extended, 249                          –log, 266
     –filter, 249                                     –max-different-rows, 266
     –help, 249                                      –order-by, 266
     –host, 249                                      –password, 266
     –id-attribute, 249                              –pid, 266
     –log, 249                                       –port, 266
     –password, 249                                  –query, 266
     –pid, 249                                       –reports, 266
     –port, 249                                      –run-time, 266
     –progress, 250                                  –set-vars, 267
     –query, 250                                     –shorten, 267
     –read-timeout, 250                              –socket, 267
     –run-time, 250                                  –temp-database, 267
     –set-vars, 250                                  –temp-table, 267
     –socket, 250                                    –user, 267
     –user, 250                                      –version, 267
     –version, 250                                   –zero-query-times, 267
     –[no]continue-on-error, 248                     –[no]clear-warnings, 263
pt-tcp-model command line option               pt-variable-advisor command line option
     –config, 255                                     –ask-pass, 277
     –help, 255                                      –charset, 277
     –progress, 255                                  –config, 278


324                                                                                      Index
Percona Toolkit Documentation, Release 2.1.1


     –daemonize, 278
     –defaults-file, 278
     –help, 278
     –host, 278
     –ignore-rules, 278
     –password, 278
     –pid, 278
     –port, 278
     –set-vars, 278
     –socket, 278
     –source-of-variables, 278
     –user, 278
     –verbose, 279
     –version, 279
pt-visual-explain command line option
     –ask-pass, 288
     –charset, 288
     –clustered-pk, 288
     –config, 288
     –connect, 288
     –database, 288
     –defaults-file, 288
     –format, 288
     –help, 288
     –host, 288
     –password, 288
     –pid, 289
     –port, 289
     –set-vars, 289
     –socket, 289
     –user, 289
     –version, 289




Index                                                                           325

More Related Content

PDF
Doctrine Manual
PDF
CALM DURING THE STORM:Best Practices in Multicast Security
PDF
Storage Virtualization for Efficient Operations
PDF
Documentation - LibraryRandom
PDF
What's New in VMware Virtual SAN
 
PDF
Prv disk en
PDF
Doctrine Manual
CALM DURING THE STORM:Best Practices in Multicast Security
Storage Virtualization for Efficient Operations
Documentation - LibraryRandom
What's New in VMware Virtual SAN
 
Prv disk en

What's hot (18)

PDF
Net app v-c_tech_report_3785
PDF
smurfit stone container 2006_AR
PDF
Java how to_program__7th_edition
PDF
AdvFS Storage allocation/reservation
PDF
RHEL-7 Administrator Guide for RedHat 7
PDF
PDF
Coulter manual de usos
PDF
Code Conventions
PDF
Ibm system storage productivity center deployment guide sg247560
PDF
Newfies-Dialer : Autodialer software - Documentation version 1.1.0
PDF
Slackbook 2.0
PDF
dte_1Q07Supp
PDF
Hsa Runtime version 1.00 Provisional
PDF
Palo alto-3.1 administrators-guide
PDF
PDF
Omniture suite 14_user_manual
PDF
Manual
Net app v-c_tech_report_3785
smurfit stone container 2006_AR
Java how to_program__7th_edition
AdvFS Storage allocation/reservation
RHEL-7 Administrator Guide for RedHat 7
Coulter manual de usos
Code Conventions
Ibm system storage productivity center deployment guide sg247560
Newfies-Dialer : Autodialer software - Documentation version 1.1.0
Slackbook 2.0
dte_1Q07Supp
Hsa Runtime version 1.00 Provisional
Palo alto-3.1 administrators-guide
Omniture suite 14_user_manual
Manual
Ad

Similar to Percona toolkit 2_1_operations_manual (20)

PDF
Cuda toolkit reference manual
PDF
Cinelerra Video Editing Manual
PDF
Copying guide
PDF
SPI Concepts.pdf
PDF
WebHost Manager Online Help 1.0
PDF
WebHost Manager Online Help 1.0
PDF
Motorola ws2000 wireless switch cli reference guide
PDF
Motorola ws2000 wireless switch cli reference guide
PDF
Networkx 0.99
PDF
PDF
Motorola air defense mobile 6.1 user guide
PDF
Motorola air defense mobile 6.1 user guide
PDF
Motorola air defense mobile 6.1 user guide
PDF
Metatron Technology Consulting 's MySQL to PostgreSQL ...
PDF
Manual doctrine jog
PDF
service and repair manual with parts manual for scissor lift skyjack 6832rt.pdf
PDF
DslMacroReference_2022_en+++++++++++++++++++++.pdf
PDF
Chapter1 6
Cuda toolkit reference manual
Cinelerra Video Editing Manual
Copying guide
SPI Concepts.pdf
WebHost Manager Online Help 1.0
WebHost Manager Online Help 1.0
Motorola ws2000 wireless switch cli reference guide
Motorola ws2000 wireless switch cli reference guide
Networkx 0.99
Motorola air defense mobile 6.1 user guide
Motorola air defense mobile 6.1 user guide
Motorola air defense mobile 6.1 user guide
Metatron Technology Consulting 's MySQL to PostgreSQL ...
Manual doctrine jog
service and repair manual with parts manual for scissor lift skyjack 6832rt.pdf
DslMacroReference_2022_en+++++++++++++++++++++.pdf
Chapter1 6
Ad

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Empathic Computing: Creating Shared Understanding
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Electronic commerce courselecture one. Pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
cuic standard and advanced reporting.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Unlocking AI with Model Context Protocol (MCP)
Empathic Computing: Creating Shared Understanding
Network Security Unit 5.pdf for BCA BBA.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Per capita expenditure prediction using model stacking based on satellite ima...
Electronic commerce courselecture one. Pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Spectroscopy.pptx food analysis technology
cuic standard and advanced reporting.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
The AUB Centre for AI in Media Proposal.docx
20250228 LYD VKU AI Blended-Learning.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

Percona toolkit 2_1_operations_manual

  • 1. Percona Toolkit Documentation Release 2.1.1 Percona Inc April 04, 2012
  • 3. CONTENTS 1 Getting Percona Toolkit 3 1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Tools 5 2.1 pt-align . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 pt-archiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 pt-config-diff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4 pt-deadlock-logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.5 pt-diskstats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6 pt-duplicate-key-checker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.7 pt-fifo-split . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.8 pt-find . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.9 pt-fingerprint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 2.10 pt-fk-error-logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 2.11 pt-heartbeat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2.12 pt-index-usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 2.13 pt-ioprofile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 2.14 pt-kill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 2.15 pt-log-player . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 2.16 pt-mext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 2.17 pt-mysql-summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 2.18 pt-online-schema-change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 2.19 pt-pmp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 2.20 pt-query-advisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 2.21 pt-query-digest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 2.22 pt-show-grants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 2.23 pt-sift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 2.24 pt-slave-delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 2.25 pt-slave-find . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 2.26 pt-slave-restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 2.27 pt-stalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 2.28 pt-summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 2.29 pt-table-checksum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 2.30 pt-table-sync . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 2.31 pt-table-usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 2.32 pt-tcp-model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 2.33 pt-trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 2.34 pt-upgrade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 2.35 pt-variable-advisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 i
  • 4. 2.36 pt-visual-explain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 3 Configuration 293 3.1 CONFIGURATION FILES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 3.2 DSN (DATA SOURCE NAME) SPECIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . 294 3.3 ENVIRONMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 3.4 SYSTEM REQUIREMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 4 Miscellaneous 299 4.1 BUGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 4.2 AUTHORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 4.3 COPYRIGHT, LICENSE, AND WARRANTY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 4.4 VERSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 4.5 Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 Index 305 ii
  • 5. Percona Toolkit Documentation, Release 2.1.1 Percona Toolkit is a collection of advanced command-line tools used by Percona (http://www.percona.com/) support staff to perform a variety of MySQL and system tasks that are too difficult or complex to perform manually. These tools are ideal alternatives to private or “one-off” scripts because they are professionally developed, formally tested, and fully documented. They are also fully self-contained, so installation is quick and easy and no libraries are installed. Percona Toolkit is derived from Maatkit and Aspersa, two of the best-known toolkits for MySQL server administration. It is developed and supported by Percona Inc. For more information and other free, open-source software developed by Percona, visit http://www.percona.com/software/. CONTENTS 1
  • 6. Percona Toolkit Documentation, Release 2.1.1 2 CONTENTS
  • 7. CHAPTER ONE GETTING PERCONA TOOLKIT 1.1 Installation Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 3
  • 8. Percona Toolkit Documentation, Release 2.1.1 4 Chapter 1. Getting Percona Toolkit
  • 9. CHAPTER TWO TOOLS 2.1 pt-align 2.1.1 NAME pt-align - Align output from other tools to columns. 2.1.2 SYNOPSIS Usage pt-align [FILES] pt-align aligns output from other tools to columns. If no FILES are specified, STDIN is read. If a tool prints the following output, DATABASE TABLE ROWS foo bar 100 long_db_name table 1 another long_name 500 then pt-align reprints the output as, DATABASE TABLE ROWS foo bar 100 long_db_name table 1 another long_name 500 2.1.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-align is a read-only tool. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-align. 5
  • 10. Percona Toolkit Documentation, Release 2.1.1 See also “BUGS” for more information on filing bugs and getting help. 2.1.4 DESCRIPTION pt-align reads lines and splits them into words. It counts how many words each line has, and if there is one number that predominates, it assumes this is the number of words in each line. Then it discards all lines that don’t have that many words, and looks at the 2nd line that does. It assumes this is the first non-header line. Based on whether each word looks numeric or not, it decides on column alignment. Finally, it goes through and decides how wide each column should be, and then prints them out. This is useful for things like aligning the output of vmstat or iostat so it is easier to read. 2.1.5 OPTIONS This tool does not have any command-line options. 2.1.6 ENVIRONMENT This tool does not use any environment variables. 2.1.7 SYSTEM REQUIREMENTS You need Perl, and some core packages that ought to be installed in any reasonably new version of Perl. 2.1.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-align. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.1.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb 6 Chapter 2. Tools
  • 11. Percona Toolkit Documentation, Release 2.1.1 You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.1.10 AUTHORS Baron Schwartz, Brian Fraser, and Daniel Nichter 2.1.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.1.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.1.13 VERSION pt-align 2.1.1 2.2 pt-archiver 2.2.1 NAME pt-archiver - Archive rows from a MySQL table into another table or a file. 2.2.2 SYNOPSIS Usage pt-archiver [OPTION...] --source DSN --where WHERE 2.2. pt-archiver 7
  • 12. Percona Toolkit Documentation, Release 2.1.1 pt-archiver nibbles records from a MySQL table. The –source and –dest arguments use DSN syntax; if COPY is yes, –dest defaults to the key’s value from –source. Examples Archive all rows from oltp_server to olap_server and to a file: pt-archiver --source h=oltp_server,D=test,t=tbl --dest h=olap_server --file ’/var/log/archive/%Y-%m-%d-%D.%t’ --where "1=1" --limit 1000 --commit-each Purge (delete) orphan rows from child table: pt-archiver --source h=host,D=db,t=child --purge --where ’NOT EXISTS(SELECT * FROM parent WHERE col=child.col)’ 2.2.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-archiver is a read-write tool. It deletes data from the source by default, so you should test your archiving jobs with the --dry-run option if you’re not sure about them. It is designed to have as little impact on production systems as possible, but tuning with --limit, --txn-size and similar options might be a good idea too. If you write or use --plugin modules, you should ensure they are good quality and well-tested. At the time of this release there is an unverified bug with --bulk-insert that may cause data loss. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- archiver. See also “BUGS” for more information on filing bugs and getting help. 2.2.4 DESCRIPTION pt-archiver is the tool I use to archive tables as described in http://tinyurl.com/mysql-archiving. The goal is a low- impact, forward-only job to nibble old data out of the table without impacting OLTP queries much. You can insert the data into another table, which need not be on the same server. You can also write it to a file in a format suitable for LOAD DATA INFILE. Or you can do neither, in which case it’s just an incremental DELETE. pt-archiver is extensible via a plugin mechanism. You can inject your own code to add advanced archiving logic that could be useful for archiving dependent data, applying complex business rules, or building a data warehouse during the archiving process. You need to choose values carefully for some options. The most important are --limit, --retries, and --txn-size. The strategy is to find the first row(s), then scan some index forward-only to find more rows efficiently. Each sub- sequent query should not scan the entire table; it should seek into the index, then scan until it finds more archivable rows. Specifying the index with the ‘i’ part of the --source argument can be crucial for this; use --dry-run to examine the generated queries and be sure to EXPLAIN them to see if they are efficient (most of the time you prob- ably want to scan the PRIMARY key, which is the default). Even better, profile pt-archiver with mk-query-profiler (http://maatkit.org/get/mk-query-profiler) and make sure it is not scanning the whole table every query. 8 Chapter 2. Tools
  • 13. Percona Toolkit Documentation, Release 2.1.1 You can disable the seek-then-scan optimizations partially or wholly with --no-ascend and --ascend-first. Sometimes this may be more efficient for multi-column keys. Be aware that pt-archiver is built to start at the beginning of the index it chooses and scan it forward-only. This might result in long table scans if you’re trying to nibble from the end of the table by an index other than the one it prefers. See --source and read the documentation on the i part if this applies to you. 2.2.5 OUTPUT If you specify --progress, the output is a header row, plus status output at intervals. Each row in the status output lists the current date and time, how many seconds pt-archiver has been running, and how many rows it has archived. If you specify --statistics, pt-archiver outputs timing and other information to help you identify which part of your archiving process takes the most time. 2.2.6 ERROR-HANDLING pt-archiver tries to catch signals and exit gracefully; for example, if you send it SIGTERM (Ctrl-C on UNIX-ish systems), it will catch the signal, print a message about the signal, and exit fairly normally. It will not execute --analyze or --optimize, because these may take a long time to finish. It will run all other code normally, including calling after_finish() on any plugins (see “EXTENDING”). In other words, a signal, if caught, will break out of the main archiving loop and skip optimize/analyze. 2.2.7 OPTIONS Specify at least one of --dest, --file, or --purge. --ignore and --replace are mutually exclusive. --txn-size and --commit-each are mutually exclusive. --low-priority-insert and --delayed-insert are mutually exclusive. --share-lock and --for-update are mutually exclusive. --analyze and --optimize are mutually exclusive. --no-ascend and --no-delete are mutually exclusive. DSN values in --dest default to values from --source if COPY is yes. -analyze type: string Run ANALYZE TABLE afterwards on --source and/or --dest. Runs ANALYZE TABLE after finishing. The argument is an arbitrary string. If it contains the letter ‘s’, the source will be analyzed. If it contains ‘d’, the destination will be analyzed. You can specify either or both. For example, the following will analyze both: --analyze=ds See http://dev.mysql.com/doc/en/analyze-table.html for details on ANALYZE TABLE. -ascend-first Ascend only first column of index. If you do want to use the ascending index optimization (see --no-ascend), but do not want to incur the overhead of ascending a large multi-column index, you can use this option to tell pt-archiver to ascend only the 2.2. pt-archiver 9
  • 14. Percona Toolkit Documentation, Release 2.1.1 leftmost column of the index. This can provide a significant performance boost over not ascending the index at all, while avoiding the cost of ascending the whole index. See “EXTENDING” for a discussion of how this interacts with plugins. -ask-pass Prompt for a password when connecting to MySQL. -buffer Buffer output to --file and flush at commit. Disables autoflushing to --file and flushes --file to disk only when a transaction commits. This typically means the file is block-flushed by the operating system, so there may be some implicit flushes to disk between commits as well. The default is to flush --file to disk after every row. The danger is that a crash might cause lost data. The performance increase I have seen from using --buffer is around 5 to 15 percent. Your mileage may vary. -bulk-delete Delete each chunk with a single statement (implies --commit-each). Delete each chunk of rows in bulk with a single DELETE statement. The statement deletes every row between the first and last row of the chunk, inclusive. It implies --commit-each, since it would be a bad idea to INSERT rows one at a time and commit them before the bulk DELETE. The normal method is to delete every row by its primary key. Bulk deletes might be a lot faster. They also might not be faster if you have a complex WHERE clause. This option completely defers all DELETE processing until the chunk of rows is finished. If you have a plugin on the source, its before_delete method will not be called. Instead, its before_bulk_delete method is called later. WARNING: if you have a plugin on the source that sometimes doesn’t return true from is_archivable(), you should use this option only if you understand what it does. If the plugin instructs pt-archiver not to archive a row, it will still be deleted by the bulk delete! -[no]bulk-delete-limit default: yes Add --limit to --bulk-delete statement. This is an advanced option and you should not disable it unless you know what you are doing and why! By default, --bulk-delete appends a --limit clause to the bulk delete SQL statement. In certain cases, this clause can be omitted by specifying --no-bulk-delete-limit. --limit must still be specified. -bulk-insert Insert each chunk with LOAD DATA INFILE (implies --bulk-delete --commit-each). Insert each chunk of rows with LOAD DATA LOCAL INFILE. This may be much faster than inserting a row at a time with INSERT statements. It is implemented by creating a temporary file for each chunk of rows, and writing the rows to this file instead of inserting them. When the chunk is finished, it uploads the rows. To protect the safety of your data, this option forces bulk deletes to be used. It would be unsafe to delete each row as it is found, before inserting the rows into the destination first. Forcing bulk deletes guarantees that the deletion waits until the insertion is successful. The --low-priority-insert, --replace, and --ignore options work with this option, but --delayed-insert does not. -charset short form: -A; type: string 10 Chapter 2. Tools
  • 15. Percona Toolkit Documentation, Release 2.1.1 Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. See also --[no]check-charset. -[no]check-charset default: yes Ensure connection and table character sets are the same. Disabling this check may cause text to be erroneously converted from one character set to another (usually from utf8 to latin1) which may cause data loss or mojibake. Disabling this check may be useful or necessary when character set conversions are intended. -[no]check-columns default: yes Ensure --source and --dest have same columns. Enabled by default; causes pt-archiver to check that the source and destination tables have the same columns. It does not check column order, data type, etc. It just checks that all columns in the source exist in the destination and vice versa. If there are any differences, pt-archiver will exit with an error. To disable this check, specify –no-check-columns. -check-interval type: time; default: 1s How often to check for slave lag if --check-slave-lag is given. -check-slave-lag type: string Pause archiving until the specified DSN’s slave lag is less than --max-lag. -columns short form: -c; type: array Comma-separated list of columns to archive. Specify a comma-separated list of columns to fetch, write to the file, and insert into the destination table. If specified, pt-archiver ignores other columns unless it needs to add them to the SELECT statement for ascending an index or deleting rows. It fetches and uses these extra columns internally, but does not write them to the file or to the destination table. It does pass them to plugins. See also --primary-key-only. -commit-each Commit each set of fetched and archived rows (disables --txn-size). Commits transactions and flushes --file after each set of rows has been archived, before fetching the next set of rows, and before sleeping if --sleep is specified. Disables --txn-size; use --limit to control the transaction size with --commit-each. This option is useful as a shortcut to make --limit and --txn-size the same value, but more importantly it avoids transactions being held open while searching for more rows. For example, imagine you are archiving old rows from the beginning of a very large table, with --limit 1000 and --txn-size 1000. After some period of finding and archiving 1000 rows at a time, pt-archiver finds the last 999 rows and archives them, then executes the next SELECT to find more rows. This scans the rest of the table, but never finds any more rows. It has held open a transaction for a very long time, only to determine it is finished anyway. You can use --commit-each to avoid this. -config type: Array 2.2. pt-archiver 11
  • 16. Percona Toolkit Documentation, Release 2.1.1 Read this comma-separated list of config files; if specified, this must be the first option on the command line. -delayed-insert Add the DELAYED modifier to INSERT statements. Adds the DELAYED modifier to INSERT or REPLACE statements. See http://dev.mysql.com/doc/en/insert.html for details. -dest type: DSN DSN specifying the table to archive to. This item specifies a table into which pt-archiver will insert rows archived from --source. It uses the same key=val argument format as --source. Most missing values default to the same values as --source, so you don’t have to repeat options that are the same in --source and --dest. Use the --help option to see which values are copied from --source. WARNING: Using a default options file (F) DSN option that defines a socket for --source causes pt- archiver to connect to --dest using that socket unless another socket for --dest is specified. This means that pt-archiver may incorrectly connect to --source when it connects to --dest. For example: --source F=host1.cnf,D=db,t=tbl --dest h=host2 When pt-archiver connects to --dest, host2, it will connect via the --source, host1, socket defined in host1.cnf. -dry-run Print queries and exit without doing anything. Causes pt-archiver to exit after printing the filename and SQL statements it will use. -file type: string File to archive to, with DATE_FORMAT()-like formatting. Filename to write archived rows to. A subset of MySQL’s DATE_FORMAT() formatting codes are allowed in the filename, as follows: %d Day of the month, numeric (01..31) %H Hour (00..23) %i Minutes, numeric (00..59) %m Month, numeric (01..12) %s Seconds (00..59) %Y Year, numeric, four digits You can use the following extra format codes too: %D Database name %t Table name Example: --file ’/var/log/archive/%Y-%m-%d-%D.%t’ The file’s contents are in the same format used by SELECT INTO OUTFILE, as documented in the MySQL manual: rows terminated by newlines, columns terminated by tabs, NULL characters are represented by N, and special characters are escaped by . This lets you reload a file with LOAD DATA INFILE’s default settings. If you want a column header at the top of the file, see --header. The file is auto-flushed by default; see --buffer. 12 Chapter 2. Tools
  • 17. Percona Toolkit Documentation, Release 2.1.1 -for-update Adds the FOR UPDATE modifier to SELECT statements. For details, see http://dev.mysql.com/doc/en/innodb-locking-reads.html. -header Print column header at top of --file. Writes column names as the first line in the file given by --file. If the file exists, does not write headers; this keeps the file loadable with LOAD DATA INFILE in case you append more output to it. -help Show help and exit. -high-priority-select Adds the HIGH_PRIORITY modifier to SELECT statements. See http://dev.mysql.com/doc/en/select.html for details. -host short form: -h; type: string Connect to host. -ignore Use IGNORE for INSERT statements. Causes INSERTs into --dest to be INSERT IGNORE. -limit type: int; default: 1 Number of rows to fetch and archive per statement. Limits the number of rows returned by the SELECT statements that retrieve rows to archive. Default is one row. It may be more efficient to increase the limit, but be careful if you are archiving sparsely, skipping over many rows; this can potentially cause more contention with other queries, depending on the storage engine, transaction isolation level, and options such as --for-update. -local Do not write OPTIMIZE or ANALYZE queries to binlog. Adds the NO_WRITE_TO_BINLOG modifier to ANALYZE and OPTIMIZE queries. See --analyze for details. -low-priority-delete Adds the LOW_PRIORITY modifier to DELETE statements. See http://dev.mysql.com/doc/en/delete.html for details. -low-priority-insert Adds the LOW_PRIORITY modifier to INSERT or REPLACE statements. See http://dev.mysql.com/doc/en/insert.html for details. -max-lag type: time; default: 1s Pause archiving if the slave given by --check-slave-lag lags. This option causes pt-archiver to look at the slave every time it’s about to fetch another row. If the slave’s lag is greater than the option’s value, or if the slave isn’t running (so its lag is NULL), pt-table-checksum sleeps for --check-interval seconds and then looks at the lag again. It repeats until the slave is caught up, then proceeds to fetch and archive the row. 2.2. pt-archiver 13
  • 18. Percona Toolkit Documentation, Release 2.1.1 This option may eliminate the need for --sleep or --sleep-coef. -no-ascend Do not use ascending index optimization. The default ascending-index optimization causes pt-archiver to optimize repeated SELECT queries so they seek into the index where the previous query ended, then scan along it, rather than scanning from the beginning of the table every time. This is enabled by default because it is generally a good strategy for repeated accesses. Large, multiple-column indexes may cause the WHERE clause to be complex enough that this could actually be less efficient. Consider for example a four-column PRIMARY KEY on (a, b, c, d). The WHERE clause to start where the last query ended is as follows: WHERE (a > ?) OR (a = ? AND b > ?) OR (a = ? AND b = ? AND c > ?) OR (a = ? AND b = ? AND c = ? AND d >= ?) Populating the placeholders with values uses memory and CPU, adds network traffic and parsing overhead, and may make the query harder for MySQL to optimize. A four-column key isn’t a big deal, but a ten-column key in which every column allows NULL might be. Ascending the index might not be necessary if you know you are simply removing rows from the beginning of the table in chunks, but not leaving any holes, so starting at the beginning of the table is actually the most efficient thing to do. See also --ascend-first. See “EXTENDING” for a discussion of how this interacts with plugins. -no-delete Do not delete archived rows. Causes pt-archiver not to delete rows after processing them. This disallows --no-ascend, because enabling them both would cause an infinite loop. If there is a plugin on the source DSN, its before_delete method is called anyway, even though pt-archiver will not execute the delete. See “EXTENDING” for more on plugins. -optimize type: string Run OPTIMIZE TABLE afterwards on --source and/or --dest. Runs OPTIMIZE TABLE after finishing. See --analyze for the option syntax and http://dev.mysql.com/doc/en/optimize-table.html for details on OPTIMIZE TABLE. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -plugin type: string Perl module name to use as a generic plugin. Specify the Perl module name of a general-purpose plugin. It is currently used only for statistics (see --statistics) and must have new() and a statistics() method. 14 Chapter 2. Tools
  • 19. Percona Toolkit Documentation, Release 2.1.1 The new( src = $src, dst => $dst, opts => $o )> method gets the source and destination DSNs, and their database connections, just like the connection-specific plugins do. It also gets an OptionParser object ($o) for accessing command-line options (example: ‘‘$o-‘‘get(‘purge’);>). The statistics(%stats, $time) method gets a hashref of the statistics collected by the archiving job, and the time the whole job started. -port short form: -P; type: int Port number to use for connection. -primary-key-only Primary key columns only. A shortcut for specifying --columns with the primary key columns. This is an efficiency if you just want to purge rows; it avoids fetching the entire row, when only the primary key columns are needed for DELETE statements. See also --purge. -progress type: int Print progress information every X rows. Prints current time, elapsed time, and rows archived every X rows. -purge Purge instead of archiving; allows omitting --file and --dest. Allows archiving without a --file or --dest argument, which is effectively a purge since the rows are just deleted. If you just want to purge rows, consider specifying the table’s primary key columns with --primary-key-only. This will prevent fetching all columns from the server for no reason. -quick-delete Adds the QUICK modifier to DELETE statements. See http://dev.mysql.com/doc/en/delete.html for details. As stated in the documentation, in some cases it may be faster to use DELETE QUICK followed by OPTIMIZE TABLE. You can use --optimize for this. -quiet short form: -q Do not print any output, such as for --statistics. Suppresses normal output, including the output of --statistics, but doesn’t suppress the output from --why-quit. -replace Causes INSERTs into --dest to be written as REPLACE. -retries type: int; default: 1 Number of retries per timeout or deadlock. Specifies the number of times pt-archiver should retry when there is an InnoDB lock wait timeout or deadlock. When retries are exhausted, pt-archiver will exit with an error. Consider carefully what you want to happen when you are archiving between a mixture of transactional and non-transactional storage engines. The INSERT to --dest and DELETE from --source are on separate connections, so they do not actually participate in the same transaction even if they’re on the same server. 2.2. pt-archiver 15
  • 20. Percona Toolkit Documentation, Release 2.1.1 However, pt-archiver implements simple distributed transactions in code, so commits and rollbacks should happen as desired across the two connections. At this time I have not written any code to handle errors with transactional storage engines other than InnoDB. Request that feature if you need it. -run-time type: time Time to run before exiting. Optional suffix s=seconds, m=minutes, h=hours, d=days; if no suffix, s is used. -[no]safe-auto-increment default: yes Do not archive row with max AUTO_INCREMENT. Adds an extra WHERE clause to prevent pt-archiver from removing the newest row when ascending a single- column AUTO_INCREMENT key. This guards against re-using AUTO_INCREMENT values if the server restarts, and is enabled by default. The extra WHERE clause contains the maximum value of the auto-increment column as of the beginning of the archive or purge job. If new rows are inserted while pt-archiver is running, it will not see them. -sentinel type: string; default: /tmp/pt-archiver-sentinel Exit if this file exists. The presence of the file specified by --sentinel will cause pt-archiver to stop archiving and exit. The default is /tmp/pt-archiver-sentinel. You might find this handy to stop cron jobs gracefully if necessary. See also --stop. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Specify any variables you want to be set immediately after connecting to MySQL. These will be included in a SET command. -share-lock Adds the LOCK IN SHARE MODE modifier to SELECT statements. See http://dev.mysql.com/doc/en/innodb-locking-reads.html. -skip-foreign-key-checks Disables foreign key checks with SET FOREIGN_KEY_CHECKS=0. -sleep type: int Sleep time between fetches. Specifies how long to sleep between SELECT statements. Default is not to sleep at all. Transactions are NOT committed, and the --file file is NOT flushed, before sleeping. See --txn-size to control that. If --commit-each is specified, committing and flushing happens before sleeping. -sleep-coef type: float Calculate --sleep as a multiple of the last SELECT time. 16 Chapter 2. Tools
  • 21. Percona Toolkit Documentation, Release 2.1.1 If this option is specified, pt-archiver will sleep for the query time of the last SELECT multiplied by the specified coefficient. This is a slightly more sophisticated way to throttle the SELECTs: sleep a varying amount of time between each SELECT, depending on how long the SELECTs are taking. -socket short form: -S; type: string Socket file to use for connection. -source type: DSN DSN specifying the table to archive from (required). This argument is a DSN. See DSN OPTIONS for the syntax. Most options control how pt-archiver connects to MySQL, but there are some extended DSN options in this tool’s syntax. The D, t, and i options select a table to archive: --source h=my_server,D=my_database,t=my_tbl The a option specifies the database to set as the connection’s default with USE. If the b option is true, it disables binary logging with SQL_LOG_BIN. The m option specifies pluggable actions, which an external Perl module can provide. The only required part is the table; other parts may be read from various places in the environment (such as options files). The ‘i’ part deserves special mention. This tells pt-archiver which index it should scan to archive. This appears in a FORCE INDEX or USE INDEX hint in the SELECT statements used to fetch archivable rows. If you don’t specify anything, pt-archiver will auto-discover a good index, preferring a PRIMARY KEY if one exists. In my experience this usually works well, so most of the time you can probably just omit the ‘i’ part. The index is used to optimize repeated accesses to the table; pt-archiver remembers the last row it retrieves from each SELECT statement, and uses it to construct a WHERE clause, using the columns in the specified index, that should allow MySQL to start the next SELECT where the last one ended, rather than potentially scanning from the beginning of the table with each successive SELECT. If you are using external plugins, please see “EXTENDING” for a discussion of how they interact with ascending indexes. The ‘a’ and ‘b’ options allow you to control how statements flow through the binary log. If you specify the ‘b’ option, binary logging will be disabled on the specified connection. If you specify the ‘a’ option, the connection will USE the specified database, which you can use to prevent slaves from executing the binary log events with --replicate-ignore-db options. These two options can be used as different methods to achieve the same goal: archive data off the master, but leave it on the slave. For example, you can run a purge job on the master and prevent it from happening on the slave using your method of choice. WARNING: Using a default options file (F) DSN option that defines a socket for --source causes pt- archiver to connect to --dest using that socket unless another socket for --dest is specified. This means that pt-archiver may incorrectly connect to --source when it is meant to connect to --dest. For example: --source F=host1.cnf,D=db,t=tbl --dest h=host2 When pt-archiver connects to --dest, host2, it will connect via the --source, host1, socket defined in host1.cnf. -statistics Collect and print timing statistics. Causes pt-archiver to collect timing statistics about what it does. These statistics are available to the plugin specified by --plugin Unless you specify --quiet, pt-archiver prints the statistics when it exits. The statistics look like this: 2.2. pt-archiver 17
  • 22. Percona Toolkit Documentation, Release 2.1.1 Started at 2008-07-18T07:18:53, ended at 2008-07-18T07:18:53 Source: D=db,t=table SELECT 4 INSERT 4 DELETE 4 Action Count Time Pct commit 10 0.1079 88.27 select 5 0.0047 3.87 deleting 4 0.0028 2.29 inserting 4 0.0028 2.28 other 0 0.0040 3.29 The first two (or three) lines show times and the source and destination tables. The next three lines show how many rows were fetched, inserted, and deleted. The remaining lines show counts and timing. The columns are the action, the total number of times that action was timed, the total time it took, and the percent of the program’s total runtime. The rows are sorted in order of descending total time. The last row is the rest of the time not explicitly attributed to anything. Actions will vary depending on command-line options. If --why-quit is given, its behavior is changed slightly. This option causes it to print the reason for exiting even when it’s just because there are no more rows. This option requires the standard Time::HiRes module, which is part of core Perl on reasonably new Perl re- leases. -stop Stop running instances by creating the sentinel file. Causes pt-archiver to create the sentinel file specified by --sentinel and exit. This should have the effect of stopping all running instances which are watching the same sentinel file. -txn-size type: int; default: 1 Number of rows per transaction. Specifies the size, in number of rows, of each transaction. Zero disables transactions altogether. After pt- archiver processes this many rows, it commits both the --source and the --dest if given, and flushes the file given by --file. This parameter is critical to performance. If you are archiving from a live server, which for example is doing heavy OLTP work, you need to choose a good balance between transaction size and commit overhead. Larger transactions create the possibility of more lock contention and deadlocks, but smaller transactions cause more frequent commit overhead, which can be significant. To give an idea, on a small test set I worked with while writing pt-archiver, a value of 500 caused archiving to take about 2 seconds per 1000 rows on an otherwise quiet MySQL instance on my desktop machine, archiving to disk and to another table. Disabling transactions with a value of zero, which turns on autocommit, dropped performance to 38 seconds per thousand rows. If you are not archiving from or to a transactional storage engine, you may want to disable transactions so pt-archiver doesn’t try to commit. -user short form: -u; type: string User for login if not current user. -version Show version and exit. -where type: string 18 Chapter 2. Tools
  • 23. Percona Toolkit Documentation, Release 2.1.1 WHERE clause to limit which rows to archive (required). Specifies a WHERE clause to limit which rows are archived. Do not include the word WHERE. You may need to quote the argument to prevent your shell from interpreting it. For example: --where ’ts < current_date - interval 90 day’ For safety, --where is required. If you do not require a WHERE clause, use --where 1=1. -why-quit Print reason for exiting unless rows exhausted. Causes pt-archiver to print a message if it exits for any reason other than running out of rows to archive. This can be useful if you have a cron job with --run-time specified, for example, and you want to be sure pt-archiver is finishing before running out of time. If --statistics is given, the behavior is changed slightly. It will print the reason for exiting even when it’s just because there are no more rows. This output prints even if --quiet is given. That’s so you can put pt-archiver in a cron job and get an email if there’s an abnormal exit. 2.2.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • a copy: no Database to USE when executing queries. • A dsn: charset; copy: yes Default character set. • b copy: no If true, disable binlog with SQL_LOG_BIN. • D dsn: database; copy: yes Database that contains the table. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • i 2.2. pt-archiver 19
  • 24. Percona Toolkit Documentation, Release 2.1.1 copy: yes Index to use. • m copy: no Plugin module name. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • t copy: yes Table to archive from/to. • u dsn: user; copy: yes User for login if not current user. 2.2.9 EXTENDING pt-archiver is extensible by plugging in external Perl modules to handle some logic and/or actions. You can specify a module for both the --source and the --dest, with the ‘m’ part of the specification. For example: --source D=test,t=test1,m=My::Module1 --dest m=My::Module2,t=test2 This will cause pt-archiver to load the My::Module1 and My::Module2 packages, create instances of them, and then make calls to them during the archiving process. You can also specify a plugin with --plugin. The module must provide this interface: new(dbh => $dbh, db => $db_name, tbl => $tbl_name) The plugin’s constructor is passed a reference to the database handle, the database name, and table name. The plugin is created just after pt-archiver opens the connection, and before it examines the table given in the arguments. This gives the plugin a chance to create and populate temporary tables, or do other setup work. before_begin(cols => @cols, allcols => @allcols) This method is called just before pt-archiver begins iterating through rows and archiving them, but after it does all other setup work (examining table structures, designing SQL queries, and so on). This is the only time pt-archiver tells the plugin column names for the rows it will pass the plugin while archiving. 20 Chapter 2. Tools
  • 25. Percona Toolkit Documentation, Release 2.1.1 The cols argument is the column names the user requested to be archived, either by default or by the --columns option. The allcols argument is the list of column names for every row pt-archiver will fetch from the source table. It may fetch more columns than the user requested, because it needs some columns for its own use. When subsequent plugin functions receive a row, it is the full row containing all the extra columns, if any, added to the end. is_archivable(row => @row) This method is called for each row to determine whether it is archivable. This applies only to --source. The argument is the row itself, as an arrayref. If the method returns true, the row will be archived; otherwise it will be skipped. Skipping a row adds complications for non-unique indexes. Normally pt-archiver uses a WHERE clause designed to target the last processed row as the place to start the scan for the next SELECT statement. If you have skipped the row by returning false from is_archivable(), pt-archiver could get into an infinite loop because the row still exists. Therefore, when you specify a plugin for the --source argument, pt- archiver will change its WHERE clause slightly. Instead of starting at “greater than or equal to” the last processed row, it will start “strictly greater than.” This will work fine on unique indexes such as primary keys, but it may skip rows (leave holes) on non-unique indexes or when ascending only the first column of an index. pt-archiver will change the clause in the same way if you specify --no-delete, because again an infinite loop is possible. If you specify the --bulk-delete option and return false from this method, pt-archiver may not do what you want. The row won’t be archived, but it will be deleted, since bulk deletes operate on ranges of rows and don’t know which rows the plugin selected to keep. If you specify the --bulk-insert option, this method’s return value will influence whether the row is written to the temporary file for the bulk insert, so bulk inserts will work as expected. However, bulk inserts require bulk deletes. before_delete(row => @row) This method is called for each row just before it is deleted. This applies only to --source. This is a good place for you to handle dependencies, such as deleting things that are foreign-keyed to the row you are about to delete. You could also use this to recursively archive all dependent tables. This plugin method is called even if --no-delete is given, but not if --bulk-delete is given. before_bulk_delete(first_row => @row, last_row => @row) This method is called just before a bulk delete is executed. It is similar to the before_delete method, except its arguments are the first and last row of the range to be deleted. It is called even if --no-delete is given. before_insert(row => @row) This method is called for each row just before it is inserted. This applies only to --dest. You could use this to insert the row into multiple tables, perhaps with an ON DUPLICATE KEY UPDATE clause to build summary tables in a data warehouse. This method is not called if --bulk-insert is given. before_bulk_insert(first_row => @row, last_row => @row, filename => bulk_insert_filename) This method is called just before a bulk insert is executed. It is similar to the before_insert method, except its arguments are the first and last row of the range to be deleted. custom_sth(row => @row, sql => $sql) This method is called just before inserting the row, but after “before_insert()”. It allows the plugin to specify different INSERT statement if desired. The return value (if any) should be a DBI statement 2.2. pt-archiver 21
  • 26. Percona Toolkit Documentation, Release 2.1.1 handle. The sql parameter is the SQL text used to prepare the default INSERT statement. This method is not called if you specify --bulk-insert. If no value is returned, the default INSERT statement handle is used. This method applies only to the plugin specified for --dest, so if your plugin isn’t doing what you expect, check that you’ve specified it for the destination and not the source. custom_sth_bulk(first_row => @row, last_row => @row, sql => $sql, filename => $bulk_insert_filename) If you’ve specified --bulk-insert, this method is called just before the bulk insert, but after “be- fore_bulk_insert()”, and the arguments are different. This method’s return value etc is similar to the “custom_sth()” method. after_finish() This method is called after pt-archiver exits the archiving loop, commits all database handles, closes --file, and prints the final statistics, but before pt-archiver runs ANALYZE or OPTIMIZE (see --analyze and --optimize). If you specify a plugin for both --source and --dest, pt-archiver constructs, calls before_begin(), and calls after_finish() on the two plugins in the order --source, --dest. pt-archiver assumes it controls transactions, and that the plugin will NOT commit or roll back the database handle. The database handle passed to the plugin’s constructor is the same handle pt-archiver uses itself. Remember that --source and --dest are separate handles. A sample module might look like this: package My::Module; sub new { my ( $class, %args ) = @_; return bless(%args, $class); } sub before_begin { my ( $self, %args ) = @_; # Save column names for later $self->{cols} = $args{cols}; } sub is_archivable { my ( $self, %args ) = @_; # Do some advanced logic with $args{row} return 1; } sub before_delete {} # Take no action sub before_insert {} # Take no action sub custom_sth {} # Take no action sub after_finish {} # Take no action 1; 2.2.10 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: 22 Chapter 2. Tools
  • 27. Percona Toolkit Documentation, Release 2.1.1 PTDEBUG=1 pt-archiver ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.2.11 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.2.12 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-archiver. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.2.13 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.2.14 AUTHORS Baron Schwartz 2.2.15 ACKNOWLEDGMENTS Andrew O’Brien 2.2. pt-archiver 23
  • 28. Percona Toolkit Documentation, Release 2.1.1 2.2.16 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.2.17 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.2.18 VERSION pt-archiver 2.1.1 2.3 pt-config-diff 2.3.1 NAME pt-config-diff - Diff MySQL configuration files and server variables. 2.3.2 SYNOPSIS Usage pt-config-diff [OPTION...] CONFIG CONFIG [CONFIG...] pt-config-diff diffs MySQL configuration files and server variables. CONFIG can be a filename or a DSN. At least two CONFIG sources must be given. Like standard Unix diff, there is no output if there are no differences. Diff host1 config from SHOW VARIABLES against host2: pt-config-diff h=host1 h=host2 Diff config from [mysqld] section in my.cnf against host1 config: pt-config-diff /etc/my.cnf h=host1 Diff the [mysqld] section of two option files: 24 Chapter 2. Tools
  • 29. Percona Toolkit Documentation, Release 2.1.1 pt-config-diff /etc/my-small.cnf /etc/my-large.cnf 2.3.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-config-diff reads MySQL’s configuration and examines it and is thus very low risk. At the time of this release there are no known bugs that pose a serious risk. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- config-diff. See also “BUGS” for more information on filing bugs and getting help. 2.3.4 DESCRIPTION pt-config-diff diffs MySQL configurations by examining the values of server system variables from two or more CONFIG sources specified on the command line. A CONFIG source can be a DSN or a filename containing the output of mysqld --help --verbose, my_print_defaults, SHOW VARIABLES, or an option file (e.g. my.cnf). For each DSN CONFIG, pt-config-diff connects to MySQL and gets variables and values by executing SHOW /*!40103 GLOBAL*/ VARIABLES. This is an “active config” because it shows what server values MySQL is actively (currently) running with. Only variables that all CONFIG sources have are compared because if a variable is not present then we cannot know or safely guess its value. For example, if you compare an option file (e.g. my.cnf) to an active config (i.e. SHOW VARIABLES from a DSN CONFIG), the option file will probably only have a few variables, whereas the active config has every variable. Only values of the variables present in both configs are compared. Option file and DSN configs provide the best results. 2.3.5 OUTPUT There is no output when there are no differences. When there are differences, pt-config-diff prints a report to STDOUT that looks similar to the following: 2 config differences Variable my.master.cnf my.slave.cnf ========================= =============== =============== datadir /tmp/12345/data /tmp/12346/data port 12345 12346 Comparing MySQL variables is difficult because there are many variations and subtleties across the many versions and distributions of MySQL. When a comparison fails, the tool prints a warning to STDERR, such as the following: Comparing log_error values (mysqld.log, /tmp/12345/data/mysqld.log) caused an error: Argument "/tmp/12345/data/mysqld.log" isn’t numeric in numeric eq (==) at ./pt-config-diff line 2311. Please report these warnings so the comparison functions can be improved. 2.3. pt-config-diff 25
  • 30. Percona Toolkit Documentation, Release 2.1.1 2.3.6 EXIT STATUS pt-config-diff exits with a zero exit status when there are no differences, and 1 if there are. 2.3.7 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. (This option does not specify a CONFIG; it’s equivalent to --defaults-file.) -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -help Show help and exit. -host short form: -h; type: string Connect to host. -ignore-variables type: array Ignore, do not compare, these variables. -password short form: -p; type: string Password to use for connection. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. 26 Chapter 2. Tools
  • 31. Percona Toolkit Documentation, Release 2.1.1 -[no]report default: yes Print the MySQL config diff report to STDOUT. If you just want to check if the given configs are different or not by examining the tool’s exit status, then specify --no-report to suppress the report. -report-width type: int; default: 78 Truncate report lines to this many characters. Since some variable values can be long, or when comparing multiple configs, it may help to increase the report width so values are not truncated beyond readability. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -user short form: -u; type: string MySQL user if not current user. -version Show version and exit. 2.3.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p 2.3. pt-config-diff 27
  • 32. Percona Toolkit Documentation, Release 2.1.1 dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.3.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-config-diff ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.3.10 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.3.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-config-diff. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.3.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: 28 Chapter 2. Tools
  • 33. Percona Toolkit Documentation, Release 2.1.1 wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.3.13 AUTHORS Baron Schwartz and Daniel Nichter 2.3.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.3.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.3.16 VERSION pt-config-diff 2.1.1 2.4 pt-deadlock-logger 2.4.1 NAME pt-deadlock-logger - Extract and log MySQL deadlock information. 2.4. pt-deadlock-logger 29
  • 34. Percona Toolkit Documentation, Release 2.1.1 2.4.2 SYNOPSIS Usage pt-deadlock-logger [OPTION...] SOURCE_DSN pt-deadlock-logger extracts and saves information about the most recent deadlock in a MySQL server. Print deadlocks on SOURCE_DSN: pt-deadlock-logger SOURCE_DSN Store deadlock information from SOURCE_DSN in test.deadlocks table on SOURCE_DSN (source and destination are the same host): pt-deadlock-logger SOURCE_DSN --dest D=test,t=deadlocks Store deadlock information from SOURCE_DSN in test.deadlocks table on DEST_DSN (source and destination are different hosts): pt-deadlock-logger SOURCE_DSN --dest DEST_DSN,D=test,t=deadlocks Daemonize and check for deadlocks on SOURCE_DSN every 30 seconds for 4 hours: pt-deadlock-logger SOURCE_DSN --dest D=test,t=deadlocks --daemonize --run-time 4h --interval 30s 2.4.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-deadlock-logger is a read-only tool unless you specify a --dest table. In some cases polling SHOW INNODB STATUS too rapidly can cause extra load on the server. If you’re using it on a production server under very heavy load, you might want to set --interval to 30 seconds or more. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- deadlock-logger. See also “BUGS” for more information on filing bugs and getting help. 2.4.4 DESCRIPTION pt-deadlock-logger extracts deadlock data from a MySQL server. Currently only InnoDB deadlock information is available. You can print the information to standard output, store it in a database table, or both. If neither --print nor --dest are given, then the deadlock information is printed by default. If only --dest is given, then the deadlock information is only stored. If both options are given, then the deadlock information is printed and stored. The source host can be specified using one of two methods. The first method is to use at least one of the standard connection-related command line options: --defaults-file, --password, --host, --port, --socket or --user. These options only apply to the source host; they cannot be used to specify the destination host. The second method to specify the source host, or the optional destination host using --dest, is a DSN. A DSN is a special syntax that can be either just a hostname (like server.domain.com or 1.2.3.4), or a key=value,key=value string. Keys are a single letter: 30 Chapter 2. Tools
  • 35. Percona Toolkit Documentation, Release 2.1.1 KEY MEANING === ======= h Connect to host P Port number to use for connection S Socket file to use for connection u User for login if not current user p Password to use when connecting F Only read default options from the given file If you omit any values from the destination host DSN, they are filled in with values from the source host, so you don’t need to specify them in both places. pt-deadlock-logger reads all normal MySQL option files, such as ~/.my.cnf, so you may not need to specify username, password and other common options at all. 2.4.5 OUTPUT You can choose which columns are output and/or saved to --dest with the --columns argument. The default columns are as follows: server The (source) server on which the deadlock occurred. This might be useful if you’re tracking deadlocks on many servers. ts The date and time of the last detected deadlock. thread The MySQL thread number, which is the same as the connection ID in SHOW FULL PROCESSLIST. txn_id The InnoDB transaction ID, which InnoDB expresses as two unsigned integers. I have multiplied them out to be one number. txn_time How long the transaction was active when the deadlock happened. user The connection’s database username. hostname The connection’s host. ip The connection’s IP address. If you specify --numeric-ip, this is converted to an unsigned integer. db The database in which the deadlock occurred. tbl The table on which the deadlock occurred. idx The index on which the deadlock occurred. lock_type 2.4. pt-deadlock-logger 31
  • 36. Percona Toolkit Documentation, Release 2.1.1 The lock type the transaction held on the lock that caused the deadlock. lock_mode The lock mode of the lock that caused the deadlock. wait_hold Whether the transaction was waiting for the lock or holding the lock. Usually you will see the two waited- for locks. victim Whether the transaction was selected as the deadlock victim and rolled back. query The query that caused the deadlock. 2.4.6 INNODB CAVEATS AND DETAILS InnoDB’s output is hard to parse and sometimes there’s no way to do it right. Sometimes not all information (for example, username or IP address) is included in the deadlock information. In this case there’s nothing for the script to put in those columns. It may also be the case that the deadlock output is so long (because there were a lot of locks) that the whole thing is truncated. Though there are usually two transactions involved in a deadlock, there are more locks than that; at a minimum, one more lock than transactions is necessary to create a cycle in the waits-for graph. pt-deadlock-logger prints the transactions (always two in the InnoDB output, even when there are more transactions in the waits-for graph than that) and fills in locks. It prefers waited-for over held when choosing lock information to output, but you can figure out the rest with a moment’s thought. If you see one wait-for and one held lock, you’re looking at the same lock, so of course you’d prefer to see both wait-for locks and get more information. If the two waited-for locks are not on the same table, more than two transactions were involved in the deadlock. 2.4.7 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -clear-deadlocks type: string Use this table to create a small deadlock. This usually has the effect of clearing out a huge deadlock, which otherwise consumes the entire output of SHOW INNODB STATUS. The table must not exist. pt-deadlock- logger will create it with the following MAGIC_clear_deadlocks structure: CREATE TABLE test.deadlock_maker(a INT PRIMARY KEY) ENGINE=InnoDB; After creating the table and causing a small deadlock, the tool will drop the table again. 32 Chapter 2. Tools
  • 37. Percona Toolkit Documentation, Release 2.1.1 -[no]collapse Collapse whitespace in queries to a single space. This might make it easier to inspect on the command line or in a query. By default, whitespace is collapsed when printing with --print, but not modified when storing to --dest. (That is, the default is different for each action). -columns type: hash Output only this comma-separated list of columns. See “OUTPUT” for more details on columns. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -create-dest-table Create the table specified by --dest. Normally the --dest table is expected to exist already. This option causes pt-deadlock-logger to create the table automatically using the suggested table structure. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -dest type: DSN DSN for where to store deadlocks; specify at least a database (D) and table (t). Missing values are filled in with the same values from the source host, so you can usually omit most parts of this argument if you’re storing deadlocks on the same server on which they happen. By default, whitespace in the query column is left intact; use --[no]collapse if you want whitespace collapsed. The following MAGIC_dest_table is suggested if you want to store all the information pt-deadlock-logger can extract about deadlocks: CREATE TABLE deadlocks ( server char(20) NOT NULL, ts datetime NOT NULL, thread int unsigned NOT NULL, txn_id bigint unsigned NOT NULL, txn_time smallint unsigned NOT NULL, user char(16) NOT NULL, hostname char(20) NOT NULL, ip char(15) NOT NULL, -- alternatively, ip int unsigned NOT NULL db char(64) NOT NULL, tbl char(64) NOT NULL, idx char(64) NOT NULL, lock_type char(16) NOT NULL, lock_mode char(1) NOT NULL, wait_hold char(1) NOT NULL, victim tinyint unsigned NOT NULL, query text NOT NULL, PRIMARY KEY (server,ts,thread) ) ENGINE=InnoDB 2.4. pt-deadlock-logger 33
  • 38. Percona Toolkit Documentation, Release 2.1.1 If you use --columns, you can omit whichever columns you don’t want to store. -help Show help and exit. -host short form: -h; type: string Connect to host. -interval type: time How often to check for deadlocks. If no --run-time is specified, pt-deadlock-logger runs forever, checking for deadlocks at every interval. See also --run-time. -log type: string Print all output to this file when daemonized. -numeric-ip Express IP addresses as integers. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -print Print results on standard output. See “OUTPUT” for more. By default, enables --[no]collapse unless you explicitly disable it. If --interval or --run-time is specified, only new deadlocks are printed at each interval. A fingerprint for each deadlock is created using --columns server, ts and thread (even if those columns were not specified by --columns) and if the current deadlock’s fingerprint is different from the last deadlock’s fingerprint, then it is printed. -run-time type: time How long to run before exiting. By default pt-deadlock-logger runs once, checks for deadlocks, and exits. If --run-time is specified but no --interval is specified, a default 1 second interval will be used. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string 34 Chapter 2. Tools
  • 39. Percona Toolkit Documentation, Release 2.1.1 Socket file to use for connection. -tab Print tab-separated columns, instead of aligned. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.4.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • t Table in which to store deadlock information. • u 2.4. pt-deadlock-logger 35
  • 40. Percona Toolkit Documentation, Release 2.1.1 dsn: user; copy: yes User for login if not current user. 2.4.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-deadlock-logger ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.4.10 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.4.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-deadlock-logger. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.4.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.4.13 AUTHORS Baron Schwartz 36 Chapter 2. Tools
  • 41. Percona Toolkit Documentation, Release 2.1.1 2.4.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.4.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.4.16 VERSION pt-deadlock-logger 2.1.1 2.5 pt-diskstats 2.5.1 NAME pt-diskstats - An interactive I/O monitoring tool for GNU/Linux. 2.5.2 SYNOPSIS Usage pt-diskstats [OPTION...] [FILES] pt-diskstats prints disk I/O statistics for GNU/Linux. It is somewhat similar to iostat, but it is interactive and more detailed. It can analyze samples gathered from another machine. 2.5.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-diskstats simply reads /proc/diskstats. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. 2.5. pt-diskstats 37
  • 42. Percona Toolkit Documentation, Release 2.1.1 The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- diskstats. See also “BUGS” for more information on filing bugs and getting help. 2.5.4 DESCRIPTION The pt-diskstats tool is similar to iostat, but has some advantages. It prints read and write statistics separately, and has more columns. It is menu-driven and interactive, with several different ways to aggregate the data. It integrates well with the pt-stalk tool. It also does the “right thing” by default, such as hiding disks that are idle. These properties make it very convenient for quickly drilling down into I/O performance and inspecting disk behavior. This program works in two modes. The default is to collect samples of /proc/diskstats and print out the formatted statistics at intervals. The other mode is to process a file that contains saved samples of /proc/diskstats; there is a shell script later in this documentation that shows how to collect such a file. In both cases, the tool is interactively controlled by keystrokes, so you can redisplay and slice the data flexibly and easily. It loops forever, until you exit with the ‘q’ key. If you press the ‘?’ key, you will bring up the interactive help menu that shows which keys control the program. When the program is gathering samples of /proc/diskstats and refreshing its display, it prints information about the newest sample each time it refreshes. When it is operating on a file of saved samples, it redraws the entire file’s contents every time you change an option. The program doesn’t print information about every block device on the system. It hides devices that it has never observed to have any activity. You can enable and disable this by pressing the ‘i’ key. 2.5.5 OUTPUT In the rest of this documentation, we will try to clarify the distinction between block devices (/dev/sda1, for example), which the kernel presents to the application via a filesystem, versus the (usually) physical device underneath the block device, which could be a disk, a RAID controller, and so on. We will sometimes refer to logical I/O operations, which occur at the block device, versus physical I/Os which are performed on the underlying device. When we refer to the queue, we are speaking of the queue associated with the block device, which holds requests until they’re issued to the physical device. The program’s output looks like the following sample, which is too wide for this manual page, so we have formatted it as several samples with line breaks: #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt {6} sda 0.9 4.2 0.0 0% 0.0 17.9 {6} sdb 0.4 4.0 0.0 0% 0.0 26.1 {6} dm-0 0.0 4.0 0.0 0% 0.0 13.5 {6} dm-1 0.8 4.0 0.0 0% 0.0 16.0 ... wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt ... 99.7 6.2 0.6 35% 3.7 23.7 ... 14.5 15.8 0.2 75% 0.5 9.2 ... 1.0 4.0 0.0 0% 0.0 2.3 ... 117.7 4.0 0.5 0% 4.1 35.1 ... busy in_prg io_s qtime stime ... 6% 0 100.6 23.3 0.4 ... 4% 0 14.9 8.6 0.6 ... 0% 0 1.1 1.5 1.2 ... 5% 0 118.5 34.5 0.4 38 Chapter 2. Tools
  • 43. Percona Toolkit Documentation, Release 2.1.1 The columns are as follows: #ts This column’s contents vary depending on the tool’s aggregation mode. In the default mode, when each line contains information about a single disk but possibly aggregates across several samples from that disk, this column shows the number of samples that were included into the line of output, in {curly braces}. In the example shown, each line of output aggregates {10} samples of /proc/diskstats. In the “all” group-by mode, this column shows timestamp offsets, relative to the time the tool began aggregating or the timestamp of the previous lines printed, depending on the mode. The output can be confusing to explain, but it’s rather intuitive when you see the lines appearing on your screen periodically. Similarly, in “sample” group-by mode, the number indicates the total time span that is grouped into each sample. If you specify --show-timestamps, this field instead shows the timestamp at which the sample was taken; if multiple timestamps are present in a single line of output, then the first timestamp is used. device The device name. If there is more than one device, then instead the number of devices aggregated into the line is shown, in {curly braces}. rd_s The average number of reads per second. This is the number of I/O requests that were sent to the underly- ing device. This usually is a smaller number than the number of logical IO requests made by applications. More requests might have been queued to the block device, but some of them usually are merged before being sent to the disk. This field is computed from the contents of /proc/diskstats as follows. See “KERNEL DOCUMENTA- TION” below for the meaning of the field numbers: delta[field1] / delta[time] rd_avkb The average size of the reads, in kilobytes. This field is computed as follows: 2 * delta[field3] / delta[field1] rd_mb_s The average number of megabytes read per second. Computed as follows: 2 * delta[field3] / delta[time] rd_mrg The percentage of read requests that were merged together in the queue scheduler before being sent to the physical device. The field is computed as follows: 100 * delta[field2] / (delta[field2] + delta[field1]) rd_cnc The average concurrency of the read operations, as computed by Little’s Law. This is the end-to-end concurrency on the block device, not the underlying disk’s concurrency. It includes time spent in the queue. The field is computed as follows: delta[field4] / delta[time] / 1000 / devices-in-group rd_rt 2.5. pt-diskstats 39
  • 44. Percona Toolkit Documentation, Release 2.1.1 The average response time of the read operations, in milliseconds. This is the end-to-end response time, including time spent in the queue. It is the response time that the application making I/O requests sees, not the response time of the physical disk underlying the block device. It is computed as follows: delta[field4] / (delta[field1] + delta[field2]) wr_s, wr_avkb, wr_mb_s, wr_mrg, wr_cnc, wr_rt These columns show write activity, and they match the corresponding columns for read activity. busy The fraction of wall-clock time that the device had at least one request in progress; this is what iostat calls %util, and indeed it is utilization, depending on how you define utilization, but that is sometimes ambiguous in common parlance. It may also be called the residence time; the time during which at least one request was resident in the system. It is computed as follows: 100 * delta[field10] / (1000 * delta[time]) This field cannot exceed 100% unless there is a rounding error, but it is a common mistake to think that a device that’s busy all the time is saturated. A device such as a RAID volume should support concurrency higher than 1, and solid-state drives can support very high concurrency. Concurrency can grow without bound, and is a more reliable indicator of how loaded the device really is. in_prg The number of requests that were in progress. Unlike the read and write concurrencies, which are averages that are generated from reliable numbers, this number is an instantaneous sample, and you can see that it might represent a spike of requests, rather than the true long-term average. If this number is large, it essentially means that the device is heavily loaded. It is computed as follows: field9 ios_s The average throughput of the physical device, in I/O operations per second (IOPS). This column shows the total IOPS the underlying device is handling. It is the sum of rd_s and wr_s. qtime The average queue time; that is, time a request spends in the device scheduler queue before being sent to the physical device. This is an average over reads and writes. It is computed in a slightly complex way: the average response time seen by the application, minus the average service time (see the description of the next column). This is derived from the queueing theory formula for response time, R = W + S: response time = queue time + service time. This is solved for W, of course, to give W = R - S. The computation follows: delta[field11] / (delta[field1, 2, 5, 6] + delta[field9]) - delta[field10] / delta[field1, 2, 5, 6] See the description for stime for more details and cautions. stime The average service time; that is, the time elapsed while the physical device processes the request, after the request finishes waiting in the queue. This is an average over reads and writes. It is computed from the queueing theory utilization formula, U = SX, solved for S. This means that utilization divided by throughput gives service time: delta[field10] / (delta[field1, 2, 5, 6]) 40 Chapter 2. Tools
  • 45. Percona Toolkit Documentation, Release 2.1.1 Note, however, that there can be some kernel bugs that cause field 9 in /proc/diskstats to become negative, and this can cause field 10 to be wrong, thus making the service time computation not wholly trustworthy. Note that in the above formula we use utilization very specifically. It is a duration, not a percentage. You can compare the stime and qtime columns to see whether the response time for reads and writes is spent in the queue or on the physical device. However, you cannot see the difference between reads and writes. Changing the block device scheduler algorithm might improve queue time greatly. The default algorithm, cfq, is very bad for servers, and should only be used on laptops and workstations that perform tasks such as working with spreadsheets and surfing the Internet. If you are used to using iostat, you might wonder where you can find the same information in pt-diskstats. Here are two samples of output from both tools on the same machine at the same time, for /dev/sda, wrapped to fit: #ts dev rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt 08:50:10 sda 0.0 0.0 0.0 0% 0.0 0.0 08:50:20 sda 0.4 4.0 0.0 0% 0.0 15.5 08:50:30 sda 2.1 4.4 0.0 0% 0.0 21.1 08:50:40 sda 2.4 4.0 0.0 0% 0.0 15.4 08:50:50 sda 0.1 4.0 0.0 0% 0.0 33.0 wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt 7.7 25.5 0.2 84% 0.0 0.3 49.6 6.8 0.3 41% 2.4 28.8 210.1 5.6 1.1 28% 7.4 25.2 297.1 5.4 1.6 26% 11.4 28.3 11.9 11.7 0.1 66% 0.2 4.9 busy in_prg io_s qtime stime 1% 0 7.7 0.1 0.2 6% 0 50.0 28.1 0.7 12% 0 212.2 24.8 0.4 16% 0 299.5 27.8 0.4 1% 0 12.0 4.7 0.3 Dev rrqm/s wrqm/s r/s w/s rMB/s wMB/s 08:50:10 sda 0.00 41.40 0.00 7.70 0.00 0.19 08:50:20 sda 0.00 34.70 0.40 49.60 0.00 0.33 08:50:30 sda 0.00 83.30 2.10 210.10 0.01 1.15 08:50:40 sda 0.00 105.10 2.40 297.90 0.01 1.58 08:50:50 sda 0.00 22.50 0.10 11.10 0.00 0.13 avgrq-sz avgqu-sz await svctm %util 51.01 0.02 2.04 1.25 0.96 13.55 2.44 48.76 1.16 5.79 11.15 7.45 35.10 0.55 11.76 10.81 11.40 37.96 0.53 15.97 24.07 0.17 15.60 0.87 0.97 The correspondence between the columns is not one-to-one. In particular: rrqm/s, wrqm/s These columns in iostat are replaced by rd_mrg and wr_mrg in pt-diskstats. avgrq-sz This column is in sectors in iostat, and is a combination of reads and writes. The pt-diskstats output breaks these out separately and shows them in kB. You can derive it via a weighted average of rd_avkb and wr_avkb in pt-diskstats, and then multiply by 2 to get sectors (each sector is 512 bytes). 2.5. pt-diskstats 41
  • 46. Percona Toolkit Documentation, Release 2.1.1 avgqu-sz This column really represents concurrency at the block device scheduler. The pt-diskstats output shows concurrency for reads and writes separately: rd_cnc and wr_cnc. await This column is the average response time from the beginning to the end of a request to the block device, including queue time and service time, and is not shown in pt-diskstats. Instead, pt-diskstats shows individual response times at the disk level for reads and writes (rd_rt and wr_rt), as well as queue time versus service time for reads and writes in aggregate. svctm This column is the average service time at the disk, and is shown as stime in pt-diskstats. %util This column is called busy in pt-diskstats. Utilization is usually defined as the portion of time during which there was at least one active request, not as a percentage, which is why we chose to avoid this confusing term. 2.5.6 COLLECTING DATA It is straightforward to gather a sample of data for this tool. Files should have this format, with a timestamp line preceding each sample of statistics: TS <timestamp> <contents of /proc/diskstats> TS <timestamp> <contents of /proc/diskstats> ... et cetera You can simply use pt-diskstats with --save-samples to collect this data for you. If you wish to capture samples as part of some other tool, and use pt-diskstats to analyze them, you can include a snippet of shell script such as the following: INTERVAL=1 while true; do sleep=$(date +%s.%N | awk "{print $INTERVAL - ($1 % $INTERVAL)}") sleep $sleep date +"TS %s.%N %F %T" >> diskstats-samples.txt cat /proc/diskstats >> diskstats-samples.txt done 2.5.7 KERNEL DOCUMENTATION This documentation supplements the official documentation|http://www.kernel.org/doc/Documentation/iostats.txt on the contents of /proc/diskstats. That documentation can sometimes be difficult to understand for those who are not familiar with Linux kernel internals. The contents of /proc/diskstats are generated by the diskstats_show() function in the kernel source file block/genhd.c. Here is a sample of /proc/diskstats on a recent kernel. 8 1 sda1 426 243 3386 2056 3 0 18 87 0 2135 2142 The fields in this sample are as follows. The first three fields are the major and minor device numbers (8, 1), and the device name (sda1). They are followed by 11 fields of statistics: 42 Chapter 2. Tools
  • 47. Percona Toolkit Documentation, Release 2.1.1 1. The number of reads completed. This is the number of physical reads done by the underlying disk, not the number of reads that applications made from the block device. This means that 426 actual reads have completed successfully to the disk on which /dev/sda1 resides. Reads are not counted until they complete. 2. The number of reads merged because they were adjacent. In the sample, 243 reads were merged. This means that /dev/sda1 actually received 869 logical reads, but sent only 426 physical reads to the underlying physical device. 3. The number of sectors read successfully. The 426 physical reads to the disk read 3386 sectors. Sectors are 512 bytes, so a total of about 1.65MB have been read from /dev/sda1. 4. The number of milliseconds spent reading. This counts only reads that have completed, not reads that are in progress. It counts the time spent from when requests are placed on the queue until they complete, not the time that the underlying disk spends servicing the requests. That is, it measures the total response time seen by applications, not disk response times. 5. Ditto for field 1, but for writes. 6. Ditto for field 2, but for writes. 7. Ditto for field 3, but for writes. 8. Ditto for field 4, but for writes. 9. The number of I/Os currently in progress, that is, they’ve been scheduled by the queue scheduler and issued to the disk (submitted to the underlying disk’s queue), but not yet completed. There are bugs in some kernels that cause this number, and thus fields 10 and 11, to be wrong sometimes. 10. The total number of milliseconds spent doing I/Os. This is not the total response time seen by the applications; it is the total amount of time during which at least one I/O was in progress. If one I/O is issued at time 100, another comes in at 101, and both of them complete at 102, then this field increments by 2, not 3. 11. This field counts the total response time of all I/Os. In contrast to field 10, it counts double when two I/Os overlap. In our previous example, this field would increment by 3, not 2. 2.5.8 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -columns-regex type: string; default: . Print columns that match this Perl regex. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -devices-regex type: string Print devices that match this Perl regex. -group-by type: string; default: disk Group-by mode: disk, sample, or all. In disk mode, each line of output shows one disk device. In sample mode, each line of output shows one sample of statistics. In all mode, each line of output shows one sample and one disk device. 2.5. pt-diskstats 43
  • 48. Percona Toolkit Documentation, Release 2.1.1 -headers type: Hash; default: group,scroll If group is present, each sample will be separated by a blank line, unless the sample is only one line. If scroll is present, the tool will print the headers as often as needed to prevent them from scrolling out of view. Note that you can press the space bar, or the enter key, to reprint headers at will. -help Show help and exit. -interval type: int; default: 1 When in interactive mode, wait N seconds before printing to the screen. Also, how often the tool should sample /proc/diskstats. The tool attempts to gather statistics exactly on even intervals of clock time. That is, if you specify a 5-second interval, it will try to capture samples at 12:00:00, 12:00:05, and so on; it will not gather at 12:00:01, 12:00:06 and so forth. This can lead to slightly odd delays in some circumstances, because the tool waits one full cycle before printing out the first set of lines. (Unlike iostat and vmstat, pt-diskstats does not start with a line representing the averages since the computer was booted.) Therefore, the rule has an exception to avoid very long delays. Suppose you specify a 10-second interval, but you start the tool at 12:00:00.01. The tool might wait until 12:00:20 to print its first lines of output, and in the intervening 19.99 seconds, it would appear to do nothing. To alleviate this, the tool waits until the next even interval of time to gather, unless more than 20% of that interval remains. This means the tool will never wait more than 120% of the sampling interval to produce output, e.g if you start the tool at 12:00:53 with a 10-second sampling interval, then the first sample will be only 7 seconds long, not 10 seconds. -iterations type: int When in interactive mode, stop after N samples. Run forever by default. -sample-time type: int; default: 1 In –group-by sample mode, include N seconds of samples per group. -save-samples type: string File to save diskstats samples in; these can be used for later analysis. -show-inactive Show inactive devices. -show-timestamps Show a ‘HH:MM:SS’ timestamp in the #ts column. If multiple timestamps are aggregated into one line, the first timestamp is shown. -version Show version and exit. 2.5.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: 44 Chapter 2. Tools
  • 49. Percona Toolkit Documentation, Release 2.1.1 PTDEBUG=1 pt-diskstats ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.5.10 SYSTEM REQUIREMENTS This tool requires Perl v5.8.0 or newer and the /proc filesystem, unless reading from files. 2.5.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-diskstats. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.5.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.5.13 AUTHORS Baron Schwartz, Brian Fraser, and Daniel Nichter 2.5.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.5. pt-diskstats 45
  • 50. Percona Toolkit Documentation, Release 2.1.1 2.5.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.5.16 VERSION pt-diskstats 2.1.1 2.6 pt-duplicate-key-checker 2.6.1 NAME pt-duplicate-key-checker - Find duplicate indexes and foreign keys on MySQL tables. 2.6.2 SYNOPSIS Usage pt-duplicate-key-checker [OPTION...] [DSN] pt-duplicate-key-checker examines MySQL tables for duplicate or redundant indexes and foreign keys. Connection options are read from MySQL option files. pt-duplicate-key-checker --host host1 2.6.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-duplicate-key-checker is a read-only tool that executes SHOW CREATE TABLE and related queries to inspect table structures, and thus is very low-risk. At the time of this release, there is an unconfirmed bug that causes the tool to crash. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- duplicate-key-checker. See also “BUGS” for more information on filing bugs and getting help. 46 Chapter 2. Tools
  • 51. Percona Toolkit Documentation, Release 2.1.1 2.6.4 DESCRIPTION This program examines the output of SHOW CREATE TABLE on MySQL tables, and if it finds indexes that cover the same columns as another index in the same order, or cover an exact leftmost prefix of another index, it prints out the suspicious indexes. By default, indexes must be of the same type, so a BTREE index is not a duplicate of a FULLTEXT index, even if they have the same columns. You can override this. It also looks for duplicate foreign keys. A duplicate foreign key covers the same columns as another in the same table, and references the same parent table. 2.6.5 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -all-structs Compare indexes with different structs (BTREE, HASH, etc). By default this is disabled, because a BTREE index that covers the same columns as a FULLTEXT index is not really a duplicate, for example. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -[no]clustered default: yes PK columns appended to secondary key is duplicate. Detects when a suffix of a secondary key is a leftmost prefix of the primary key, and treats it as a duplicate key. Only detects this condition on storage engines whose primary keys are clustered (currently InnoDB and solidDB). Clustered storage engines append the primary key columns to the leaf nodes of all secondary keys anyway, so you might consider it redundant to have them appear in the internal nodes as well. Of course, you may also want them in the internal nodes, because just having them at the leaf nodes won’t help for some queries. It does help for covering index queries, however. Here’s an example of a key that is considered redundant with this option: PRIMARY KEY (‘a‘) KEY ‘b‘ (‘b‘,‘a‘) The use of such indexes is rather subtle. For example, suppose you have the following query: SELECT ... WHERE b=1 ORDER BY a; This query will do a filesort if we remove the index on b,a. But if we shorten the index on b,a to just b and also remove the ORDER BY, the query should return the same results. The tool suggests shortening duplicate clustered keys by dropping the key and re-adding it without the primary key prefix. The shortened clustered key may still duplicate another key, but the tool cannot currently detect when this happens without being ran a second time to re-check the newly shortened clustered keys. Therefore, if you shorten any duplicate clustered keys, you should run the tool again. 2.6. pt-duplicate-key-checker 47
  • 52. Percona Toolkit Documentation, Release 2.1.1 -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -databases short form: -d; type: hash Check only this comma-separated list of databases. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -engines short form: -e; type: hash Check only tables whose storage engine is in this comma-separated list. -help Show help and exit. -host short form: -h; type: string Connect to host. -ignore-databases type: Hash Ignore this comma-separated list of databases. -ignore-engines type: Hash Ignore this comma-separated list of storage engines. -ignore-order Ignore index order so KEY(a,b) duplicates KEY(b,a). -ignore-tables type: Hash Ignore this comma-separated list of tables. Table names may be qualified with the database name. -key-types type: string; default: fk Check for duplicate f=foreign keys, k=keys or fk=both. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. 48 Chapter 2. Tools
  • 53. Percona Toolkit Documentation, Release 2.1.1 -port short form: -P; type: int Port number to use for connection. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -[no]sql default: yes Print DROP KEY statement for each duplicate key. By default an ALTER TABLE DROP KEY statement is printed below each duplicate key so that, if you want to remove the duplicate key, you can copy-paste the statement into MySQL. To disable printing these statements, specify --no-sql. -[no]summary default: yes Print summary of indexes at end of output. -tables short form: -t; type: hash Check only this comma-separated list of tables. Table names may be qualified with the database name. -user short form: -u; type: string User for login if not current user. -verbose short form: -v Output all keys and/or foreign keys found, not just redundant ones. -version Show version and exit. 2.6.6 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D 2.6. pt-duplicate-key-checker 49
  • 54. Percona Toolkit Documentation, Release 2.1.1 dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.6.7 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-duplicate-key-checker ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.6.8 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.6.9 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-duplicate-key-checker. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version 50 Chapter 2. Tools
  • 55. Percona Toolkit Documentation, Release 2.1.1 • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.6.10 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.6.11 AUTHORS Baron Schwartz and Daniel Nichter 2.6.12 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.6.13 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.6.14 VERSION pt-duplicate-key-checker 2.1.1 2.6. pt-duplicate-key-checker 51
  • 56. Percona Toolkit Documentation, Release 2.1.1 2.7 pt-fifo-split 2.7.1 NAME pt-fifo-split - Split files and pipe lines to a fifo without really splitting. 2.7.2 SYNOPSIS Usage pt-fifo-split [options] [FILE ...] pt-fifo-split splits FILE and pipes lines to a fifo. With no FILE, or when FILE is -, read standard input. Read hugefile.txt in chunks of a million lines without physically splitting it: pt-fifo-split --lines 1000000 hugefile.txt while [ -e /tmp/pt-fifo-split ]; do cat /tmp/pt-fifo-split; done 2.7.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-fifo-split creates and/or deletes the --fifo file. Otherwise, no other files are modified, and it merely reads lines from the file given on the command-line. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-fifo- split. See also “BUGS” for more information on filing bugs and getting help. 2.7.4 DESCRIPTION pt-fifo-split lets you read from a file as though it contains only some of the lines in the file. When you read from it again, it contains the next set of lines; when you have gone all the way through it, the file disappears. This works only on Unix-like operating systems. You can specify multiple files on the command line. If you don’t specify any, or if you use the special filename -, lines are read from standard input. 2.7.5 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. 52 Chapter 2. Tools
  • 57. Percona Toolkit Documentation, Release 2.1.1 -fifo type: string; default: /tmp/pt-fifo-split The name of the fifo from which the lines can be read. -force Remove the fifo if it exists already, then create it again. -help Show help and exit. -lines type: int; default: 1000 The number of lines to read in each chunk. -offset type: int; default: 0 Begin at the Nth line. If the argument is 0, all lines are printed to the fifo. If 1, then beginning at the first line, lines are printed (exactly the same as 0). If 2, the first line is skipped, and the 2nd and subsequent lines are printed to the fifo. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -statistics Print out statistics between chunks. The statistics are the number of chunks, the number of lines, elapsed time, and lines per second overall and during the last chunk. -version Show version and exit. 2.7.6 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-fifo-split ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.7.7 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.7.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-fifo-split. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: 2.7. pt-fifo-split 53
  • 58. Percona Toolkit Documentation, Release 2.1.1 • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.7.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.7.10 AUTHORS Baron Schwartz 2.7.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.7.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 54 Chapter 2. Tools
  • 59. Percona Toolkit Documentation, Release 2.1.1 2.7.13 VERSION pt-fifo-split 2.1.1 2.8 pt-find 2.8.1 NAME pt-find - Find MySQL tables and execute actions, like GNU find. 2.8.2 SYNOPSIS Usage pt-find [OPTION...] [DATABASE...] pt-find searches for MySQL tables and executes actions, like GNU find. The default action is to print the database and table name. Find all tables created more than a day ago, which use the MyISAM engine, and print their names: pt-find --ctime +1 --engine MyISAM Find InnoDB tables that haven’t been updated in a month, and convert them to MyISAM storage engine (data ware- housing, anyone?): pt-find --mtime +30 --engine InnoDB --exec "ALTER TABLE %D.%N ENGINE=MyISAM" Find tables created by a process that no longer exists, following the name_sid_pid naming convention, and remove them. pt-find --connection-id ’D_d+_(d+)$’ --server-id ’D_(d+)_d+$’ --exec-plus "DROP TABLE %s" Find empty tables in the test and junk databases, and delete them: pt-find --empty junk test --exec-plus "DROP TABLE %s" Find tables more than five gigabytes in total size: pt-find --tablesize +5G Find all tables and print their total data and index size, and sort largest tables first (sort is a different program, by the way). pt-find --printf "%Tt%D.%Nn" | sort -rn As above, but this time, insert the data back into the database for posterity: pt-find --noquote --exec "INSERT INTO sysdata.tblsize(db, tbl, size) VALUES(’%D’, ’%N’, %T)" 2.8.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. 2.8. pt-find 55
  • 60. Percona Toolkit Documentation, Release 2.1.1 pt-find only reads and prints information by default, but --exec and --exec-plus can execute user-defined SQL. You should be as careful with it as you are with any command-line tool that can execute queries against your database. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-find. See also “BUGS” for more information on filing bugs and getting help. 2.8.4 DESCRIPTION pt-find looks for MySQL tables that pass the tests you specify, and executes the actions you specify. The default action is to print the database and table name to STDOUT. pt-find is simpler than GNU find. It doesn’t allow you to specify complicated expressions on the command line. pt-find uses SHOW TABLES when possible, and SHOW TABLE STATUS when needed. 2.8.5 OPTION TYPES There are three types of options: normal options, which determine some behavior or setting; tests, which determine whether a table should be included in the list of tables found; and actions, which do something to the tables pt-find finds. pt-find uses standard Getopt::Long option parsing, so you should use double dashes in front of long option names, unlike GNU find. 2.8.6 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -case-insensitive Specifies that all regular expression searches are case-insensitive. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -day-start Measure times (for --mmin, etc) from the beginning of today rather than from the current time. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. 56 Chapter 2. Tools
  • 61. Percona Toolkit Documentation, Release 2.1.1 -help Show help and exit. -host short form: -h; type: string Connect to host. -or Combine tests with OR, not AND. By default, tests are evaluated as though there were an AND between them. This option switches it to OR. Option parsing is not implemented by pt-find itself, so you cannot specify complicated expressions with paren- theses and mixtures of OR and AND. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -port short form: -P; type: int Port number to use for connection. -[no]quote default: yes Quotes MySQL identifier names with MySQL’s standard backtick character. Quoting happens after tests are run, and before actions are run. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.8. pt-find 57
  • 62. Percona Toolkit Documentation, Release 2.1.1 2.8.7 TESTS Most tests check some criterion against a column of SHOW TABLE STATUS output. Numeric arguments can be specified as +n for greater than n, -n for less than n, and n for exactly n. All numeric options can take an optional suffix multiplier of k, M or G (1_024, 1_048_576, and 1_073_741_824 respectively). All patterns are Perl regular expressions (see ‘man perlre’) unless specified as SQL LIKE patterns. Dates and times are all measured relative to the same instant, when pt-find first asks the database server what time it is. All date and time manipulation is done in SQL, so if you say to find tables modified 5 days ago, that translates to SELECT DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 5 DAY). If you specify --day-start, if course it’s relative to CURRENT_DATE instead. However, table sizes and other metrics are not consistent at an instant in time. It can take some time for MySQL to process all the SHOW queries, and pt-find can’t do anything about that. These measurements are as of the time they’re taken. If you need some test that’s not in this list, file a bug report and I’ll enhance pt-find for you. It’s really easy. -autoinc type: string; group: Tests Table’s next AUTO_INCREMENT is n. This tests the Auto_increment column. -avgrowlen type: size; group: Tests Table avg row len is n bytes. This tests the Avg_row_length column. The specified size can be “NULL” to test where Avg_row_length IS NULL. -checksum type: string; group: Tests Table checksum is n. This tests the Checksum column. -cmin type: size; group: Tests Table was created n minutes ago. This tests the Create_time column. -collation type: string; group: Tests Table collation matches pattern. This tests the Collation column. -column-name type: string; group: Tests A column name in the table matches pattern. -column-type type: string; group: Tests A column in the table matches this type (case-insensitive). Examples of types are: varchar, char, int, smallint, bigint, decimal, year, timestamp, text, enum. -comment type: string; group: Tests Table comment matches pattern. This tests the Comment column. -connection-id type: string; group: Tests 58 Chapter 2. Tools
  • 63. Percona Toolkit Documentation, Release 2.1.1 Table name has nonexistent MySQL connection ID. This tests the table name for a pattern. The argument to this test must be a Perl regular expression that captures digits like this: (d+). If the table name matches the pattern, these captured digits are taken to be the MySQL connection ID of some process. If the connection doesn’t exist according to SHOW FULL PROCESSLIST, the test returns true. If the connection ID is greater than pt-find‘s own connection ID, the test returns false for safety. Why would you want to do this? If you use MySQL statement-based replication, you probably know the trouble temporary tables can cause. You might choose to work around this by creating real tables with unique names, instead of temporary tables. One way to do this is to append your connection ID to the end of the table, thusly: scratch_table_12345. This assures the table name is unique and lets you have a way to find which connection it was associated with. And perhaps most importantly, if the connection no longer exists, you can assume the connection died without cleaning up its tables, and this table is a candidate for removal. This is how I manage scratch tables, and that’s why I included this test in pt-find. The argument I use to --connection-id is “D_(d+)$”. That finds tables with a series of numbers at the end, preceded by an underscore and some non-number character (the latter criterion prevents me from examining tables with a date at the end, which people tend to do: baron_scratch_2007_05_07 for example). It’s better to keep the scratch tables separate of course. If you do this, make sure the user pt-find runs as has the PROCESS privilege! Otherwise it will only see connections from the same user, and might think some tables are ready to remove when they’re still in use. For safety, pt-find checks this for you. See also --server-id. -createopts type: string; group: Tests Table create option matches pattern. This tests the Create_options column. -ctime type: size; group: Tests Table was created n days ago. This tests the Create_time column. -datafree type: size; group: Tests Table has n bytes of free space. This tests the Data_free column. The specified size can be “NULL” to test where Data_free IS NULL. -datasize type: size; group: Tests Table data uses n bytes of space. This tests the Data_length column. The specified size can be “NULL” to test where Data_length IS NULL. -dblike type: string; group: Tests Database name matches SQL LIKE pattern. -dbregex type: string; group: Tests Database name matches this pattern. -empty group: Tests Table has no rows. This tests the Rows column. 2.8. pt-find 59
  • 64. Percona Toolkit Documentation, Release 2.1.1 -engine type: string; group: Tests Table storage engine matches this pattern. This tests the Engine column, or in earlier versions of MySQL, the Type column. -function type: string; group: Tests Function definition matches pattern. -indexsize type: size; group: Tests Table indexes use n bytes of space. This tests the Index_length column. The specified size can be “NULL” to test where Index_length IS NULL. -kmin type: size; group: Tests Table was checked n minutes ago. This tests the Check_time column. -ktime type: size; group: Tests Table was checked n days ago. This tests the Check_time column. -mmin type: size; group: Tests Table was last modified n minutes ago. This tests the Update_time column. -mtime type: size; group: Tests Table was last modified n days ago. This tests the Update_time column. -procedure type: string; group: Tests Procedure definition matches pattern. -rowformat type: string; group: Tests Table row format matches pattern. This tests the Row_format column. -rows type: size; group: Tests Table has n rows. This tests the Rows column. The specified size can be “NULL” to test where Rows IS NULL. -server-id type: string; group: Tests Table name contains the server ID. If you create temporary tables with the naming convention explained in --connection-id, but also add the server ID of the server on which the tables are created, then you can use this pattern match to ensure tables are dropped only on the server they’re created on. This prevents a table from being accidentally dropped on a slave while it’s in use (provided that your server IDs are all unique, which they should be for replication to work). For example, on the master (server ID 22) you create a table called scratch_table_22_12345. If you see this table on the slave (server ID 23), you might think it can be dropped safely if there’s no such connection 12345. But if you also force the name to match the server ID with --server-id ’D_(d+)_d+$’, the table won’t be dropped on the slave. 60 Chapter 2. Tools
  • 65. Percona Toolkit Documentation, Release 2.1.1 -tablesize type: size; group: Tests Table uses n bytes of space. This tests the sum of the Data_length and Index_length columns. -tbllike type: string; group: Tests Table name matches SQL LIKE pattern. -tblregex type: string; group: Tests Table name matches this pattern. -tblversion type: size; group: Tests Table version is n. This tests the Version column. -trigger type: string; group: Tests Trigger action statement matches pattern. -trigger-table type: string; group: Tests --trigger is defined on table matching pattern. -view type: string; group: Tests CREATE VIEW matches this pattern. 2.8.8 ACTIONS The --exec-plus action happens after everything else, but otherwise actions happen in an indeterminate order. If you need determinism, file a bug report and I’ll add this feature. -exec type: string; group: Actions Execute this SQL with each item found. The SQL can contain escapes and formatting directives (see --printf). -exec-dsn type: string; group: Actions Specify a DSN in key-value format to use when executing SQL with --exec and --exec-plus. Any values not specified are inherited from command-line arguments. -exec-plus type: string; group: Actions Execute this SQL with all items at once. This option is unlike --exec. There are no escaping or formatting directives; there is only one special placeholder for the list of database and table names, %s. The list of tables found will be joined together with commas and substituted wherever you place %s. You might use this, for example, to drop all the tables you found: DROP TABLE %s 2.8. pt-find 61
  • 66. Percona Toolkit Documentation, Release 2.1.1 This is sort of like GNU find’s “-exec command {} +” syntax. Only it’s not totally cryptic. And it doesn’t require me to write a command-line parser. -print group: Actions Print the database and table name, followed by a newline. This is the default action if no other action is specified. -printf type: string; group: Actions Print format on the standard output, interpreting ‘’ escapes and ‘%’ directives. Escapes are backslashed char- acters, like n and t. Perl interprets these, so you can use any escapes Perl knows about. Directives are replaced by %s, and as of this writing, you can’t add any special formatting instructions, like field widths or alignment (though I’m musing over ways to do that). Here is a list of the directives. Note that most of them simply come from columns of SHOW TABLE STATUS. If the column is NULL or doesn’t exist, you get an empty string in the output. A % character followed by any character not in the following list is discarded (but the other character is printed). CHAR DATA SOURCE NOTES ---- ------------------ ------------------------------------------ a Auto_increment A Avg_row_length c Checksum C Create_time D Database The database name in which the table lives d Data_length E Engine In older versions of MySQL, this is Type F Data_free f Innodb_free Parsed from the Comment field I Index_length K Check_time L Collation M Max_data_length N Name O Comment P Create_options R Row_format S Rows T Table_length Data_length+Index_length U Update_time V Version 2.8.9 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes 62 Chapter 2. Tools
  • 67. Percona Toolkit Documentation, Release 2.1.1 Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.8.10 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-find ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.8.11 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.8.12 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-find. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved 2.8. pt-find 63
  • 68. Percona Toolkit Documentation, Release 2.1.1 • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.8.13 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.8.14 AUTHORS Baron Schwartz 2.8.15 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.8.16 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.8.17 VERSION pt-find 2.1.1 64 Chapter 2. Tools
  • 69. Percona Toolkit Documentation, Release 2.1.1 2.9 pt-fingerprint 2.9.1 NAME pt-fingerprint - Convert queries into fingerprints. 2.9.2 SYNOPSIS Usage pt-fingerprint [OPTIONS] [FILES] pt-fingerprint converts queries into fingerprints. With the –query option, converts the option’s value into a fingerprint. With no options, treats command-line arguments as FILEs and reads and converts semicolon-separated queries from the FILEs. When FILE is -, it read standard input. Convert a single query: pt-fingerprint --query "select a, b, c from users where id = 500" Convert a file full of queries: pt-fingerprint /path/to/file.txt 2.9.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. The pt-fingerprint tool simply reads data and transforms it, so risks are minimal. See also “BUGS” for more information on filing bugs and getting help. 2.9.4 DESCRIPTION A query fingerprint is the abstracted form of a query, which makes it possible to group similar queries together. Abstracting a query removes literal values, normalizes whitespace, and so on. For example, consider these two queries: SELECT name, password FROM user WHERE id=’12823’; select name, password from user where id=5; Both of those queries will fingerprint to select name, password from user where id=? Once the query’s fingerprint is known, we can then talk about a query as though it represents all similar queries. Query fingerprinting accommodates a great many special cases, which have proven necessary in the real world. For example, an IN list with 5 literals is really equivalent to one with 4 literals, so lists of literals are collapsed to a single one. If you want to understand more about how and why all of these cases are handled, please review the test cases in the Subversion repository. If you find something that is not fingerprinted properly, please submit a bug report with a reproducible test case. Here is a list of transformations during fingerprinting, which might not be exhaustive: 2.9. pt-fingerprint 65
  • 70. Percona Toolkit Documentation, Release 2.1.1 • Group all SELECT queries from mysqldump together, even if they are against different tables. Ditto for all of pt-table-checksum’s checksum queries. • Shorten multi-value INSERT statements to a single VALUES() list. • Strip comments. • Abstract the databases in USE statements, so all USE statements are grouped together. • Replace all literals, such as quoted strings. For efficiency, the code that replaces literal numbers is somewhat non-selective, and might replace some things as numbers when they really are not. Hexadecimal literals are also replaced. NULL is treated as a literal. Numbers embedded in identifiers are also replaced, so tables named similarly will be fingerprinted to the same values (e.g. users_2009 and users_2010 will fingerprint identically). • Collapse all whitespace into a single space. • Lowercase the entire query. • Replace all literals inside of IN() and VALUES() lists with a single placeholder, regardless of cardinality. • Collapse multiple identical UNION queries into a single one. 2.9.5 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -help Show help and exit. -match-embedded-numbers Match numbers embedded in words and replace as single values. This option causes the tool to be more careful about matching numbers so that words with numbers, like catch22 are matched and replaced as a single ? placeholder. Otherwise the default number matching pattern will replace catch22 as catch?. This is helpful if database or table names contain numbers. -match-md5-checksums Match MD5 checksums and replace as single values. This option causes the tool to be more careful about matching numbers so that MD5 checksums like fbc5e685a5d3d45aa1d0347fdb7c4d35 are matched and replaced as a single ? placeholder. Otherwise, the default number matching pattern will replace fbc5e685a5d3d45aa1d0347fdb7c4d35 as fbc?. -query type: string The query to convert into a fingerprint. -version Show version and exit. 2.9.6 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: 66 Chapter 2. Tools
  • 71. Percona Toolkit Documentation, Release 2.1.1 PTDEBUG=1 pt-fingerprint ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.9.7 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.9.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-fingerprint. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.9.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.9.10 AUTHORS Baron Schwartz and Daniel Nichter 2.9.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.9. pt-fingerprint 67
  • 72. Percona Toolkit Documentation, Release 2.1.1 2.9.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.9.13 VERSION pt-fingerprint 2.1.1 2.10 pt-fk-error-logger 2.10.1 NAME pt-fk-error-logger - Extract and log MySQL foreign key errors. 2.10.2 SYNOPSIS Usage pt-fk-error-logger [OPTION...] SOURCE_DSN pt-fk-error-logger extracts and saves information about the most recent foreign key errors in a MySQL server. Print foreign key errors on host1: pt-fk-error-logger h=host1 Save foreign key errors on host1 to db.foreign_key_errors table on host2: pt-fk-error-logger h=host1 --dest h=host1,D=db,t=foreign_key_errors 2.10.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-fk-error-logger is read-only unless you specify --dest. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-fk- error-logger. 68 Chapter 2. Tools
  • 73. Percona Toolkit Documentation, Release 2.1.1 See also “BUGS” for more information on filing bugs and getting help. 2.10.4 DESCRIPTION pt-fk-error-logger prints or saves the foreign key errors text from SHOW INNODB STATUS. The errors are not parsed or interpreted in any way. Foreign key errors are uniquely identified by their timestamp. Only new (more recent) errors are printed or saved. 2.10.5 OUTPUT If --print is given or no --dest is given, then pt-fk-error-logger prints the foreign key error text to STDOUT exactly as it appeared in SHOW INNODB STATUS. 2.10.6 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -dest type: DSN DSN for where to store foreign key errors; specify at least a database (D) and table (t). Missing values are filled in with the same values from the source host, so you can usually omit most parts of this argument if you’re storing foreign key errors on the same server on which they happen. The following table is suggested: CREATE TABLE foreign_key_errors ( ts datetime NOT NULL, error text NOT NULL, PRIMARY KEY (ts), ) The only information saved is the timestamp and the foreign key error text. 2.10. pt-fk-error-logger 69
  • 74. Percona Toolkit Documentation, Release 2.1.1 -help Show help and exit. -host short form: -h; type: string Connect to host. -interval type: time; default: 0 How often to check for foreign key errors. -log type: string Print all output to this file when daemonized. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -print Print results on standard output. See “OUTPUT” for more. -run-time type: time How long to run before exiting. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 70 Chapter 2. Tools
  • 75. Percona Toolkit Documentation, Release 2.1.1 2.10.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • t Table in which to store foreign key errors. • u dsn: user; copy: yes User for login if not current user. 2.10.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-fk-error-logger ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.10. pt-fk-error-logger 71
  • 76. Percona Toolkit Documentation, Release 2.1.1 2.10.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.10.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-fk-error-logger. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.10.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.10.12 AUTHORS Daniel Nichter 2.10.13 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 72 Chapter 2. Tools
  • 77. Percona Toolkit Documentation, Release 2.1.1 2.10.14 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.10.15 VERSION pt-fk-error-logger 2.1.1 2.11 pt-heartbeat 2.11.1 NAME pt-heartbeat - Monitor MySQL replication delay. 2.11.2 SYNOPSIS Usage pt-heartbeat [OPTION...] [DSN] --update|--monitor|--check|--stop pt-heartbeat measures replication lag on a MySQL or PostgreSQL server. You can use it to update a master or monitor a replica. If possible, MySQL connection options are read from your .my.cnf file. Start daemonized process to update test.heartbeat table on master: pt-heartbeat -D test --update -h master-server --daemonize Monitor replication lag on slave: pt-heartbeat -D test --monitor -h slave-server pt-heartbeat -D test --monitor -h slave-server --dbi-driver Pg Check slave lag once and exit (using optional DSN to specify slave host): pt-heartbeat -D test --check h=slave-server 2.11.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. 2.11. pt-heartbeat 73
  • 78. Percona Toolkit Documentation, Release 2.1.1 pt-heartbeat merely reads and writes a single record in a table. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- heartbeat. See also “BUGS” for more information on filing bugs and getting help. 2.11.4 DESCRIPTION pt-heartbeat is a two-part MySQL and PostgreSQL replication delay monitoring system that measures delay by looking at actual replicated data. This avoids reliance on the replication mechanism itself, which is unreliable. (For example, SHOW SLAVE STATUS on MySQL). The first part is an --update instance of pt-heartbeat that connects to a master and updates a timestamp (“heartbeat record”) every --interval seconds. Since the heartbeat table may contain records from multiple masters (see “MULTI-SLAVE HIERARCHY”), the server’s ID (@@server_id) is used to identify records. The second part is a --monitor or --check instance of pt-heartbeat that connects to a slave, examines the replicated heartbeat record from its immediate master or the specified --master-server-id, and computes the difference from the current system time. If replication between the slave and the master is delayed or broken, the computed difference will be greater than zero and potentially increase if --monitor is specified. You must either manually create the heartbeat table on the master or use --create-table. See --create-table for the proper heartbeat table structure. The MEMORY storage engine is suggested, but not re- quired of course, for MySQL. The heartbeat table must contain a heartbeat row. By default, a heartbeat row is inserted if it doesn’t exist. This feature can be disabled with the --[no]insert-heartbeat-row option in case the database user does not have INSERT privileges. pt-heartbeat depends only on the heartbeat record being replicated to the slave, so it works regardless of the replication mechanism (built-in replication, a system such as Continuent Tungsten, etc). It works at any depth in the replication hierarchy; for example, it will reliably report how far a slave lags its master’s master’s master. And if replication is stopped, it will continue to work and report (accurately!) that the slave is falling further and further behind the master. pt-heartbeat has a maximum resolution of 0.01 second. The clocks on the master and slave servers must be closely synchronized via NTP. By default, --update checks happen on the edge of the second (e.g. 00:01) and --monitor checks happen halfway between seconds (e.g. 00:01.5). As long as the servers’ clocks are closely synchronized and replication events are propagating in less than half a second, pt-heartbeat will report zero seconds of delay. pt-heartbeat will try to reconnect if the connection has an error, but will not retry if it can’t get a connection when it first starts. The --dbi-driver option lets you use pt-heartbeat to monitor PostgreSQL as well. It is reported to work well with Slony-1 replication. 2.11.5 MULTI-SLAVE HIERARCHY If the replication hierarchy has multiple slaves which are masters of other slaves, like “master -> slave1 - > slave2”, --update instances can be ran on the slaves as well as the master. The default heartbeat ta- ble (see --create-table) is keyed on the server_id column, so each server will update the row where server_id=@@server_id. 74 Chapter 2. Tools
  • 79. Percona Toolkit Documentation, Release 2.1.1 For --monitor and --check, if --master-server-id is not specified, the tool tries to discover and use the slave’s immediate master. If this fails, or if you want monitor lag from another master, then you can specify the --master-server-id to use. For example, if the replication hierarchy is “master -> slave1 -> slave2” with corresponding server IDs 1, 2 and 3, you can: pt-heartbeat --daemonize -D test --update -h master pt-heartbeat --daemonize -D test --update -h slave1 Then check (or monitor) the replication delay from master to slave2: pt-heartbeat -D test --master-server-id 1 --check slave2 Or check the replication delay from slave1 to slave2: pt-heartbeat -D test --master-server-id 2 --check slave2 Stopping the --update instance one slave1 will not affect the instance on master. 2.11.6 MASTER AND SLAVE STATUS The default heartbeat table (see --create-table) has columns for saving information from SHOW MASTER STATUS and SHOW SLAVE STATUS. These columns are optional. If any are present, their corresponding infor- mation will be saved. 2.11.7 OPTIONS Specify at least one of --stop, --update, --monitor, or --check. --update, --monitor, and --check are mutually exclusive. --daemonize and --check are mutually exclusive. This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -check Check slave delay once and exit. If you also specify --recurse, the tool will try to discover slave’s of the given slave and check and print their lag, too. The hostname or IP and port for each slave is printed before its delay. --recurse only works with MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -create-table Create the heartbeat --table if it does not exist. 2.11. pt-heartbeat 75
  • 80. Percona Toolkit Documentation, Release 2.1.1 This option causes the table specified by --database and --table to be created with the following MAGIC_create_heartbeat table definition: CREATE TABLE heartbeat ( ts varchar(26) NOT NULL, server_id int unsigned NOT NULL PRIMARY KEY, file varchar(255) DEFAULT NULL, -- SHOW MASTER STATUS position bigint unsigned DEFAULT NULL, -- SHOW MASTER STATUS relay_master_log_file varchar(255) DEFAULT NULL, -- SHOW SLAVE STATUS exec_master_log_pos bigint unsigned DEFAULT NULL -- SHOW SLAVE STATUS ); The heartbeat table requires at least one row. If you manually create the heartbeat table, then you must insert a row by doing: INSERT INTO heartbeat (ts, server_id) VALUES (NOW(), N); where N is the server’s ID; do not use @@server_id because it will replicate and slaves will insert their own server ID instead of the master’s server ID. This is done automatically by --create-table. A legacy version of the heartbeat table is still supported: CREATE TABLE heartbeat ( id int NOT NULL PRIMARY KEY, ts datetime NOT NULL ); Legacy tables do not support --update instances on each slave of a multi-slave hierarchy like “master -> slave1 -> slave2”. To manually insert the one required row into a legacy table: INSERT INTO heartbeat (id, ts) VALUES (1, NOW()); The tool automatically detects if the heartbeat table is legacy. See also “MULTI-SLAVE HIERARCHY”. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -database short form: -D; type: string The database to use for the connection. -dbi-driver default: mysql; type: string Specify a driver for the connection; mysql and Pg are supported. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -file type: string Print latest --monitor output to this file. When --monitor is given, prints output to the specified file instead of to STDOUT. The file is opened, trun- cated, and closed every interval, so it will only contain the most recent statistics. Useful when --daemonize is given. 76 Chapter 2. Tools
  • 81. Percona Toolkit Documentation, Release 2.1.1 -frames type: string; default: 1m,5m,15m Timeframes for averages. Specifies the timeframes over which to calculate moving averages when --monitor is given. Specify as a comma-separated list of numbers with suffixes. The suffix can be s for seconds, m for minutes, h for hours, or d for days. The size of the largest frame determines the maximum memory usage, as up to the specified number of per-second samples are kept in memory to calculate the averages. You can specify as many timeframes as you like. -help Show help and exit. -host short form: -h; type: string Connect to host. -[no]insert-heartbeat-row default: yes Insert a heartbeat row in the --table if one doesn’t exist. The heartbeat --table requires a heartbeat row, else there’s nothing to --update, --monitor, or --check! By default, the tool will insert a heartbeat row if one is not already present. You can disable this feature by specifying --no-insert-heartbeat-row in case the database user does not have INSERT privileges. -interval type: float; default: 1.0 How often to update or check the heartbeat --table. Updates and checks begin on the first whole second then repeat every --interval seconds for --update and every --interval plus --skew seconds for --monitor. For example, if at 00:00.4 an --update instance is started at 0.5 second intervals, the first update happens at 00:01.0, the next at 00:01.5, etc. If at 00:10.7 a --monitor instance is started at 0.05 second intervals with the default 0.5 second --skew, then the first check happens at 00:11.5 (00:11.0 + 0.5) which will be --skew seconds after the last update which, because the instances are checking at synchronized intervals, happened at 00:11.0. The tool waits for and begins on the first whole second just to make the interval calculations simpler. Therefore, the tool could wait up to 1 second before updating or checking. The minimum (fastest) interval is 0.01, and the maximum precision is two decimal places, so 0.015 will be rounded to 0.02. If a legacy heartbeat table (see --create-table) is used, then the maximum precision is 1s because the ts column is type datetime. -log type: string Print all output to this file when daemonized. -master-server-id type: string Calculate delay from this master server ID for --monitor or --check. If not given, pt-heartbeat attempts to connect to the server’s master and determine its server id. 2.11. pt-heartbeat 77
  • 82. Percona Toolkit Documentation, Release 2.1.1 -monitor Monitor slave delay continuously. Specifies that pt-heartbeat should check the slave’s delay every second and report to STDOUT (or if --file is given, to the file instead). The output is the current delay followed by moving averages over the timeframe given in --frames. For example, 5s [ 0.25s, 0.05s, 0.02s ] -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -print-master-server-id Print the auto-detected or given --master-server-id. If --check or --monitor is specified, specify- ing this option will print the auto-detected or given --master-server-id at the end of each line. -recurse type: int Check slaves recursively to this depth in --check mode. Try to discover slave servers recursively, to the specified depth. After discovering servers, run the check on each one of them and print the hostname (if possible), followed by the slave delay. This currently works only with MySQL. See --recursion-method. -recursion-method type: string Preferred recursion method used to find slaves. Possible methods are: METHOD USES =========== ================ processlist SHOW PROCESSLIST hosts SHOW SLAVE HOSTS The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method is required if the server uses a non-standard port (not 3306). Usually pt-heartbeat does the right thing and finds the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the other methods will be tried. -replace Use REPLACE instead of UPDATE for –update. When running in --update mode, use REPLACE instead of UPDATE to set the heartbeat table’s timestamp. The REPLACE statement is a MySQL extension to SQL. This option is useful when you don’t know whether the table contains any rows or not. It must be used in conjunction with –update. 78 Chapter 2. Tools
  • 83. Percona Toolkit Documentation, Release 2.1.1 -run-time type: time Time to run before exiting. -sentinel type: string; default: /tmp/pt-heartbeat-sentinel Exit if this file exists. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -skew type: float; default: 0.5 How long to delay checks. The default is to delay checks one half second. Since the update happens as soon as possible after the beginning of the second on the master, this allows one half second of replication delay before reporting that the slave lags the master by one second. If your clocks are not completely accurate or there is some other reason you’d like to delay the slave more or less, you can tweak this value. Try setting the PTDEBUG environment variable to see the effect this has. -socket short form: -S; type: string Socket file to use for connection. -stop Stop running instances by creating the sentinel file. This should have the effect of stopping all running instances which are watching the same sentinel file. If none of --update, --monitor or --check is specified, pt-heartbeat will exit after creating the file. If one of these is specified, pt-heartbeat will wait the interval given by --interval, then remove the file and continue working. You might find this handy to stop cron jobs gracefully if necessary, or to replace one running instance with another. For example, if you want to stop and restart pt-heartbeat every hour (just to make sure that it is restarted every hour, in case of a server crash or some other problem), you could use a crontab line like this: 0 * * * * :program:‘pt-heartbeat‘ --update -D test --stop --sentinel /tmp/pt-heartbeat-hourly The non-default --sentinel will make sure the hourly cron job stops only instances previously started with the same options (that is, from the same cron job). See also --sentinel. -table type: string; default: heartbeat The table to use for the heartbeat. Don’t specify database.table; use --database to specify the database. See --create-table. -update Update a master’s heartbeat. 2.11. pt-heartbeat 79
  • 84. Percona Toolkit Documentation, Release 2.1.1 -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.11.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 80 Chapter 2. Tools
  • 85. Percona Toolkit Documentation, Release 2.1.1 2.11.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-heartbeat ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.11.10 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.11.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-heartbeat. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.11.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.11.13 AUTHORS Proven Scaling LLC, SixApart Ltd, Baron Schwartz, and Daniel Nichter 2.11. pt-heartbeat 81
  • 86. Percona Toolkit Documentation, Release 2.1.1 2.11.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.11.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2006 Proven Scaling LLC and Six Apart Ltd, 2007-2012 Percona Inc. Feedback and improvements are welcome. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.11.16 VERSION pt-heartbeat 2.1.1 2.12 pt-index-usage 2.12.1 NAME pt-index-usage - Read queries from a log and analyze how they use indexes. 2.12.2 SYNOPSIS Usage pt-index-usage [OPTION...] [FILE...] pt-index-usage reads queries from logs and analyzes how they use indexes. Analyze queries in slow.log and print reports: pt-index-usage /path/to/slow.log --host localhost Disable reports and save results to mk database for later analysis: pt-index-usage slow.log --no-report --save-results-database mk 82 Chapter 2. Tools
  • 87. Percona Toolkit Documentation, Release 2.1.1 2.12.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. This tool is read-only unless you use --save-results-database. It reads a log of queries and EXPLAIN them. It also gathers information about all tables in all databases. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- index-usage. See also “BUGS” for more information on filing bugs and getting help. 2.12.4 DESCRIPTION This tool connects to a MySQL database server, reads through a query log, and uses EXPLAIN to ask MySQL how it will use each query. When it is finished, it prints out a report on indexes that the queries didn’t use. The query log needs to be in MySQL’s slow query log format. If you need to input a different format, you can use pt-query-digest to translate the formats. If you don’t specify a filename, the tool reads from STDIN. The tool runs two stages. In the first stage, the tool takes inventory of all the tables and indexes in your database, so it can compare the existing indexes to those that were actually used by the queries in the log. In the second stage, it runs EXPLAIN on each query in the query log. It uses separate database connections to inventory the tables and run EXPLAIN, so it opens two connections to the database. If a query is not a SELECT, it tries to transform it to a roughly equivalent SELECT query so it can be EXPLAINed. This is not a perfect process, but it is good enough to be useful. The tool skips the EXPLAIN step for queries that are exact duplicates of those seen before. It assumes that the same query will generate the same EXPLAIN plan as it did previously (usually a safe assumption, and generally good for performance), and simply increments the count of times that the indexes were used. However, queries that have the same fingerprint but different checksums will be re-EXPLAINed. Queries that have different literal constants can have different execution plans, and this is important to measure. After EXPLAIN-ing the query, it is necessary to try to map aliases in the query back to the original table names. For example, consider the EXPLAIN plan for the following query: SELECT * FROM tbl1 AS foo; The EXPLAIN output will show access to table foo, and that must be translated back to tbl1. This process involves complex parsing. It is generally very accurate, but there is some chance that it might not work right. If you find cases where it fails, submit a bug report and a reproducible test case. Queries that cannot be EXPLAINed will cause all subsequent queries with the same fingerprint to be blacklisted. This is to reduce the work they cause, and prevent them from continuing to print error messages. However, at least in this stage of the tool’s development, it is my opinion that it’s not a good idea to preemptively silence these, or prevent them from being EXPLAINed at all. I am looking for lots of feedback on how to improve things like the query parsing. So please submit your test cases based on the errors the tool prints! 2.12.5 OUTPUT After it reads all the events in the log, the tool prints out DROP statements for every index that was not used. It skips indexes for tables that were never accessed by any queries in the log, to avoid false-positive results. 2.12. pt-index-usage 83
  • 88. Percona Toolkit Documentation, Release 2.1.1 If you don’t specify --quiet, the tool also outputs warnings about statements that cannot be EXPLAINed and similar. These go to standard error. Progress reports are enabled by default (see --progress). These also go to standard error. 2.12.6 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -create-save-results-database Create the --save-results-database if it does not exist. If the --save-results-database already exists and this option is specified, the database is used and the necessary tables are created if they do not already exist. -[no]create-views Create views for --save-results-database example queries. Several example queries are given for querying the tables in the --save-results-database. These example queries are, by default, created as views. Specifying --no-create-views prevents these views from being created. -database short form: -D; type: string The database to use for the connection. -databases short form: -d; type: hash Only get tables and indexes from this comma-separated list of databases. -databases-regex type: string Only get tables and indexes from database whose names match this Perl regex. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -drop type: Hash; default: non-unique Suggest dropping only these types of unused indexes. By default pt-index-usage will only suggest to drop unused secondary indexes, not primary or unique indexes. You can specify which types of unused indexes the tool suggests to drop: primary, unique, non-unique, all. 84 Chapter 2. Tools
  • 89. Percona Toolkit Documentation, Release 2.1.1 A separate ALTER TABLE statement for each type is printed. So if you specify --drop all and there is a primary key and a non-unique index, the ALTER TABLE ... DROP for each will be printed on separate lines. -empty-save-results-tables Drop and re-create all pre-existing tables in the --save-results-database. This allows information from previous runs to be removed before the current run. -help Show help and exit. -host short form: -h; type: string Connect to host. -ignore-databases type: Hash Ignore this comma-separated list of databases. -ignore-databases-regex type: string Ignore databases whose names match this Perl regex. -ignore-tables type: Hash Ignore this comma-separated list of table names. Table names may be qualified with the database name. -ignore-tables-regex type: string Ignore tables whose names match the Perl regex. -password short form: -p; type: string Password to use when connecting. -port short form: -P; type: int Port number to use for connection. -progress type: array; default: time,30 Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage, seconds, or number of iterations. -quiet short form: -q Do not print any warnings. Also disables --progress. -[no]report default: yes Print the reports for --report-format. 2.12. pt-index-usage 85
  • 90. Percona Toolkit Documentation, Release 2.1.1 You may want to disable the reports by specifying --no-report if, for example, you also specify --save-results-database and you only want to query the results tables later. -report-format type: Array; default: drop_unused_indexes Right now there is only one report: drop_unused_indexes. This report prints SQL statements for dropping any unused indexes. See also --drop. See also --[no]report. -save-results-database type: DSN Save results to tables in this database. Information about indexes, queries, tables and their usage is stored in several tables in the specified database. The tables are auto-created if they do not exist. If the database doesn’t exist, it can be auto-created with --create-save-results-database. In this case the connection is initially created with no default database, then after the database is created, it is USE’ed. pt-index-usage executes INSERT statements to save the results. Therefore, you should be careful if you use this feature on a production server. It might increase load, or cause trouble if you don’t want the server to be written to, or so on. This is a new feature. It may change in future releases. After a run, you can query the usage tables to answer various questions about index usage. The tables have the following CREATE TABLE definitions: MAGIC_create_indexes: CREATE TABLE IF NOT EXISTS indexes ( db VARCHAR(64) NOT NULL, tbl VARCHAR(64) NOT NULL, idx VARCHAR(64) NOT NULL, cnt BIGINT UNSIGNED NOT NULL DEFAULT 0, PRIMARY KEY (db, tbl, idx) ) MAGIC_create_queries: CREATE TABLE IF NOT EXISTS queries ( query_id BIGINT UNSIGNED NOT NULL, fingerprint TEXT NOT NULL, sample TEXT NOT NULL, PRIMARY KEY (query_id) ) MAGIC_create_tables: CREATE TABLE IF NOT EXISTS tables ( db VARCHAR(64) NOT NULL, tbl VARCHAR(64) NOT NULL, cnt BIGINT UNSIGNED NOT NULL DEFAULT 0, PRIMARY KEY (db, tbl) ) MAGIC_create_index_usage: CREATE TABLE IF NOT EXISTS index_usage ( query_id BIGINT UNSIGNED NOT NULL, db VARCHAR(64) NOT NULL, tbl VARCHAR(64) NOT NULL, 86 Chapter 2. Tools
  • 91. Percona Toolkit Documentation, Release 2.1.1 idx VARCHAR(64) NOT NULL, cnt BIGINT UNSIGNED NOT NULL DEFAULT 1, UNIQUE INDEX (query_id, db, tbl, idx) ) MAGIC_create_index_alternatives: CREATE TABLE IF NOT EXISTS index_alternatives ( query_id BIGINT UNSIGNED NOT NULL, -- This query used db VARCHAR(64) NOT NULL, -- this index, but... tbl VARCHAR(64) NOT NULL, -- idx VARCHAR(64) NOT NULL, -- alt_idx VARCHAR(64) NOT NULL, -- was an alternative cnt BIGINT UNSIGNED NOT NULL DEFAULT 1, UNIQUE INDEX (query_id, db, tbl, idx, alt_idx), INDEX (db, tbl, idx), INDEX (db, tbl, alt_idx) ) The following are some queries you can run against these tables to answer common questions you might have. Each query is also created as a view (with MySQL v5.0 and newer) if :option:‘--[no]create-views‘ is true (it is by default). The view names are the strings after the MAGIC_view_ prefix. Question: which queries sometimes use different indexes, and what fraction of the time is each index chosen? MAGIC_view_query_uses_several_indexes: SELECT iu.query_id, CONCAT_WS(’.’, iu.db, iu.tbl, iu.idx) AS idx, variations, iu.cnt, iu.cnt / total_cnt * 100 AS pct FROM index_usage AS iu INNER JOIN ( SELECT query_id, db, tbl, SUM(cnt) AS total_cnt, COUNT(*) AS variations FROM index_usage GROUP BY query_id, db, tbl HAVING COUNT(*) > 1 ) AS qv USING(query_id, db, tbl); Question: which indexes have lots of alternatives, i.e. are chosen instead of other indexes, and for what queries? MAGIC_view_index_has_alternates: SELECT CONCAT_WS(’.’, db, tbl, idx) AS idx_chosen, GROUP_CONCAT(DISTINCT alt_idx) AS alternatives, GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt FROM index_alternatives GROUP BY db, tbl, idx HAVING COUNT(*) > 1; Question: which indexes are considered as alternates for other indexes, and for what queries? MAGIC_view_index_alternates: SELECT CONCAT_WS(’.’, db, tbl, alt_idx) AS idx_considered, GROUP_CONCAT(DISTINCT idx) AS alternative_to, GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt FROM index_alternatives GROUP BY db, tbl, alt_idx HAVING COUNT(*) > 1; Question: which of those are never chosen by any queries, and are therefore superfluous? MAGIC_view_unused_index_alternates: 2.12. pt-index-usage 87
  • 92. Percona Toolkit Documentation, Release 2.1.1 SELECT CONCAT_WS(’.’, i.db, i.tbl, i.idx) AS idx, alt.alternative_to, alt.queries, alt.cnt FROM indexes AS i INNER JOIN ( SELECT db, tbl, alt_idx, GROUP_CONCAT(DISTINCT idx) AS alternative_to, GROUP_CONCAT(DISTINCT query_id) AS queries, SUM(cnt) AS cnt FROM index_alternatives GROUP BY db, tbl, alt_idx HAVING COUNT(*) > 1 ) AS alt ON i.db = alt.db AND i.tbl = alt.tbl AND i.idx = alt.alt_idx WHERE i.cnt = 0; Question: given a table, which indexes were used, by how many queries, with how many distinct fingerprints? Were there alternatives? Which indexes were not used? You can edit the following query’s SELECT list to also see the query IDs in question. MAGIC_view_index_usage: SELECT i.idx, iu.usage_cnt, iu.usage_total, ia.alt_cnt, ia.alt_total FROM indexes AS i LEFT OUTER JOIN ( SELECT db, tbl, idx, COUNT(*) AS usage_cnt, SUM(cnt) AS usage_total, GROUP_CONCAT(query_id) AS used_by FROM index_usage GROUP BY db, tbl, idx ) AS iu ON i.db=iu.db AND i.tbl=iu.tbl AND i.idx = iu.idx LEFT OUTER JOIN ( SELECT db, tbl, idx, COUNT(*) AS alt_cnt, SUM(cnt) AS alt_total, GROUP_CONCAT(query_id) AS alt_queries FROM index_alternatives GROUP BY db, tbl, idx ) AS ia ON i.db=ia.db AND i.tbl=ia.tbl AND i.idx = ia.idx; Question: which indexes on a given table are vital for at least one query (there is no alternative)? MAGIC_view_required_indexes: SELECT i.db, i.tbl, i.idx, no_alt.queries FROM indexes AS i INNER JOIN ( SELECT iu.db, iu.tbl, iu.idx, GROUP_CONCAT(iu.query_id) AS queries FROM index_usage AS iu LEFT OUTER JOIN index_alternatives AS ia USING(db, tbl, idx) WHERE ia.db IS NULL GROUP BY iu.db, iu.tbl, iu.idx ) AS no_alt ON no_alt.db = i.db AND no_alt.tbl = i.tbl AND no_alt.idx = i.idx ORDER BY i.db, i.tbl, i.idx, no_alt.queries; -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string 88 Chapter 2. Tools
  • 93. Percona Toolkit Documentation, Release 2.1.1 Socket file to use for connection. -tables short form: -t; type: hash Only get indexes from this comma-separated list of tables. -tables-regex type: string Only get indexes from tables whose names match this Perl regex. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.12.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Database to connect to. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S 2.12. pt-index-usage 89
  • 94. Percona Toolkit Documentation, Release 2.1.1 dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.12.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-index-usage ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.12.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.12.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-index-usage. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.12.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 90 Chapter 2. Tools
  • 95. Percona Toolkit Documentation, Release 2.1.1 2.12.12 AUTHORS Baron Schwartz and Daniel Nichter 2.12.13 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.12.14 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.12.15 VERSION pt-index-usage 2.1.1 2.13 pt-ioprofile 2.13.1 NAME pt-ioprofile - Watch process IO and print a table of file and I/O activity. 2.13.2 SYNOPSIS Usage pt-ioprofile [OPTIONS] [FILE] pt-ioprofile does two things: 1) get lsof+strace for -s seconds, 2) aggregate the result. If you specify a FILE, then step 1) is not performed. 2.13. pt-ioprofile 91
  • 96. Percona Toolkit Documentation, Release 2.1.1 2.13.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-ioprofile is a read-only tool, so your data is not at risk. However, it works by attaching strace to the process using ptrace(), which will make it run very slowly until strace detaches. In addition to freezing the server, there is also some risk of the process crashing or performing badly after strace detaches from it, or indeed of strace not detaching cleanly and leaving the process in a sleeping state. As a result, this should be considered an intrusive tool, and should not be used on production servers unless you are comfortable with that. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- ioprofile. See also “BUGS” for more information on filing bugs and getting help. 2.13.4 DESCRIPTION pt-ioprofile uses strace and lsof to watch a process’s IO and print out a table of files and I/O activity. By default, it watches the mysqld process for 30 seconds. The output is like: Tue Dec 27 15:33:57 PST 2011 Tracing process ID 1833 total read write lseek ftruncate filename 0.000150 0.000029 0.000068 0.000038 0.000015 /tmp/ibBE5opS You probably need to run this tool as root. 2.13.5 OPTIONS -aggregate short form: -a; type: string; default: sum The aggregate function, either sum or avg. If sum, then each cell will contain the sum of the values in it. If avg, then each cell will contain the average of the values in it. -cell short form: -c; type: string; default: times The cell contents. Valid values are: VALUE CELLS CONTAIN ===== ======================= count Count of I/O operations sizes Sizes of I/O operations times I/O operation timing -group-by short form: -g; type: string; default: filename The group-by item. 92 Chapter 2. Tools
  • 97. Percona Toolkit Documentation, Release 2.1.1 Valid values are: VALUE GROUPING ===== ====================================== all Summarize into a single line of output filename One line of output per filename pid One line of output per process ID -help Print help and exit. -profile-pid short form: -p; type: int The PID to profile, overrides --profile-process. -profile-process short form: -b; type: string; default: mysqld The process name to profile. -run-time type: int; default: 30 How long to profile. -save-samples type: string Filename to save samples in; these can be used for later analysis. -version Print the tool’s version and exit. 2.13.6 ENVIRONMENT This tool does not use any environment variables. 2.13.7 SYSTEM REQUIREMENTS This tool requires the Bourne shell (/bin/sh). 2.13.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-ioprofile. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.13. pt-ioprofile 93
  • 98. Percona Toolkit Documentation, Release 2.1.1 2.13.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.13.10 AUTHORS Baron Schwartz 2.13.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.13.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.13.13 VERSION pt-ioprofile 2.1.1 94 Chapter 2. Tools
  • 99. Percona Toolkit Documentation, Release 2.1.1 2.14 pt-kill 2.14.1 NAME pt-kill - Kill MySQL queries that match certain criteria. 2.14.2 SYNOPSIS Usage pt-kill [OPTIONS] pt-kill kills MySQL connections. pt-kill connects to MySQL and gets queries from SHOW PROCESSLIST if no FILE is given. Else, it reads queries from one or more FILE which contains the output of SHOW PROCESSLIST. If FILE is -, pt-kill reads from STDIN. Kill queries running longer than 60s: pt-kill --busy-time 60 --kill Print, do not kill, queries running longer than 60s: pt-kill --busy-time 60 --print Check for sleeping processes and kill them all every 10s: pt-kill --match-command Sleep --kill --victims all --interval 10 Print all login processes: pt-kill --match-state login --print --victims all See which queries in the processlist right now would match: mysql -e "SHOW PROCESSLIST" > proclist.txt pt-kill --test-matching proclist.txt --busy-time 60 --print 2.14.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-kill kills queries if you use the --kill option, so it can disrupt your database’s users, of course. You should test with the <--print> option, which is safe, if you’re unsure what the tool will do. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-kill. See also “BUGS” for more information on filing bugs and getting help. 2.14. pt-kill 95
  • 100. Percona Toolkit Documentation, Release 2.1.1 2.14.4 DESCRIPTION pt-kill captures queries from SHOW PROCESSLIST, filters them, and then either kills or prints them. This is also known as a “slow query sniper” in some circles. The idea is to watch for queries that might be consuming too many resources, and kill them. For brevity, we talk about killing queries, but they may just be printed (or some other future action) depending on what options are given. Normally pt-kill connects to MySQL to get queries from SHOW PROCESSLIST. Alternatively, it can read SHOW PROCESSLIST output from files. In this case, pt-kill does not connect to MySQL and --kill has no effect. You should use --print instead when reading files. The ability to read a file with --test-matching allows you to capture SHOW PROCESSLIST and test it later with pt-kill to make sure that your matches kill the proper queries. There are a lot of special rules to follow, such as “don’t kill replication threads,” so be careful not to kill something important! Two important options to know are --busy-time and --victims. First, whereas most match/filter options match their corresponding value from SHOW PROCESSLIST (e.g. --match-command matches a query’s Command value), the Time value is matched by --busy-time. See also --interval. Second, --victims controls which matching queries from each class are killed. By default, the matching query with the highest Time value is killed (the oldest query). See the next section, “GROUP, MATCH AND KILL”, for more details. Usually you need to specify at least one --match option, else no queries will match. Or, you can specify --match-all to match all queries that aren’t ignored by an --ignore option. 2.14.5 GROUP, MATCH AND KILL Queries pass through several steps to determine which exactly will be killed (or printed–whatever action is specified). Understanding these steps will help you match precisely the queries you want. The first step is grouping queries into classes. The --group-by option controls grouping. By default, this option has no value so all queries are grouped into one default class. All types of matching and filtering (the next step) are applied per-class. Therefore, you may need to group queries in order to match/filter some classes but not others. The second step is matching. Matching implies filtering since if a query doesn’t match some criteria, it is removed from its class. Matching happens for each class. First, queries are filtered from their class by the various Query Matches options like --match-user. Then, entire classes are filtered by the various Class Matches options like --query-count. The third step is victim selection, that is, which matching queries in each class to kill. This is controlled by the --victims option. Although many queries in a class may match, you may only want to kill the oldest query, or all queries, etc. The forth and final step is to take some action on all matching queries from all classes. The Actions options specify which actions will be taken. At this step, there are no more classes, just a single list of queries to kill, print, etc. 2.14.6 OUTPUT If only --kill is given, then there is no output. If only --print is given, then a timestamped KILL statement if printed for every query that would have been killed, like: # 2009-07-15T15:04:01 KILL 8 (Query 42 sec) SELECT * FROM huge_table The line shows a timestamp, the query’s Id (8), its Time (42 sec) and its Info (usually the query SQL). 96 Chapter 2. Tools
  • 101. Percona Toolkit Documentation, Release 2.1.1 If both --kill and --print are given, then matching queries are killed and a line for each like the one above is printed. Any command executed by --execute-command is responsible for its own output and logging. After being executed, pt-kill has no control or interaction with the command. 2.14.7 OPTIONS Specify at least one of --kill, --kill-query, --print, --execute-command or --stop. --any-busy-time and --each-busy-time are mutually exclusive. --kill and --kill-query are mutually exclusive. --daemonize and --test-matching are mutually exclusive. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -filter type: string Discard events for which this Perl code doesn’t return true. This option is a string of Perl code or a file containing Perl code that gets compiled into a subroutine with one argument: $event. This is a hashref. If the given value is a readable file, then pt-kill reads the entire file and uses its contents as the code. The file should not contain a shebang (#!/usr/bin/perl) line. If the code returns true, the chain of callbacks continues; otherwise it ends. The code is the last statement in the subroutine other than return $event. The subroutine template is: sub { $event = shift; filter && return $event; } Filters given on the command line are wrapped inside parentheses like like ( filter ). For complex, multi- line filters, you must put the code inside a file so it will not be wrapped inside parentheses. Either way, the filter must produce syntactically valid code given the template. For example, an if-else branch given on the command line would not be valid: --filter ’if () { } else { }’ # WRONG 2.14. pt-kill 97
  • 102. Percona Toolkit Documentation, Release 2.1.1 Since it’s given on the command line, the if-else branch would be wrapped inside parentheses which is not syntactically valid. So to accomplish something more complex like this would require putting the code in a file, for example filter.txt: my $event_ok; if (...) { $event_ok=1; } else { $event_ok=0; } $event_ok Then specify --filter filter.txt to read the code from filter.txt. If the filter code won’t compile, pt-kill will die with an error. If the filter code does compile, an error may still occur at runtime if the code tries to do something wrong (like pattern match an undefined value). pt-kill does not provide any safeguards so code carefully! It is permissible for the code to have side effects (to alter $event). -group-by type: string Apply matches to each class of queries grouped by this SHOW PROCESSLIST column. In addition to the basic columns of SHOW PROCESSLIST (user, host, command, state, etc.), queries can be matched by fingerprint which abstracts the SQL query in the Info column. By default, queries are not grouped, so matches and actions apply to all queries. Grouping allows matches and actions to apply to classes of similar queries, if any queries in the class match. For example, detecting cache stampedes (see all-but-oldest under --victims for an explanation of that term) requires that queries are grouped by the arg attribute. This creates classes of identical queries (stripped of comments). So queries "SELECT c FROM t WHERE id=1" and "SELECT c FROM t WHERE id=1" are grouped into the same class, but query c<”SELECT c FROM t WHERE id=3”> is not iden- tical to the first two queries so it is grouped into another class. Then when --victims all-but-oldest is specified, all but the oldest query in each class is killed for each class of queries that matches the match criteria. -help Show help and exit. -host short form: -h; type: string; default: localhost Connect to host. -interval type: time How often to check for queries to kill. If --busy-time is not given, then the default interval is 30 seconds. Else the default is half as often as --busy-time. If both --interval and --busy-time are given, then the explicit --interval value is used. See also --run-time. -log type: string Print all output to this file when daemonized. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. 98 Chapter 2. Tools
  • 103. Percona Toolkit Documentation, Release 2.1.1 -port short form: -P; type: int Port number to use for connection. -run-time type: time How long to run before exiting. By default pt-kill runs forever, or until its process is killed or stopped by the creation of a --sentinel file. If this option is specified, pt-kill runs for the specified amount of time and sleeps --interval seconds between each check of the PROCESSLIST. -sentinel type: string; default: /tmp/pt-kill-sentinel Exit if this file exists. The presence of the file specified by --sentinel will cause all running instances of pt-kill to exit. You might find this handy to stop cron jobs gracefully if necessary. See also --stop. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -stop Stop running instances by creating the --sentinel file. Causes pt-kill to create the sentinel file specified by --sentinel and exit. This should have the effect of stopping all running instances which are watching the same sentinel file. -[no]strip-comments default: yes Remove SQL comments from queries in the Info column of the PROCESSLIST. -user short form: -u; type: string User for login if not current user. -version Show version and exit. -victims type: string; default: oldest Which of the matching queries in each class will be killed. After classes have been matched/filtered, this option specifies which of the matching queries in each class will be killed (or printed, etc.). The following values are possible: oldest Only kill the single oldest query. This is to prevent killing queries that aren’t really long-running, they’re just long-waiting. This sorts matching queries by Time and kills the one with the highest Time value. all 2.14. pt-kill 99
  • 104. Percona Toolkit Documentation, Release 2.1.1 Kill all queries in the class. all-but-oldest Kill all but the oldest query. This is the inverse of the oldest value. This value can be used to prevent “cache stampedes”, the condition where several identical queries are executed and create a backlog while the first query attempts to finish. Since all queries are identical, all but the first query are killed so that it can complete and populate the cache. -wait-after-kill type: time Wait after killing a query, before looking for more to kill. The purpose of this is to give blocked queries a chance to execute, so we don’t kill a query that’s blocking a bunch of others, and then kill the others immediately afterwards. -wait-before-kill type: time Wait before killing a query. The purpose of this is to give --execute-command a chance to see the matching query and gather other MySQL or system information before it’s killed. 2.14.8 QUERY MATCHES These options filter queries from their classes. If a query does not match, it is removed from its class. The --ignore options take precedence. The matches for command, db, host, etc. correspond to the columns returned by SHOW PROCESSLIST: Command, db, Host, etc. All pattern matches are case-sensitive by default, but they can be made case-insensitive by specifying a regex pattern like (?i-xsm:select). See also “GROUP, MATCH AND KILL”. -busy-time type: time; group: Query Matches Match queries that have been running for longer than this time. The queries must be in Command=Query status. This matches a query’s Time value as reported by SHOW PROCESSLIST. -idle-time type: time; group: Query Matches Match queries that have been idle/sleeping for longer than this time. The queries must be in Command=Sleep status. This matches a query’s Time value as reported by SHOW PROCESSLIST. -ignore-command type: string; group: Query Matches Ignore queries whose Command matches this Perl regex. See --match-command. -ignore-db type: string; group: Query Matches Ignore queries whose db (database) matches this Perl regex. See --match-db. -ignore-host type: string; group: Query Matches Ignore queries whose Host matches this Perl regex. See --match-host. 100 Chapter 2. Tools
  • 105. Percona Toolkit Documentation, Release 2.1.1 -ignore-info type: string; group: Query Matches Ignore queries whose Info (query) matches this Perl regex. See --match-info. -[no]ignore-self default: yes; group: Query Matches Don’t kill pt-kill‘s own connection. -ignore-state type: string; group: Query Matches; default: Locked Ignore queries whose State matches this Perl regex. The default is to keep threads from being killed if they are locked waiting for another thread. See --match-state. -ignore-user type: string; group: Query Matches Ignore queries whose user matches this Perl regex. See --match-user. -match-all group: Query Matches Match all queries that are not ignored. If no ignore options are specified, then every query matches (except replication threads, unless --replication-threads is also specified). This option allows you to specify negative matches, i.e. “match every query except...” where the exceptions are defined by specifying various --ignore options. This option is not the same as --victims all. This option matches all queries within a class, whereas --victims all specifies that all matching queries in a class (however they matched) will be killed. Normally, however, the two are used together because if, for example, you specify --victims oldest, then although all queries may match, only the oldest will be killed. -match-command type: string; group: Query Matches Match only queries whose Command matches this Perl regex. Common Command values are: Query Sleep Binlog Dump Connect Delayed insert Execute Fetch Init DB Kill Prepare Processlist Quit Reset stmt Table Dump See http://dev.mysql.com/doc/refman/5.1/en/thread-commands.html for a full list and description of Command values. 2.14. pt-kill 101
  • 106. Percona Toolkit Documentation, Release 2.1.1 -match-db type: string; group: Query Matches Match only queries whose db (database) matches this Perl regex. -match-host type: string; group: Query Matches Match only queries whose Host matches this Perl regex. The Host value often time includes the port like “host:port”. -match-info type: string; group: Query Matches Match only queries whose Info (query) matches this Perl regex. The Info column of the processlist shows the query that is being executed or NULL if no query is being executed. -match-state type: string; group: Query Matches Match only queries whose State matches this Perl regex. Common State values are: Locked login copy to tmp table Copying to tmp table Copying to tmp table on disk Creating tmp table executing Reading from net Sending data Sorting for order Sorting result Table lock Updating See http://dev.mysql.com/doc/refman/5.1/en/general-thread-states.html for a full list and description of State values. -match-user type: string; group: Query Matches Match only queries whose User matches this Perl regex. -replication-threads group: Query Matches Allow matching and killing replication threads. By default, matches do not apply to replication threads; i.e. replication threads are completely ignored. Speci- fying this option allows matches to match (and potentially kill) replication threads on masters and slaves. -test-matching type: array; group: Query Matches Files with processlist snapshots to test matching options against. Since the matching options can be complex, you can save snapshots of processlist in files, then test matching options against queries in those files. This option disables --run-time, --interval, and --[no]ignore-self. 102 Chapter 2. Tools
  • 107. Percona Toolkit Documentation, Release 2.1.1 2.14.9 CLASS MATCHES These matches apply to entire query classes. Classes are created by specifying the --group-by option, else all queries are members of a single, default class. See also “GROUP, MATCH AND KILL”. -any-busy-time type: time; group: Class Matches Match query class if any query has been running for longer than this time. “Longer than” means that if you specify 10, for example, the class will only match if there’s at least one query that has been running for greater than 10 seconds. See --each-busy-time for more details. -each-busy-time type: time; group: Class Matches Match query class if each query has been running for longer than this time. “Longer than” means that if you specify 10, for example, the class will only match if each and every query has been running for greater than 10 seconds. See also --any-busy-time (to match a class if ANY query has been running longer than the specified time) and --busy-time. -query-count type: int; group: Class Matches Match query class if it has at least this many queries. When queries are grouped into classes by specify- ing --group-by, this option causes matches to apply only to classes with at least this many queries. If --group-by is not specified then this option causes matches to apply only if there are at least this many queries in the entire SHOW PROCESSLIST. -verbose short form: -v Print information to STDOUT about what is being done. 2.14.10 ACTIONS These actions are taken for every matching query from all classes. The actions are taken in this order: --print, --execute-command, --kill”/”--kill-query. This order allows --execute-command to see the out- put of --print and the query before --kill”/”--kill-query. This may be helpful because pt-kill does not pass any information to --execute-command. See also “GROUP, MATCH AND KILL”. -execute-command type: string; group: Actions Execute this command when a query matches. After the command is executed, pt-kill has no control over it, so the command is responsible for its own info gathering, logging, interval, etc. The command is executed each time a query matches, so be careful that the command behaves well when multiple instances are ran. No information from pt-kill is passed to the command. See also --wait-before-kill. -kill group: Actions 2.14. pt-kill 103
  • 108. Percona Toolkit Documentation, Release 2.1.1 Kill the connection for matching queries. This option makes pt-kill kill the connections (a.k.a. processes, threads) that have matching queries. Use --kill-query if you only want to kill individual queries and not their connections. Unless --print is also given, no other information is printed that shows that pt-kill matched and killed a query. See also --wait-before-kill and --wait-after-kill. -kill-query group: Actions Kill matching queries. This option makes pt-kill kill matching queries. This requires MySQL 5.0 or newer. Unlike --kill which kills the connection for matching queries, this option only kills the query, not its connection. -print group: Actions Print a KILL statement for matching queries; does not actually kill queries. If you just want to see which queries match and would be killed without actually killing them, specify --print. To both kill and print matching queries, specify both --kill and --print. 2.14.11 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P 104 Chapter 2. Tools
  • 109. Percona Toolkit Documentation, Release 2.1.1 dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.14.12 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-kill ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.14.13 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.14.14 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-kill. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.14.15 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb 2.14. pt-kill 105
  • 110. Percona Toolkit Documentation, Release 2.1.1 You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.14.16 AUTHORS Baron Schwartz and Daniel Nichter 2.14.17 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.14.18 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2009-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.14.19 VERSION pt-kill 2.1.1 2.15 pt-log-player 2.15.1 NAME pt-log-player - Replay MySQL query logs. 2.15.2 SYNOPSIS Usage pt-log-player [OPTION...] [DSN] 106 Chapter 2. Tools
  • 111. Percona Toolkit Documentation, Release 2.1.1 pt-log-player splits and plays slow log files. Split slow.log on Thread_id into 16 session files, save in ./sessions: pt-log-player --split Thread_id --session-files 16 --base-dir ./sessions slow.log Play all those sessions on host1, save results in ./results: pt-log-player --play ./sessions --base-dir ./results h=host1 Use pt-query-digest to summarize the results: pt-query-digest ./results/* 2.15.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. This tool is meant to load a server as much as possible, for stress-testing purposes. It is not designed to be used on production servers. At the time of this release there is a bug which causes pt-log-player to exceed max open files during --split. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-log- player. See also “BUGS” for more information on filing bugs and getting help. 2.15.4 DESCRIPTION pt-log-player does two things: it splits MySQL query logs into session files and it plays (executes) queries in session files on a MySQL server. Only session files can be played; slow logs cannot be played directly without being split. A session is a group of queries from the slow log that all share a common attribute, usually Thread_id. The common attribute is specified with --split. Multiple sessions are saved into a single session file. See --session-files, --max-sessions, --base-file-name and --base-dir. These session files are played with --play. pt-log-player will --play session files in parallel using N number of --threads. (They’re not technically threads, but we call them that anyway.) Each thread will play all the sessions in its given session files. The sessions are played as fast as possible (there are no delays) because the goal is to stress-test and load-test the server. So be careful using this script on a production server! Each --play thread writes its results to a separate file. These result files are in slow log format so they can be aggregated and summarized with pt-query-digest. See “OUTPUT”. 2.15.5 OUTPUT Both --split and --play have two outputs: status messages printed to STDOUT to let you know what the script is doing, and session or result files written to separate files saved in --base-dir. You can suppress all output to STDOUT for each with --quiet, or increase output with --verbose. The session files written by --split are simple text files containing queries grouped into sessions. For example: 2.15. pt-log-player 107
  • 112. Percona Toolkit Documentation, Release 2.1.1 -- START SESSION 10 use foo SELECT col FROM foo_tbl The format of these session files is important: each query must be a single line separated by a single blank line. And the “– START SESSION” comment tells pt-log-player where individual sessions begin and end so that --play can correctly fake Thread_id in its result files. The result files written by --play are in slow log format with a minimal header: the only attributes printed are Thread_id, Query_time and Schema. 2.15.6 OPTIONS Specify at least one of --play, --split or --split-random. --play and --split are mutually exclusive. This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass group: Play Prompt for a password when connecting to MySQL. -base-dir type: string; default: ./ Base directory for --split session files and --play result file. -base-file-name type: string; default: session Base file name for --split session files and --play result file. Each --split session file will be saved as <base-file-name>-N.txt, where N is a four digit, zero-padded session ID. For example: session-0003.txt. Each --play result file will be saved as <base-file-name>-results-PID.txt, where PID is the process ID of the executing thread. All files are saved in --base-dir. -charset short form: -A; type: string; group: Play Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -defaults-file short form: -F; type: string Only read mysql options from the given file. -dry-run Print which processes play which session files then exit. 108 Chapter 2. Tools
  • 113. Percona Toolkit Documentation, Release 2.1.1 -filter type: string; group: Split Discard --split events for which this Perl code doesn’t return true. This option only works with --split. This option allows you to inject Perl code into the tool to affect how the tool runs. Usually your code should examine $event to decided whether or not to allow the event. $event is a hashref of attributes and values of the event being filtered. Or, your code could add new attribute-value pairs to $event for use by other options that accept event attributes as their value. You can find an explanation of the structure of $event at http://code.google.com/p/maatkit/wiki/EventAttributes. There are two ways to supply your code: on the command line or in a file. If you supply your code on the command line, it is injected into the following subroutine where $filter is your code: sub { PTDEBUG && _d(’callback: filter’); my( $event ) = shift; ( $filter ) && return $event; } Therefore you must ensure two things: first, that you correctly escape any special characters that need to be escaped on the command line for your shell, and two, that your code is syntactically valid when injected into the subroutine above. Here’s an example filter supplied on the command line that discards events that are not SELECT statements: --filter ’$event->{arg} =~ m/^select/i’ The second way to supply your code is in a file. If your code is too complex to be expressed on the command line that results in valid syntax in the subroutine above, then you need to put the code in a file and give the file name as the value to --filter. The file should not contain a shebang (#!/usr/bin/perl) line. The entire contents of the file is injected into the following subroutine: sub { PTDEBUG && _d(’callback: filter’); my( $event ) = shift; $filter && return $event; } That subroutine is almost identical to the one above except your code is not wrapped in parentheses. This allows you to write multi-line code like: my $event_ok; if (...) { $event_ok = 1; } else { $event_ok = 0; } $event_ok Notice that the last line is not syntactically valid by itself, but it becomes syntactically valid when injected into the subroutine because it becomes: $event_ok && return $event; If your code doesn’t compile, the tool will die with an error. Even if your code compiles, it may crash to tool during runtime if, for example, it tries a pattern match an undefined value. No safeguards of any kind are provided so code carefully! 2.15. pt-log-player 109
  • 114. Percona Toolkit Documentation, Release 2.1.1 -help Show help and exit. -host short form: -h; type: string; group: Play Connect to host. -iterations type: int; default: 1; group: Play How many times each thread should play all its session files. -max-sessions type: int; default: 5000000; group: Split Maximum number of sessions to --split. By default, pt-log-player tries to split every session from the log file. For huge logs, however, this can result in millions of sessions. This option causes only the first N number of sessions to be saved. All sessions after this number are ignored, but sessions split before this number will continue to have their queries split even if those queries appear near the end of the log and after this number has been reached. -only-select group: Play Play only SELECT and USE queries; ignore all others. -password short form: -p; type: string; group: Play Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -play type: string; group: Play Play (execute) session files created by --split. The argument to play must be a comma-separated list of session files created by --split or a directory. If the argument is a directory, ALL files in that directory will be played. -port short form: -P; type: int; group: Play Port number to use for connection. -print group: Play Print queries instead of playing them; requires --play. You must also specify --play with --print. Although the queries will not be executed, --play is required to specify which session files to read. 110 Chapter 2. Tools
  • 115. Percona Toolkit Documentation, Release 2.1.1 -quiet short form: -q Do not print anything; disables --verbose. -[no]results default: yes Print --play results to files in --base-dir. -session-files type: int; default: 8; group: Split Number of session files to create with --split. The number of session files should either be equal to the number of --threads you intend to --play or be an even multiple of --threads. This number is important for maximum performance because it: * allows each thread to have roughly the same amount of sessions to play * avoids having to open/close many session files * avoids disk IO overhead by doing large sequential reads You may want to increase this number beyond --threads if each session file becomes too large. For example, splitting a 20G log into 8 sessions files may yield roughly eight 2G session files. See also --max-sessions. -set-vars type: string; group: Play; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string; group: Play Socket file to use for connection. -split type: string; group: Split Split log by given attribute to create session files. Valid attributes are any which appear in the log: Thread_id, Schema, etc. -split-random group: Split Split log without an attribute, write queries round-robin to session files. This option, if specified, overrides --split and causes the log to be split query-by-query, writing each query to the next session file in round-robin style. If you don’t care about “sessions” and just want to split a lot into N many session files and the relation or order of the queries does not matter, then use this option. -threads type: int; default: 2; group: Play Number of threads used to play sessions concurrently. Specifies the number of parallel processes to run. The default is 2. On GNU/Linux machines, the default is the number of times ‘processor’ appears in /proc/cpuinfo. On Windows, the default is read from the environment. In any case, the default is at least 2, even when there’s only a single processor. See also --session-files. 2.15. pt-log-player 111
  • 116. Percona Toolkit Documentation, Release 2.1.1 -type type: string; group: Split The type of log to --split (default slowlog). The permitted types are binlog Split the output of running mysqlbinlog against a binary log file. Currently, splitting binary logs does not always work well depending on what the binary logs contain. Be sure to check the session files after splitting to ensure proper “OUTPUT”. If the binary log contains row-based replication data, you need to run mysqlbinlog with options --base64-output=decode-rows --verbose, else invalid statements will be written to the session files. genlog Split a general log file. slowlog Split a log file in any variation of MySQL slow-log format. -user short form: -u; type: string; group: Play User for login if not current user. -verbose short form: -v; cumulative: yes; default: 0 Increase verbosity; can be specified multiple times. This option is disabled by --quiet. -version Show version and exit. -[no]warnings default: no; group: Play Print warnings about SQL errors such as invalid queries to STDERR. 2.15.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F 112 Chapter 2. Tools
  • 117. Percona Toolkit Documentation, Release 2.1.1 dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.15.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-log-player ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.15.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.15.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-log-player. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) 2.15. pt-log-player 113
  • 118. Percona Toolkit Documentation, Release 2.1.1 If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.15.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.15.12 AUTHORS Daniel Nichter 2.15.13 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.15.14 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2008-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.15.15 VERSION pt-log-player 2.1.1 114 Chapter 2. Tools
  • 119. Percona Toolkit Documentation, Release 2.1.1 2.16 pt-mext 2.16.1 NAME pt-mext - Look at many samples of MySQL SHOW GLOBAL STATUS side-by-side. 2.16.2 SYNOPSIS Usage pt-mext [OPTIONS] -- COMMAND pt-mext columnizes repeated output from a program like mysqladmin extended. Get output from mysqladmin: pt-mext -r -- mysqladmin ext -i10 -c3" Get output from a file: pt-mext -r -- cat mysqladmin-output.txt 2.16.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-mext is a read-only tool. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-mext. See also “BUGS” for more information on filing bugs and getting help. 2.16.4 DESCRIPTION pt-mext executes the COMMAND you specify, and reads through the result one line at a time. It places each line into a temporary file. When it finds a blank line, it assumes that a new sample of SHOW GLOBAL STATUS is starting, and it creates a new temporary file. At the end of this process, it has a number of temporary files. It joins the temporary files together side-by-side and prints the result. If the “-r” option is given, it first subtracts each sample from the one after it before printing results. 2.16.5 OPTIONS -r Relative: subtract each column from the previous column. 2.16.6 ENVIRONMENT This tool does not use any environment variables. 2.16. pt-mext 115
  • 120. Percona Toolkit Documentation, Release 2.1.1 2.16.7 SYSTEM REQUIREMENTS This tool requires the Bourne shell (/bin/sh) and the seq program. 2.16.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-mext. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.16.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.16.10 AUTHORS Baron Schwartz 2.16.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 116 Chapter 2. Tools
  • 121. Percona Toolkit Documentation, Release 2.1.1 2.16.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.16.13 VERSION pt-mext 2.1.1 2.17 pt-mysql-summary 2.17.1 NAME pt-mysql-summary - Summarize MySQL information nicely. 2.17.2 SYNOPSIS Usage pt-mysql-summary [OPTIONS] [-- MYSQL OPTIONS] pt-mysql-summary conveniently summarizes the status and configuration of a MySQL database server so that you can learn about it at a glance. It is not a tuning tool or diagnosis tool. It produces a report that is easy to diff and can be pasted into emails without losing the formatting. It should work well on any modern UNIX systems. 2.17.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-mysql-summary is a read-only tool. It should be very low-risk. At the time of this release, we know of no bugs that could harm users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- mysql-summary. See also “BUGS” for more information on filing bugs and getting help. 2.17. pt-mysql-summary 117
  • 122. Percona Toolkit Documentation, Release 2.1.1 2.17.4 DESCRIPTION pt-mysql-summary works by connecting to a MySQL database server and querying it for status and configuration information. It saves these bits of data into files in a temporary directory, and then formats them neatly with awk and other scripting languages. To use, simply execute it. Optionally add a double dash and then the same command-line options you would use to connect to MySQL, such as the following: pt-mysql-summary -- --user=root The tool interacts minimally with the server upon which it runs. It assumes that you’ll run it on the same server you’re inspecting, and therefore it assumes that it will be able to find the my.cnf configuration file, for example. However, it should degrade gracefully if this is not the case. Note, however, that its output does not indicate which information comes from the MySQL database and which comes from the host operating system, so it is possible for confusing output to be generated if you run the tool on one server and connect to a MySQL database server running on another server. 2.17.5 OUTPUT Many of the outputs from this tool are deliberately rounded to show their magnitude but not the exact detail. This is called fuzzy-rounding. The idea is that it does not matter whether a server is running 918 queries per second or 921 queries per second; such a small variation is insignificant, and only makes the output hard to compare to other servers. Fuzzy-rounding rounds in larger increments as the input grows. It begins by rounding to the nearest 5, then the nearest 10, nearest 25, and then repeats by a factor of 10 larger (50, 100, 250), and so on, as the input grows. The following is a sample of the report that the tool produces: # Percona Toolkit MySQL Summary Report ####################### System time | 2012-03-30 18:46:05 UTC (local TZ: EDT -0400) # Instances ################################################## Port Data Directory Nice OOM Socket ===== ========================== ==== === ====== 12345 /tmp/12345/data 0 0 /tmp/12345.sock 12346 /tmp/12346/data 0 0 /tmp/12346.sock 12347 /tmp/12347/data 0 0 /tmp/12347.sock The first two sections show which server the report was generated on and which MySQL instances are running on the server. This is detected from the output of ps and does not always detect all instances and parameters, but often works well. From this point forward, the report will be focused on a single MySQL instance, although several instances may appear in the above paragraph. # Report On Port 12345 ####################################### User | msandbox@% Time | 2012-03-30 14:46:05 (EDT) Hostname | localhost.localdomain Version | 5.5.20-log MySQL Community Server (GPL) Built On | linux2.6 i686 Started | 2012-03-28 23:33 (up 1+15:12:09) Databases | 4 Datadir | /tmp/12345/data/ Processes | 2 connected, 2 running Replication | Is not a slave, has 1 slaves connected Pidfile | /tmp/12345/data/12345.pid (exists) 118 Chapter 2. Tools
  • 123. Percona Toolkit Documentation, Release 2.1.1 This section is a quick summary of the MySQL instance: version, uptime, and other very basic parameters. The Time output is generated from the MySQL server, unlike the system date and time printed earlier, so you can see whether the database and operating system times match. # Processlist ################################################ Command COUNT(*) Working SUM(Time) MAX(Time) ------------------------------ -------- ------- --------- --------- Binlog Dump 1 1 150000 150000 Query 1 1 0 0 User COUNT(*) Working SUM(Time) MAX(Time) ------------------------------ -------- ------- --------- --------- msandbox 2 2 150000 150000 Host COUNT(*) Working SUM(Time) MAX(Time) ------------------------------ -------- ------- --------- --------- localhost 2 2 150000 150000 db COUNT(*) Working SUM(Time) MAX(Time) ------------------------------ -------- ------- --------- --------- NULL 2 2 150000 150000 State COUNT(*) Working SUM(Time) MAX(Time) ------------------------------ -------- ------- --------- --------- Master has sent all binlog to 1 1 150000 150000 NULL 1 1 0 0 This section is a summary of the output from SHOW PROCESSLIST. Each sub-section is aggregated by a differ- ent item, which is shown as the first column heading. When summarized by Command, every row in SHOW PRO- CESSLIST is included, but otherwise, rows whose Command is Sleep are excluded from the SUM and MAX columns, so they do not skew the numbers too much. In the example shown, the server is idle except for this tool itself, and one connected replica, which is executing Binlog Dump. The columns are the number of rows included, the number that are not in Sleep status, the sum of the Time column, and the maximum Time column. The numbers are fuzzy-rounded. # Status Counters (Wait 10 Seconds) ########################## Variable Per day Per second 10 secs Binlog_cache_disk_use 4 Binlog_cache_use 80 Bytes_received 15000000 175 200 Bytes_sent 15000000 175 2000 Com_admin_commands 1 ...................(many lines omitted)............................ Threads_created 40 1 Uptime 90000 1 1 This section shows selected counters from two snapshots of SHOW GLOBAL STATUS, gathered approximately 10 seconds apart and fuzzy-rounded. It includes only items that are incrementing counters; it does not include absolute numbers such as the Threads_running status variable, which represents a current value, rather than an accumulated number over time. The first column is the variable name, and the second column is the counter from the first snapshot divided by 86400 (the number of seconds in a day), so you can see the magnitude of the counter’s change per day. 86400 fuzzy-rounds to 90000, so the Uptime counter should always be about 90000. The third column is the value from the first snapshot, divided by Uptime and then fuzzy-rounded, so it represents approximately how quickly the counter is growing per-second over the uptime of the server. 2.17. pt-mysql-summary 119
  • 124. Percona Toolkit Documentation, Release 2.1.1 The third column is the incremental difference from the first and second snapshot, divided by the difference in uptime and then fuzzy-rounded. Therefore, it shows how quickly the counter is growing per second at the time the report was generated. # Table cache ################################################ Size | 400 Usage | 15% This section shows the size of the table cache, followed by the percentage of the table cache in use. The usage is fuzzy-rounded. # Key Percona Server features ################################ Table & Index Stats | Not Supported Multiple I/O Threads | Enabled Corruption Resilient | Not Supported Durable Replication | Not Supported Import InnoDB Tables | Not Supported Fast Server Restarts | Not Supported Enhanced Logging | Not Supported Replica Perf Logging | Not Supported Response Time Hist. | Not Supported Smooth Flushing | Not Supported HandlerSocket NoSQL | Not Supported Fast Hash UDFs | Unknown This section shows features that are available in Percona Server and whether they are enabled or not. In the example shown, the server is standard MySQL, not Percona Server, so the features are generally not supported. # Plugins #################################################### InnoDB compression | ACTIVE This feature shows specific plugins and whether they are enabled. # Query cache ################################################ query_cache_type | ON Size | 0.0 Usage | 0% HitToInsertRatio | 0% This section shows whether the query cache is enabled and its size, followed by the percentage of the cache in use and the hit-to-insert ratio. The latter two are fuzzy-rounded. # Schema ##################################################### Would you like to mysqldump -d the schema and analyze it? y/n y There are 4 databases. Would you like to dump all, or just one? Type the name of the database, or press Enter to dump all of them. Database Tables Views SPs Trigs Funcs FKs Partn mysql 24 performance_schema 17 sakila 16 7 3 6 3 22 Database MyISAM CSV PERFORMANCE_SCHEMA InnoDB mysql 22 2 performance_schema 17 sakila 8 15 Database BTREE FULLTEXT mysql 31 120 Chapter 2. Tools
  • 125. Percona Toolkit Documentation, Release 2.1.1 performance_schema sakila 63 1 c t s e l d i t m v s h i e n o a n i e a m a m t u n t t n d r a r e m g e y i c l s b t i u h l t l i n m a i a o m t t r n m b e e t p x t Database === === === === === === === === === === === mysql 61 10 6 78 5 4 26 3 4 5 3 performance_schema 5 16 33 sakila 1 15 1 3 4 3 19 42 26 If you select to dump the schema and analyze it, the tool will print the above section. This summarizes the number and type of objects in the database. It is generated by running mysqldump --no-data, not by querying the INFORMATION_SCHEMA, which can freeze a busy server. You can use the --databases option to specify which databases to examine. If you do not, and you run the tool interactively, it will prompt you as shown. You can choose not to dump the schema, to dump all of the databases, or to dump only a single named one, by specifying the appropriate options. In the example above, we are dumping all databases. The first sub-report in the section is the count of objects by type in each database: tables, views, and so on. The second one shows how many tables use various storage engines in each database. The third sub-report shows the number of each type of indexes in each database. The last section shows the number of columns of various data types in each database. For compact display, the column headers are formatted vertically, so you need to read downwards from the top. In this example, the first column is char and the second column is timestamp. This example is truncated so it does not wrap on a terminal. All of the numbers in this portion of the output are exact, not fuzzy-rounded. # Noteworthy Technologies #################################### Full Text Indexing | Yes Geospatial Types | No Foreign Keys | Yes Partitioning | No InnoDB Compression | Yes SSL | No Explicit LOCK TABLES | No Delayed Insert | No XA Transactions | No NDB Cluster | No Prepared Statements | No Prepared statement count | 0 This section shows some specific technologies used on this server. Some of them are detected from the schema dump performed for the previous sections; others can be detected by looking at SHOW GLOBAL STATUS. # InnoDB ##################################################### Version | 1.1.8 Buffer Pool Size | 16.0M Buffer Pool Fill | 100% Buffer Pool Dirty | 0% File Per Table | OFF 2.17. pt-mysql-summary 121
  • 126. Percona Toolkit Documentation, Release 2.1.1 Page Size | 16k Log File Size | 2 * 5.0M = 10.0M Log Buffer Size | 8M Flush Method | Flush Log At Commit | 1 XA Support | ON Checksums | ON Doublewrite | ON R/W I/O Threads | 4 4 I/O Capacity | 200 Thread Concurrency | 0 Concurrency Tickets | 500 Commit Concurrency | 0 Txn Isolation Level | REPEATABLE-READ Adaptive Flushing | ON Adaptive Checkpoint | Checkpoint Age 0 | InnoDB Queue | 0 queries inside InnoDB, 0 queries in queue Oldest Transaction | 0 Seconds History List Len 209 | Read Views 1 | Undo Log Entries | 1 transactions, 1 total undo, 1 max undo Pending I/O Reads | 0 buf pool reads, 0 normal AIO, 0 ibuf AIO, 0 preads Pending I/O Writes | 0 buf pool (0 LRU, 0 flush list, 0 page); 0 AIO, 0 sync, 0 log IO (0 log, 0 chkp); 0 pwrites Pending I/O Flushes | 0 buf pool, 0 log Transaction States | 1xnot started This section shows important configuration variables for the InnoDB storage engine. The buffer pool fill percent and dirty percent are fuzzy-rounded. The last few lines are derived from the output of SHOW INNODB STATUS. It is likely that this output will change in the future to become more useful. # MyISAM ##################################################### Key Cache | 16.0M Pct Used | 10% Unflushed | 0% This section shows the size of the MyISAM key cache, followed by the percentage of the cache in use and percentage unflushed (fuzzy-rounded). # Security ################################################### Users | 2 users, 0 anon, 0 w/o pw, 0 old pw Old Passwords | OFF This section is generated from queries to tables in the mysql system database. It shows how many users exist, and various potential security risks such as old-style passwords and users without passwords. # Binary Logging ############################################# Binlogs | 1 Zero-Sized | 0 Total Size | 21.8M binlog_format | STATEMENT expire_logs_days | 0 sync_binlog | 0 server_id | 12345 binlog_do_db | binlog_ignore_db | 122 Chapter 2. Tools
  • 127. Percona Toolkit Documentation, Release 2.1.1 This section shows configuration and status of the binary logs. If there are zero-sized binary logs, then it is possible that the binlog index is out of sync with the binary logs that actually exist on disk. # Noteworthy Variables ####################################### Auto-Inc Incr/Offset | 1/1 default_storage_engine | InnoDB flush_time | 0 init_connect | init_file | sql_mode | join_buffer_size | 128k sort_buffer_size | 2M read_buffer_size | 128k read_rnd_buffer_size | 256k bulk_insert_buffer | 0.00 max_heap_table_size | 16M tmp_table_size | 16M max_allowed_packet | 1M thread_stack | 192k log | OFF log_error | /tmp/12345/data/mysqld.log log_warnings | 1 log_slow_queries | ON log_queries_not_using_indexes | OFF log_slave_updates | ON This section shows several noteworthy server configuration variables that might be important to know about when working with this server. # Configuration File ######################################### Config File | /tmp/12345/my.sandbox.cnf [client] user = msandbox password = msandbox port = 12345 socket = /tmp/12345/mysql_sandbox12345.sock [mysqld] port = 12345 socket = /tmp/12345/mysql_sandbox12345.sock pid-file = /tmp/12345/data/mysql_sandbox12345.pid basedir = /home/baron/5.5.20 datadir = /tmp/12345/data key_buffer_size = 16M innodb_buffer_pool_size = 16M innodb_data_home_dir = /tmp/12345/data innodb_log_group_home_dir = /tmp/12345/data innodb_data_file_path = ibdata1:10M:autoextend innodb_log_file_size = 5M log-bin = mysql-bin relay_log = mysql-relay-bin log_slave_updates server-id = 12345 report-host = 127.0.0.1 report-port = 12345 log-error = mysqld.log innodb_lock_wait_timeout = 3 # The End #################################################### 2.17. pt-mysql-summary 123
  • 128. Percona Toolkit Documentation, Release 2.1.1 This section shows a pretty-printed version of the my.cnf file, with comments removed and with whitespace added to align things for easy reading. The tool tries to detect the my.cnf file by looking at the output of ps, and if it does not find the location of the file there, it tries common locations until it finds a file. Note that this file might not actually correspond with the server from which the report was generated. This can happen when the tool isn’t run on the same server it’s reporting on, or when detecting the location of the configuration file fails. 2.17.6 OPTIONS All options after – are passed to mysql. -config type: string Read this comma-separated list of config files. If specified, this must be the first option on the command line. -help Print help and exit. -save-samples type: string Save the data files used to generate the summary in this directory. -read-samples type: string Create a report from the files found in this directory. -databases type: string Names of databases to summarize. If you want all of them, you can use the value --all-databases; you can also pass in a comma-separated list of database names. If not provided, the program will ask you for manual input. -sleep type: int; default: 10 Seconds to sleep when gathering status counters. -version Print tool’s version and exit. 2.17.7 ENVIRONMENT This tool does not use any environment variables. 2.17.8 SYSTEM REQUIREMENTS This tool requires Bash v3 or newer, Perl 5.8 or newer, and binutils. These are generally already provided by most distributions. On BSD systems, it may require a mounted procfs. 2.17.9 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-mysql-summary. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: 124 Chapter 2. Tools
  • 129. Percona Toolkit Documentation, Release 2.1.1 • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.17.10 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.17.11 AUTHORS Baron Schwartz, Brian Fraser, and Daniel Nichter. 2.17.12 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.17.13 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.17. pt-mysql-summary 125
  • 130. Percona Toolkit Documentation, Release 2.1.1 2.17.14 VERSION pt-mysql-summary 2.1.1 2.18 pt-online-schema-change 2.18.1 NAME pt-online-schema-change - ALTER tables without locking them. 2.18.2 SYNOPSIS Usage pt-online-schema-change [OPTIONS] DSN pt-online-schema-change alters a table’s structure without blocking reads or writes. Specify the database and table in the DSN. Do not use this tool before reading its documentation and checking your backups carefully. Add a column to sakila.actor: pt-online-schema-change --alter "ADD COLUMN c1 INT" D=sakila,t=actor Change sakila.actor to InnoDB, effectively performing OPTIMIZE TABLE in a non-blocking fashion because it is already an InnoDB table: pt-online-schema-change --alter "ENGINE=InnoDB" D=sakila,t=actor 2.18.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-online-schema-change modifies data and structures. You should be careful with it, and test it before using it in production. You should also ensure that you have recoverable backups before using this tool. At the time of this release, we know of no bugs that could cause harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- online-schema-change. See also “BUGS” for more information on filing bugs and getting help. 2.18.4 DESCRIPTION pt-online-schema-change emulates the way that MySQL alters tables internally, but it works on a copy of the table you wish to alter. This means that the original table is not locked, and clients may continue to read and change data in it. 126 Chapter 2. Tools
  • 131. Percona Toolkit Documentation, Release 2.1.1 pt-online-schema-change works by creating an empty copy of the table to alter, modifying it as desired, and then copying rows from the original table into the new table. When the copy is complete, it moves away the original table and replaces it with the new one. By default, it also drops the original table. The data copy process is performed in small chunks of data, which are varied to attempt to make them execute in a specific amount of time (see --chunk-time). This process is very similar to how other tools, such as pt-table- checksum, work. Any modifications to data in the original tables during the copy will be reflected in the new table, because the tool creates triggers on the original table to update the corresponding rows in the new table. The use of triggers means that the tool will not work if any triggers are already defined on the table. When the tool finishes copying data into the new table, it uses an atomic RENAME TABLE operation to simultaneously rename the original and new tables. After this is complete, the tool drops the original table. Foreign keys complicate the tool’s operation and introduce additional risk. The technique of atomically renaming the original and new tables does not work when foreign keys refer to the table. The tool must update foreign keys to refer to the new table after the schema change is complete. The tool supports two methods for accomplishing this. You can read more about this in the documentation for --alter-foreign-keys-method. Foreign keys also cause some side effects. The final table will have the same foreign keys and indexes as the original table (unless you specify differently in your ALTER statement), but the names of the objects may be changed slightly to avoid object name collisions in MySQL and InnoDB. For safety, the tool does not modify the table unless you specify the --execute option, which is not enabled by default. The tool supports a variety of other measures to prevent unwanted load or other problems, including automatically detecting replicas, connecting to them, and using the following safety checks: • The tool refuses to operate if it detects replication filters. See --[no]check-replication-filters for details. • The tool pauses the data copy operation if it observes any replicas that are delayed in replication. See --max-lag for details. • The tool pauses or aborts its operation if it detects too much load on the server. See --max-load and --critical-load for details. • The tool sets its lock wait timeout to 1 second so that it is more likely to be the victim of any lock contention, and less likely to disrupt other transactions. See --lock-wait-timeout for details. • The tool refuses to alter the table if foreign key constraints reference it, unless you specify --alter-foreign-keys-method. 2.18.5 OUTPUT The tool prints information about its activities to STDOUT so that you can see what it is doing. During the data copy phase, it prints progress reports to STDERR. You can get additional information with the --print option. 2.18.6 OPTIONS --dry-run and --execute are mutually exclusive. This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -alter type: string The schema modification, without the ALTER TABLE keywords. You can perform multiple modifications to the table by specifying them with commas. Please refer to the MySQL manual for the syntax of ALTER TABLE. You cannot use the RENAME clause to ALTER TABLE, or the tool will fail. 2.18. pt-online-schema-change 127
  • 132. Percona Toolkit Documentation, Release 2.1.1 -alter-foreign-keys-method type: string How to modify foreign keys so they reference the new table. Foreign keys that reference the table to be altered must be treated specially to ensure that they continue to reference the correct table. When the tool renames the original table to let the new one take its place, the foreign keys “follow” the renamed table, and must be changed to reference the new table instead. The tool supports two techniques to achieve this. It automatically finds “child tables” that reference the table to be altered. auto Automatically determine which method is best. The tool uses rebuild_constraints if possible (see the description of that method for details), and if not, then it uses drop_swap. rebuild_constraints This method uses ALTER TABLE to drop and re-add foreign key constraints that reference the new table. This is the preferred technique, unless one or more of the “child” tables is so large that the ALTER would take too long. The tool determines that by comparing the number of rows in the child table to the rate at which the tool is able to copy rows from the old table to the new table. If the tool estimates that the child table can be altered in less time than the --chunk-time, then it will use this technique. For purposes of estimating the time required to alter the child table, the tool multiplies the row-copying rate by --chunk-size-limit, because MySQL’s ALTER TABLE is typically much faster than the external process of copying rows. Due to a limitation in MySQL, foreign keys will not have the same names after the ALTER that they did prior to it. The tool has to rename the foreign key when it redefines it, which adds a leading underscore to the name. In some cases, MySQL also automatically renames indexes required for the foreign key. drop_swap Disable foreign key checks (FOREIGN_KEY_CHECKS=0), then drop the original table before re- naming the new table into its place. This is different from the normal method of swapping the old and new table, which uses an atomic RENAME that is undetectable to client applications. This method is faster and does not block, but it is riskier for two reasons. First, for a short time between dropping the original table and renaming the temporary table, the table to be altered simply does not exist, and queries against it will result in an error. Secondly, if there is an error and the new table cannot be renamed into the place of the old one, then it is too late to abort, because the old table is gone permanently. none This method is like drop_swap without the “swap”. Any foreign keys that referenced the original table will now reference a nonexistent table. This will typically cause foreign key violations that are visible in SHOW ENGINE INNODB STATUS, similar to the following: Trying to add to index ‘idx_fk_staff_id‘ tuple: DATA TUPLE: 2 fields; 0: len 1; hex 05; asc ;; 1: len 4; hex 80000001; asc ;; But the parent table ‘sakila‘.‘staff_old‘ or its .ibd file does not currently exist! This is because the original table (in this case, sakila.staff) was renamed to sakila.staff_old and then dropped. This method of handling foreign key constraints is provided so that the database adminis- trator can disable the tool’s built-in functionality if desired. 128 Chapter 2. Tools
  • 133. Percona Toolkit Documentation, Release 2.1.1 -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -check-interval type: time; default: 1 Sleep time between checks for --max-lag. -[no]check-replication-filters default: yes Abort if any replication filter is set on any server. The tool looks for server options that filter replication, such as binlog_ignore_db and replicate_do_db. If it finds any such filters, it aborts with an error. If the replicas are configured with any filtering options, you should be careful not to modify any databases or tables that exist on the master and not the replicas, because it could cause replication to fail. For more information on replication rules, see http://dev.mysql.com/doc/en/replication-rules.html. -check-slave-lag type: string Pause the data copy until this replica’s lag is less than --max-lag. The value is a DSN that inherits prop- erties from the the connection options (--port, --user, etc.). This option overrides the normal behavior of finding and continually monitoring replication lag on ALL connected replicas. If you don’t want to mon- itor ALL replicas, but you want more than just one replica to be monitored, then use the DSN option to the --recursion-method option instead of this option. -chunk-index type: string Prefer this index for chunking tables. By default, the tool chooses the most appropriate index for chunking. This option lets you specify the index that you prefer. If the index doesn’t exist, then the tool will fall back to its default behavior of choosing an index. The tool adds the index to the SQL statements in a FORCE INDEX clause. Be careful when using this option; a poor choice of index could cause bad performance. -chunk-size type: size; default: 1000 Number of rows to select for each chunk copied. Allowable suffixes are k, M, G. This option can override the default behavior, which is to adjust chunk size dynamically to try to make chunks run in exactly --chunk-time seconds. When this option isn’t set explicitly, its default value is used as a starting point, but after that, the tool ignores this option’s value. If you set this option explicitly, however, then it disables the dynamic adjustment behavior and tries to make all chunks exactly the specified number of rows. There is a subtlety: if the chunk index is not unique, then it’s possible that chunks will be larger than desired. For example, if a table is chunked by an index that contains 10,000 of a given value, there is no way to write a WHERE clause that matches only 1,000 of the values, and that chunk will be at least 10,000 rows large. Such a chunk will probably be skipped because of --chunk-size-limit. -chunk-size-limit type: float; default: 4.0 Do not copy chunks this much larger than the desired chunk size. 2.18. pt-online-schema-change 129
  • 134. Percona Toolkit Documentation, Release 2.1.1 When a table has no unique indexes, chunk sizes can be inaccurate. This option specifies a maximum tolerable limit to the inaccuracy. The tool uses <EXPLAIN> to estimate how many rows are in the chunk. If that estimate exceeds the desired chunk size times the limit, then the tool skips the chunk. The minimum value for this option is 1, which means that no chunk can be larger than --chunk-size. You probably don’t want to specify 1, because rows reported by EXPLAIN are estimates, which can be different from the real number of rows in the chunk. You can disable oversized chunk checking by specifying a value of 0. The tool also uses this option to determine how to handle foreign keys that reference the table to be altered. See --alter-foreign-keys-method for details. -chunk-time type: float; default: 0.5 Adjust the chunk size dynamically so each data-copy query takes this long to execute. The tool tracks the copy rate (rows per second) and adjusts the chunk size after each data-copy query, so that the next query takes this amount of time (in seconds) to execute. It keeps an exponentially decaying moving average of queries per second, so that if the server’s performance changes due to changes in server load, the tool adapts quickly. If this option is set to zero, the chunk size doesn’t auto-adjust, so query times will vary, but query chunk sizes will not. Another way to do the same thing is to specify a value for --chunk-size explicitly, instead of leaving it at the default. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -critical-load type: Array; default: Threads_running=50 Examine SHOW GLOBAL STATUS after every chunk, and abort if the load is too high. The option accepts a comma-separated list of MySQL status variables and thresholds. An optional =MAX_VALUE (or :MAX_VALUE) can follow each variable. If not given, the tool determines a threshold by examining the current value at startup and doubling it. See --max-load for further details. These options work similarly, except that this option will abort the tool’s operation instead of pausing it, and the default value is computed differently if you specify no threshold. The reason for this option is as a safety check in case the triggers on the original table add so much load to the server that it causes downtime. There is probably no single value of Threads_running that is wrong for every server, but a default of 50 seems likely to be unacceptably high for most servers, indicating that the operation should be canceled immediately. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -[no]drop-old-table default: yes Drop the original table after renaming it. After the original table has been successfully renamed to let the new table take its place, and if there are no errors, the tool drops the original table by default. If there are any errors, the tool leaves the original table in place. -dry-run Create and alter the new table, but do not create triggers, copy data, or replace the original table. -execute Indicate that you have read the documentation and want to alter the table. You must specify this option to alter the table. If you do not, then the tool will only perform some safety checks and exit. This helps ensure that you 130 Chapter 2. Tools
  • 135. Percona Toolkit Documentation, Release 2.1.1 have read the documentation and understand how to use this tool. If you have not read the documentation, then do not specify this option. -help Show help and exit. -host short form: -h; type: string Connect to host. -lock-wait-timeout type: int; default: 1 Set the session value of innodb_lock_wait_timeout. This option helps guard against long lock waits if the data-copy queries become slow for some reason. Setting this option dynamically requires the InnoDB plugin, so this works only on newer InnoDB and MySQL versions. If the setting’s current value is greater than the specified value, and the tool cannot set the value as desired, then it prints a warning. If the tool cannot set the value but the current value is less than or equal to the desired value, there is no error. -max-lag type: time; default: 1s Pause the data copy until all replicas’ lag is less than this value. After each data-copy query (each chunk), the tool looks at the replication lag of all replicas to which it connects, using Seconds_Behind_Master. If any replica is lagging more than the value of this option, then the tool will sleep for --check-interval seconds, then check all replicas again. If you specify --check-slave-lag, then the tool only examines that server for lag, not all servers. If you want to control exactly which servers the tool monitors, use the DSN value to --recursion-method. The tool waits forever for replicas to stop lagging. If any replica is stopped, the tool waits forever until the replica is started. The data copy continues when all replicas are running and not lagging too much. The tool prints progress reports while waiting. If a replica is stopped, it prints a progress report immediately, then again at every progress report interval. -max-load type: Array; default: Threads_running=25 Examine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than their thresholds. The option accepts a comma-separated list of MySQL status variables. An optional =MAX_VALUE (or :MAX_VALUE) can follow each variable. If not given, the tool determines a threshold by examining the current value and increasing it by 20%. For example, if you want the tool to pause when Threads_connected gets too high, you can specify “Threads_connected”, and the tool will check the current value when it starts working and add 20% to that value. If the current value is 100, then the tool will pause when Threads_connected exceeds 120, and resume working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use either “Threads_connected:110” or “Threads_connected=110”. The purpose of this option is to prevent the tool from adding too much load to the server. If the data-copy queries are intrusive, or if they cause lock waits, then other queries on the server will tend to block and queue. This will typically cause Threads_running to increase, and the tool can detect that by running SHOW GLOBAL STATUS immediately after each query finishes. If you specify a threshold for this variable, then you can instruct the tool to wait until queries are running normally again. This will not prevent queueing, however; it will only give the server a chance to recover from the queueing. If you notice queueing, it is best to decrease the chunk time. -password short form: -p; type: string Password to use when connecting. 2.18. pt-online-schema-change 131
  • 136. Percona Toolkit Documentation, Release 2.1.1 -pid type: string Create the given PID file. The file contains the process ID of the tool’s instance. The PID file is removed when the tool exits. The tool checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the tool exits. -port short form: -P; type: int Port number to use for connection. -print Print SQL statements to STDOUT. Specifying this option allows you to see most of the statements that the tool executes. You can use this option with --dry-run, for example. -progress type: array; default: time,30 Print progress reports to STDERR while copying rows. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage, seconds, or number of iterations. -quiet short form: -q Do not print messages to STDOUT. Errors and warnings are still printed to STDERR. -recurse type: int Number of levels to recurse in the hierarchy when discovering replicas. Default is infinite. See also --recursion-method. -recursion-method type: string Preferred recursion method for discovering replicas. Possible methods are: METHOD USES =========== ================== processlist SHOW PROCESSLIST hosts SHOW SLAVE HOSTS dsn=DSN DSNs from a table The processlist method is the default, because SHOW SLAVE HOSTS is not reliable. However, the hosts method can work better if the server uses a non-standard port (not 3306). The tool usually does the right thing and finds all replicas, but you may give a preferred method and it will be used first. The hosts method requires replicas to be configured with report_host, report_port, etc. The dsn method is special: it specifies a table from which other DSN strings are read. The specified DSN must specify a D and t, or a database-qualified t. The DSN table should have the following structure: CREATE TABLE ‘dsns‘ ( ‘id‘ int(11) NOT NULL AUTO_INCREMENT, ‘parent_id‘ int(11) DEFAULT NULL, ‘dsn‘ varchar(255) NOT NULL, PRIMARY KEY (‘id‘) ); 132 Chapter 2. Tools
  • 137. Percona Toolkit Documentation, Release 2.1.1 To make the tool monitor only the hosts 10.10.1.16 and 10.10.1.17 for replication lag, insert the values h=10.10.1.16 and h=10.10.1.17 into the table. Currently, the DSNs are ordered by id, but id and parent_id are otherwise ignored. -retries type: int; default: 3 Retry a chunk this many times when there is a nonfatal error. Nonfatal errors are problems such as a lock wait timeout or the query being killed. This option applies to the data copy operation. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -[no]swap-tables default: yes Swap the original table and the new, altered table. This step completes the online schema change process by making the table with the new schema take the place of the original table. The original table becomes the “old table,” and the tool drops it unless you disable --[no]drop-old-table. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.18.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Database for the old and new table. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h 2.18. pt-online-schema-change 133
  • 138. Percona Toolkit Documentation, Release 2.1.1 dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • t dsn: table; copy: no Table to alter. • u dsn: user; copy: yes User for login if not current user. 2.18.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-online-schema-change ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.18.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. This tool works only on MySQL 5.0.2 and newer versions, because earlier versions do not support triggers. 2.18.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-online-schema-change. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR 134 Chapter 2. Tools
  • 139. Percona Toolkit Documentation, Release 2.1.1 • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.18.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.18.12 AUTHORS Daniel Nichter and Baron Schwartz 2.18.13 ACKNOWLEDGMENTS The “online schema change” concept was first implemented by Shlomi Noach in his tool oak-online-alter-table, part of http://code.google.com/p/openarkkit/. Engineers at Facebook then built another version called OnlineSchemaChange.php as explained by their blog post: http://tinyurl.com/32zeb86. This tool is a hybrid of both approaches, with additional features and functionality not present in either. 2.18.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.18.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.18. pt-online-schema-change 135
  • 140. Percona Toolkit Documentation, Release 2.1.1 2.18.16 VERSION pt-online-schema-change 2.1.1 2.19 pt-pmp 2.19.1 NAME pt-pmp - Aggregate GDB stack traces for a selected program. 2.19.2 SYNOPSIS Usage pt-pmp [OPTIONS] [FILES] pt-pmp is a poor man’s profiler, inspired by http://poormansprofiler.org. It can create and summarize full stack traces of processes on Linux. Summaries of stack traces can be an invaluable tool for diagnosing what a process is waiting for. 2.19.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-pmp is a read-only tool. However, collecting GDB stacktraces is achieved by attaching GDB to the program and printing stack traces from all threads. This will freeze the program for some period of time, ranging from a second or so to much longer on very busy systems with a lot of memory and many threads in the program. In the tool’s default usage as a MySQL profiling tool, this means that MySQL will be unresponsive while the tool runs, although if you are using the tool to diagnose an unresponsive server, there is really no reason not to do this. In addition to freezing the server, there is also some risk of the server crashing or performing badly after GDB detaches from it. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-pmp. See also “BUGS” for more information on filing bugs and getting help. 2.19.4 DESCRIPTION pt-pmp performs two tasks: it gets a stack trace, and it summarizes the stack trace. If a file is given on the command line, the tool skips the first step and just aggregates the file. To summarize the stack trace, the tool extracts the function name (symbol) from each level of the stack, and combines them with commas. It does this for each thread in the output. Afterwards, it sorts similar threads together and counts how many of each one there are, then sorts them most-frequent first. 136 Chapter 2. Tools
  • 141. Percona Toolkit Documentation, Release 2.1.1 2.19.5 OPTIONS Options must precede files on the command line. -b BINARY Which binary to trace (default mysqld) -i ITERATIONS How many traces to gather and aggregate (default 1) -k KEEPFILE Keep the raw traces in this file after aggregation -l NUMBER Aggregate only first NUMBER functions; 0=infinity (default 0) -p PID Process ID of the process to trace; overrides -b -s SLEEPTIME Number of seconds to sleep between iterations (default 0) 2.19.6 ENVIRONMENT This tool does not use any environment variables. 2.19.7 SYSTEM REQUIREMENTS This tool requires Bash v3 or newer. 2.19.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-pmp. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.19.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.19. pt-pmp 137
  • 142. Percona Toolkit Documentation, Release 2.1.1 2.19.10 AUTHORS Baron Schwartz, based on a script by Domas Mituzas (http://poormansprofiler.org/) 2.19.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.19.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.19.13 VERSION pt-pmp 2.1.1 2.20 pt-query-advisor 2.20.1 NAME pt-query-advisor - Analyze queries and advise on possible problems. 2.20.2 SYNOPSIS Usage pt-query-advisor [OPTION...] [FILE] pt-query-advisor analyzes queries and advises on possible problems. Queries are given either by specifying slowlog files, –query, or –review. Analyze all queries in a slow log: pt-query-advisor /path/to/slow-query.log Analyze all queries in a general log: 138 Chapter 2. Tools
  • 143. Percona Toolkit Documentation, Release 2.1.1 pt-query-advisor --type genlog mysql.log Get queries from tcpdump using pt-query-digest: pt-query-digest --type tcpdump.txt --print --no-report | pt-query-advisor 2.20.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-query-advisor simply reads queries and examines them, and is thus very low risk. At the time of this release there is a bug that may cause an infinite (or very long) loop when parsing very large queries. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- query-advisor. See also “BUGS” for more information on filing bugs and getting help. 2.20.4 DESCRIPTION pt-query-advisor examines queries and applies rules to them, trying to find queries that look bad according to the rules. It reports on queries that match the rules, so you can find bad practices or hidden problems in your SQL. By default, it accepts a MySQL slow query log as input. 2.20.5 RULES These are the rules that pt-query-advisor will apply to the queries it examines. Each rule has three bits of information: an ID, a severity and a description. The rule’s ID is its identifier. We use a seven-character ID, and the naming convention is three characters, a period, and a three-digit number. The first three characters are sort of an abbreviation of the general class of the rule. For example, ALI.001 is some rule related to how the query uses aliases. The rule’s severity is an indication of how important it is that this rule matched a query. We use NOTE, WARN, and CRIT to denote these levels. The rule’s description is a textual, human-readable explanation of what it means when a query matches this rule. Depending on the verbosity of the report you generate, you will see more of the text in the description. By default, you’ll see only the first sentence, which is sort of a terse synopsis of the rule’s meaning. At a higher verbosity, you’ll see subsequent sentences. ALI.001 severity: note Aliasing without the AS keyword. Explicitly using the AS keyword in column or table aliases, such as “tbl AS alias,” is more readable than implicit aliases such as “tbl alias”. ALI.002 severity: warn 2.20. pt-query-advisor 139
  • 144. Percona Toolkit Documentation, Release 2.1.1 Aliasing the ‘*’ wildcard. Aliasing a column wildcard, such as “SELECT tbl.* col1, col2” probably indicates a bug in your SQL. You probably meant for the query to retrieve col1, but instead it renames the last column in the *-wildcarded list. ALI.003 severity: note Aliasing without renaming. The table or column’s alias is the same as its real name, and the alias just makes the query harder to read. ARG.001 severity: warn Argument with leading wildcard. An argument has a leading wildcard character, such as “%foo”. The predicate with this argument is not sargable and cannot use an index if one exists. ARG.002 severity: note LIKE without a wildcard. A LIKE pattern that does not include a wildcard is potentially a bug in the SQL. CLA.001 severity: warn SELECT without WHERE. The SELECT statement has no WHERE clause. CLA.002 severity: note ORDER BY RAND(). ORDER BY RAND() is a very inefficient way to retrieve a random row from the results. CLA.003 severity: note LIMIT with OFFSET. Paginating a result set with LIMIT and OFFSET is O(n^2) complexity, and will cause performance problems as the data grows larger. CLA.004 severity: note Ordinal in the GROUP BY clause. Using a number in the GROUP BY clause, instead of an expression or column name, can cause problems if the query is changed. CLA.005 severity: warn ORDER BY constant column. CLA.006 severity: warn GROUP BY or ORDER BY different tables will force a temp table and filesort. CLA.007 severity: warn ORDER BY different directions prevents index from being used. All tables in the ORDER BY clause must be either ASC or DESC, else MySQL cannot use an index. 140 Chapter 2. Tools
  • 145. Percona Toolkit Documentation, Release 2.1.1 COL.001 severity: note SELECT *. Selecting all columns with the * wildcard will cause the query’s meaning and behavior to change if the table’s schema changes, and might cause the query to retrieve too much data. COL.002 severity: note Blind INSERT. The INSERT or REPLACE query doesn’t specify the columns explicitly, so the query’s behavior will change if the table’s schema changes; use “INSERT INTO tbl(col1, col2) VALUES...” instead. LIT.001 severity: warn Storing an IP address as characters. The string literal looks like an IP address, but is not an argument to INET_ATON(), indicating that the data is stored as characters instead of as integers. It is more efficient to store IP addresses as integers. LIT.002 severity: warn Unquoted date/time literal. A query such as “WHERE col<2010-02-12” is valid SQL but is probably a bug; the literal should be quoted. KWR.001 severity: note SQL_CALC_FOUND_ROWS is inefficient. SQL_CALC_FOUND_ROWS can cause performance prob- lems because it does not scale well; use alternative strategies to build functionality such as paginated result screens. JOI.001 severity: crit Mixing comma and ANSI joins. Mixing comma joins and ANSI joins is confusing to humans, and the behavior differs between some MySQL versions. JOI.002 severity: crit A table is joined twice. The same table appears at least twice in the FROM clause. JOI.003 severity: warn Reference to outer table column in WHERE clause prevents OUTER JOIN, implicitly converts to INNER JOIN. JOI.004 severity: warn Exclusion join uses wrong column in WHERE. The exclusion join (LEFT OUTER JOIN with a WHERE clause that is satisfied only if there is no row in the right-hand table) seems to use the wrong column in the WHERE clause. A query such as ”... FROM l LEFT OUTER JOIN r ON l.l=r.r WHERE r.z IS NULL” probably ought to list r.r in the WHERE IS NULL clause. 2.20. pt-query-advisor 141
  • 146. Percona Toolkit Documentation, Release 2.1.1 RES.001 severity: warn Non-deterministic GROUP BY. The SQL retrieves columns that are neither in an aggregate function nor the GROUP BY expression, so these values will be non-deterministic in the result. RES.002 severity: warn LIMIT without ORDER BY. LIMIT without ORDER BY causes non-deterministic results, depending on the query execution plan. STA.001 severity: note != is non-standard. Use the <> operator to test for inequality. SUB.001 severity: crit IN() and NOT IN() subqueries are poorly optimized. MySQL executes the subquery as a dependent subquery for each row in the outer query. This is a frequent cause of serious performance problems. This might change version 6.0 of MySQL, but for versions 5.1 and older, the query should be rewritten as a JOIN or a LEFT OUTER JOIN, respectively. 2.20.6 OPTIONS --query and --review are mutually exclusive. This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -[no]continue-on-error default: yes Continue working even if there is an error. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -database short form: -D; type: string Connect to this database. This is also used as the default database for --[no]show-create-table if a query does not use database-qualified tables. 142 Chapter 2. Tools
  • 147. Percona Toolkit Documentation, Release 2.1.1 -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -group-by type: string; default: rule_id Group items in the report by this attribute. Possible attributes are: ATTRIBUTE GROUPS ========= ========================================================== rule_id Items matching the same rule ID query_id Queries with the same ID (the same fingerprint) none No grouping, report each query and its advice individually -help Show help and exit. -host short form: -h; type: string Connect to host. -ignore-rules type: hash Ignore these rule IDs. Specify a comma-separated list of rule IDs (e.g. LIT.001,RES.002,etc.) to ignore. Currently, the rule IDs are case-sensitive and must be uppercase. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -print-all Print all queries, even those that do not match any rules. With --group-by none, non-matching queries are printed in the main report and profile. For other --group-by values, non-matching queries are only printed in the profile. Non-matching queries have zeros for NOTE, WARN and CRIT in the profile. -query type: string Analyze this single query and ignore files and STDIN. This option allows you to supply a single query on the command line. Any files also specified on the command line are ignored. -report-format type: string; default: compact 2.20. pt-query-advisor 143
  • 148. Percona Toolkit Documentation, Release 2.1.1 Type of report format: full or compact. In full mode, every query’s report contains the description of the rules it matched, even if this information was previously displayed. In compact mode, the repeated information is suppressed, and only the rule ID is displayed. -review type: DSN Analyze queries from this pt-query-digest query review table. -sample type: int; default: 1 How many samples of the query to show. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -[no]show-create-table default: yes Get SHOW CREATE TABLE for each query’s table. If host connection options are given (like --host, --port, etc.) then the tool will also get SHOW CREATE TABLE for each query. This information is needed for some rules like JOI.004. If this option is disabled by specifying --no-show-create-table then some rules may not be checked. -socket short form: -S; type: string Socket file to use for connection. -type type: Array The type of input to parse (default slowlog). The permitted types are slowlog and genlog. -user short form: -u; type: string User for login if not current user. -verbose short form: -v; cumulative: yes; default: 1 Increase verbosity of output. At the default level of verbosity, the program prints only the first sentence of each rule’s description. At higher levels, the program prints more of the description. See also --report-format. -version Show version and exit. -where type: string Apply this WHERE clause to the SELECT query on the --review table. 2.20.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value 144 Chapter 2. Tools
  • 149. Percona Toolkit Documentation, Release 2.1.1 contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Database that contains the query review table. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • t Table to use as the query review table. • u dsn: user; copy: yes User for login if not current user. 2.20.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-query-advisor ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.20. pt-query-advisor 145
  • 150. Percona Toolkit Documentation, Release 2.1.1 2.20.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.20.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-query-advisor. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.20.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.20.12 AUTHORS Baron Schwartz and Daniel Nichter 2.20.13 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 146 Chapter 2. Tools
  • 151. Percona Toolkit Documentation, Release 2.1.1 2.20.14 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.20.15 VERSION pt-query-advisor 2.1.1 2.21 pt-query-digest 2.21.1 NAME pt-query-digest - Analyze query execution logs and generate a query report, filter, replay, or transform queries for MySQL, PostgreSQL, memcached, and more. 2.21.2 SYNOPSIS Usage pt-query-digest [OPTION...] [FILE] pt-query-digest parses and analyzes MySQL log files. With no FILE, or when FILE is -, it read standard input. Analyze, aggregate, and report on a slow query log: pt-query-digest /path/to/slow.log Review a slow log, saving results to the test.query_review table in a MySQL server running on host1. See --review for more on reviewing queries: pt-query-digest --review h=host1,D=test,t=query_review /path/to/slow.log Filter out everything but SELECT queries, replay the queries against another server, then use the timings from replay- ing them to analyze their performance: pt-query-digest /path/to/slow.log --execute h=another_server --filter ’$event->{fingerprint} =~ m/^select/’ Print the structure of events so you can construct a complex --filter: pt-query-digest /path/to/slow.log --no-report --filter ’print Dumper($event)’ 2.21. pt-query-digest 147
  • 152. Percona Toolkit Documentation, Release 2.1.1 Watch SHOW FULL PROCESSLIST and output a log in slow query log format: pt-query-digest --processlist h=host1 --print --no-report The default aggregation and analysis is CPU and memory intensive. Disable it if you don’t need the default report: pt-query-digest <arguments> --no-report 2.21.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. By default pt-query-digest merely collects and aggregates data from the files specified. It is designed to be as efficient as possible, but depending on the input you give it, it can use a lot of CPU and memory. Practically speaking, it is safe to run even on production systems, but you might want to monitor it until you are satisfied that the input you give it does not cause undue load. Various options will cause pt-query-digest to insert data into tables, execute SQL queries, and so on. These include the --execute option and --review. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- query-digest. See also “BUGS” for more information on filing bugs and getting help. 2.21.4 DESCRIPTION pt-query-digest is a framework for doing things with events from a query source such as the slow query log or PROCESSLIST. By default it acts as a very sophisticated log analysis tool. You can group and sort queries in many different ways simultaneously and find the most expensive queries, or create a timeline of queries in the log, for example. It can also do a “query review,” which means to save a sample of each type of query into a MySQL table so you can easily see whether you’ve reviewed and analyzed a query before. The benefit of this is that you can keep track of changes to your server’s queries and avoid repeated work. You can also save other information with the queries, such as comments, issue numbers in your ticketing system, and so on. Note that this is a work in very active progress and you should expect incompatible changes in the future. 2.21.5 ATTRIBUTES pt-query-digest works on events, which are a collection of key/value pairs called attributes. You’ll recognize most of the attributes right away: Query_time, Lock_time, and so on. You can just look at a slow log and see them. However, there are some that don’t exist in the slow log, and slow logs may actually include different kinds of attributes (for example, you may have a server with the Percona patches). For a full list of attributes, see http://code.google.com/p/maatkit/wiki/EventAttributes. With creative use of --filter, you can create new attributes derived from existing attributes. For example, to create an attribute called Row_ratio for examining the ratio of Rows_sent to Rows_examined, specify a filter like: --filter ’($event->{Row_ratio} = $event->{Rows_sent} / ($event->{Rows_examined})) && 1’ 148 Chapter 2. Tools
  • 153. Percona Toolkit Documentation, Release 2.1.1 The && 1 trick is needed to create a valid one-line syntax that is always true, even if the assignment happens to evaluate false. The new attribute will automatically appears in the output: # Row ratio 1.00 0.00 1 0.50 1 0.71 0.50 Attributes created this way can be specified for --order-by or any option that requires an attribute. 2.21.6 memcached memcached events have additional attributes related to the memcached protocol: cmd, key, res (result) and val. Also, boolean attributes are created for the various commands, misses and errors: Memc_CMD where CMD is a memcached command (get, set, delete, etc.), Memc_error and Memc_miss. These attributes are no different from slow log attributes, so you can use them with --[no]report, --group-by, in a --filter, etc. These attributes and more are documented at http://code.google.com/p/maatkit/wiki/EventAttributes. 2.21.7 OUTPUT The default output is a query analysis report. The --[no]report option controls whether or not this report is printed. Sometimes you may wish to parse all the queries but suppress the report, for example when using --print or --review. There is one paragraph for each class of query analyzed. A “class” of queries all have the same value for the --group-by attribute which is “fingerprint” by default. (See “ATTRIBUTES”.) A fingerprint is an abstracted ver- sion of the query text with literals removed, whitespace collapsed, and so forth. The report is formatted so it’s easy to paste into emails without wrapping, and all non-query lines begin with a comment, so you can save it to a .sql file and open it in your favorite syntax-highlighting text editor. There is a response-time profile at the beginning. The output described here is controlled by --report-format. That option allows you to specify what to print and in what order. The default output in the default order is described here. The report, by default, begins with a paragraph about the entire analysis run The information is very similar to what you’ll see for each class of queries in the log, but it doesn’t have some information that would be too expensive to keep globally for the analysis. It also has some statistics about the code’s execution itself, such as the CPU and memory usage, the local date and time of the run, and a list of input file read/parsed. Following this is the response-time profile over the events. This is a highly summarized view of the unique events in the detailed query report that follows. It contains the following columns: Column Meaning ============ ========================================================== Rank The query’s rank within the entire set of queries analyzed Query ID The query’s fingerprint Response time The total response time, and percentage of overall total Calls The number of times this query was executed R/Call The mean response time per execution Apdx The Apdex score; see --apdex-threshold for details V/M The Variance-to-mean ratio of response time EXPLAIN If --explain was specified, a sparkline; see --explain Item The distilled query A final line whose rank is shown as MISC contains aggregate statistics on the queries that were not included in the report, due to options such as --limit and --outliers. For details on the variance-to-mean ratio, please see http://en.wikipedia.org/wiki/Index_of_dispersion. 2.21. pt-query-digest 149
  • 154. Percona Toolkit Documentation, Release 2.1.1 Next, the detailed query report is printed. Each query appears in a paragraph. Here is a sample, slightly reformatted so ‘perldoc’ will not wrap lines in a terminal. The following will all be one paragraph, but we’ll break it up for commentary. # Query 2: 0.01 QPS, 0.02x conc, ID 0xFDEA8D2993C9CAF3 at byte 160665 This line identifies the sequential number of the query in the sort order specified by --order-by. Then there’s the queries per second, and the approximate concurrency for this query (calculated as a function of the timespan and total Query_time). Next there’s a query ID. This ID is a hex version of the query’s checksum in the database, if you’re using --review. You can select the reviewed query’s details from the database with a query like SELECT .... WHERE checksum=0xFDEA8D2993C9CAF3. If you are investigating the report and want to print out every sample of a particular query, then the following --filter may be helpful: ‘‘pt-query-digest slow-log.log –no-report –print –filter ‘$event-‘‘{fingerprint} && make_checksum($event->{fingerprint}) eq “FDEA8D2993C9CAF3”’>. Notice that you must remove the 0x prefix from the checksum in order for this to work. Finally, in case you want to find a sample of the query in the log file, there’s the byte offset where you can look. (This is not always accurate, due to some silly anomalies in the slow-log format, but it’s usually right.) The position refers to the worst sample, which we’ll see more about below. Next is the table of metrics about this class of queries. # pct total min max avg 95% stddev median # Count 0 2 # Exec time 13 1105s 552s 554s 553s 554s 2s 553s # Lock time 0 216us 99us 117us 108us 117us 12us 108us # Rows sent 20 6.26M 3.13M 3.13M 3.13M 3.13M 12.73 3.13M # Rows exam 0 6.26M 3.13M 3.13M 3.13M 3.13M 12.73 3.13M The first line is column headers for the table. The percentage is the percent of the total for the whole analysis run, and the total is the actual value of the specified metric. For example, in this case we can see that the query executed 2 times, which is 13% of the total number of queries in the file. The min, max and avg columns are self-explanatory. The 95% column shows the 95th percentile; 95% of the values are less than or equal to this value. The standard deviation shows you how tightly grouped the values are. The standard deviation and median are both calculated from the 95th percentile, discarding the extremely large values. The stddev, median and 95th percentile statistics are approximate. Exact statistics require keeping every value seen, sorting, and doing some calculations on them. This uses a lot of memory. To avoid this, we keep 1000 buckets, each of them 5% bigger than the one before, ranging from .000001 up to a very big number. When we see a value we increment the bucket into which it falls. Thus we have fixed memory per class of queries. The drawback is the imprecision, which typically falls in the 5 percent range. Next we have statistics on the users, databases and time range for the query. # Users 1 user1 # Databases 2 db1(1), db2(1) # Time range 2008-11-26 04:55:18 to 2008-11-27 00:15:15 The users and databases are shown as a count of distinct values, followed by the values. If there’s only one, it’s shown alone; if there are many, we show each of the most frequent ones, followed by the number of times it appears. # Query_time distribution # 1us # 10us # 100us # 1ms # 10ms # 100ms 150 Chapter 2. Tools
  • 155. Percona Toolkit Documentation, Release 2.1.1 # 1s # 10s+ ############################################################# The execution times show a logarithmic chart of time clustering. Each query goes into one of the “buckets” and is counted up. The buckets are powers of ten. The first bucket is all values in the “single microsecond range” – that is, less than 10us. The second is “tens of microseconds,” which is from 10us up to (but not including) 100us; and so on. The charted attribute can be changed by specifying --report-histogram but is limited to time-based attributes. # Tables # SHOW TABLE STATUS LIKE ’table1’G # SHOW CREATE TABLE ‘table1‘G # EXPLAIN SELECT * FROM table1G This section is a convenience: if you’re trying to optimize the queries you see in the slow log, you probably want to examine the table structure and size. These are copy-and-paste-ready commands to do that. Finally, we see a sample of the queries in this class of query. This is not a random sample. It is the query that performed the worst, according to the sort order given by --order-by. You will normally see a commented # EXPLAIN line just before it, so you can copy-paste the query to examine its EXPLAIN plan. But for non-SELECT queries that isn’t possible to do, so the tool tries to transform the query into a roughly equivalent SELECT query, and adds that below. If you want to find this sample event in the log, use the offset mentioned above, and something like the following: tail -c +<offset> /path/to/file | head See also --report-format. 2.21.8 SPARKLINES The output also contains sparklines. Sparklines are “data-intense, design-simple, word-sized graphics” (http://en.wikipedia.org/wiki/Sparkline).There is a sparkline for --report-histogram and for --explain. See each of those options for details about interpreting their sparklines. 2.21.9 QUERY REVIEWS A “query review” is the process of storing all the query fingerprints analyzed. This has several benefits: • You can add meta-data to classes of queries, such as marking them for follow-up, adding notes to queries, or marking them with an issue ID for your issue tracking system. • You can refer to the stored values on subsequent runs so you’ll know whether you’ve seen a query before. This can help you cut down on duplicated work. • You can store historical data such as the row count, query times, and generally anything you can see in the report. To use this feature, you run pt-query-digest with the --review option. It will store the fingerprints and other information into the table you specify. Next time you run it with the same option, it will do the following: • It won’t show you queries you’ve already reviewed. A query is considered to be already reviewed if you’ve set a value for the reviewed_by column. (If you want to see queries you’ve already reviewed, use the --report-all option.) • Queries that you’ve reviewed, and don’t appear in the output, will cause gaps in the query number sequence in the first line of each paragraph. And the value you’ve specified for --limit will still be honored. So if you’ve reviewed all queries in the top 10 and you ask for the top 10, you won’t see anything in the output. 2.21. pt-query-digest 151
  • 156. Percona Toolkit Documentation, Release 2.1.1 • If you want to see the queries you’ve already reviewed, you can specify --report-all. Then you’ll see the normal analysis output, but you’ll also see the information from the review table, just below the execution time graph. For example, # Review information # comments: really bad IN() subquery, fix soon! # first_seen: 2008-12-01 11:48:57 # jira_ticket: 1933 # last_seen: 2008-12-18 11:49:07 # priority: high # reviewed_by: xaprb # reviewed_on: 2008-12-18 15:03:11 You can see how useful this meta-data is – as you analyze your queries, you get your comments integrated right into the report. If you add the --review-history option, it will also store information into a separate database table, so you can keep historical trending information on classes of queries. 2.21.10 FINGERPRINTS A query fingerprint is the abstracted form of a query, which makes it possible to group similar queries together. Abstracting a query removes literal values, normalizes whitespace, and so on. For example, consider these two queries: SELECT name, password FROM user WHERE id=’12823’; select name, password from user where id=5; Both of those queries will fingerprint to select name, password from user where id=? Once the query’s fingerprint is known, we can then talk about a query as though it represents all similar queries. What pt-query-digest does is analogous to a GROUP BY statement in SQL. (But note that “multiple columns” doesn’t define a multi-column grouping; it defines multiple reports!) If your command-line looks like this, pt-query-digest /path/to/slow.log --select Rows_read,Rows_sent --group-by fingerprint --order-by Query_time:sum --limit 10 The corresponding pseudo-SQL looks like this: SELECT WORST(query BY Query_time), SUM(Query_time), ... FROM /path/to/slow.log GROUP BY FINGERPRINT(query) ORDER BY SUM(Query_time) DESC LIMIT 10 You can also use the value distill, which is a kind of super-fingerprint. See --group-by for more. When parsing memcached input (--type memcached), the fingerprint is an abstracted version of the com- mand and key, with placeholders removed. For example, get user_123_preferences fingerprints to get user_?_preferences. There is also a key_print which a fingerprinted version of the key. This example’s key_print is user_?_preferences. Query fingerprinting accommodates a great many special cases, which have proven necessary in the real world. For example, an IN list with 5 literals is really equivalent to one with 4 literals, so lists of literals are collapsed to a single one. If you want to understand more about how and why all of these cases are handled, please review the test cases in the Subversion repository. If you find something that is not fingerprinted properly, please submit a bug report with a reproducible test case. Here is a list of transformations during fingerprinting, which might not be exhaustive: 152 Chapter 2. Tools
  • 157. Percona Toolkit Documentation, Release 2.1.1 • Group all SELECT queries from mysqldump together, even if they are against different tables. Ditto for all of pt-table-checksum’s checksum queries. • Shorten multi-value INSERT statements to a single VALUES() list. • Strip comments. • Abstract the databases in USE statements, so all USE statements are grouped together. • Replace all literals, such as quoted strings. For efficiency, the code that replaces literal numbers is somewhat non-selective, and might replace some things as numbers when they really are not. Hexadecimal literals are also replaced. NULL is treated as a literal. Numbers embedded in identifiers are also replaced, so tables named similarly will be fingerprinted to the same values (e.g. users_2009 and users_2010 will fingerprint identically). • Collapse all whitespace into a single space. • Lowercase the entire query. • Replace all literals inside of IN() and VALUES() lists with a single placeholder, regardless of cardinality. • Collapse multiple identical UNION queries into a single one. 2.21.11 OPTIONS DSN values in --review-history default to values in --review if COPY is yes. This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -apdex-threshold type: float; default: 1.0 Set Apdex target threshold (T) for query response time. The Application Performance Index (Apdex) Technical Specification V1.1 defines T as “a positive decimal value in seconds, having no more than two significant digits of granularity.” This value only applies to query response time (Query_time). Options can be abbreviated so specifying --apdex-t also works. See http://www.apdex.org/. -ask-pass Prompt for a password when connecting to MySQL. -attribute-aliases type: array; default: db|Schema List of attribute|alias,etc. Certain attributes have multiple names, like db and Schema. If an event does not have the primary attribute, pt-query-digest looks for an alias attribute. If it finds an alias, it creates the primary attribute with the alias attribute’s value and removes the alias attribute. If the event has the primary attribute, all alias attributes are deleted. This helps simplify event attributes so that, for example, there will not be report lines for both db and Schema. -attribute-value-limit type: int; default: 4294967296 A sanity limit for attribute values. This option deals with bugs in slow-logging functionality that causes large values for attributes. If the attribute’s value is bigger than this, the last-seen value for that class of query is used instead. 2.21. pt-query-digest 153
  • 158. Percona Toolkit Documentation, Release 2.1.1 -aux-dsn type: DSN Auxiliary DSN used for special options. The following options may require a DSN even when only parsing a slow log file: * --since * --until See each option for why it might require a DSN. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -check-attributes-limit type: int; default: 1000 Stop checking for new attributes after this many events. For better speed, pt-query-digest stops checking events for new attributes after a certain number of events. Any new attributes after this number will be ignored and will not be reported. One special case is new attributes for pre-existing query classes (see --group-by about query classes). New attributes will not be added to pre-existing query classes even if the attributes are detected before the --check-attributes-limit limit. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -[no]continue-on-error default: yes Continue parsing even if there is an error. -create-review-history-table Create the --review-history table if it does not exist. This option causes the table specified by --review-history to be created with the default structure shown in the documentation for that option. -create-review-table Create the --review table if it does not exist. This option causes the table specified by --review to be created with the default structure shown in the documentation for that option. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -embedded-attributes type: array Two Perl regex patterns to capture pseudo-attributes embedded in queries. 154 Chapter 2. Tools
  • 159. Percona Toolkit Documentation, Release 2.1.1 Embedded attributes might be special attribute-value pairs that you’ve hidden in comments. The first regex should match the entire set of attributes (in case there are multiple). The second regex should match and capture attribute-value pairs from the first regex. For example, suppose your query looks like the following: SELECT * from users -- file: /login.php, line: 493; You might run pt-query-digest with the following option: :program:‘pt-query-digest‘ --embedded-attributes ’ -- .*’,’(w+): ([^,]+)’ The first regular expression captures the whole comment: " -- file: /login.php, line: 493;" The second one splits it into attribute-value pairs and adds them to the event: ATTRIBUTE VALUE ========= ========== file /login.php line 493 NOTE: All commas in the regex patterns must be escaped with otherwise the pattern will break. -execute type: DSN Execute queries on this DSN. Adds a callback into the chain, after filters but before the reports. Events are executed on this DSN. If they are successful, the time they take to execute overwrites the event’s Query_time attribute and the original Query_time value (from the log) is saved as the Exec_orig_time attribute. If unsuccessful, the callback returns false and terminates the chain. If the connection fails, pt-query-digest tries to reconnect once per second. See also --mirror and --execute-throttle. -execute-throttle type: array Throttle values for --execute. By default --execute runs without any limitations or concerns for the amount of time that it takes to execute the events. The --execute-throttle allows you to limit the amount of time spent doing --execute relative to the other processes that handle events. This works by marking some events with a Skip_exec attribute when --execute begins to take too much time. --execute will not execute an event if this attribute is true. This indirectly decreases the time spent doing --execute. The --execute-throttle option takes at least two comma-separated values: max allowed --execute time as a percentage and a check interval time. An optional third value is a percentage step for increasing and decreasing the probability that an event will be marked Skip_exec true. 5 (percent) is the default step. For example: --execute-throttle 70,60,10. This will limit --execute to 70% of total event processing time, checked every minute (60 seconds) and probability stepped up and down by 10%. When --execute exceeds 70%, the probability that events will be marked Skip_exec true increases by 10%. --execute time is checked again after another minute. If it’s still above 70%, then the probability will in- crease another 10%. Or, if it’s dropped below 70%, then the probability will decrease by 10%. -expected-range type: array; default: 5,10 2.21. pt-query-digest 155
  • 160. Percona Toolkit Documentation, Release 2.1.1 Explain items when there are more or fewer than expected. Defines the number of items expected to be seen in the report given by --[no]report, as controlled by --limit and --outliers. If there are more or fewer items in the report, each one will explain why it was included. -explain type: DSN Run EXPLAIN for the sample query with this DSN and print results. This works only when --group-by includes fingerprint. It causes pt-query-digest to run EXPLAIN and include the output into the report. For safety, queries that appear to have a subquery that EXPLAIN will execute won’t be EXPLAINed. Those are typically “derived table” queries of the form select ... from ( select .... ) der; The EXPLAIN results are printed in three places: a sparkline in the event header, a full vertical format in the event report, and a sparkline in the profile. The full format appears at the end of each event report in vertical style (G) just like MySQL prints it. The sparklines (see “SPARKLINES”) are compact representations of the access type for each table and whether or not “Using temporary” or “Using filesort” appear in EXPLAIN. The sparklines look like: nr>TF That sparkline means that there are two tables, the first uses a range (n) access, the second uses a ref access, and both “Using temporary” (T) and “Using filesort” (F) appear. The greater-than character just separates table access codes from T and/or F. The abbreviated table access codes are: a ALL c const e eq_ref f fulltext i index m index_merge n range o ref_or_null r ref s system u unique_subquery A capitalized access code means that “Using index” appears in EXPLAIN for that table. -filter type: string Discard events for which this Perl code doesn’t return true. This option is a string of Perl code or a file containing Perl code that gets compiled into a subroutine with one argument: $event. This is a hashref. If the given value is a readable file, then pt-query-digest reads the entire file and uses its contents as the code. The file should not contain a shebang (#!/usr/bin/perl) line. If the code returns true, the chain of callbacks continues; otherwise it ends. The code is the last statement in the subroutine other than return $event. The subroutine template is: sub { $event = shift; filter && return $event; } Filters given on the command line are wrapped inside parentheses like like ( filter ). For complex, multi- line filters, you must put the code inside a file so it will not be wrapped inside parentheses. Either way, the filter 156 Chapter 2. Tools
  • 161. Percona Toolkit Documentation, Release 2.1.1 must produce syntactically valid code given the template. For example, an if-else branch given on the command line would not be valid: --filter ’if () { } else { }’ # WRONG Since it’s given on the command line, the if-else branch would be wrapped inside parentheses which is not syntactically valid. So to accomplish something more complex like this would require putting the code in a file, for example filter.txt: my $event_ok; if (...) { $event_ok=1; } else { $event_ok=0; } $event_ok Then specify --filter filter.txt to read the code from filter.txt. If the filter code won’t compile, pt-query-digest will die with an error. If the filter code does compile, an error may still occur at runtime if the code tries to do something wrong (like pattern match an undefined value). pt-query-digest does not provide any safeguards so code carefully! An example filter that discards everything but SELECT statements: --filter ’$event->{arg} =~ m/^select/i’ This is compiled into a subroutine like the following: sub { $event = shift; ( $event->{arg} =~ m/^select/i ) && return $event; } It is permissible for the code to have side effects (to alter $event). You can find an explanation of the structure of $event at http://code.google.com/p/maatkit/wiki/EventAttributes. Here are more examples of filter code: Host/IP matches domain.com –filter ‘($event->{host} || $event->{ip} || “”) =~ m/domain.com/’ Sometimes MySQL logs the host where the IP is expected. Therefore, we check both. User matches john –filter ‘($event->{user} || “”) =~ m/john/’ More than 1 warning –filter ‘($event->{Warning_count} || 0) > 1’ Query does full table scan or full join –filter ‘(($event->{Full_scan} || “”) eq “Yes”) || (($event->{Full_join} || “”) eq “Yes”)’ Query was not served from query cache –filter ‘($event->{QC_Hit} || “”) eq “No”’ Query is 1 MB or larger –filter ‘$event->{bytes} >= 1_048_576’ Since --filter allows you to alter $event, you can use it to do other things, like create new attributes. See “ATTRIBUTES” for an example. -fingerprints Add query fingerprints to the standard query analysis report. This is mostly useful for debugging purposes. -[no]for-explain default: yes Print extra information to make analysis easy. 2.21. pt-query-digest 157
  • 162. Percona Toolkit Documentation, Release 2.1.1 This option adds code snippets to make it easy to run SHOW CREATE TABLE and SHOW TABLE STATUS for the query’s tables. It also rewrites non-SELECT queries into a SELECT that might be helpful for determining the non-SELECT statement’s index usage. -group-by type: Array; default: fingerprint Which attribute of the events to group by. In general, you can group queries into classes based on any attribute of the query, such as user or db, which will by default show you which users and which databases get the most Query_time. The default attribute, fingerprint, groups similar, abstracted queries into classes; see below and see also “FINGERPRINTS”. A report is printed for each --group-by value (unless --no-report is given). Therefore, --group-by user,db means “report on queries with the same user and report on queries with the same db”; it does not mean “report on queries with the same user and db.” See also “OUTPUT”. Every value must have a corresponding value in the same position in --order-by. However, adding values to --group-by will automatically add values to --order-by, for your convenience. There are several magical values that cause some extra data mining to happen before the grouping takes place: fingerprint This causes events to be fingerprinted to abstract queries into a canonical form, which is then used to group events together into a class. See “FINGERPRINTS” for more about fingerprinting. tables This causes events to be inspected for what appear to be tables, and then aggregated by that. Note that a query that contains two or more tables will be counted as many times as there are tables; so a join against two tables will count the Query_time against both tables. distill This is a sort of super-fingerprint that collapses queries down into a suggestion of what they do, such as INSERT SELECT table1 table2. If parsing memcached input (--type memcached), there are other attributes which you can group by: key_print (see memcached section in “FINGERPRINTS”), cmd, key, res and val (see memcached section in “AT- TRIBUTES”). -help Show help and exit. -host short form: -h; type: string Connect to host. -ignore-attributes type: array; default: arg, cmd, insert_id, ip, port, Thread_id, timestamp, exptime, flags, key, res, val, server_id, offset, end_log_pos, Xid Do not aggregate these attributes when auto-detecting --select. If you do not specify --select then pt-query-digest auto-detects and aggregates every attribute that it finds in the slow log. Some attributes, however, should not be aggregated. This option allows you to specify a list of attributes to ignore. This only works when no explicit --select is given. -inherit-attributes type: array; default: db,ts If missing, inherit these attributes from the last event that had them. 158 Chapter 2. Tools
  • 163. Percona Toolkit Documentation, Release 2.1.1 This option sets which attributes are inherited or carried forward to events which do not have them. For example, if one event has the db attribute equal to “foo”, but the next event doesn’t have the db attribute, then it inherits “foo” for its db attribute. Inheritance is usually desirable, but in some cases it might confuse things. If a query inherits a database that it doesn’t actually use, then this could confuse --execute. -interval type: float; default: .1 How frequently to poll the processlist, in seconds. -iterations type: int; default: 1 How many times to iterate through the collect-and-report cycle. If 0, iterate to infinity. Each iteration runs for --run-time amount of time. An iteration is usually determined by an amount of time and a report is printed when that amount of time elapses. With --run-time-mode interval, an interval is instead determined by the interval time you specify with --run-time. See --run-time and --run-time-mode for more information. -limit type: Array; default: 95%:20 Limit output to the given percentage or count. If the argument is an integer, report only the top N worst queries. If the argument is an integer followed by the % sign, report that percentage of the worst queries. If the percentage is followed by a colon and another integer, report the top percentage or the number specified by that integer, whichever comes first. The value is actually a comma-separated array of values, one for each item in --group-by. If you don’t specify a value for any of those items, the default is the top 95%. See also --outliers. -log type: string Print all output to this file when daemonized. -mirror type: float How often to check whether connections should be moved, depending on read_only. Requires --processlist and --execute. This option causes pt-query-digest to check every N seconds whether it is reading from a read-write server and executing against a read-only server, which is a sensible way to set up two servers if you’re doing something like master-master replication. The http://code.google.com/p/mysql-master-master/ master-master toolkit does this. The aim is to keep the passive server ready for failover, which is impossible without putting it under a realistic workload. -order-by type: Array; default: Query_time:sum Sort events by this attribute and aggregate function. This is a comma-separated list of order-by expressions, one for each --group-by attribute. The default Query_time:sum is used for --group-by attributes without explicitly given --order-by attributes (that is, if you specify more --group-by attributes than corresponding --order-by attributes). The syntax is attribute:aggregate. See “ATTRIBUTES” for valid attributes. Valid aggregates are: 2.21. pt-query-digest 159
  • 164. Percona Toolkit Documentation, Release 2.1.1 Aggregate Meaning ========= ============================ sum Sum/total attribute value min Minimum attribute value max Maximum attribute value cnt Frequency/count of the query For example, the default Query_time:sum means that queries in the query analysis report will be ordered (sorted) by their total query execution time (“Exec time”). Query_time:max orders the queries by their maximum query execution time, so the query with the single largest Query_time will be list first. cnt refers more to the frequency of the query as a whole, how often it appears; “Count” is its corresponding line in the query analysis report. So any attribute and cnt should yield the same report wherein queries are sorted by the number of times they appear. When parsing general logs (--type genlog), the default --order-by becomes Query_time:cnt. General logs do not report query times so only the cnt aggregate makes sense because all query times are zero. If you specify an attribute that doesn’t exist in the events, then pt-query-digest falls back to the default Query_time:sum and prints a notice at the beginning of the report for each query class. You can create attributes with --filter and order by them; see “ATTRIBUTES” for an example. -outliers type: array; default: Query_time:1:10 Report outliers by attribute:percentile:count. The syntax of this option is a comma-separated list of colon-delimited strings. The first field is the attribute by which an outlier is defined. The second is a number that is compared to the attribute’s 95th percentile. The third is optional, and is compared to the attribute’s cnt aggregate. Queries that pass this specification are added to the report, regardless of any limits you specified in --limit. For example, to report queries whose 95th percentile Query_time is at least 60 seconds and which are seen at least 5 times, use the following argument: --outliers Query_time:60:5 You can specify an –outliers option for each value in --group-by. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -pipeline-profile Print a profile of the pipeline processes. -port short form: -P; type: int Port number to use for connection. -print Print log events to STDOUT in standard slow-query-log format. 160 Chapter 2. Tools
  • 165. Percona Toolkit Documentation, Release 2.1.1 -print-iterations Print the start time for each --iterations. This option causes a line like the following to be printed at the start of each --iterations report: # Iteration 2 started at 2009-11-24T14:39:48.345780 This line will print even if --no-report is specified. If --iterations 0 is specified, each iteration number will be 0. -processlist type: DSN Poll this DSN’s processlist for queries, with --interval sleep between. If the connection fails, pt-query-digest tries to reopen it once per second. See also --mirror. -progress type: array; default: time,30 Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage, seconds, or number of iterations. -read-timeout type: time; default: 0 Wait this long for an event from the input; 0 to wait forever. This option sets the maximum time to wait for an event from the input. It applies to all types of input except --processlist. If an event is not received after the specified time, the script stops reading the input and prints its reports. If --iterations is 0 or greater than 1, the next iteration will begin, else the script will exit. This option requires the Perl POSIX module. -[no]report default: yes Print out reports on the aggregate results from --group-by. This is the standard slow-log analysis functionality. See “OUTPUT” for the description of what this does and what the results look like. -report-all Include all queries, even if they have already been reviewed. -report-format type: Array; default: rusage,date,hostname,files,header,profile,query_report,prepared Print these sections of the query analysis report. SECTION PRINTS ============ ====================================================== rusage CPU times and memory usage reported by ps date Current local date and time hostname Hostname of machine on which :program:‘pt-query-digest‘ was run files Input files read/parse header Summary of the entire analysis run profile Compact table of queries for an overview of the report query_report Detailed information about each unique query prepared Prepared statements The sections are printed in the order specified. The rusage, date, files and header sections are grouped together if specified together; other sections are separated by blank lines. 2.21. pt-query-digest 161
  • 166. Percona Toolkit Documentation, Release 2.1.1 See “OUTPUT” for more information on the various parts of the query report. -report-histogram type: string; default: Query_time Chart the distribution of this attribute’s values. The distribution chart is limited to time-based attributes, so charting Rows_examined, for example, will produce a useless chart. Charts look like: # Query_time distribution # 1us # 10us # 100us # 1ms # 10ms ################################ # 100ms ################################################################ # 1s ######## # 10s+ A sparkline (see “SPARKLINES”) of the full chart is also printed in the header for each query event. The sparkline of that full chart is: # Query_time sparkline: | .^_ | The sparkline itself is the 8 characters between the pipes (|), one character for each of the 8 buckets (1us, 10us, etc.) Four character codes are used to represent the approximate relation between each bucket’s value: _ . - ^ The caret ^ represents peaks (buckets with the most values), and the underscore _ represents lows (buckets with the least or at least one value). The period . and the hyphen - represent buckets with values between these two extremes. If a bucket has no values, a space is printed. So in the example above, the period represents the 10ms bucket, the caret the 100ms bucket, and the underscore the 1s bucket. See “OUTPUT” for more information. -review type: DSN Store a sample of each class of query in this DSN. The argument specifies a table to store all unique query fingerprints in. The table must have at least the following columns. You can add more columns for your own special purposes, but they won’t be used by pt-query-digest. The following CREATE TABLE definition is also used for --create-review-table. MAGIC_create_review: CREATE TABLE query_review ( checksum BIGINT UNSIGNED NOT NULL PRIMARY KEY, fingerprint TEXT NOT NULL, sample TEXT NOT NULL, first_seen DATETIME, last_seen DATETIME, reviewed_by VARCHAR(20), reviewed_on DATETIME, comments TEXT ) The columns are as follows: 162 Chapter 2. Tools
  • 167. Percona Toolkit Documentation, Release 2.1.1 COLUMN MEANING =========== =============== checksum A 64-bit checksum of the query fingerprint fingerprint The abstracted version of the query; its primary key sample The query text of a sample of the class of queries first_seen The smallest timestamp of this class of queries last_seen The largest timestamp of this class of queries reviewed_by Initially NULL; if set, query is skipped thereafter reviewed_on Initially NULL; not assigned any special meaning comments Initially NULL; not assigned any special meaning Note that the fingerprint column is the true primary key for a class of queries. The checksum is just a cryptographic hash of this value, which provides a shorter value that is very likely to also be unique. After parsing and aggregating events, your table should contain a row for each fingerprint. This option depends on --group-by fingerprint (which is the default). It will not work otherwise. -review-history type: DSN The table in which to store historical values for review trend analysis. Each time you review queries with --review, pt-query-digest will save information into this table so you can see how classes of queries have changed over time. This DSN inherits unspecified values from --review. It should mention a table in which to store statistics about each class of queries. pt-query-digest verifies the existence of the table, and your privileges to insert, delete and update on that table. pt-query-digest then inspects the columns in the table. The table must have at least the following columns: CREATE TABLE query_review_history ( checksum BIGINT UNSIGNED NOT NULL, sample TEXT NOT NULL ); Any columns not mentioned above are inspected to see if they follow a certain naming convention. The column is special if the name ends with an underscore followed by any of these MAGIC_history_cols values: pct|avt|cnt|sum|min|max|pct_95|stddev|median|rank If the column ends with one of those values, then the prefix is interpreted as the event attribute to store in that column, and the suffix is interpreted as the metric to be stored. For example, a column named Query_time_min will be used to store the minimum Query_time for the class of events. The presence of this column will also add Query_time to the --select list. The table should also have a primary key, but that is up to you, depending on how you want to store the historical data. We suggest adding ts_min and ts_max columns and making them part of the primary key along with the checksum. But you could also just add a ts_min column and make it a DATE type, so you’d get one row per class of queries per day. The default table structure follows. The following MAGIC_create_review_history table definition is used for --create-review-history-table: CREATE TABLE query_review_history ( checksum BIGINT UNSIGNED NOT NULL, sample TEXT NOT NULL, ts_min DATETIME, ts_max DATETIME, ts_cnt FLOAT, Query_time_sum FLOAT, 2.21. pt-query-digest 163
  • 168. Percona Toolkit Documentation, Release 2.1.1 Query_time_min FLOAT, Query_time_max FLOAT, Query_time_pct_95 FLOAT, Query_time_stddev FLOAT, Query_time_median FLOAT, Lock_time_sum FLOAT, Lock_time_min FLOAT, Lock_time_max FLOAT, Lock_time_pct_95 FLOAT, Lock_time_stddev FLOAT, Lock_time_median FLOAT, Rows_sent_sum FLOAT, Rows_sent_min FLOAT, Rows_sent_max FLOAT, Rows_sent_pct_95 FLOAT, Rows_sent_stddev FLOAT, Rows_sent_median FLOAT, Rows_examined_sum FLOAT, Rows_examined_min FLOAT, Rows_examined_max FLOAT, Rows_examined_pct_95 FLOAT, Rows_examined_stddev FLOAT, Rows_examined_median FLOAT, -- Percona extended slowlog attributes -- http://www.percona.com/docs/wiki/patches:slow_extended Rows_affected_sum FLOAT, Rows_affected_min FLOAT, Rows_affected_max FLOAT, Rows_affected_pct_95 FLOAT, Rows_affected_stddev FLOAT, Rows_affected_median FLOAT, Rows_read_sum FLOAT, Rows_read_min FLOAT, Rows_read_max FLOAT, Rows_read_pct_95 FLOAT, Rows_read_stddev FLOAT, Rows_read_median FLOAT, Merge_passes_sum FLOAT, Merge_passes_min FLOAT, Merge_passes_max FLOAT, Merge_passes_pct_95 FLOAT, Merge_passes_stddev FLOAT, Merge_passes_median FLOAT, InnoDB_IO_r_ops_min FLOAT, InnoDB_IO_r_ops_max FLOAT, InnoDB_IO_r_ops_pct_95 FLOAT, InnoDB_IO_r_ops_stddev FLOAT, InnoDB_IO_r_ops_median FLOAT, InnoDB_IO_r_bytes_min FLOAT, InnoDB_IO_r_bytes_max FLOAT, InnoDB_IO_r_bytes_pct_95 FLOAT, InnoDB_IO_r_bytes_stddev FLOAT, InnoDB_IO_r_bytes_median FLOAT, InnoDB_IO_r_wait_min FLOAT, InnoDB_IO_r_wait_max FLOAT, InnoDB_IO_r_wait_pct_95 FLOAT, InnoDB_IO_r_wait_stddev FLOAT, InnoDB_IO_r_wait_median FLOAT, 164 Chapter 2. Tools
  • 169. Percona Toolkit Documentation, Release 2.1.1 InnoDB_rec_lock_wait_min FLOAT, InnoDB_rec_lock_wait_max FLOAT, InnoDB_rec_lock_wait_pct_95 FLOAT, InnoDB_rec_lock_wait_stddev FLOAT, InnoDB_rec_lock_wait_median FLOAT, InnoDB_queue_wait_min FLOAT, InnoDB_queue_wait_max FLOAT, InnoDB_queue_wait_pct_95 FLOAT, InnoDB_queue_wait_stddev FLOAT, InnoDB_queue_wait_median FLOAT, InnoDB_pages_distinct_min FLOAT, InnoDB_pages_distinct_max FLOAT, InnoDB_pages_distinct_pct_95 FLOAT, InnoDB_pages_distinct_stddev FLOAT, InnoDB_pages_distinct_median FLOAT, -- Boolean (Yes/No) attributes. Only the cnt and sum are needed for these. -- cnt is how many times is attribute was recorded and sum is how many of -- those times the value was Yes. Therefore sum/cnt * 100 = % of recorded -- times that the value was Yes. QC_Hit_cnt FLOAT, QC_Hit_sum FLOAT, Full_scan_cnt FLOAT, Full_scan_sum FLOAT, Full_join_cnt FLOAT, Full_join_sum FLOAT, Tmp_table_cnt FLOAT, Tmp_table_sum FLOAT, Tmp_table_on_disk_cnt FLOAT, Tmp_table_on_disk_sum FLOAT, Filesort_cnt FLOAT, Filesort_sum FLOAT, Filesort_on_disk_cnt FLOAT, Filesort_on_disk_sum FLOAT, PRIMARY KEY(checksum, ts_min, ts_max) ); Note that we store the count (cnt) for the ts attribute only; it will be redundant to store this for other attributes. -run-time type: time How long to run for each --iterations. The default is to run forever (you can interrupt with CTRL-C). Be- cause --iterations defaults to 1, if you only specify --run-time, pt-query-digest runs for that amount of time and then exits. The two options are specified together to do collect-and-report cycles. For example, spec- ifying --iterations 4 --run-time 15m with a continuous input (like STDIN or --processlist) will cause pt-query-digest to run for 1 hour (15 minutes x 4), reporting four times, once at each 15 minute interval. -run-time-mode type: string; default: clock Set what the value of --run-time operates on. Following are the possible values for this option: clock --run-time specifies an amount of real clock time during which the tool should run for each --iterations. event --run-time specifies an amount of log time. Log time is determined by timestamps in the log. The first timestamp seen is remembered, and each timestamp after that is compared to the first to 2.21. pt-query-digest 165
  • 170. Percona Toolkit Documentation, Release 2.1.1 determine how much log time has passed. For example, if the first timestamp seen is 12:00:00 and the next is 12:01:30, that is 1 minute and 30 seconds of log time. The tool will read events until the log time is greater than or equal to the specified --run-time value. Since timestamps in logs are not always printed, or not always printed frequently, this mode varies in accuracy. interval --run-time specifies interval boundaries of log time into which events are divided and reports are generated. This mode is different from the others because it doesn’t specify how long to run. The value of --run-time must be an interval that divides evenly into minutes, hours or days. For example, 5m divides evenly into hours (60/5=12, so 12 5 minutes intervals per hour) but 7m does not (60/7=8.6). Specifying --run-time-mode interval --run-time 30m --iterations 0 is simi- lar to specifying --run-time-mode clock --run-time 30m --iterations 0. In the latter case, pt-query-digest will run forever, producing reports every 30 minutes, but this only works effectively with continuous inputs like STDIN and the processlist. For fixed inputs, like log files, the former example produces multiple reports by dividing the log into 30 minutes intervals based on timestamps. Intervals are calculated from the zeroth second/minute/hour in which a timestamp occurs, not from whatever time it specifies. For example, with 30 minute intervals and a timestamp of 12:10:30, the interval is not 12:10:30 to 12:40:30, it is 12:00:00 to 12:29:59. Or, with 1 hour intervals, it is 12:00:00 to 12:59:59. When a new timestamp exceeds the interval, a report is printed, and the next interval is recalculated based on the new timestamp. Since --iterations is 1 by default, you probably want to specify a new value else pt-query- digest will only get and report on the first interval from the log since 1 interval = 1 iteration. If you want to get and report every interval in a log, specify --iterations 0. -sample type: int Filter out all but the first N occurrences of each query. The queries are filtered on the first value in --group-by, so by default, this will filter by query fingerprint. For example, --sample 2 will permit two sample queries for each fingerprint. Useful in conjunction with --print to print out the queries. You probably want to set --no-report to avoid the overhead of aggregating and reporting if you’re just using this to print out samples of queries. A complete example: :program:‘pt-query-digest‘ --sample 2 --no-report --print slow.log -select type: Array Compute aggregate statistics for these attributes. By default pt-query-digest auto-detects, aggregates and prints metrics for every query attribute that it finds in the slow query log. This option specifies a list of only the attributes that you want. You can specify an alternative attribute with a colon. For example, db:Schema uses db if it’s available, and Schema if it’s not. Previously, pt-query-digest only aggregated these attributes: Query_time,Lock_time,Rows_sent,Rows_examined,user,db:Schema,ts Attributes specified in the --review-history table will always be selected even if you do not specify --select. See also --ignore-attributes and “ATTRIBUTES”. 166 Chapter 2. Tools
  • 171. Percona Toolkit Documentation, Release 2.1.1 -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -shorten type: int; default: 1024 Shorten long statements in reports. Shortens long statements, replacing the omitted portion with a /*... omitted ...*/ comment. This applies only to the output in reports, not to information stored for --review or other places. It prevents a large statement from causing difficulty in a report. The argument is the preferred length of the shortened statement. Not all statements can be shortened, but very large INSERT and similar statements often can; and so can IN() lists, although only the first such list in the statement will be shortened. If it shortens something beyond recognition, you can find the original statement in the log, at the offset shown in the report header (see “OUTPUT”). -show-all type: Hash Show all values for these attributes. By default pt-query-digest only shows as many of an attribute’s value that fit on a single line. This option allows you to specify attributes for which all values will be shown (line width is ignored). This only works for attributes with string values like user, host, db, etc. Multiple attributes can be specified, comma-separated. -since type: string Parse only queries newer than this value (parse queries since this date). This option allows you to ignore queries older than a certain value and parse only those queries which are more recent than the value. The value can be several types: * Simple time value N with optional suffix: N[shmd], where s=seconds, h=hours, m=minutes, d=days (default s if no suffix given); this is like saying "since N[shmd] ago" * Full date with optional hours:minutes:seconds: YYYY-MM-DD [HH:MM::SS] * Short, MySQL-style date: YYMMDD [HH:MM:SS] * Any time expression evaluated by MySQL: CURRENT_DATE - INTERVAL 7 DAY If you give a MySQL time expression, then you must also specify a DSN so that pt-query-digest can con- nect to MySQL to evaluate the expression. If you specify --execute, --explain, --processlist, --review or --review-history, then one of these DSNs will be used automatically. Otherwise, you must specify an --aux-dsn or pt-query-digest will die saying that the value is invalid. The MySQL time expression is wrapped inside a query like “SELECT UNIX_TIMESTAMP(<expression>)”, so be sure that the expression is valid inside this query. For example, do not use UNIX_TIMESTAMP() because UNIX_TIMESTAMP(UNIX_TIMESTAMP()) returns 0. Events are assumed to be in chronological: older events at the beginning of the log and newer events at the end of the log. --since is strict: it ignores all queries until one is found that is new enough. Therefore, if the query events are not consistently timestamped, some may be ignored which are actually new enough. See also --until. 2.21. pt-query-digest 167
  • 172. Percona Toolkit Documentation, Release 2.1.1 -socket short form: -S; type: string Socket file to use for connection. -statistics Print statistics about internal counters. This option is mostly for development and debugging. The statistics report is printed for each iteration after all other reports, even if no events are processed or --no-report is specified. The statistics report looks like: # No events processed. # Statistic Count %/Events # ================================================ ====== ======== # events_read 142030 100.00 # events_parsed 50430 35.51 # events_aggregated 0 0.00 # ignored_midstream_server_response 18111 12.75 # no_tcp_data 91600 64.49 # pipeline_restarted_after_MemcachedProtocolParser 142030 100.00 # pipeline_restarted_after_TcpdumpParser 1 0.00 # unknown_client_command 1 0.00 # unknown_client_data 32318 22.75 The first column is the internal counter name; the second column is counter’s count; and the third column is the count as a percentage of events_read. In this case, it shows why no events were processed/aggregated: 100% of events were rejected by the MemcachedProtocolParser. Of those, 35.51% were data packets, but of these 12.75% of ignored mid- stream server response, one was an unknown client command, and 22.75% were unknown client data. The other 64.49% were TCP control packets (probably most ACKs). Since pt-query-digest is complex, you will probably need someone familiar with its code to decipher the statis- tics report. -table-access Print a table access report. The table access report shows which tables are accessed by all the queries and if the access is a read or write. The report looks like: write ‘baz‘.‘tbl‘ read ‘baz‘.‘new_tbl‘ write ‘baz‘.‘tbl3‘ write ‘db6‘.‘tbl6‘ If you pipe the output to sort, the read and write tables will be grouped together and sorted alphabetically: read ‘baz‘.‘new_tbl‘ write ‘baz‘.‘tbl‘ write ‘baz‘.‘tbl3‘ write ‘db6‘.‘tbl6‘ -tcpdump-errors type: string Write the tcpdump data to this file on error. If pt-query-digest doesn’t parse the stream correctly for some reason, the session’s packets since the last query event will be written out to create a usable test case. If this happens, pt-query-digest will not raise an error; it will just discard the session’s saved state and permit the tool to continue working. See “tcpdump” for more information about parsing tcpdump output. 168 Chapter 2. Tools
  • 173. Percona Toolkit Documentation, Release 2.1.1 -timeline Show a timeline of events. This option makes pt-query-digest print another kind of report: a timeline of the events. Each query is still grouped and aggregate into classes according to --group-by, but then they are printed in chronological order. The timeline report prints out the timestamp, interval, count and value of each classes. If all you want is the timeline report, then specify --no-report to suppress the default query analy- sis report. Otherwise, the timeline report will be printed at the end before the response-time profile (see --report-format and “OUTPUT”). For example, this: :program:‘pt-query-digest‘ /path/to/log --group-by distill --timeline will print something like: # ######################################################## # distill report # ######################################################## # 2009-07-25 11:19:27 1+00:00:01 2 SELECT foo # 2009-07-27 11:19:30 00:01 2 SELECT bar # 2009-07-27 11:30:00 1+06:30:00 2 SELECT foo -type type: Array The type of input to parse (default slowlog). The permitted types are binlog Parse a binary log file. genlog Parse a MySQL general log file. General logs lack a lot of “ATTRIBUTES”, notably Query_time. The default --order-by for general logs changes to Query_time:cnt. http Parse HTTP traffic from tcpdump. pglog Parse a log file in PostgreSQL format. The parser will automatically recognize logs sent to syslog and transparently parse the syslog format, too. The recommended configuration for logging in your postgresql.conf is as follows. The log_destination setting can be set to either syslog or stderr. Syslog has the added benefit of not interleaving log messages from several sessions concurrently, which the parser cannot handle, so this might be better than stderr. CSV-formatted logs are not supported at this time. The log_min_duration_statement setting should be set to 0 to capture all statements with their dura- tions. Alternatively, the parser will also recognize and handle various combinations of log_duration and log_statement. You may enable log_connections and log_disconnections, but this is optional. It is highly recommended to set your log_line_prefix to the following: log_line_prefix = ’%m c=%c,u=%u,D=%d ’ This lets the parser find timestamps with milliseconds, session IDs, users, and databases from the log. If these items are missing, you’ll simply get less information to analyze. For compatibility with 2.21. pt-query-digest 169
  • 174. Percona Toolkit Documentation, Release 2.1.1 other log analysis tools such as PQA and pgfouine, various log line prefix formats are supported. The general format is as follows: a timestamp can be detected and extracted (the syslog timestamp is NOT parsed), and a name=value list of properties can also. Although the suggested format is as shown above, any name=value list will be captured and interpreted by using the first letter of the ‘name’ part, lowercased, to determine the meaning of the item. The lowercased first letter is interpreted to mean the same thing as PostgreSQL’s built-in %-codes for the log_line_prefix format string. For example, u means user, so unicorn=fred will be interpreted as user=fred; d means database, so D=john will be interpreted as database=john. The pgfouine-suggested formatting is user=%u and db=%d, so it should Just Work regardless of which format you choose. The main thing is to add as much information as possible into the log_line_prefix to permit richer analysis. Currently, only English locale messages are supported, so if your server’s locale is set to something else, the log won’t be parsed properly. (Log messages with “duration:” and “statement:” won’t be recognized.) slowlog Parse a log file in any variation of MySQL slow-log format. tcpdump Inspect network packets and decode the MySQL client protocol, extracting queries and responses from it. pt-query-digest does not actually watch the network (i.e. it does NOT “sniff packets”). Instead, it’s just parsing the output of tcpdump. You are responsible for generating this output; pt-query-digest does not do it for you. Then you send this to pt-query-digest as you would any log file: as files on the command line or to STDIN. The parser expects the input to be formatted with the following options: -x -n -q -tttt. For example, if you want to capture output from your local machine, you can do something like the following (the port must come last on FreeBSD): tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 > mysql.tcp.txt :program:‘pt-query-digest‘ --type tcpdump mysql.tcp.txt The other tcpdump parameters, such as -s, -c, and -i, are up to you. Just make sure the output looks like this (there is a line break in the first line to avoid man-page problems): 2009-04-12 09:50:16.804849 IP 127.0.0.1.42167 > 127.0.0.1.3306: tcp 37 0x0000: 4508 0059 6eb2 4000 4006 cde2 7f00 0001 0x0010: .... Remember tcpdump has a handy -c option to stop after it captures some number of packets! That’s very useful for testing your tcpdump command. Note that tcpdump can’t capture traffic on a Unix socket. Read http://bugs.mysql.com/bug.php?id=31577 if you’re confused about this. Devananda Van Der Veen explained on the MySQL Performance Blog how to capture traffic without dropping packets on busy servers. Dropped packets cause pt-query-digest to miss the response to a request, then see the response to a later request and assign the wrong execution time to the query. You can change the filter to something like the following to help capture a subset of the queries. (See http://www.mysqlperformanceblog.com/?p=6092 for details.) tcpdump -i any -s 65535 -x -n -q -tttt ’port 3306 and tcp[1] & 7 == 2 and tcp[3] & 7 == 2’ All MySQL servers running on port 3306 are automatically detected in the tcpdump output. Therefore, if the tcpdump out contains packets from multiple servers on port 3306 (for example, 170 Chapter 2. Tools
  • 175. Percona Toolkit Documentation, Release 2.1.1 10.0.0.1:3306, 10.0.0.2:3306, etc.), all packets/queries from all these servers will be analyzed to- gether as if they were one server. If you’re analyzing traffic for a MySQL server that is not running on port 3306, see --watch-server. Also note that pt-query-digest may fail to report the database for queries when parsing tcpdump out- put. The database is discovered only in the initial connect events for a new client or when <USE db> is executed. If the tcpdump output contains neither of these, then pt-query-digest cannot discover the database. Server-side prepared statements are supported. SSL-encrypted traffic cannot be inspected and de- coded. memcached Similar to tcpdump, but the expected input is memcached packets instead of MySQL packets. For example: tcpdump -i any port 11211 -s 65535 -x -nn -q -tttt > memcached.tcp.txt :program:‘pt-query-digest‘ --type memcached memcached.tcp.txt memcached uses port 11211 by default. -until type: string Parse only queries older than this value (parse queries until this date). This option allows you to ignore queries newer than a certain value and parse only those queries which are older than the value. The value can be one of the same types listed for --since. Unlike --since, --until is not strict: all queries are parsed until one has a timestamp that is equal to or greater than --until. Then all subsequent queries are ignored. -user short form: -u; type: string User for login if not current user. -variations type: Array Report the number of variations in these attributes’ values. Variations show how many distinct values an attribute had within a class. The usual value for this option is arg which shows how many distinct queries were in the class. This can be useful to determine a query’s cacheability. Distinct values are determined by CRC32 checksums of the attributes’ values. These checksums are reported in the query report for attributes specified by this option, like: # arg crc 109 (1/25%), 144 (1/25%)... 2 more In that class there were 4 distinct queries. The checksums of the first two variations are shown, and each one occurred once (or, 25% of the time). The counts of distinct variations is approximate because only 1,000 variations are saved. The mod (%) 1000 of the full CRC32 checksum is saved, so some distinct checksums are treated as equal. -version Show version and exit. 2.21. pt-query-digest 171
  • 176. Percona Toolkit Documentation, Release 2.1.1 -watch-server type: string This option tells pt-query-digest which server IP address and port (like “10.0.0.1:3306”) to watch when parsing tcpdump (for --type tcpdump and memcached); all other servers are ignored. If you don’t specify it, pt- query-digest watches all servers by looking for any IP address using port 3306 or “mysql”. If you’re watching a server with a non-standard port, this won’t work, so you must specify the IP address and port to watch. If you want to watch a mix of servers, some running on standard port 3306 and some running on non-standard ports, you need to create separate tcpdump outputs for the non-standard port servers and then specify this option for each. At present pt-query-digest cannot auto-detect servers on port 3306 and also be told to watch a server on a non-standard port. -[no]zero-admin default: yes Zero out the Rows_XXX properties for administrator command events. -[no]zero-bool default: yes Print 0% boolean values in report. 2.21.12 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Database that contains the query review table. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. 172 Chapter 2. Tools
  • 177. Percona Toolkit Documentation, Release 2.1.1 • S dsn: mysql_socket; copy: yes Socket file to use for connection. • t Table to use as the query review table. • u dsn: user; copy: yes User for login if not current user. 2.21.13 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-query-digest ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.21.14 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.21.15 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-query-digest. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.21.16 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb 2.21. pt-query-digest 173
  • 178. Percona Toolkit Documentation, Release 2.1.1 You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.21.17 AUTHORS Baron Schwartz and Daniel Nichter 2.21.18 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.21.19 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2008-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.21.20 VERSION pt-query-digest 2.1.1 2.22 pt-show-grants 2.22.1 NAME pt-show-grants - Canonicalize and print MySQL grants so you can effectively replicate, compare and version-control them. 2.22.2 SYNOPSIS Usage pt-show-grants [OPTION...] [DSN] pt-show-grants shows grants (user privileges) from a MySQL server. 174 Chapter 2. Tools
  • 179. Percona Toolkit Documentation, Release 2.1.1 Examples pt-show-grants pt-show-grants --separate --revoke | diff othergrants.sql - 2.22.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-show-grants is read-only by default, and very low-risk. If you specify --flush, it will execute FLUSH PRIVILEGES. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- show-grants. See also “BUGS” for more information on filing bugs and getting help. 2.22.4 DESCRIPTION pt-show-grants extracts, orders, and then prints grants for MySQL user accounts. Why would you want this? There are several reasons. The first is to easily replicate users from one server to another; you can simply extract the grants from the first server and pipe the output directly into another server. The second use is to place your grants into version control. If you do a daily automated grant dump into version control, you’ll get lots of spurious changesets for grants that don’t change, because MySQL prints the actual grants out in a seemingly random order. For instance, one day it’ll say GRANT DELETE, INSERT, UPDATE ON ‘test‘.* TO ’foo’@’%’; And then another day it’ll say GRANT INSERT, DELETE, UPDATE ON ‘test‘.* TO ’foo’@’%’; The grants haven’t changed, but the order has. This script sorts the grants within the line, between ‘GRANT’ and ‘ON’. If there are multiple rows from SHOW GRANTS, it sorts the rows too, except that it always prints the row with the user’s password first, if it exists. This removes three kinds of inconsistency you’ll get from running SHOW GRANTS, and avoids spurious changesets in version control. Third, if you want to diff grants across servers, it will be hard without “canonicalizing” them, which pt-show-grants does. The output is fully diff-able. With the --revoke, --separate and other options, pt-show-grants also makes it easy to revoke specific privi- leges from users. This is tedious otherwise. 2.22.5 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. 2.22. pt-show-grants 175
  • 180. Percona Toolkit Documentation, Release 2.1.1 -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -database short form: -D; type: string The database to use for the connection. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -drop Add DROP USER before each user in the output. -flush Add FLUSH PRIVILEGES after output. You might need this on pre-4.1.1 servers if you want to drop a user completely. -[no]header default: yes Print dump header. The header precedes the dumped grants. It looks like: -- Grants dumped by :program:‘pt-show-grants‘ 1.0.19 -- Dumped from server Localhost via UNIX socket, MySQL 5.0.82-log at 2009-10-26 10:01:04 See also --[no]timestamp. -help Show help and exit. -host short form: -h; type: string Connect to host. -ignore type: array Ignore this comma-separated list of users. -only type: array Only show grants for this comma-separated list of users. -password short form: -p; type: string 176 Chapter 2. Tools
  • 181. Percona Toolkit Documentation, Release 2.1.1 Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -port short form: -P; type: int Port number to use for connection. -revoke Add REVOKE statements for each GRANT statement. -separate List each GRANT or REVOKE separately. The default output from MySQL’s SHOW GRANTS command lists many privileges on a single line. With --flush, places a FLUSH PRIVILEGES after each user, instead of once at the end of all the output. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -[no]timestamp default: yes Add timestamp to the dump header. See also --[no]header. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.22.6 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. 2.22. pt-show-grants 177
  • 182. Percona Toolkit Documentation, Release 2.1.1 • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.22.7 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-show-grants ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.22.8 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.22.9 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-show-grants. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool 178 Chapter 2. Tools
  • 183. Percona Toolkit Documentation, Release 2.1.1 • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.22.10 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.22.11 AUTHORS Baron Schwartz 2.22.12 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.22.13 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.22. pt-show-grants 179
  • 184. Percona Toolkit Documentation, Release 2.1.1 2.22.14 VERSION pt-show-grants 2.1.1 2.23 pt-sift 2.23.1 NAME pt-sift - Browses files created by pt-collect. 2.23.2 SYNOPSIS Usage pt-sift FILE|PREFIX|DIRECTORY pt-sift browses the files created by pt-collect. If you specify a FILE or PREFIX, it browses only files with that prefix. If you specify a DIRECTORY, then it browses all files within that directory. 2.23.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-sift is a read-only tool. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-sift. See also “BUGS” for more information on filing bugs and getting help. 2.23.4 DESCRIPTION pt-sift downloads other tools that it might need, such as pt-diskstats, and then makes a list of the unique timestamp prefixes of all the files in the directory, as written by the pt-collect tool. If the user specified a timestamp on the command line, then it begins with that sample of data; otherwise it begins by showing a list of the timestamps and prompting for a selection. Thereafter, it displays a summary of the selected sample, and the user can navigate and inspect with keystrokes. The keystroke commands you can use are as follows: d Sets the action to start the pt-diskstats tool on the sample’s disk performance statistics. i Sets the action to view the first INNODB STATUS sample in less. m Displays the first 4 samples of SHOW STATUS counters side by side with the pt-mext tool. n 180 Chapter 2. Tools
  • 185. Percona Toolkit Documentation, Release 2.1.1 Summarizes the first sample of netstat data in two ways: by originating host, and by connection state. j Select the next timestamp as the active sample. k Select the previous timestamp as the active sample. q Quit the program. 1 Sets the action for each sample to the default, which is to view a summary of the sample. 0 Sets the action to just list the files in the sample. • Sets the action to view all of the sample’s files in the less program. 2.23.5 OPTIONS This tool does not have any command-line options. 2.23.6 ENVIRONMENT This tool does not use any environment variables. 2.23.7 SYSTEM REQUIREMENTS This tool requires Bash v3 and the following programs: pt-diskstats, pt-pmp, pt-mext, and pt-align. If these programs are not in your PATH, they will be fetched from the Internet if curl is available. 2.23.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-sift. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.23. pt-sift 181
  • 186. Percona Toolkit Documentation, Release 2.1.1 2.23.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.23.10 AUTHORS Baron Schwartz 2.23.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.23.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.23.13 VERSION pt-sift 2.1.1 182 Chapter 2. Tools
  • 187. Percona Toolkit Documentation, Release 2.1.1 2.24 pt-slave-delay 2.24.1 NAME pt-slave-delay - Make a MySQL slave server lag behind its master. 2.24.2 SYNOPSIS Usage pt-slave-delay [OPTION...] SLAVE-HOST [MASTER-HOST] pt-slave-delay starts and stops a slave server as needed to make it lag behind the master. The SLAVE-HOST and MASTER-HOST use DSN syntax, and values are copied from the SLAVE-HOST to the MASTER-HOST if omitted. To hold slavehost one minute behind its master for ten minutes: pt-slave-delay --delay 1m --interval 15s --run-time 10m slavehost 2.24.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-slave-delay is generally very low-risk. It simply starts and stops the replication SQL thread. This might cause monitoring systems to think the slave is having trouble. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- slave-delay. See also “BUGS” for more information on filing bugs and getting help. 2.24.4 DESCRIPTION pt-slave-delay watches a slave and starts and stops its replication SQL thread as necessary to hold it at least as far behind the master as you request. In practice, it will typically cause the slave to lag between --delay and --delay”+”--interval behind the master. It bases the delay on binlog positions in the slave’s relay logs by default, so there is no need to connect to the master. This works well if the IO thread doesn’t lag the master much, which is typical in most replication setups; the IO thread lag is usually milliseconds on a fast network. If your IO thread’s lag is too large for your purposes, pt-slave-delay can also connect to the master for information about binlog positions. If the slave’s I/O thread reports that it is waiting for the SQL thread to free some relay log space, pt-slave-delay will automatically connect to the master to find binary log positions. If --ask-pass and --daemonize are given, it is possible that this could cause it to ask for a password while daemonized. In this case, it exits. Therefore, if you think your slave might encounter this condition, you should be sure to either specify --use-master explicitly when daemonizing, or don’t specify --ask-pass. The SLAVE-HOST and optional MASTER-HOST are both DSNs. See “DSN OPTIONS”. Missing MASTER-HOST values are filled in with values from SLAVE-HOST, so you don’t need to specify them in both places. pt-slave-delay 2.24. pt-slave-delay 183
  • 188. Percona Toolkit Documentation, Release 2.1.1 reads all normal MySQL option files, such as ~/.my.cnf, so you may not need to specify username, password and other common options at all. pt-slave-delay tries to exit gracefully by trapping signals such as Ctrl-C. You cannot bypass --[no]continue with a trappable signal. 2.24.5 PRIVILEGES pt-slave-delay requires the following privileges: PROCESS, REPLICATION CLIENT, and SUPER. 2.24.6 OUTPUT If you specify --quiet, there is no output. Otherwise, the normal output is a status message consisting of a timestamp and information about what pt-slave-delay is doing: starting the slave, stopping the slave, or just observing. 2.24.7 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -[no]continue default: yes Continue replication normally on exit. After exiting, restart the slave’s SQL thread with no UNTIL condition, so it will run as usual and catch up to the master. This is enabled by default and works even if you terminate pt-slave-delay with Control-C. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -delay type: time; default: 1h How far the slave should lag its master. -help Show help and exit. 184 Chapter 2. Tools
  • 189. Percona Toolkit Documentation, Release 2.1.1 -host short form: -h; type: string Connect to host. -interval type: time; default: 1m How frequently pt-slave-delay should check whether the slave needs to be started or stopped. -log type: string Print all output to this file when daemonized. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -quiet short form: -q Don’t print informational messages about operation. See OUTPUT for details. -run-time type: time How long pt-slave-delay should run before exiting. The default is to run forever. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -use-master Get binlog positions from master, not slave. Don’t trust the binlog positions in the slave’s relay log. Connect to the master and get binlog positions instead. If you specify this option without giving a MASTER-HOST on the command line, pt-slave-delay examines the slave’s SHOW SLAVE STATUS to determine the hostname and port for connecting to the master. pt-slave-delay uses only the MASTER_HOST and MASTER_PORT values from SHOW SLAVE STATUS for the master connection. It does not use the MASTER_USER value. If you want to specify a different username for the master than the one you use to connect to the slave, you should specify the MASTER-HOST option explicitly on the command line. 2.24. pt-slave-delay 185
  • 190. Percona Toolkit Documentation, Release 2.1.1 -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.24.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 186 Chapter 2. Tools
  • 191. Percona Toolkit Documentation, Release 2.1.1 2.24.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-slave-delay ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.24.10 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.24.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-slave-delay. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.24.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.24.13 AUTHORS Sergey Zhuravlev and Baron Schwartz 2.24. pt-slave-delay 187
  • 192. Percona Toolkit Documentation, Release 2.1.1 2.24.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.24.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Sergey Zhuravle and Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.24.16 VERSION pt-slave-delay 2.1.1 2.25 pt-slave-find 2.25.1 NAME pt-slave-find - Find and print replication hierarchy tree of MySQL slaves. 2.25.2 SYNOPSIS Usage pt-slave-find [OPTION...] MASTER-HOST pt-slave-find finds and prints a hierarchy tree of MySQL slaves. Examples pt-slave-find --host master-host 188 Chapter 2. Tools
  • 193. Percona Toolkit Documentation, Release 2.1.1 2.25.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-slave-find is read-only and very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- slave-find. See also “BUGS” for more information on filing bugs and getting help. 2.25.4 DESCRIPTION pt-slave-find connects to a MySQL replication master and finds its slaves. Currently the only thing it can do is print a tree-like view of the replication hierarchy. The master host can be specified using one of two methods. The first method is to use the standard connection-related command line options: --defaults-file, --password, --host, --port, --socket or --user. The second method to specify the master host is a DSN. A DSN is a special syntax that can be either just a hostname (like server.domain.com or 1.2.3.4), or a key=value,key=value string. Keys are a single letter: KEY MEANING === ======= h Connect to host P Port number to use for connection S Socket file to use for connection u User for login if not current user p Password to use when connecting F Only read default options from the given file pt-slave-find reads all normal MySQL option files, such as ~/.my.cnf, so you may not need to specify username, password and other common options at all. 2.25.5 EXIT STATUS An exit status of 0 (sometimes also called a return value or return code) indicates success. Any other value represents the exit status of the Perl process itself. 2.25.6 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. 2.25. pt-slave-find 189
  • 194. Percona Toolkit Documentation, Release 2.1.1 -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -database type: string; short form: -D Database to use. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -help Show help and exit. -host short form: -h; type: string Connect to host. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -port short form: -P; type: int Port number to use for connection. -recurse type: int Number of levels to recurse in the hierarchy. Default is infinite. See --recursion-method. -recursion-method type: string Preferred recursion method used to find slaves. Possible methods are: METHOD USES =========== ================ processlist SHOW PROCESSLIST hosts SHOW SLAVE HOSTS The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method is required if the server uses a non-standard port (not 3306). Usually pt-slave-find does the right thing and finds the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the other methods will be tried. 190 Chapter 2. Tools
  • 195. Percona Toolkit Documentation, Release 2.1.1 -report-format type: string; default: summary Set what information about the slaves is printed. The report format can be one of the following: •hostname Print just the hostname name of the slaves. It looks like: 127.0.0.1:12345 +- 127.0.0.1:12346 +- 127.0.0.1:12347 •summary Print a summary of each slave’s settings. This report shows more information about each slave, like: 127.0.0.1:12345 Version 5.1.34-log Server ID 12345 Uptime 04:56 (started 2010-06-17T11:21:22) Replication Is not a slave, has 1 slaves connected Filters Binary logging STATEMENT Slave status Slave mode STRICT Auto-increment increment 1, offset 1 +- 127.0.0.1:12346 Version 5.1.34-log Server ID 12346 Uptime 04:54 (started 2010-06-17T11:21:24) Replication Is a slave, has 1 slaves connected Filters Binary logging STATEMENT Slave status 0 seconds behind, running, no errors Slave mode STRICT Auto-increment increment 1, offset 1 -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.25. pt-slave-find 191
  • 196. Percona Toolkit Documentation, Release 2.1.1 2.25.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.25.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-slave-find ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 192 Chapter 2. Tools
  • 197. Percona Toolkit Documentation, Release 2.1.1 2.25.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.25.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-slave-find. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.25.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.25.12 AUTHORS Baron Schwartz and Daniel Nichter 2.25.13 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.25. pt-slave-find 193
  • 198. Percona Toolkit Documentation, Release 2.1.1 2.25.14 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.25.15 VERSION pt-slave-find 2.1.1 2.26 pt-slave-restart 2.26.1 NAME pt-slave-restart - Watch and restart MySQL replication after errors. 2.26.2 SYNOPSIS Usage pt-slave-restart [OPTION...] [DSN] pt-slave-restart watches one or more MySQL replication slaves for errors, and tries to restart replication if it stops. 2.26.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-slave-restart is a brute-force way to try to keep a slave server running when it is having problems with replication. Don’t be too hasty to use it unless you need to. If you use this tool carelessly, you might miss the chance to really solve the slave server’s problems. At the time of this release there is a bug that causes an invalid CHANGE MASTER TO statement to be executed. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- slave-restart. See also “BUGS” for more information on filing bugs and getting help. 194 Chapter 2. Tools
  • 199. Percona Toolkit Documentation, Release 2.1.1 2.26.4 DESCRIPTION pt-slave-restart watches one or more MySQL replication slaves and tries to skip statements that cause errors. It polls slaves intelligently with an exponentially varying sleep time. You can specify errors to skip and run the slaves until a certain binlog position. Note: it has come to my attention that Yahoo! had or has an internal tool called fix_repl, described to me by a past Yahoo! employee and mentioned in the first edition of High Performance MySQL. Apparently this tool does the same thing. Make no mistake, though: this is not a way to “fix replication.” In fact I would not even encourage its use on a regular basis; I use it only when I have an error I know I just need to skip past. 2.26.5 OUTPUT If you specify --verbose, pt-slave-restart prints a line every time it sees the slave has an error. See --verbose for details. 2.26.6 SLEEP pt-slave-restart sleeps intelligently between polling the slave. The current sleep time varies. • The initial sleep time is given by --sleep. • If it checks and finds an error, it halves the previous sleep time. • If it finds no error, it doubles the previous sleep time. • The sleep time is bounded below by --min-sleep and above by --max-sleep. • Immediately after finding an error, pt-slave-restart assumes another error is very likely to happen next, so it sleeps the current sleep time or the initial sleep time, whichever is less. 2.26.7 EXIT STATUS An exit status of 0 (sometimes also called a return value or return code) indicates success. Any other value represents the exit status of the Perl process itself, or of the last forked process that exited if there were multiple servers to monitor. 2.26.8 COMPATIBILITY pt-slave-restart should work on many versions of MySQL. Lettercase of many output columns from SHOW SLAVE STATUS has changed over time, so it treats them all as lowercase. 2.26.9 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -always Start slaves even when there is no error. With this option enabled, pt-slave-restart will not let you stop the slave manually if you want to! -ask-pass Prompt for a password when connecting to MySQL. 2.26. pt-slave-restart 195
  • 200. Percona Toolkit Documentation, Release 2.1.1 -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -[no]check-relay-log default: yes Check the last relay log file and position before checking for slave errors. By default pt-slave-restart will not doing anything (it will just sleep) if neither the relay log file nor the relay log position have changed since the last check. This prevents infinite loops (i.e. restarting the same error in the same relay log file at the same relay log position). For certain slave errors, however, this check needs to be disabled by specifying --no-check-relay-log. Do not do this unless you know what you are doing! -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -database short form: -D; type: string Database to use. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -error-length type: int Max length of error message to print. When --verbose is set high enough to print the error, this option will truncate the error text to the specified length. This can be useful to prevent wrapping on the terminal. -error-numbers type: hash Only restart this comma-separated list of errors. Makes pt-slave-restart only try to restart if the error number is in this comma-separated list of errors. If it sees an error not in the list, it will exit. The error number is in the last_errno column of SHOW SLAVE STATUS. -error-text type: string Only restart errors that match this pattern. A Perl regular expression against which the error text, if any, is matched. If the error text exists and matches, pt-slave-restart will try to restart the slave. If it exists but doesn’t match, pt-slave-restart will exit. The error text is in the last_error column of SHOW SLAVE STATUS. -help Show help and exit. 196 Chapter 2. Tools
  • 201. Percona Toolkit Documentation, Release 2.1.1 -host short form: -h; type: string Connect to host. -log type: string Print all output to this file when daemonized. -max-sleep type: float; default: 64 Maximum sleep seconds. The maximum time pt-slave-restart will sleep before polling the slave again. This is also the time that pt-slave- restart will wait for all other running instances to quit if both --stop and --monitor are specified. See “SLEEP”. -min-sleep type: float; default: 0.015625 The minimum time pt-slave-restart will sleep before polling the slave again. See “SLEEP”. -monitor Whether to monitor the slave (default). Unless you specify –monitor explicitly, --stop will disable it. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -quiet short form: -q Suppresses normal output (disables --verbose). -recurse type: int; default: 0 Watch slaves of the specified server, up to the specified number of servers deep in the hierarchy. The default depth of 0 means “just watch the slave specified.” pt-slave-restart examines SHOW PROCESSLIST and tries to determine which connections are from slaves, then connect to them. See --recursion-method. Recursion works by finding all slaves when the program starts, then watching them. If there is more than one slave, pt-slave-restart uses fork() to monitor them. This also works if you have configured your slaves to show up in SHOW SLAVE HOSTS. The minimal config- uration for this is the report_host parameter, but there are other “report” parameters as well for the port, username, and password. 2.26. pt-slave-restart 197
  • 202. Percona Toolkit Documentation, Release 2.1.1 -recursion-method type: string Preferred recursion method used to find slaves. Possible methods are: METHOD USES =========== ================ processlist SHOW PROCESSLIST hosts SHOW SLAVE HOSTS The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method is required if the server uses a non-standard port (not 3306). Usually pt-slave-restart does the right thing and finds the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the other methods will be tried. -run-time type: time Time to run before exiting. Causes pt-slave-restart to stop after the specified time has elapsed. Optional suffix: s=seconds, m=minutes, h=hours, d=days; if no suffix, s is used. -sentinel type: string; default: /tmp/pt-slave-restart-sentinel Exit if this file exists. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -skip-count type: int; default: 1 Number of statements to skip when restarting the slave. -sleep type: int; default: 1 Initial sleep seconds between checking the slave. See “SLEEP”. -socket short form: -S; type: string Socket file to use for connection. -stop Stop running instances by creating the sentinel file. Causes pt-slave-restart to create the sentinel file specified by --sentinel. This should have the effect of stopping all running instances which are watching the same sentinel file. If --monitor isn’t specified, pt- slave-restart will exit after creating the file. If it is specified, pt-slave-restart will wait the interval given by --max-sleep, then remove the file and continue working. You might find this handy to stop cron jobs gracefully if necessary, or to replace one running instance with another. For example, if you want to stop and restart pt-slave-restart every hour (just to make sure that it is restarted every hour, in case of a server crash or some other problem), you could use a crontab line like this: 198 Chapter 2. Tools
  • 203. Percona Toolkit Documentation, Release 2.1.1 0 * * * * :program:‘pt-slave-restart‘ --monitor --stop --sentinel /tmp/pt-slave-restartup The non-default --sentinel will make sure the hourly cron job stops only instances previously started with the same options (that is, from the same cron job). See also --sentinel. -until-master type: string Run until this master log file and position. Start the slave, and retry if it fails, until it reaches the given repli- cation coordinates. The coordinates are the logfile and position on the master, given by relay_master_log_file, exec_master_log_pos. The argument must be in the format “file,pos”. Separate the filename and position with a single comma and no space. This will also cause an UNTIL clause to be given to START SLAVE. After reaching this point, the slave should be stopped and pt-slave-restart will exit. -until-relay type: string Run until this relay log file and position. Like --until-master, but in the slave’s relay logs instead. The coordinates are given by relay_log_file, relay_log_pos. -user short form: -u; type: string User for login if not current user. -verbose short form: -v; cumulative: yes; default: 1 Be verbose; can specify multiple times. Verbosity 1 outputs connection information, a timestamp, relay_log_file, relay_log_pos, and last_errno. Verbosity 2 adds last_error. See also --error-length. Verbosity 3 prints the current sleep time each time pt-slave-restart sleeps. -version Show version and exit. 2.26.10 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F 2.26. pt-slave-restart 199
  • 204. Percona Toolkit Documentation, Release 2.1.1 dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.26.11 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-slave-restart ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.26.12 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.26.13 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-slave-restart. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) 200 Chapter 2. Tools
  • 205. Percona Toolkit Documentation, Release 2.1.1 If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.26.14 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.26.15 AUTHORS Baron Schwartz 2.26.16 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.26.17 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.26.18 VERSION pt-slave-restart 2.1.1 2.26. pt-slave-restart 201
  • 206. Percona Toolkit Documentation, Release 2.1.1 2.27 pt-stalk 2.27.1 NAME pt-stalk - Gather forensic data about MySQL when a problem occurs. 2.27.2 SYNOPSIS Usage pt-stalk [OPTIONS] [-- MYSQL OPTIONS] pt-stalk watches for a trigger condition to become true, and then collects data to help in diagnosing problems. It is designed to run as a daemon with root privileges, so that you can diagnose intermittent problems that you cannot observe directly. You can also use it to execute a custom command, or to gather the data on demand without waiting for the trigger to happen. 2.27.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-stalk is a read-write tool; it collects data from the system and writes it into a series of files. It should be very low-risk. Some of the options can cause intrusive data collection to be performed, however, so if you enable any non-default options, you should read their documentation carefully. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-stalk. See also “BUGS” for more information on filing bugs and getting help. 2.27.4 DESCRIPTION Sometimes a problem happens infrequently and for a short time, giving you no chance to see the system when it happens. How do you solve intermittent MySQL problems when you can’t observe them? That’s why pt-stalk exists. In addition to using it when there’s a known problem on your servers, it is a good idea to run pt-stalk all the time, even when you think nothing is wrong. You will appreciate the data it gathers when a problem occurs, because problems such as MySQL lockups or spikes of activity typically leave no evidence to use in root cause analysis. This tool does two things: it watches a server (typically MySQL) for a trigger to occur, and it gathers diagnostic data. To use it effectively, you need to define a good trigger condition. A good trigger is sensitive enough to fire reliably when a problem occurs, so that you don’t miss a chance to solve problems. On the other hand, a good trigger isn’t prone to false positives, so you don’t gather information when the server is functioning normally. The most reliable triggers for MySQL tend to be the number of connections to the server, and the number of queries running concurrently. These are available in the SHOW GLOBAL STATUS command as Threads_connected and Threads_running. Sometimes Threads_connected is not a reliable indicator of trouble, but Threads_running usually is. Your job, as the tool’s user, is to define an appropriate trigger condition for the tool. Choose carefully, because the quality of your results will depend on the trigger you choose. 202 Chapter 2. Tools
  • 207. Percona Toolkit Documentation, Release 2.1.1 You can define the trigger with the --function, --variable, and --threshold options, among others. Please read the documentation for –function to learn how to do this. The pt-stalk tool, by default, simply watches MySQL repeatedly until the trigger becomes true. It then gathers diagnostics for a while, and sleeps afterwards for some time to prevent repeatedly gathering data if the condition remains true. In crude pseudocode, omitting some subtleties, while true; do if --variable from --function is greater than --threshold; then observations++ if observations is greater than --cycles; then capture diagnostics for --run-time seconds exit if --iterations is exceeded sleep for --sleep seconds done done clean up data that’s older than --retention-time sleep for --interval seconds done The diagnostic data is written to files whose names begin with a timestamp, so you can distinguish samples from each other in case the tool collects data multiple times. The pt-sift tool is designed to help you browse and analyze the resulting samples of data. Although this sounds simple enough, in practice there are a number of subtleties, such as detecting when the disk is beginning to fill up so that the tool doesn’t cause the server to run out of disk space. This tool handles these types of potential problems, so it’s a good idea to use this tool instead of writing something from scratch and possibly experiencing some of the hazards this tool is designed to prevent. 2.27.5 CONFIGURING You can use standard Percona Toolkit configuration files to set commandline options. You will probably want to run the tool as a daemon and customize at least the diagnostic threshold. Here’s a sample configuration file for triggering when there are more than 20 queries running at once: daemonize threshold=20 If you’re not running the tool as it’s designed (as a root user, daemonized) then you’ll need to set several options, such as --dest, to locations that are writable by non-root users. 2.27.6 OPTIONS -collect default: yes; negatable: yes Collect system information. You can negate this option to make the tool watch the system but not actually gather any diagnostic data. See also --stalk. -collect-gdb Collect GDB stacktraces. This is achieved by attaching to MySQL and printing stack traces from all threads. This will freeze the server for some period of time, ranging from a second or so to much longer on very busy systems with a lot of memory and many threads in the server. For this reason, it is disabled by default. However, if you are trying to diagnose a server stall or lockup, freezing the server causes no additional harm, and the stack traces can be vital for diagnosis. 2.27. pt-stalk 203
  • 208. Percona Toolkit Documentation, Release 2.1.1 In addition to freezing the server, there is also some risk of the server crashing or performing badly after GDB detaches from it. -collect-oprofile Collect oprofile data. This is achieved by starting an oprofile session, letting it run for the collection time, and then stopping and saving the resulting profile data in the system’s default location. Please read your system’s oprofile documentation to learn more about this. -collect-strace Collect strace data. This is achieved by attaching strace to the server, which will make it run very slowly until strace detaches. The same cautions apply as those listed in –collect-gdb. You should not enable this option together with –collect-gdb, because GDB and strace can’t attach to the server process simultaneously. -collect-tcpdump Collect tcpdump data. This option causes tcpdump to capture all traffic on all interfaces for the port on which MySQL is listening. You can later use pt-query-digest to decode the MySQL protocol and extract a log of query traffic from it. -config type: string Read this comma-separated list of config files. If specified, this must be the first option on the command line. -cycles type: int; default: 5 The number of times the trigger condition must be true before collecting data. This helps prevent false positives, and makes the trigger condition less likely to fire when the problem recovers quickly. -daemonize Daemonize the tool. This causes the tool to fork into the background and log its output as specified in –log. -dest type: string; default: /var/lib/pt-stalk Where to store the diagnostic data. Each time the tool collects data, it writes to a new set of files, which are named with the current system timestamp. -disk-bytes-free type: size; default: 100M Don’t collect data if the disk has less than this much free space. This prevents the tool from filling up the disk with diagnostic data. If the --dest directory contains a previously captured sample of data, the tool will measure its size and use that as an estimate of how much data is likely to be gathered this time, too. It will then be even more pessimistic, and will refuse to collect data unless the disk has enough free space to hold the sample and still have the desired amount of free space. For example, if you’d like 100MB of free space and the previous diagnostic sample consumed 100MB, the tool won’t collect any data unless the disk has 200MB free. Valid size value suffixes are k, M, G, and T. -disk-pct-free type: int; default: 5 Don’t collect data if the disk has less than this percent free space. This prevents the tool from filling up the disk with diagnostic data. This option works similarly to --disk-bytes-free but specifies a percentage margin of safety instead of a bytes margin of safety. The tool honors both options, and will not collect any data unless both margins are satisfied. 204 Chapter 2. Tools
  • 209. Percona Toolkit Documentation, Release 2.1.1 -function type: string; default: status Specifies what to watch for a diagnostic trigger. The default value watches SHOW GLOBAL STATUS, but you can also watch SHOW PROCESSLIST or supply a plugin file with your own custom code. This function supplies the value of --variable, which is then compared against --threshold to see if the trigger condition is met. Additional options may be required as well; see below. Possible values: •status This value specifies that the source of data for the diagnostic trigger is SHOW GLOBAL STATUS. The value of --variable then defines which status counter is the trigger. •processlist This value specifies that the data for the diagnostic trigger comes from SHOW FULL PROCESSLIST. The trigger value is the count of processes whose --variable column matches the --match option. For example, to trigger when more than 10 processes are in the “statistics” state, use the following options: --function processlist --variable State --match statistics --threshold 10 In addition, you can specify a file that contains your custom trigger function, written in Unix shell script. This can be a wrapper that executes anything you wish. If the argument to –function is a file, then it takes precedence over builtin functions, so if there is a file in the working directory named “status” or “processlist” then the tool will use that file as a plugin, even though those are otherwise recognized as reserved words for this option. The plugin file works by providing a function called trg_plugin, and the tool simply sources the file and executes the function. For example, the function might look like the following: trg_plugin() { mysql $EXT_ARGV -e "SHOW ENGINE INNODB STATUS" | grep -c "has waited at" } This snippet will count the number of mutex waits inside of InnoDB. It illustrates the general principle: the function must output a number, which is then compared to the threshold as usual. The $EXT_ARGV variable contains the MySQL options mentioned in the “SYNOPSIS” above. The plugin should not alter the tool’s existing global variables. Prefix any plugin-specific global variables with “PLUGIN_” or make them local. -help Print help and exit. -interval type: int; default: 1 Interval between checks for the diagnostic trigger. -iterations type: int Exit after collecting diagnostics this many times. By default, the tool will continue to watch the server forever, but this is useful for scenarios where you want to capture once and then exit, for example. -log type: string; default: /var/log/pt-stalk.log Print all output to this file when daemonized. 2.27. pt-stalk 205
  • 210. Percona Toolkit Documentation, Release 2.1.1 -match type: string The pattern to use when watching SHOW PROCESSLIST. See the documentation for --function for details. -notify-by-email type: string Send mail to this list of addresses when data is collected. -pid type: string; default: /var/run/pt-stalk.pid Create a PID file when daemonized. -prefix type: string The filename prefix for diagnostic samples. By default, samples have a timestamp prefix based on the current local time, such as 2011_12_06_14_02_02, which is December 6, 2011 at 14:02:02. -retention-time type: int; default: 30 Number of days to retain collected samples. Any samples that are older will be purged. -run-time type: int; default: 30 How long the tool will collect data when it triggers. This should not be longer than --sleep. It is usually not necessary to change this; if the default 30 seconds hasn’t gathered enough diagnostic data, running longer is not likely to do so. In fact, in many cases a shorter collection period is appropriate. -sleep type: int; default: 300 How long to sleep after collecting data. This prevents the tool from triggering continuously, which might be a problem if the collection process is intrusive. It also prevents filling up the disk or gathering too much data to analyze reasonably. -stalk default: yes; negatable: yes Watch the server and wait for the trigger to occur. You can negate this option to make the tool immediately gather any diagnostic data once and exit. This is useful if a problem is already happening, but pt-stalk is not running, so you only want to collect diagnostic data. If this option is negate, --daemonize, --log, --pid, and other stalking-related options have no ef- fect; the tool simply collects diagnostic data and exits. Safeguard options, like --disk-bytes-free and --disk-pct-free, are still respected. See also --collect. -threshold type: int; default: 25 The threshold at which the diagnostic trigger should fire. See --function for details. -variable type: string; default: Threads_running The variable to compare against the threshold. See --function for details. -version Print tool’s version and exit. 206 Chapter 2. Tools
  • 211. Percona Toolkit Documentation, Release 2.1.1 2.27.7 ENVIRONMENT This tool does not use any environment variables for configuration. 2.27.8 SYSTEM REQUIREMENTS This tool requires Bash v3 or newer. 2.27.9 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-stalk. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.27.10 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.27.11 AUTHORS Baron Schwartz, Justin Swanhart, Fernando Ipar, and Daniel Nichter 2.27.12 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.27. pt-stalk 207
  • 212. Percona Toolkit Documentation, Release 2.1.1 2.27.13 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.27.14 VERSION pt-stalk 2.1.1 2.28 pt-summary 2.28.1 NAME pt-summary - Summarize system information nicely. 2.28.2 SYNOPSIS Usage pt-summary pt-summary conveniently summarizes the status and configuration of a server. It is not a tuning tool or diagnosis tool. It produces a report that is easy to diff and can be pasted into emails without losing the formatting. This tool works well on many types of Unix systems. Download and run: wget http://percona.com/get/pt-summary bash ./pt-summary 2.28.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-summary is a read-only tool. It should be very low-risk. At the time of this release, we know of no bugs that could harm users. 208 Chapter 2. Tools
  • 213. Percona Toolkit Documentation, Release 2.1.1 The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- summary. See also “BUGS” for more information on filing bugs and getting help. 2.28.4 DESCRIPTION pt-summary runs a large variety of commands to inspect system status and configuration, saves the output into files in a temporary directory, and then runs Unix commands on these results to format them nicely. It works best when executed as a privileged user, but will also work without privileges, although some output might not be possible to generate without root. 2.28.5 OUTPUT Many of the outputs from this tool are deliberately rounded to show their magnitude but not the exact detail. This is called fuzzy-rounding. The idea is that it doesn’t matter whether a particular counter is 918 or 921; such a small variation is insignificant, and only makes the output hard to compare to other servers. Fuzzy-rounding rounds in larger increments as the input grows. It begins by rounding to the nearest 5, then the nearest 10, nearest 25, and then repeats by a factor of 10 larger (50, 100, 250), and so on, as the input grows. The following is a simple report generated from a CentOS virtual machine, broken into sections with commentary following each section. Some long lines are reformatted for clarity when reading this documentation as a manual page in a terminal. # Percona Toolkit System Summary Report ###################### Date | 2012-03-30 00:58:07 UTC (local TZ: EDT -0400) Hostname | localhost.localdomain Uptime | 20:58:06 up 1 day, 20 min, 1 user, load average: 0.14, 0.18, 0.18 System | innotek GmbH; VirtualBox; v1.2 () Service Tag | 0 Platform | Linux Release | CentOS release 5.5 (Final) Kernel | 2.6.18-194.el5 Architecture | CPU = 32-bit, OS = 32-bit Threading | NPTL 2.5 Compiler | GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-48). SELinux | Enforcing Virtualized | VirtualBox This section shows the current date and time, and a synopsis of the server and operating system. # Processor ################################################## Processors | physical = 1, cores = 0, virtual = 1, hyperthreading = no Speeds | 1x2510.626 Models | 1xIntel(R) Core(TM) i5-2400S CPU @ 2.50GHz Caches | 1x6144 KB This section is derived from /proc/cpuinfo. # Memory ##################################################### Total | 503.2M Free | 29.0M Used | physical = 474.2M, swap allocated = 1.0M, swap used = 16.0k, virtual = 474.3M 2.28. pt-summary 209
  • 214. Percona Toolkit Documentation, Release 2.1.1 Buffers | 33.9M Caches | 262.6M Dirty | 396 kB UsedRSS | 201.9M Swappiness | 60 DirtyPolicy | 40, 10 Locator Size Speed Form Factor Type Type Detail ======= ==== ===== =========== ==== =========== Information about memory is gathered from free. The Used statistic is the total of the rss sizes displayed by ps. The Dirty statistic for the cached value comes from /proc/meminfo. On Linux, the swappiness settings are gathered from sysctl. The final portion of this section is a table of the DIMMs, which comes from dmidecode. In this example there is no output. # Mounted Filesystems ######################################## Filesystem Size Used Type Opts Mountpoint /dev/mapper/VolGroup00-LogVol00 15G 17% ext3 rw / /dev/sda1 99M 13% ext3 rw /boot tmpfs 252M 0% tmpfs rw /dev/shm The mounted filesystem section is a combination of information from mount and df. This section is skipped if you disable --summarize-mounts. # Disk Schedulers And Queue Size ############################# dm-0 | UNREADABLE dm-1 | UNREADABLE hdc | [cfq] 128 md0 | UNREADABLE sda | [cfq] 128 The disk scheduler information is extracted from the /sys filesystem in Linux. # Disk Partioning ############################################ Device Type Start End Size ============ ==== ========== ========== ================== /dev/sda Disk 17179869184 /dev/sda1 Part 1 13 98703360 /dev/sda2 Part 14 2088 17059230720 Information about disk partitioning comes from fdisk -l. # Kernel Inode State ######################################### dentry-state | 10697 8559 45 0 0 0 file-nr | 960 0 50539 inode-nr | 14059 8139 These lines are from the files of the same name in the /proc/sys/fs directory on Linux. Read the proc man page to learn about the meaning of these files on your system. # LVM Volumes ################################################ LV VG Attr LSize Origin Snap% Move Log Copy% Convert LogVol00 VolGroup00 -wi-ao 269.00G LogVol01 VolGroup00 -wi-ao 9.75G This section shows the output of lvs. # RAID Controller ############################################ Controller | No RAID controller detected 210 Chapter 2. Tools
  • 215. Percona Toolkit Documentation, Release 2.1.1 The tool can detect a variety of RAID controllers by examining lspci and dmesg information. If the controller software is installed on the system, in many cases it is able to execute status commands and show a summary of the RAID controller’s status and configuration. If your system is not supported, please file a bug report. # Network Config ############################################# Controller | Intel Corporation 82540EM Gigabit Ethernet Controller FIN Timeout | 60 Port Range | 61000 The network controllers attached to the system are detected from lspci. The TCP/IP protocol configuration param- eters are extracted from sysctl. You can skip this section by disabling the --summarize-network option. # Interface Statistics ####################################### interface rx_bytes rx_packets rx_errors tx_bytes tx_packets tx_errors ========= ======== ========== ========= ======== ========== ========= lo 60000000 12500 0 60000000 12500 0 eth0 15000000 80000 0 1500000 10000 0 sit0 0 0 0 0 0 0 Interface statistics are gathered from ip -s link and are fuzzy-rounded. The columns are received and transmitted bytes, packets, and errors. You can skip this section by disabling the --summarize-network option. # Network Connections ######################################## Connections from remote IP addresses 127.0.0.1 2 Connections to local IP addresses 127.0.0.1 2 Connections to top 10 local ports 38346 1 60875 1 States of connections ESTABLISHED 5 LISTEN 8 This section shows a summary of network connections, retrieved from netstat and “fuzzy-rounded” to make them easier to compare when the numbers grow large. There are two sub-sections showing how many connections there are per origin and destination IP address, and a sub-section showing the count of ports in use. The section ends with the count of the network connections’ states. You can skip this section by disabling the --summarize-network option. # Top Processes ############################################## PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 15 0 2072 628 540 S 0.0 0.1 0:02.55 init 2 root RT -5 0 0 0 S 0.0 0.0 0:00.00 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/0 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 root 10 -5 0 0 0 S 0.0 0.0 0:00.97 events/0 6 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 khelper 7 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread 10 root 10 -5 0 0 0 S 0.0 0.0 0:00.13 kblockd/0 11 root 20 -5 0 0 0 S 0.0 0.0 0:00.00 kacpid # Notable Processes ########################################## PID OOM COMMAND 2028 +0 sshd This section shows the first few lines of top so that you can see what processes are actively using CPU time. The notable processes include the SSH daemon and any process whose out-of-memory-killer priority is set to 17. You can skip this section by disabling the --summarize-processes option. 2.28. pt-summary 211
  • 216. Percona Toolkit Documentation, Release 2.1.1 # Simplified and fuzzy rounded vmstat (wait please) ########## procs ---swap-- -----io---- ---system---- --------cpu-------- r b si so bi bo ir cs us sy il wa st 2 0 0 0 3 15 30 125 0 0 99 0 0 0 0 0 0 0 0 1250 800 6 10 84 0 0 0 0 0 0 0 0 1000 125 0 0 100 0 0 0 0 0 0 0 0 1000 125 0 0 100 0 0 0 0 0 0 0 450 1000 125 0 1 88 11 0 # The End #################################################### This section is a trimmed-down sample of vmstat 1 5, so you can see the general status of the system at present. The values in the table are fuzzy-rounded, except for the CPU columns. You can skip this section by disabling the --summarize-processes option. 2.28.6 OPTIONS -config type: string Read this comma-separated list of config files. If specified, this must be the first option on the command line. -help Print help and exit. -save-samples type: string Save the collected data in this directory. -read-samples type: string Create a report from the files in this directory. -summarize-mounts default: yes; negatable: yes Report on mounted filesystems and disk usage. -summarize-network default: yes; negatable: yes Report on network controllers and configuration. -summarize-processes default: yes; negatable: yes Report on top processes and vmstat output. -sleep type: int; default: 5 How long to sleep when gathering samples from vmstat. -version Print tool’s version and exit. 2.28.7 SYSTEM REQUIREMENTS This tool requires the Bourne shell (/bin/sh). 212 Chapter 2. Tools
  • 217. Percona Toolkit Documentation, Release 2.1.1 2.28.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-summary. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.28.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.28.10 AUTHORS Baron Schwartz and Kevin van Zonneveld (http://kevin.vanzonneveld.net) 2.28.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.28.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. 2.28. pt-summary 213
  • 218. Percona Toolkit Documentation, Release 2.1.1 This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.28.13 VERSION pt-summary 2.1.1 2.29 pt-table-checksum 2.29.1 NAME pt-table-checksum - Verify MySQL replication integrity. 2.29.2 SYNOPSIS Usage pt-table-checksum [OPTION...] [DSN] pt-table-checksum performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master. The optional DSN specifies the master host. The tool’s exit status is nonzero if any differences are found, or if any warnings or errors occur. The following command will connect to the replication master on localhost, checksum every table, and report the results on every detected replica: pt-table-checksum This tool is focused on finding data differences efficiently. If any data is different, you can resolve the problem with pt-table-sync. 2.29.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-table-checksum can add load to the MySQL server, although it has many safeguards to prevent this. It inserts a small amount of data into a table that contains checksum results. It has checks that, if disabled, can potentially cause replication to fail when unsafe replication options are used. In short, it is safe by default, but it permits you to turn off its safety checks. The tool presumes that schemas and tables are identical on the master and all replicas. Replication will break if, for example, a replica does not have a schema that exists on the master (and that schema is checksummed), or if the structure of a table on a replica is different than on the master. At the time of this release, we know of no bugs that could cause harm to users. 214 Chapter 2. Tools
  • 219. Percona Toolkit Documentation, Release 2.1.1 The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-table- checksum. See also “BUGS” for more information on filing bugs and getting help. 2.29.4 DESCRIPTION pt-table-checksum is designed to do the right thing by default in almost every case. When in doubt, use --explain to see how the tool will checksum a table. The following is a high-level overview of how the tool functions. In contrast to older versions of pt-table-checksum, this tool is focused on a single purpose, and does not have a lot of complexity or support many different checksumming techniques. It executes checksum queries on only one server, and these flow through replication to re-execute on replicas. If you need the older behavior, you can use Percona Toolkit version 1.0. pt-table-checksum connects to the server you specify, and finds databases and tables that match the filters you specify (if any). It works one table at a time, so it does not accumulate large amounts of memory or do a lot of work before beginning to checksum. This makes it usable on very large servers. We have used it on servers with hundreds of thousands of databases and tables, and trillions of rows. No matter how large the server is, pt-table-checksum works equally well. One reason it can work on very large tables is that it divides each table into chunks of rows, and checksums each chunk with a single REPLACE..SELECT query. It varies the chunk size to make the checksum queries run in the desired amount of time. The goal of chunking the tables, instead of doing each table with a single big query, is to ensure that checksums are unintrusive and don’t cause too much replication lag or load on the server. That’s why the target time for each chunk is 0.5 seconds by default. The tool keeps track of how quickly the server is able to execute the queries, and adjusts the chunks as it learns more about the server’s performance. It uses an exponentially decaying weighted average to keep the chunk size stable, yet remain responsive if the server’s performance changes during checksumming for any reason. This means that the tool will quickly throttle itself if your server becomes heavily loaded during a traffic spike or a background task, for example. Chunking is accomplished by a technique that we used to call “nibbling” in other tools in Percona Toolkit. It is the same technique used for pt-archiver, for example. The legacy chunking algorithms used in older versions of pt-table- checksum are removed, because they did not result in predictably sized chunks, and didn’t work well on many tables. All that is required to divide a table into chunks is an index of some sort (preferably a primary key or unique index). If there is no index, and the table contains a suitably small number of rows, the tool will checksum the table in a single chunk. pt-table-checksum has many other safeguards to ensure that it does not interfere with any server’s operation, including replicas. To accomplish this, pt-table-checksum detects replicas and connects to them automatically. (If this fails, you can give it a hint with the --recursion-method option.) The tool monitors replicas continually. If any replica falls too far behind in replication, pt-table-checksum pauses to allow it to catch up. If any replica has an error, or replication stops, pt-table-checksum pauses and waits. In addition, pt-table-checksum looks for common causes of problems, such as replication filters, and refuses to operate unless you force it to. Replication filters are dangerous, because the queries that pt-table-checksum executes could potentially conflict with them and cause replication to fail. pt-table-checksum verifies that chunks are not too large to checksum safely. It performs an EXPLAIN query on each chunk, and skips chunks that might be larger than the desired number of rows. You can configure the sensitivity of this safeguard with the --chunk-size-limit option. If a table will be checksummed in a single chunk because it has a small number of rows, then pt-table-checksum additionally verifies that the table isn’t oversized on replicas. This avoids the following scenario: a table is empty on the master but is very large on a replica, and is checksummed in a single large query, which causes a very long delay in replication. 2.29. pt-table-checksum 215
  • 220. Percona Toolkit Documentation, Release 2.1.1 There are several other safeguards. For example, pt-table-checksum sets its session-level innodb_lock_wait_timeout to 1 second, so that if there is a lock wait, it will be the victim instead of causing other queries to time out. Another safeguard checks the load on the database server, and pauses if the load is too high. There is no single right answer for how to do this, but by default pt-table-checksum will pause if there are more than 25 concurrently executing queries. You should probably set a sane value for your server with the --max-load option. Checksumming usually is a low-priority task that should yield to other work on the server. However, a tool that must be restarted constantly is difficult to use. Thus, pt-table-checksum is very resilient to errors. For example, if the database administrator needs to kill pt-table-checksum‘s queries for any reason, that is not a fatal error. Users often run pt-kill to kill any long-running checksum queries. The tool will retry a killed query once, and if it fails again, it will move on to the next chunk of that table. The same behavior applies if there is a lock wait timeout. The tool will print a warning if such an error happens, but only once per table. If the connection to any server fails, pt-table-checksum will attempt to reconnect and continue working. If pt-table-checksum encounters a condition that causes it to stop completely, it is easy to resume it with the --resume option. It will begin from the last chunk of the last table that it processed. You can also safely stop the tool with CTRL-C. It will finish the chunk it is currently processing, and then exit. You can resume it as usual afterwards. After pt-table-checksum finishes checksumming all of the chunks in a table, it pauses and waits for all detected replicas to finish executing the checksum queries. Once that is finished, it checks all of the replicas to see if they have the same data as the master, and then prints a line of output with the results. You can see a sample of its output later in this documentation. The tool prints progress indicators during time-consuming operations. It prints a progress indicator as each table is checksummed. The progress is computed by the estimated number of rows in the table. It will also print a progress report when it pauses to wait for replication to catch up, and when it is waiting to check replicas for differences from the master. You can make the output less verbose with the --quiet option. If you wish, you can query the checksum tables manually to get a report of which tables and chunks have differences from the master. The following query will report every database and table with differences, along with a summary of the number of chunks and rows possibly affected: SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; The table referenced in that query is the checksum table, where the checksums are stored. Each row in the table contains the checksum of one chunk of data from some table in the server. Version 2.0 of pt-table-checksum is not backwards compatible with pt-table-sync version 1.0. In some cases this is not a serious problem. Adding a “boundaries” column to the table, and then updating it with a manually generated WHERE clause, may suffice to let pt-table-sync version 1.0 interoperate with pt-table-checksum version 2.0. Assuming an integer primary key named ‘id’, You can try something like the following: ALTER TABLE checksums ADD boundaries VARCHAR(500); UPDATE checksums SET boundaries = COALESCE(CONCAT(’id BETWEEN ’, lower_boundary, ’ AND ’, upper_boundary), ’1=1’); 2.29.5 OUTPUT The tool prints tabular results, one line per table: 216 Chapter 2. Tools
  • 221. Percona Toolkit Documentation, Release 2.1.1 TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 10-20T08:36:50 0 0 200 1 0 0.005 db1.tbl1 10-20T08:36:50 0 0 603 7 0 0.035 db1.tbl2 10-20T08:36:50 0 0 16 1 0 0.003 db2.tbl3 10-20T08:36:50 0 0 600 6 0 0.024 db2.tbl4 Errors, warnings, and progress reports are printed to standard error. See also --quiet. Each table’s results are printed when the tool finishes checksumming the table. The columns are as follows: TS The timestamp (without the year) when the tool finished checksumming the table. ERRORS The number of errors and warnings that occurred while checksumming the table. Errors and warnings are printed to standard error while the table is in progress. DIFFS The number of chunks that differ from the master on one or more replicas. If --no-replicate-check is specified, this column will always have zeros. If --replicate-check-only is specified, then only tables with differences are printed. ROWS The number of rows selected and checksummed from the table. It might be different from the number of rows in the table if you use the –where option. CHUNKS The number of chunks into which the table was divided. SKIPPED The number of chunks that were skipped due to errors or warnings, or because they were oversized. TIME The time elapsed while checksumming the table. TABLE The database and table that was checksummed. If --replicate-check-only is specified, only checksum differences on detected replicas are printed. The output is different: one paragraph per replica, one checksum difference per line, and values are separted by spaces: Differences on h=127.0.0.1,P=12346 TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY db1.tbl1 1 0 1 PRIMARY 1 100 db1.tbl1 6 0 1 PRIMARY 501 600 Differences on h=127.0.0.1,P=12347 TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY db1.tbl1 1 0 1 PRIMARY 1 100 db2.tbl2 9 5 0 PRIMARY 101 200 The first line of a paragraph indicates the replica with differences. In this example there are two: h=127.0.0.1,P=12346 and h=127.0.0.1,P=12347. The columns are as follows: TABLE The database and table that differs from the master. CHUNK The chunk number of the table that differs from the master. CNT_DIFF The number of chunk rows on the replica minus the number of chunk rows on the master. CRC_DIFF 1 if the CRC of the chunk on the replica is different than the CRC of the chunk on the master, else 0. CHUNK_INDEX The index used to chunk the table. LOWER_BOUNDARY The index values that define the lower boundary of the chunk. UPPER_BOUNDARY The index values that define the upper boundary of the chunk. 2.29.6 EXIT STATUS A non-zero exit status indicates errors, warnings, or checksum differences. 2.29. pt-table-checksum 217
  • 222. Percona Toolkit Documentation, Release 2.1.1 2.29.7 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass group: Connection Prompt for a password when connecting to MySQL. -check-interval type: time; default: 1; group: Throttle Sleep time between checks for --max-lag. -[no]check-replication-filters default: yes; group: Safety Do not checksum if any replication filters are set on any replicas. The tool looks for server options that filter replication, such as binlog_ignore_db and replicate_do_db. If it finds any such filters, it aborts with an error. If the replicas are configured with any filtering options, you should be careful not to checksum any databases or tables that exist on the master and not the replicas. Changes to such tables might normally be skipped on the replicas because of the filtering options, but the checksum queries modify the contents of the table that stores the checksums, not the tables whose data you are checksumming. Therefore, these queries will be executed on the replica, and if the table or database you’re checksumming does not exist, the queries will cause replication to fail. For more information on replication rules, see http://dev.mysql.com/doc/en/replication-rules.html. Replication filtering makes it impossible to be sure that the checksum queries won’t break replication (or simply fail to replicate). If you are sure that it’s OK to run the checksum queries, you can negate this option to disable the checks. See also --replicate-database. -check-slave-lag type: string; group: Throttle Pause checksumming until this replica’s lag is less than --max-lag. The value is a DSN that inherits proper- ties from the master host and the connection options (--port, --user, etc.). This option overrides the normal behavior of finding and continually monitoring replication lag on ALL connected replicas. If you don’t want to monitor ALL replicas, but you want more than just one replica to be monitored, then use the DSN option to the --recursion-method option instead of this option. -chunk-index type: string Prefer this index for chunking tables. By default, pt-table-checksum chooses the most appropriate index for chunking. This option lets you specify the index that you prefer. If the index doesn’t exist, then pt-table- checksum will fall back to its default behavior of choosing an index. pt-table-checksum adds the index to the checksum SQL statements in a FORCE INDEX clause. Be careful when using this option; a poor choice of index could cause bad performance. This is probably best to use when you are checksumming only a single table, not an entire server. -chunk-size type: size; default: 1000 Number of rows to select for each checksum query. Allowable suffixes are k, M, G. This option can override the default behavior, which is to adjust chunk size dynamically to try to make chunks run in exactly --chunk-time seconds. When this option isn’t set explicitly, its default value is used as a starting point, but after that, the tool ignores this option’s value. If you set this option explicitly, however, then it disables the dynamic adjustment behavior and tries to make all chunks exactly the specified number of rows. There is a subtlety: if the chunk index is not unique, then it’s possible that chunks will be larger than desired. For example, if a table is chunked by an index that contains 10,000 of a given value, there is no way to write a 218 Chapter 2. Tools
  • 223. Percona Toolkit Documentation, Release 2.1.1 WHERE clause that matches only 1,000 of the values, and that chunk will be at least 10,000 rows large. Such a chunk will probably be skipped because of --chunk-size-limit. -chunk-size-limit type: float; default: 2.0; group: Safety Do not checksum chunks this much larger than the desired chunk size. When a table has no unique indexes, chunk sizes can be inaccurate. This option specifies a maximum tolerable limit to the inaccuracy. The tool uses <EXPLAIN> to estimate how many rows are in the chunk. If that estimate exceeds the desired chunk size times the limit (twice as large, by default), then the tool skips the chunk. The minimum value for this option is 1, which means that no chunk can be larger than --chunk-size. You probably don’t want to specify 1, because rows reported by EXPLAIN are estimates, which can be different from the real number of rows in the chunk. If the tool skips too many chunks because they are oversized, you might want to specify a value larger than the default of 2. You can disable oversized chunk checking by specifying a value of 0. -chunk-time type: float; default: 0.5 Adjust the chunk size dynamically so each checksum query takes this long to execute. The tool tracks the checksum rate (rows per second) for all tables and each table individually. It uses these rates to adjust the chunk size after each checksum query, so that the next checksum query takes this amount of time (in seconds) to execute. The algorithm is as follows: at the beginning of each table, the chunk size is initialized from the overall average rows per second since the tool began working, or the value of --chunk-size if the tool hasn’t started working yet. For each subsequent chunk of a table, the tool adjusts the chunk size to try to make queries run in the desired amount of time. It keeps an exponentially decaying moving average of queries per second, so that if the server’s performance changes due to changes in server load, the tool adapts quickly. This allows the tool to achieve predictably timed queries for each table, and for the server overall. If this option is set to zero, the chunk size doesn’t auto-adjust, so query checksum times will vary, but query checksum sizes will not. Another way to do the same thing is to specify a value for --chunk-size explicitly, instead of leaving it at the default. -columns short form: -c; type: array; group: Filter Checksum only this comma-separated list of columns. -config type: Array; group: Config Read this comma-separated list of config files; if specified, this must be the first option on the command line. -[no]create-replicate-table default: yes Create the --replicate database and table if they do not exist. The structure of the replicate table is the same as the suggested table mentioned in --replicate. -databases short form: -d; type: hash; group: Filter Only checksum this comma-separated list of databases. -databases-regex type: string; group: Filter Only checksum databases whose names match this Perl regex. 2.29. pt-table-checksum 219
  • 224. Percona Toolkit Documentation, Release 2.1.1 -defaults-file short form: -F; type: string; group: Connection Only read mysql options from the given file. You must give an absolute pathname. -[no]empty-replicate-table default: yes Delete previous checksums for each table before checksumming the table. This option does not truncate the entire table, it only deletes rows (checksums) for each table just before checksumming the table. Therefore, if checksumming stops prematurely and there was preexisting data, there will still be rows for tables that were not checksummed before the tool was stopped. If you’re resuming from a previous checksum run, then the checksum records for the table from which the tool resumes won’t be emptied. -engines short form: -e; type: hash; group: Filter Only checksum tables which use these storage engines. -explain cumulative: yes; default: 0; group: Output Show, but do not execute, checksum queries (disables --[no]empty-replicate-table). If specified twice, the tool actually iterates through the chunking algorithm, printing the upper and lower boundary values for each chunk, but not executing the checksum queries. -float-precision type: int Precision for FLOAT and DOUBLE number-to-string conversion. Causes FLOAT and DOUBLE values to be rounded to the specified number of digits after the decimal point, with the ROUND() function in MySQL. This can help avoid checksum mismatches due to different floating-point representations of the same values on different MySQL versions and hardware. The default is no rounding; the values are converted to strings by the CONCAT() function, and MySQL chooses the string representation. If you specify a value of 2, for example, then the values 1.008 and 1.009 will be rounded to 1.01, and will checksum as equal. -function type: string Hash function for checksums (FNV1A_64, MURMUR_HASH, SHA1, MD5, CRC32, etc). The default is to use CRC32(), but MD5() and SHA1() also work, and you can use your own function, such as a compiled UDF, if you wish. The function you specify is run in SQL, not in Perl, so it must be available to MySQL. MySQL doesn’t have good built-in hash functions that are fast. CRC32() is too prone to hash collisions, and MD5() and SHA1() are very CPU-intensive. The FNV1A_64() UDF that is distributed with Percona Server is a faster alternative. It is very simple to compile and install; look at the header in the source code for instructions. If it is installed, it is preferred over MD5(). You can also use the MURMUR_HASH() function if you compile and install that as a UDF; the source is also distributed with Percona Server, and it might be better than FNV1A_64(). -help group: Help Show help and exit. -host short form: -h; type: string; default: localhost; group: Connection Host to connect to. 220 Chapter 2. Tools
  • 225. Percona Toolkit Documentation, Release 2.1.1 -ignore-columns type: Hash; group: Filter Ignore this comma-separated list of columns when calculating the checksum. -ignore-databases type: Hash; group: Filter Ignore this comma-separated list of databases. -ignore-databases-regex type: string; group: Filter Ignore databases whose names match this Perl regex. -ignore-engines type: Hash; default: FEDERATED,MRG_MyISAM; group: Filter Ignore this comma-separated list of storage engines. -ignore-tables type: Hash; group: Filter Ignore this comma-separated list of tables. Table names may be qualified with the database name. The --replicate table is always automatically ignored. -ignore-tables-regex type: string; group: Filter Ignore tables whose names match the Perl regex. -lock-wait-timeout type: int; default: 1 Set the session value of innodb_lock_wait_timeout on the master host. This option helps guard against long lock waits if the checksum queries become slow for some reason. Setting this option dynamically requires the InnoDB plugin, so this works only on newer InnoDB and MySQL versions. If setting the value fails and the current server value is greater than the specified value, then a warning is printed; else, if the current server value is less than or equal to the specified value, no warning is printed. -max-lag type: time; default: 1s; group: Throttle Pause checksumming until all replicas’ lag is less than this value. After each checksum query (each chunk), pt- table-checksum looks at the replication lag of all replicas to which it connects, using Seconds_Behind_Master. If any replica is lagging more than the value of this option, then pt-table-checksum will sleep for --check-interval seconds, then check all replicas again. If you specify --check-slave-lag, then the tool only examines that server for lag, not all servers. If you want to control exactly which servers the tool monitors, use the DSN value to --recursion-method. The tool waits forever for replicas to stop lagging. If any replica is stopped, the tool waits forever until the replica is started. Checksumming continues once all replicas are running and not lagging too much. The tool prints progress reports while waiting. If a replica is stopped, it prints a progress report immediately, then again at every progress report interval. -max-load type: Array; default: Threads_running=25; group: Throttle Examine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than the threshold. The option accepts a comma-separated list of MySQL status variables to check for a threshold. An optional =MAX_VALUE (or :MAX_VALUE) can follow each variable. If not given, the tool determines a threshold by examining the current value and increasing it by 20%. 2.29. pt-table-checksum 221
  • 226. Percona Toolkit Documentation, Release 2.1.1 For example, if you want the tool to pause when Threads_connected gets too high, you can specify “Threads_connected”, and the tool will check the current value when it starts working and add 20% to that value. If the current value is 100, then the tool will pause when Threads_connected exceeds 120, and resume working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use either “Threads_connected:110” or “Threads_connected=110”. The purpose of this option is to prevent the tool from adding too much load to the server. If the checksum queries are intrusive, or if they cause lock waits, then other queries on the server will tend to block and queue. This will typically cause Threads_running to increase, and the tool can detect that by running SHOW GLOBAL STATUS immediately after each checksum query finishes. If you specify a threshold for this variable, then you can instruct the tool to wait until queries are running normally again. This will not prevent queueing, however; it will only give the server a chance to recover from the queueing. If you notice queueing, it is best to decrease the chunk time. -password short form: -p; type: string; group: Connection Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -port short form: -P; type: int; group: Connection Port number to use for connection. -progress type: array; default: time,30 Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage, seconds, or number of iterations. The tool prints progress reports for a variety of time-consuming operations, including waiting for replicas to catch up if they become lagged. -quiet short form: -q; cumulative: yes; default: 0 Print only the most important information (disables --progress). Specifying this option once causes the tool to print only errors, warnings, and tables that have checksum differences. Specifying this option twice causes the tool to print only errors. In this case, you can use the tool’s exit status to determine if there were any warnings or checksum differences. -recurse type: int Number of levels to recurse in the hierarchy when discovering replicas. Default is infinite. See also --recursion-method. -recursion-method type: string Preferred recursion method for discovering replicas. Possible methods are: 222 Chapter 2. Tools
  • 227. Percona Toolkit Documentation, Release 2.1.1 METHOD USES =========== ================== processlist SHOW PROCESSLIST hosts SHOW SLAVE HOSTS dsn=DSN DSNs from a table The processlist method is the default, because SHOW SLAVE HOSTS is not reliable. However, the hosts method can work better if the server uses a non-standard port (not 3306). The tool usually does the right thing and finds all replicas, but you may give a preferred method and it will be used first. The hosts method requires replicas to be configured with report_host, report_port, etc. The dsn method is special: it specifies a table from which other DSN strings are read. The specified DSN must specify a D and t, or a database-qualified t. The DSN table should have the following structure: CREATE TABLE ‘dsns‘ ( ‘id‘ int(11) NOT NULL AUTO_INCREMENT, ‘parent_id‘ int(11) DEFAULT NULL, ‘dsn‘ varchar(255) NOT NULL, PRIMARY KEY (‘id‘) ); To make the tool monitor only the hosts 10.10.1.16 and 10.10.1.17 for replication lag and checksum differences, insert the values h=10.10.1.16 and h=10.10.1.17 into the table. Currently, the DSNs are ordered by id, but id and parent_id are otherwise ignored. -replicate type: string; default: percona.checksums Write checksum results to this table. The replicate table must have this structure (MAGIC_create_replicate): CREATE TABLE checksums ( db char(64) NOT NULL, tbl char(64) NOT NULL, chunk int NOT NULL, chunk_time float NULL, chunk_index varchar(200) NULL, lower_boundary text NULL, upper_boundary text NULL, this_crc char(40) NOT NULL, this_cnt int NOT NULL, master_crc char(40) NULL, master_cnt int NULL, ts timestamp NOT NULL, PRIMARY KEY (db, tbl, chunk), INDEX ts_db_tbl (ts, db, tbl) ) ENGINE=InnoDB; By default, --[no]create-replicate-table is true, so the database and the table specified by this option are created automatically if they do not exist. Be sure to choose an appropriate storage engine for the replicate table. If you are checksumming InnoDB tables, and you use MyISAM for this table, a deadlock will break replication, because the mixture of transactional and non-transactional tables in the checksum statements will cause it to be written to the binlog even though it had an error. It will then replay without a deadlock on the replicas, and break replication with “different error on master and slave.” This is not a problem with pt-table-checksum; it’s a problem with MySQL replication, and you can read more about it in the MySQL manual. The replicate table is never checksummed (the tool automatically adds this table to --ignore-tables). 2.29. pt-table-checksum 223
  • 228. Percona Toolkit Documentation, Release 2.1.1 -[no]replicate-check default: yes Check replicas for data differences after finishing each table. The tool finds differences by executing a simple SELECT statement on all detected replicas. The query compares the replica’s checksum results to the master’s checksum results. It reports differences in the DIFFS column of the output. -replicate-check-only Check replicas for consistency without executing checksum queries. This option is used only with --[no]replicate-check. If specified, pt-table-checksum doesn’t checksum any tables. It checks repli- cas for differences found by previous checksumming, and then exits. It might be useful if you run pt-table- checksum quietly in a cron job, for example, and later want a report on the results of the cron job, perhaps to implement a Nagios check. -replicate-database type: string USE only this database. By default, pt-table-checksum executes USE to select the database that contains the table it’s currently working on. This is is a best effort to avoid problems with replication filters such as binlog_ignore_db and replicate_ignore_db. However, replication filters can create a situation where there simply is no one right way to do things. Some statements might not be replicated, and others might cause replication to fail. In such cases, you can use this option to specify a default database that pt-table-checksum selects with USE, and never changes. See also --[no]check-replication-filters. -resume Resume checksumming from the last completed chunk (disables --[no]empty-replicate-table). If the tool stops before it checksums all tables, this option makes checksumming resume from the last chunk of the last table that it finished. -retries type: int; default: 2 Retry a chunk this many times when there is a nonfatal error. Nonfatal errors are problems such as a lock wait timeout or the query being killed. -separator type: string; default: # The separator character used for CONCAT_WS(). This character is used to join the values of columns when checksumming. -set-vars type: string; default: wait_timeout=10000; group: Connection Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string; group: Connection Socket file to use for connection. -tables short form: -t; type: hash; group: Filter Checksum only this comma-separated list of tables. Table names may be qualified with the database name. -tables-regex type: string; group: Filter Checksum only tables whose names match this Perl regex. 224 Chapter 2. Tools
  • 229. Percona Toolkit Documentation, Release 2.1.1 -trim Add TRIM() to VARCHAR columns (helps when comparing 4.1 to >= 5.0). This is useful when you don’t care about the trailing space differences between MySQL versions that vary in their handling of trailing spaces. MySQL 5.0 and later all retain trailing spaces in VARCHAR, while previous versions would remove them. These differences will cause false checksum differences. -user short form: -u; type: string; group: Connection User for login if not current user. -version group: Help Show version and exit. -where type: string Do only rows matching this WHERE clause. You can use this option to limit the checksum to only part of the table. This is particularly useful if you have append-only tables and don’t want to constantly re-check all rows; you could run a daily job to just check yesterday’s rows, for instance. This option is much like the -w option to mysqldump. Do not specify the WHERE keyword. You might need to quote the value. Here is an example: :program:‘pt-table-checksum‘ --where "ts > CURRENT_DATE - INTERVAL 1 DAY" 2.29.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D copy: no DSN table database. • F dsn: mysql_read_default_file; copy: no Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. 2.29. pt-table-checksum 225
  • 230. Percona Toolkit Documentation, Release 2.1.1 • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: no Socket file to use for connection. • t copy: no DSN table table. • u dsn: user; copy: yes User for login if not current user. 2.29.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-table-checksum ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.29.10 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.29.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-table-checksum. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 226 Chapter 2. Tools
  • 231. Percona Toolkit Documentation, Release 2.1.1 2.29.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.29.13 AUTHORS Baron Schwartz and Daniel Nichter 2.29.14 ACKNOWLEDGMENTS Claus Jeppesen, Francois Saint-Jacques, Giuseppe Maxia, Heikki Tuuri, James Briggs, Martin Friebe, and Sergey Zhuravlev 2.29.15 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.29.16 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.29.17 VERSION pt-table-checksum 2.1.1 2.29. pt-table-checksum 227
  • 232. Percona Toolkit Documentation, Release 2.1.1 2.30 pt-table-sync 2.30.1 NAME pt-table-sync - Synchronize MySQL table data efficiently. 2.30.2 SYNOPSIS Usage pt-table-sync [OPTION...] DSN [DSN...] pt-table-sync synchronizes data efficiently between MySQL tables. This tool changes data, so for maximum safety, you should back up your data before you use it. When synchronizing a server that is a replication slave with the –replicate or –sync-to-master methods, it always makes the changes on the replication master, never the replication slave directly. This is in general the only safe way to bring a replica back in sync with its master; changes to the replica are usually the source of the problems in the first place. However, the changes it makes on the master should be no-op changes that set the data to their current values, and actually affect only the replica. Please read the detailed documentation that follows to learn more about this. Sync db.tbl on host1 to host2: pt-table-sync --execute h=host1,D=db,t=tbl h=host2 Sync all tables on host1 to host2 and host3: pt-table-sync --execute host1 host2 host3 Make slave1 have the same data as its replication master: pt-table-sync --execute --sync-to-master slave1 Resolve differences that pt-table-checksum found on all slaves of master1: pt-table-sync --execute --replicate test.checksum master1 Same as above but only resolve differences on slave1: pt-table-sync --execute --replicate test.checksum --sync-to-master slave1 Sync master2 in a master-master replication configuration, where master2’s copy of db.tbl is known or suspected to be incorrect: pt-table-sync --execute --sync-to-master h=master2,D=db,t=tbl Note that in the master-master configuration, the following will NOT do what you want, because it will make changes directly on master2, which will then flow through replication and change master1’s data: # Don’t do this in a master-master setup! pt-table-sync --execute h=master1,D=db,t=tbl master2 228 Chapter 2. Tools
  • 233. Percona Toolkit Documentation, Release 2.1.1 2.30.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. With great power comes great responsibility! This tool changes data, so it is a good idea to back up your data. It is also very powerful, which means it is very complex, so you should run it with the --dry-run option to see what it will do, until you’re familiar with its operation. If you want to see which rows are different, without changing any data, use --print instead of --execute. Be careful when using pt-table-sync in any master-master setup. Master-master replication is inherently tricky, and it’s easy to make mistakes. You need to be sure you’re using the tool correctly for master-master replication. See the “SYNOPSIS” for the overview of the correct usage. Also be careful with tables that have foreign key constraints with ON DELETE or ON UPDATE definitions because these might cause unintended changes on the child tables. In general, this tool is best suited when your tables have a primary key or unique index. Although it can synchronize data in tables lacking a primary key or unique index, it might be best to synchronize that data by another means. At the time of this release, there is a potential bug using --lock-and-rename with MySQL 5.1, a bug detecting certain differences, a bug using ROUND() across different platforms, and a bug mixing collations. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-table- sync. See also “BUGS” for more information on filing bugs and getting help. 2.30.4 DESCRIPTION pt-table-sync does one-way and bidirectional synchronization of table data. It does not synchronize table structures, indexes, or any other schema objects. The following describes one-way synchronization. “BIDIRECTIONAL SYNC- ING” is described later. This tool is complex and functions in several different ways. To use it safely and effectively, you should understand three things: the purpose of --replicate, finding differences, and specifying hosts. These three concepts are closely related and determine how the tool will run. The following is the abbreviated logic: if DSN has a t part, sync only that table: if 1 DSN: if --sync-to-master: The DSN is a slave. Connect to its master and sync. if more than 1 DSN: The first DSN is the source. Sync each DSN in turn. else if --replicate: if --sync-to-master: The DSN is a slave. Connect to its master, find records of differences, and fix. else: The DSN is the master. Find slaves and connect to each, find records of differences, and fix. else: if only 1 DSN and --sync-to-master: The DSN is a slave. Connect to its master, find tables and filter with --databases etc, and sync each table to the master. else: 2.30. pt-table-sync 229
  • 234. Percona Toolkit Documentation, Release 2.1.1 find tables, filtering with --databases etc, and sync each DSN to the first. pt-table-sync can run in one of two ways: with --replicate or without. The default is to run without --replicate which causes pt-table-sync to automatically find differences efficiently with one of several algo- rithms (see “ALGORITHMS”). Alternatively, the value of --replicate, if specified, causes pt-table-sync to use the differences already found by having previously ran pt-table-checksum with its own --replicate option. Strictly speaking, you don’t need to use --replicate because pt-table-sync can find differences, but many people use --replicate if, for example, they checksum regularly using pt-table-checksum then fix differences as needed with pt-table-sync. If you’re unsure, read each tool’s documentation carefully and decide for yourself, or consult with an expert. Regardless of whether --replicate is used or not, you need to specify which hosts to sync. There are two ways: with --sync-to-master or without. Specifying --sync-to-master makes pt-table-sync expect one and only slave DSN on the command line. The tool will automatically discover the slave’s master and sync it so that its data is the same as its master. This is accomplished by making changes on the master which then flow through replication and update the slave to resolve its differences. Be careful though: although this option specifies and syncs a single slave, if there are other slaves on the same master, they will receive via replication the changes intended for the slave that you’re trying to sync. Alternatively, if you do not specify --sync-to-master, the first DSN given on the command line is the source host. There is only ever one source host. If you do not also specify --replicate, then you must specify at least one other DSN as the destination host. There can be one or more destination hosts. Source and destination hosts must be independent; they cannot be in the same replication topology. pt-table-sync will die with an error if it detects that a destination host is a slave because changes are written directly to destination hosts (and it’s not safe to write directly to slaves). Or, if you specify --replicate (but not --sync-to-master) then pt-table-sync expects one and only one master DSN on the command line. The tool will automatically discover all the master’s slaves and sync them to the master. This is the only way to sync several (all) slaves at once (because --sync-to-master only specifies one slave). Each host on the command line is specified as a DSN. The first DSN (or only DSN for cases like --sync-to-master) provides default values for other DSNs, whether those other DSNs are specified on the com- mand line or auto-discovered by the tool. So in this example, pt-table-sync --execute h=host1,u=msandbox,p=msandbox h=host2 the host2 DSN inherits the u and p DSN parts from the host1 DSN. Use the --explain-hosts option to see how pt-table-sync will interpret the DSNs given on the command line. 2.30.5 OUTPUT If you specify the --verbose option, you’ll see information about the differences between the tables. There is one row per table. Each server is printed separately. For example, # Syncing h=host1,D=test,t=test1 # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE # 0 0 3 0 Chunk 13:00:00 13:00:17 2 test.test1 Table test.test1 on host1 required 3 INSERT statements to synchronize and it used the Chunk algorithm (see “ALGO- RITHMS”). The sync operation for this table started at 13:00:00 and ended 17 seconds later (times taken from NOW() on the source host). Because differences were found, its “EXIT STATUS” was 2. If you specify the --print option, you’ll see the actual SQL statements that the script uses to synchronize the table if --execute is also specified. If you want to see the SQL statements that pt-table-sync is using to select chunks, nibbles, rows, etc., then specify --print once and --verbose twice. Be careful though: this can print a lot of SQL statements. 230 Chapter 2. Tools
  • 235. Percona Toolkit Documentation, Release 2.1.1 There are cases where no combination of INSERT, UPDATE or DELETE statements can resolve differences without violating some unique key. For example, suppose there’s a primary key on column a and a unique key on column b. Then there is no way to sync these two tables with straightforward UPDATE statements: +---+---+ +---+---+ | a | b | | a | b | +---+---+ +---+---+ | 1 | 2 | | 1 | 1 | | 2 | 1 | | 2 | 2 | +---+---+ +---+---+ The tool rewrites queries to DELETE and REPLACE in this case. This is automatically handled after the first index violation, so you don’t have to worry about it. 2.30.6 REPLICATION SAFETY Synchronizing a replication master and slave safely is a non-trivial problem, in general. There are all sorts of issues to think about, such as other processes changing data, trying to change data on the slave, whether the destination and source are a master-master pair, and much more. In general, the safe way to do it is to change the data on the master, and let the changes flow through replication to the slave like any other changes. However, this works only if it’s possible to REPLACE into the table on the master. REPLACE works only if there’s a unique index on the table (otherwise it just acts like an ordinary INSERT). If your table has unique keys, you should use the --sync-to-master and/or --replicate options to sync a slave to its master. This will generally do the right thing. When there is no unique key on the table, there is no choice but to change the data on the slave, and pt-table-sync will detect that you’re trying to do so. It will complain and die unless you specify --no-check-slave (see --[no]check-slave). If you’re syncing a table without a primary or unique key on a master-master pair, you must change the data on the destination server. Therefore, you need to specify --no-bin-log for safety (see --[no]bin-log). If you don’t, the changes you make on the destination server will replicate back to the source server and change the data there! The generally safe thing to do on a master-master pair is to use the --sync-to-master option so you don’t change the data on the destination server. You will also need to specify --no-check-slave to keep pt-table-sync from complaining that it is changing data on a slave. 2.30.7 ALGORITHMS pt-table-sync has a generic data-syncing framework which uses different algorithms to find differences. The tool automatically chooses the best algorithm for each table based on indexes, column types, and the algorithm preferences specified by --algorithms. The following algorithms are available, listed in their default order of preference: Chunk Finds an index whose first column is numeric (including date and time types), and divides the column’s range of values into chunks of approximately --chunk-size rows. Syncs a chunk at a time by check- summing the entire chunk. If the chunk differs on the source and destination, checksums each chunk’s rows individually to find the rows that differ. It is efficient when the column has sufficient cardinality to make the chunks end up about the right size. The initial per-chunk checksum is quite small and results in minimal network traffic and memory con- sumption. If a chunk’s rows must be examined, only the primary key columns and a checksum are sent over the network, not the entire row. If a row is found to be different, the entire row will be fetched, but not before. Nibble 2.30. pt-table-sync 231
  • 236. Percona Toolkit Documentation, Release 2.1.1 Finds an index and ascends the index in fixed-size nibbles of --chunk-size rows, using a non- backtracking algorithm (see pt-archiver for more on this algorithm). It is very similar to “Chunk”, but instead of pre-calculating the boundaries of each piece of the table based on index cardinality, it uses LIMIT to define each nibble’s upper limit, and the previous nibble’s upper limit to define the lower limit. It works in steps: one query finds the row that will define the next nibble’s upper boundary, and the next query checksums the entire nibble. If the nibble differs between the source and destination, it examines the nibble row-by-row, just as “Chunk” does. GroupBy Selects the entire table grouped by all columns, with a COUNT(*) column added. Compares all columns, and if they’re the same, compares the COUNT(*) column’s value to determine how many rows to insert or delete into the destination. Works on tables with no primary key or unique index. Stream Selects the entire table in one big stream and compares all columns. Selects all columns. Much less efficient than the other algorithms, but works when there is no suitable index for them to use. Future Plans Possibilities for future algorithms are TempTable (what I originally called bottom-up in earlier versions of this tool), DrillDown (what I originally called top-down), and GroupByPrefix (similar to how SqlYOG Job Agent works). Each algorithm has strengths and weaknesses. If you’d like to implement your favorite technique for finding differences between two sources of data on possibly different servers, I’m willing to help. The algorithms adhere to a simple interface that makes it pretty easy to write your own. 2.30.8 BIDIRECTIONAL SYNCING Bidirectional syncing is a new, experimental feature. To make it work reliably there are a number of strict limitations: * only works when syncing one server to other independent servers * does not work in any way with replication * requires that the table(s) are chunkable with the Chunk algorithm * is not N-way, only bidirectional between two servers at a time * does not handle DELETE changes For example, suppose we have three servers: c1, r1, r2. c1 is the central server, a pseudo-master to the other servers (viz. r1 and r2 are not slaves to c1). r1 and r2 are remote servers. Rows in table foo are updated and inserted on all three servers and we want to synchronize all the changes between all the servers. Table foo has columns: id int PRIMARY KEY ts timestamp auto updated name varchar Auto-increment offsets are used so that new rows from any server do not create conflicting primary key (id) values. In general, newer rows, as determined by the ts column, take precedence when a same but differing row is found during the bidirectional sync. “Same but differing” means that two rows have the same primary key (id) value but different values for some other column, like the name column in this example. Same but differing conflicts are resolved by a “conflict”. A conflict compares some column of the competing rows to determine a “winner”. The winning row becomes the source and its values are used to update the other row. There are subtle differences between three columns used to achieve bidirectional syncing that you should be fa- miliar with: chunk column (--chunk-column), comparison column(s) (--columns), and conflict column (--conflict-column). The chunk column is only used to chunk the table; e.g. “WHERE id >= 5 AND id < 10”. Chunks are checksummed and when chunk checksums reveal a difference, the tool selects the rows in that chunk and checksums the --columns for each row. If a column checksum differs, the rows have one or more con- flicting column values. In a traditional unidirectional sync, the conflict is a moot point because it can be resolved 232 Chapter 2. Tools
  • 237. Percona Toolkit Documentation, Release 2.1.1 simply by updating the entire destination row with the source row’s values. In a bidirectional sync, however, the --conflict-column (in accordance with other --conflict-* options list below) is compared to determine which row is “correct” or “authoritative”; this row becomes the “source”. To sync all three servers completely, two runs of pt-table-sync are required. The first run syncs c1 and r1, then syncs c1 and r2 including any changes from r1. At this point c1 and r2 are completely in sync, but r1 is missing any changes from r2 because c1 didn’t have these changes when it and r1 were synced. So a second run is needed which syncs the servers in the same order, but this time when c1 and r1 are synced r1 gets r2’s changes. The tool does not sync N-ways, only bidirectionally between the first DSN given on the command line and each subsequent DSN in turn. So the tool in this example would be ran twice like: pt-table-sync --bidirectional h=c1 h=r1 h=r2 The --bidirectional option enables this feature and causes various sanity checks to be performed. You must specify other options that tell pt-table-sync how to resolve conflicts for same but differing rows. These options are: * --conflict-column * --conflict-comparison * --conflict-value * --conflict-threshold * --conflict-error"> (optional) Use --print to test this option before --execute. The printed SQL statements will have comments saying on which host the statement would be executed if you used --execute. Technical side note: the first DSN is always the “left” server and the other DSNs are always the “right” server. Since either server can become the source or destination it’s confusing to think of them as “src” and “dst”. Therefore, they’re generically referred to as left and right. It’s easy to remember this because the first DSN is always to the left of the other server DSNs on the command line. 2.30.9 EXIT STATUS The following are the exit statuses (also called return values, or return codes) when pt-table-sync finishes and exits. STATUS MEANING ====== ======================================================= 0 Success. 1 Internal error. 2 At least one table differed on the destination. 3 Combination of 1 and 2. 2.30.10 OPTIONS Specify at least one of --print, --execute, or --dry-run. --where and --replicate are mutually exclusive. This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -algorithms type: string; default: Chunk,Nibble,GroupBy,Stream Algorithm to use when comparing the tables, in order of preference. For each table, pt-table-sync will check if the table can be synced with the given algorithms in the order that they’re given. The first algorithm that can sync the table is used. See “ALGORITHMS”. 2.30. pt-table-sync 233
  • 238. Percona Toolkit Documentation, Release 2.1.1 -ask-pass Prompt for a password when connecting to MySQL. -bidirectional Enable bidirectional sync between first and subsequent hosts. See “BIDIRECTIONAL SYNCING” for more information. -[no]bin-log default: yes Log to the binary log (SET SQL_LOG_BIN=1). Specifying --no-bin-log will SET SQL_LOG_BIN=0. -buffer-in-mysql Instruct MySQL to buffer queries in its memory. This option adds the SQL_BUFFER_RESULT option to the comparison queries. This causes MySQL to execute the queries and place them in a temporary table internally before sending the results back to pt-table-sync. The advantage of this strategy is that pt-table-sync can fetch rows as desired without using a lot of memory inside the Perl process, while releasing locks on the MySQL table (to reduce contention with other queries). The disadvantage is that it uses more memory on the MySQL server instead. You probably want to leave --[no]buffer-to-client enabled too, because buffering into a temp table and then fetching it all into Perl’s memory is probably a silly thing to do. This option is most useful for the GroupBy and Stream algorithms, which may fetch a lot of data from the server. -[no]buffer-to-client default: yes Fetch rows one-by-one from MySQL while comparing. This option enables mysql_use_result which causes MySQL to hold the selected rows on the server until the tool fetches them. This allows the tool to use less memory but may keep the rows locked on the server longer. If this option is disabled by specifying --no-buffer-to-client then mysql_store_result is used which causes MySQL to send all selected rows to the tool at once. This may result in the results “cursor” being held open for a shorter time on the server, but if the tables are large, it could take a long time anyway, and use all your memory. For most non-trivial data sizes, you want to leave this option enabled. This option is disabled when --bidirectional is used. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -[no]check-master default: yes With --sync-to-master, try to verify that the detected master is the real master. -[no]check-privileges default: yes Check that user has all necessary privileges on source and destination table. 234 Chapter 2. Tools
  • 239. Percona Toolkit Documentation, Release 2.1.1 -[no]check-slave default: yes Check whether the destination server is a slave. If the destination server is a slave, it’s generally unsafe to make changes on it. However, sometimes you have to; --replace won’t work unless there’s a unique index, for example, so you can’t make changes on the master in that scenario. By default pt-table-sync will complain if you try to change data on a slave. Specify --no-check-slave to disable this check. Use it at your own risk. -[no]check-triggers default: yes Check that no triggers are defined on the destination table. Triggers were introduced in MySQL v5.0.2, so for older versions this option has no effect because triggers will not be checked. -chunk-column type: string Chunk the table on this column. -chunk-index type: string Chunk the table using this index. -chunk-size type: string; default: 1000 Number of rows or data size per chunk. The size of each chunk of rows for the “Chunk” and “Nibble” algorithms. The size can be either a number of rows, or a data size. Data sizes are specified with a suffix of k=kibibytes, M=mebibytes, G=gibibytes. Data sizes are converted to a number of rows by dividing by the average row length. -columns short form: -c; type: array Compare this comma-separated list of columns. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -conflict-column type: string Compare this column when rows conflict during a --bidirectional sync. When a same but differing row is found the value of this column from each row is compared according to --conflict-comparison, --conflict-value and --conflict-threshold to determine which row has the correct data and becomes the source. The column can be any type for which there is an appropriate --conflict-comparison (this is almost all types except, for example, blobs). This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information. -conflict-comparison type: string Choose the --conflict-column with this property as the source. 2.30. pt-table-sync 235
  • 240. Percona Toolkit Documentation, Release 2.1.1 The option affects how the --conflict-column values from the conflicting rows are compared. Possible comparisons are one of these MAGIC_comparisons: newest|oldest|greatest|least|equals|matches COMPARISON CHOOSES ROW WITH ========== ========================================================= newest Newest temporal --conflict-column value oldest Oldest temporal --conflict-column value greatest Greatest numerical "--conflict-column value least Least numerical --conflict-column value equals --conflict-column value equal to --conflict-value matches --conflict-column value matching Perl regex pattern --conflict-value This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information. -conflict-error type: string; default: warn How to report unresolvable conflicts and conflict errors This option changes how the user is notified when a conflict cannot be resolved or causes some kind of error. Possible values are: * warn: Print a warning to STDERR about the unresolvable conflict * die: Die, stop syncing, and print a warning to STDERR This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information. -conflict-threshold type: string Amount by which one --conflict-column must exceed the other. The --conflict-threshold prevents a conflict from being resolved if the absolute difference between the two --conflict-column values is less than this amount. For example, if two --conflict-column have timestamp values “2009-12-01 12:00:00” and “2009-12-01 12:05:00” the difference is 5 minutes. If --conflict-threshold is set to “5m” the conflict will be resolved, but if --conflict-threshold is set to “6m” the conflict will fail to resolve because the difference is not greater than or equal to 6 minutes. In this latter case, --conflict-error will report the failure. This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information. -conflict-value type: string Use this value for certain --conflict-comparison. This option gives the value for equals and matches --conflict-comparison. This option only works with --bidirectional. See “BIDIRECTIONAL SYNCING” for more information. -databases short form: -d; type: hash Sync only this comma-separated list of databases. A common request is to sync tables from one database with tables from another database on the same or different server. This is not yet possible. --databases will not do it, and you can’t do it with the D part of the DSN either because in the absence of a table name it assumes the whole server should be synced and the D part controls only the connection’s default database. 236 Chapter 2. Tools
  • 241. Percona Toolkit Documentation, Release 2.1.1 -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -dry-run Analyze, decide the sync algorithm to use, print and exit. Implies --verbose so you can see the results. The results are in the same output format that you’ll see from actually running the tool, but there will be zeros for rows affected. This is because the tool actually executes, but stops before it compares any data and just returns zeros. The zeros do not mean there are no changes to be made. -engines short form: -e; type: hash Sync only this comma-separated list of storage engines. -execute Execute queries to make the tables have identical data. This option makes pt-table-sync actually sync table data by executing all the queries that it created to resolve table differences. Therefore, the tables will be changed! And unless you also specify --verbose, the changes will be made silently. If this is not what you want, see --print or --dry-run. -explain-hosts Print connection information and exit. Print out a list of hosts to which pt-table-sync will connect, with all the various connection options, and exit. -float-precision type: int Precision for FLOAT and DOUBLE number-to-string conversion. Causes FLOAT and DOUBLE values to be rounded to the specified number of digits after the decimal point, with the ROUND() function in MySQL. This can help avoid checksum mismatches due to different floating-point representations of the same values on different MySQL versions and hardware. The default is no rounding; the values are converted to strings by the CONCAT() function, and MySQL chooses the string representation. If you specify a value of 2, for example, then the values 1.008 and 1.009 will be rounded to 1.01, and will checksum as equal. -[no]foreign-key-checks default: yes Enable foreign key checks (SET FOREIGN_KEY_CHECKS=1). Specifying --no-foreign-key-checks will SET FOREIGN_KEY_CHECKS=0. -function type: string Which hash function you’d like to use for checksums. The default is CRC32. Other good choices include MD5 and SHA1. If you have installed the FNV_64 user- defined function, pt-table-sync will detect it and prefer to use it, because it is much faster than the built-ins. You can also use MURMUR_HASH if you’ve installed that user-defined function. Both of these are distributed with Maatkit. See pt-table-checksum for more information and benchmarks. -help Show help and exit. -[no]hex-blob default: yes HEX() BLOB, TEXT and BINARY columns. 2.30. pt-table-sync 237
  • 242. Percona Toolkit Documentation, Release 2.1.1 When row data from the source is fetched to create queries to sync the data (i.e. the queries seen with --print and executed by --execute), binary columns are wrapped in HEX() so the binary data does not produce an invalid SQL statement. You can disable this option but you probably shouldn’t. -host short form: -h; type: string Connect to host. -ignore-columns type: Hash Ignore this comma-separated list of column names in comparisons. This option causes columns not to be compared. However, if a row is determined to differ between tables, all columns in that row will be synced, regardless. (It is not currently possible to exclude columns from the sync process itself, only from the comparison.) -ignore-databases type: Hash Ignore this comma-separated list of databases. -ignore-engines type: Hash; default: FEDERATED,MRG_MyISAM Ignore this comma-separated list of storage engines. -ignore-tables type: Hash Ignore this comma-separated list of tables. Table names may be qualified with the database name. -[no]index-hint default: yes Add FORCE/USE INDEX hints to the chunk and row queries. By default pt-table-sync adds a FORCE/USE INDEX hint to each SQL statement to coerce MySQL into using the index chosen by the sync algorithm or specified by --chunk-index. This is usually a good thing, but in rare cases the index may not be the best for the query so you can suppress the index hint by specifying --no-index-hint and let MySQL choose the index. This does not affect the queries printed by --print; it only affects the chunk and row queries that pt-table- sync uses to select and compare rows. -lock type: int Lock tables: 0=none, 1=per sync cycle, 2=per table, or 3=globally. This uses LOCK TABLES. This can help prevent tables being changed while you’re examining them. The possible values are as follows: VALUE MEANING ===== ======================================================= 0 Never lock tables. 1 Lock and unlock one time per sync cycle (as implemented by the syncing algorithm). This is the most granular level of locking available. For example, the Chunk algorithm will lock each chunk of C<N> rows, and then unlock them if they are the same on the source and the 238 Chapter 2. Tools
  • 243. Percona Toolkit Documentation, Release 2.1.1 destination, before moving on to the next chunk. 2 Lock and unlock before and after each table. 3 Lock and unlock once for every server (DSN) synced, with C<FLUSH TABLES WITH READ LOCK>. A replication slave is never locked if --replicate or --sync-to-master is specified, since in theory locking the table on the master should prevent any changes from taking place. (You are not changing data on your slave, right?) If --wait is given, the master (source) is locked and then the tool waits for the slave to catch up to the master before continuing. If --transaction is specified, LOCK TABLES is not used. Instead, lock and unlock are implemented by beginning and committing transactions. The exception is if --lock is 3. If --no-transaction is specified, then LOCK TABLES is used for any value of --lock. See --[no]transaction. -lock-and-rename Lock the source and destination table, sync, then swap names. This is useful as a less-blocking ALTER TABLE, once the tables are reasonably in sync with each other (which you may choose to accomplish via any number of means, including dump and reload or even something like pt-archiver). It requires exactly two DSNs and assumes they are on the same server, so it does no waiting for replication or the like. Tables are locked with LOCK TABLES. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -port short form: -P; type: int Port number to use for connection. -print Print queries that will resolve differences. If you don’t trust pt-table-sync, or just want to see what it will do, this is a good way to be safe. These queries are valid SQL and you can run them yourself if you want to sync the tables manually. -recursion-method type: string Preferred recursion method used to find slaves. Possible methods are: METHOD USES =========== ================ processlist SHOW PROCESSLIST hosts SHOW SLAVE HOSTS The processlist method is preferred because SHOW SLAVE HOSTS is not reliable. However, the hosts method is required if the server uses a non-standard port (not 3306). Usually pt-table-sync does the right thing and finds 2.30. pt-table-sync 239
  • 244. Percona Toolkit Documentation, Release 2.1.1 the slaves, but you may give a preferred method and it will be used first. If it doesn’t find any slaves, the other methods will be tried. -replace Write all INSERT and UPDATE statements as REPLACE. This is automatically switched on as needed when there are unique index violations. -replicate type: string Sync tables listed as different in this table. Specifies that pt-table-sync should examine the specified table to find data that differs. The table is exactly the same as the argument of the same name to pt-table-checksum. That is, it contains records of which tables (and ranges of values) differ between the master and slave. For each table and range of values that shows differences between the master and slave, pt-table-checksum will sync that table, with the appropriate WHERE clause, to its master. This automatically sets --wait to 60 and causes changes to be made on the master instead of the slave. If --sync-to-master is specified, the tool will assume the server you specified is the slave, and connect to the master as usual to sync. Otherwise, it will try to use SHOW PROCESSLIST to find slaves of the server you specified. If it is unable to find any slaves via SHOW PROCESSLIST, it will inspect SHOW SLAVE HOSTS instead. You must configure each slave’s report-host, report-port and other options for this to work right. After finding slaves, it will inspect the specified table on each slave to find data that needs to be synced, and sync it. The tool examines the master’s copy of the table first, assuming that the master is potentially a slave as well. Any table that shows differences there will NOT be synced on the slave(s). For example, suppose your replication is set up as A->B, B->C, B->D. Suppose you use this argument and specify server B. The tool will examine server B’s copy of the table. If it looks like server B’s data in table test.tbl1 is different from server A’s copy, the tool will not sync that table on servers C and D. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -sync-to-master Treat the DSN as a slave and sync it to its master. Treat the server you specified as a slave. Inspect SHOW SLAVE STATUS, connect to the server’s master, and treat the master as the source and the slave as the destination. Causes changes to be made on the master. Sets --wait to 60 by default, sets --lock to 1 by default, and disables --[no]transaction by default. See also --replicate, which changes this option’s behavior. -tables short form: -t; type: hash Sync only this comma-separated list of tables. Table names may be qualified with the database name. -timeout-ok Keep going if --wait fails. 240 Chapter 2. Tools
  • 245. Percona Toolkit Documentation, Release 2.1.1 If you specify --wait and the slave doesn’t catch up to the master’s position before the wait times out, the default behavior is to abort. This option makes the tool keep going anyway. Warning: if you are trying to get a consistent comparison between the two servers, you probably don’t want to keep going after a timeout. -[no]transaction Use transactions instead of LOCK TABLES. The granularity of beginning and committing transactions is controlled by --lock. This is enabled by default, but since --lock is disabled by default, it has no effect. Most options that enable locking also disable transactions by default, so if you want to use transactional locking (via LOCK IN SHARE MODE and FOR UPDATE, you must specify --transaction explicitly. If you don’t specify --transaction explicitly pt-table-sync will decide on a per-table basis whether to use transactions or table locks. It currently uses transactions on InnoDB tables, and table locks on all others. If --no-transaction is specified, then pt-table-sync will not use transactions at all (not even for InnoDB tables) and locking is controlled by --lock. When enabled, either explicitly or implicitly, the transaction isolation level is set REPEATABLE READ and transactions are started WITH CONSISTENT SNAPSHOT. -trim TRIM() VARCHAR columns in BIT_XOR and ACCUM modes. Helps when comparing MySQL 4.1 to >= 5.0. This is useful when you don’t care about the trailing space differences between MySQL versions which vary in their handling of trailing spaces. MySQL 5.0 and later all retain trailing spaces in VARCHAR, while previous versions would remove them. -[no]unique-checks default: yes Enable unique key checks (SET UNIQUE_CHECKS=1). Specifying --no-unique-checks will SET UNIQUE_CHECKS=0. -user short form: -u; type: string User for login if not current user. -verbose short form: -v; cumulative: yes Print results of sync operations. See “OUTPUT” for more details about the output. -version Show version and exit. -wait short form: -w; type: time How long to wait for slaves to catch up to their master. Make the master wait for the slave to catch up in replication before comparing the tables. The value is the number of seconds to wait before timing out (see also --timeout-ok). Sets --lock to 1 and --[no]transaction to 0 by default. If you see an error such as the following, MASTER_POS_WAIT returned -1 It means the timeout was exceeded and you need to increase it. The default value of this option is influenced by other options. To see what value is in effect, run with --help. 2.30. pt-table-sync 241
  • 246. Percona Toolkit Documentation, Release 2.1.1 To disable waiting entirely (except for locks), specify --wait 0. This helps when the slave is lagging on tables that are not being synced. -where type: string WHERE clause to restrict syncing to part of the table. -[no]zero-chunk default: yes Add a chunk for rows with zero or zero-equivalent values. The only has an effect when --chunk-size is specified. The purpose of the zero chunk is to capture a potentially large number of zero values that would imbalance the size of the first chunk. For example, if a lot of negative numbers were inserted into an unsigned integer column causing them to be stored as zeros, then these zero values are captured by the zero chunk instead of the first chunk and all its non-zero values. 2.30.11 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Database containing the table to be synced. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. 242 Chapter 2. Tools
  • 247. Percona Toolkit Documentation, Release 2.1.1 • t copy: yes Table to be synced. • u dsn: user; copy: yes User for login if not current user. 2.30.12 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-table-sync ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.30.13 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.30.14 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-table-sync. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.30.15 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: 2.30. pt-table-sync 243
  • 248. Percona Toolkit Documentation, Release 2.1.1 wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.30.16 AUTHORS Baron Schwartz 2.30.17 ACKNOWLEDGMENTS My work is based in part on Giuseppe Maxia’s work on distributed databases, http://www.sysadminmag.com/articles/2004/0408/ and code derived from that article. There is more explana- tion, and a link to the code, at http://www.perlmonks.org/?node_id=381053. Another programmer extended Maxia’s work even further. Fabien Coelho changed and generalized Maxia’s technique, introducing symmetry and avoiding some problems that might have caused too-frequent checksum collisions. This work grew into pg_comparator, http://www.coelho.net/pg_comparator/. Coelho also explained the technique further in a paper titled “Remote Comparison of Database Tables” (http://cri.ensmp.fr/classement/doc/A-375.pdf). This existing literature mostly addressed how to find the differences between the tables, not how to resolve them once found. I needed a tool that would not only find them efficiently, but would then resolve them. I first began thinking about how to improve the technique further with my article http://tinyurl.com/mysql-data-diff-algorithm, where I discussed a number of problems with the Maxia/Coelho “bottom-up” algorithm. After writing that article, I began to write this tool. I wanted to actually implement their algorithm with some improvements so I was sure I understood it completely. I discovered it is not what I thought it was, and is considerably more complex than it appeared to me at first. Fabien Coelho was kind enough to address some questions over email. The first versions of this tool implemented a version of the Coelho/Maxia algorithm, which I called “bottom-up”, and my own, which I called “top-down.” Those algorithms are considerably more complex than the current algorithms and I have removed them from this tool, and may add them back later. The improvements to the bottom-up algorithm are my original work, as is the top-down algorithm. The techniques to actually resolve the differences are also my own work. Another tool that can synchronize tables is the SQLyog Job Agent from webyog. Thanks to Rohit Nadhani, SJA’s author, for the conversations about the general techniques. There is a comparison of pt-table-sync and SJA at http://tinyurl.com/maatkit-vs-sqlyog Thanks to the following people and organizations for helping in many ways: The Rimm-Kaufman Group http://www.rimmkaufman.com/, MySQL AB http://www.mysql.com/, Blue Ridge Inter- netWorks http://www.briworks.com/, Percona http://www.percona.com/, Fabien Coelho, Giuseppe Maxia and others at MySQL AB, Kristian Koehntopp (MySQL AB), Rohit Nadhani (WebYog), The helpful monks at Perlmonks, And others too numerous to mention. 2.30.18 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 244 Chapter 2. Tools
  • 249. Percona Toolkit Documentation, Release 2.1.1 2.30.19 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.30.20 VERSION pt-table-sync 2.1.1 2.31 pt-table-usage 2.31.1 NAME pt-table-usage - Analyze how queries use tables. 2.31.2 SYNOPSIS Usage pt-table-usage [OPTIONS] [FILES] pt-table-usage reads queries from a log and analyzes how they use tables. If no FILE is specified, it reads STDIN. It prints a report for each query. 2.31.3 RISKS pt-table-use is very low risk. By default, it simply reads queries from a log. It executes EXPLAIN EXTENDED if you specify the --explain-extended option. At the time of this release, we know of no bugs that could harm users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-table- usage. See also “BUGS” for more information on filing bugs and getting help. 2.31. pt-table-usage 245
  • 250. Percona Toolkit Documentation, Release 2.1.1 2.31.4 DESCRIPTION pt-table-usage reads queries from a log and analyzes how they use tables. The log should be in MySQL’s slow query log format. Table usage is more than simply an indication of which tables the query reads or writes. It also indicates data flow: data in and data out. The tool determines the data flow by the contexts in which tables appear. A single query can use a table in several different contexts simultaneously. The tool’s output lists every context for every table. This CONTEXT-TABLE list indicates how data flows between tables. The “OUTPUT” section lists the possible contexts and describes how to read a table usage report. The tool analyzes data flow down to the level of individual columns, so it is helpful if columns are identified un- ambiguously in the query. If a query uses only one table, then all columns must be from that table, and there’s no difficulty. But if a query uses multiple tables and the column names are not table-qualified, then it is necessary to use EXPLAIN EXTENDED, followed by SHOW WARNINGS, to determine to which tables the columns belong. If the tool does not know the query’s default database, which can occur when the database is not printed in the log, then EXPLAIN EXTENDED can fail. In this case, you can specify a default database with --database. You can also use the --create-table-definitions option to help resolve ambiguities. 2.31.5 OUTPUT The tool prints a usage report for each table in every query, similar to the following: Query_id: 0x1CD27577D202A339.1 UPDATE t1 SELECT DUAL JOIN t1 JOIN t2 WHERE t1 Query_id: 0x1CD27577D202A339.2 UPDATE t2 SELECT DUAL JOIN t1 JOIN t2 WHERE t1 The first line contains the query ID, which by default is the same as those shown in pt-query-digest reports. It is an MD5 checksum of the query’s “fingerprint,” which is what remains after removing literals, collapsing white space, and a variety of other transformations. The query ID has two parts separated by a period: the query ID and the table number. If you wish to use a different value to identify the query, you can specify the --id-attribute option. The previous example shows two paragraphs for a single query, not two queries. Note that the query ID is identical for the two, but the table number differs. The table number increments by 1 for each table that the query updates. Only multi-table UPDATE queries can update multiple tables with a single query, so the table number is 1 for all other types of queries. (The tool does not support multi-table DELETE queries.) The example output above is from this query: UPDATE t1 AS a JOIN t2 AS b USING (id) SET a.foo="bar", b.foo="bat" WHERE a.id=1; The SET clause indicates that the query updates two tables: a aliased as t1, and b aliased as t2. After the first line, the tool prints a variable number of CONTEXT-TABLE lines. Possible contexts are as follows: • SELECT 246 Chapter 2. Tools
  • 251. Percona Toolkit Documentation, Release 2.1.1 SELECT means that the query retrieves data from the table for one of two reasons. The first is to be returned to the user as part of a result set. Only SELECT queries return result sets, so the report always shows a SELECT context for SELECT queries. The second case is when data flows to another table as part of an INSERT or UPDATE. For example, the UPDATE query in the example above has the usage: SELECT DUAL This refers to: SET a.foo="bar", b.foo="bat" The tool uses DUAL for any values that do not originate in a table, in this case the literal values “bar” and “bat”. If that SET clause were SET a.foo=b.foo instead, then the complete usage would be: Query_id: 0x1CD27577D202A339.1 UPDATE t1 SELECT t2 JOIN t1 JOIN t2 WHERE t1 The presence of a SELECT context after another context, such as UPDATE or INSERT, indicates where the UPDATE or INSERT retrieves its data. The example immediately above reflects an UPDATE query that updates rows in table t1 with data from table t2. • Any other verb Any other verb, such as INSERT, UPDATE, DELETE, etc. may be a context. These verbs indicate that the query modifies data in some way. If a SELECT context follows one of these verbs, then the query reads data from the SELECT table and writes it to this table. This happens, for example, with INSERT..SELECT or UPDATE queries that use values from tables instead of constant values. These query types are not supported: SET, LOAD, and multi-table DELETE. • JOIN The JOIN context lists tables that are joined, either with an explicit JOIN in the FROM clause, or implicitly in the WHERE clause, such as t1.id = t2.id. • WHERE The WHERE context lists tables that are used in the WHERE clause to filter results. This does not include tables that are implicitly joined in the WHERE clause; those are listed as JOIN contexts. For example: WHERE t1.id > 100 AND t1.id < 200 AND t2.foo IS NOT NULL Results in: WHERE t1 WHERE t2 The tool lists only distinct tables; that is why table t1 is listed only once. • TLIST The TLIST context lists tables that the query accesses, but which do not appear in any other context. These tables are usually an implicit cartesian join. For example, the query SELECT * FROM t1, t2 results in: 2.31. pt-table-usage 247
  • 252. Percona Toolkit Documentation, Release 2.1.1 Query_id: 0xBDDEB6EDA41897A8.1 SELECT t1 SELECT t2 TLIST t1 TLIST t2 First of all, there are two SELECT contexts, because SELECT * selects rows from all tables; t1 and t2 in this case. Secondly, the tables are implicitly joined, but without any kind of join condition, which results in a cartesian join as indicated by the TLIST context for each. 2.31.6 EXIT STATUS pt-table-usage exits 1 on any kind of error, or 0 if no errors. 2.31.7 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -constant-data-value type: string; default: DUAL Table to print as the source for constant data (literals). This is any data not retrieved from tables (or subqueries, because subqueries are not supported). This includes literal values such as strings (“foo”) and numbers (42), or functions such as NOW(). For example, in the query INSERT INTO t (c) VALUES (’a’), the string ‘a’ is constant data, so the table usage report is: INSERT t SELECT DUAL The first line indicates that the query inserts data into table t, and the second line indicates that the inserted data comes from some constant value. -[no]continue-on-error default: yes Continue to work even if there is an error. -create-table-definitions type: array Read CREATE TABLE definitions from this list of comma-separated files. If you cannot use --explain-extended to fully qualify table and column names, you can save the output of mysqldump --no-data to one or more files and specify those files with this option. The tool will parse all CREATE 248 Chapter 2. Tools
  • 253. Percona Toolkit Documentation, Release 2.1.1 TABLE definitions from the files and use this information to qualify table and column names. If a column name appears in multiple tables, or a table name appears in multiple databases, the ambiguities cannot be resolved. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -database short form: -D; type: string Default database. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -explain-extended type: DSN A server to execute EXPLAIN EXTENDED queries. This may be necessary to resolve ambiguous (unqualified) column and table names. -filter type: string Discard events for which this Perl code doesn’t return true. This option is a string of Perl code or a file containing Perl code that is compiled into a subroutine with one argument: $event. If the given value is a readable file, then pt-table-usage reads the entire file and uses its contents as the code. Filters are implemented in the same fashion as in the pt-query-digest tool, so please refer to its documentation for more information. -help Show help and exit. -host short form: -h; type: string Connect to host. -id-attribute type: string Identify each event using this attribute. The default is to use a query ID, which is an MD5 checksum of the query’s fingerprint. -log type: string Print all output to this file when daemonized. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when running. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. 2.31. pt-table-usage 249
  • 254. Percona Toolkit Documentation, Release 2.1.1 -port short form: -P; type: int Port number to use for connection. -progress type: array; default: time,30 Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage, seconds, or number of iterations. -query type: string Analyze the specified query instead of reading a log file. -read-timeout type: time; default: 0 Wait this long for an event from the input; 0 to wait forever. This option sets the maximum time to wait for an event from the input. If an event is not received after the specified time, the tool stops reading the input and prints its reports. This option requires the Perl POSIX module. -run-time type: time How long to run before exiting. The default is to run forever (you can interrupt with CTRL-C). -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.31.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. 250 Chapter 2. Tools
  • 255. Percona Toolkit Documentation, Release 2.1.1 • D copy: no Default database. • F dsn: mysql_read_default_file; copy: no Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: no Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.31.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-table-usage ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.31.10 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.31.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-table-usage. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool 2.31. pt-table-usage 251
  • 256. Percona Toolkit Documentation, Release 2.1.1 • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.31.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.31.13 AUTHORS Daniel Nichter 2.31.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.31.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 252 Chapter 2. Tools
  • 257. Percona Toolkit Documentation, Release 2.1.1 2.31.16 VERSION pt-table-usage 2.1.1 2.32 pt-tcp-model 2.32.1 NAME pt-tcp-model - Transform tcpdump into metrics that permit performance and scalability modeling. 2.32.2 SYNOPSIS Usage pt-tcp-model [OPTION...] [FILE] pt-tcp-model parses and analyzes tcpdump files. With no FILE, or when FILE is -, it read standard input. Dump TCP requests and responses to a file, capturing only the packet headers to avoid dropped packets, and ignoring any packets without a payload (such as ack-only packets). Capture port 3306 (MySQL database traffic). Note that to avoid line breaking in terminals and man pages, the TCP filtering expression that follows has a line break at the end of the second line; you should omit this from your tcpdump command. tcpdump -s 384 -i any -nnq -tttt ’tcp port 3306 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)’ > /path/to/tcp-file.txt Extract individual response times, sorted by end time: pt-tcp-model /path/to/tcp-file.txt > requests.txt Sort the result by arrival time, for input to the next step: sort -n -k1,1 requests.txt > sorted.txt Slice the result into 10-second intervals and emit throughput, concurrency, and response time metrics for each interval: pt-tcp-model --type=requests --run-time=10 sorted.txt > sliced.txt Transform the result for modeling with Aspersa’s usl tool, discarding the first and last line of each file if you specify multiple files (the first and last line are normally incomplete observation periods and are aberrant): for f in sliced.txt; do tail -n +2 "$f" | head -n -1 | awk ’{print $2, $3, $7/$4}’ done > usl-input.txt 2.32.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-tcp-model merely reads and transforms its input, printing it to the output. It should be very low risk. 2.32. pt-tcp-model 253
  • 258. Percona Toolkit Documentation, Release 2.1.1 At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-tcp- model. See also “BUGS” for more information on filing bugs and getting help. 2.32.4 DESCRIPTION This tool recognizes requests and responses in a TCP stream, and extracts the “conversations”. You can use it to capture the response times of individual queries to a database, for example. It expects the TCP input to be in the following format, which should result from the sample shown in the SYNOPSIS: <date> <time.microseconds> IP <IP.port> > <IP.port>: <junk> The tool watches for “incoming” packets to the port you specify with the --watch-server option. This begins a request. If multiple inbound packets follow each other, then by default the last inbound packet seen determines the time at which the request is assumed to begin. This is logical if one assumes that a server must receive the whole SQL statement before beginning execution, for example. When the first outbound packet is seen, the server is considered to have responded to the request. The tool might see an inbound packet, but never see a response. This can happen when the kernel drops packets, for example. As a result, the tool never prints a request unless it sees the response to it. However, the tool actually does not print any request until it sees the “last” outbound packet. It determines this by waiting for either another inbound packet, or EOF, and then considers the previous inbound/outbound pair to be complete. As a result, the tool prints requests in a relatively random order. Most types of analysis require processing in either arrival or completion order. Therefore, the second type of processing this tool can do requires that you sort the output from the first stage and supply it as input. The second type of processing is selected with the --type option set to “requests”. In this mode, the tool reads a group of requests and aggregates them, then emits the aggregated metrics. 2.32.5 OUTPUT In the default mode (parsing tcpdump output), requests are printed out one per line, in the following format: <id> <start> <end> <elapsed> <IP:port> The ID is an incrementing number, assigned in arrival order in the original TCP traffic. The start and end timestamps, and the elapsed time, can be customized with the --start-end option. In --type=requests mode, the tool prints out one line per time interval as defined by --run-time, with the following columns: ts, concurrency, throughput, arrivals, completions, busy_time, weighted_time, sum_time, vari- ance_mean, quantile_time, obs_time. A detailed explanation follows: ts The timestamp that defines the beginning of the interval. concurrency The average number of requests resident in the server during the interval. throughput The number of arrivals per second during the interval. arrivals The number of arrivals during the interval. 254 Chapter 2. Tools
  • 259. Percona Toolkit Documentation, Release 2.1.1 completions The number of completions during the interval. busy_time The total amount of time during which at least one request was resident in the server during the interval. weighted_time The total response time of all the requests resident in the server during the interval, including requests that neither arrived nor completed during the interval. sum_time The total response time of all the requests that arrived in the interval. variance_mean The variance-to-mean ratio (index of dispersion) of the response times of the requests that arrived in the interval. quantile_time The Nth percentile response time for all the requests that arrived in the interval. See also --quantile. obs_time The length of the observation time window. This will usually be the same as the interval length, except for the first and last intervals in a file, which might have a shorter observation time. 2.32.6 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -help Show help and exit. -progress type: array; default: time,30 Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage, seconds, or number of iterations. -quantile type: float The percentile for the last column when --type is “requests” (default .99). -run-time type: float The size of the aggregation interval in seconds when --type is “requests” (default 1). Fractional values are permitted. -start-end type: Array; default: ts,end 2.32. pt-tcp-model 255
  • 260. Percona Toolkit Documentation, Release 2.1.1 Define how the arrival and completion timestamps of a query, and thus its response time (elapsed time) are computed. Recall that there may be multiple inbound and outbound packets per request and response, and refer to the following ASCII diagram. Suppose that a client sends a series of three inbound (I) packets to the server, which computes the result and then sends two outbound (O) packets back: I I I ..................... O O |<---->|<---response time----->|<-->| ts0 ts end end1 By default, the query is considered to arrive at time ts, and complete at time end. However, this might not be what you want. Perhaps you do not want to consider the query to have completed until time end1. You can accomplish this by setting this option to ts,end1. -type type: string The type of input to parse (default tcpdump). The permitted types are tcpdump The parser expects the input to be formatted with the following options: -x -n -q -tttt. For example, if you want to capture output from your local machine, you can do something like the following (the port must come last on FreeBSD): tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 > mysql.tcp.txt pt-query-digest --type tcpdump mysql.tcp.txt The other tcpdump parameters, such as -s, -c, and -i, are up to you. Just make sure the output looks like this (there is a line break in the first line to avoid man-page problems): 2009-04-12 09:50:16.804849 IP 127.0.0.1.42167 > 127.0.0.1.3306: tcp 37 All MySQL servers running on port 3306 are automatically detected in the tcpdump output. Therefore, if the tcpdump out contains packets from multiple servers on port 3306 (for example, 10.0.0.1:3306, 10.0.0.2:3306, etc.), all packets/queries from all these servers will be analyzed to- gether as if they were one server. If you’re analyzing traffic for a protocol that is not running on port 3306, see --watch-server. -version Show version and exit. -watch-server type: string; default: 10.10.10.10:3306 This option tells pt-tcp-model which server IP address and port (such as “10.0.0.1:3306”) to watch when parsing tcpdump for --type tcpdump. If you don’t specify it, the tool watches all servers by looking for any IP address using port 3306. If you’re watching a server with a non-standard port, this won’t work, so you must specify the IP address and port to watch. Currently, IP address filtering isn’t implemented; so even though you must specify the option in IP:port form, it ignores the IP and only looks at the port number. 2.32.7 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: 256 Chapter 2. Tools
  • 261. Percona Toolkit Documentation, Release 2.1.1 PTDEBUG=1 pt-tcp-model ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.32.8 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.32.9 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-tcp-model. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.32.10 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.32.11 AUTHORS Baron Schwartz 2.32.12 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.32. pt-tcp-model 257
  • 262. Percona Toolkit Documentation, Release 2.1.1 2.32.13 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.32.14 VERSION pt-tcp-model 2.1.1 2.33 pt-trend 2.33.1 NAME pt-trend - Compute statistics over a set of time-series data points. 2.33.2 SYNOPSIS Usage pt-trend [OPTION...] [FILE ...] pt-trend reads a slow query log and outputs statistics on it. 2.33.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-trend simply reads files give on the command-line. It should be very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt-trend. See also “BUGS” for more information on filing bugs and getting help. 2.33.4 DESCRIPTION You can specify multiple files on the command line. If you don’t specify any, or if you use the special filename -, lines are read from standard input. 258 Chapter 2. Tools
  • 263. Percona Toolkit Documentation, Release 2.1.1 2.33.5 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -help Show help and exit. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -progress type: array; default: time,15 Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage, seconds, or number of iterations. -quiet short form: -q Disables --progress. -version Show version and exit. 2.33.6 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-trend ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.33.7 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.33.8 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-trend. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool 2.33. pt-trend 259
  • 264. Percona Toolkit Documentation, Release 2.1.1 • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.33.9 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.33.10 AUTHORS Baron Schwartz 2.33.11 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.33.12 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 260 Chapter 2. Tools
  • 265. Percona Toolkit Documentation, Release 2.1.1 2.33.13 VERSION pt-trend 2.1.1 2.34 pt-upgrade 2.34.1 NAME pt-upgrade - Execute queries on multiple servers and check for differences. 2.34.2 SYNOPSIS Usage pt-upgrade [OPTION...] DSN [DSN...] [FILE] pt-upgrade compares query execution on two hosts by executing queries in the given file (or STDIN if no file given) and examining the results, errors, warnings, etc.produced on each. Execute and compare all queries in slow.log on host1 to host2: pt-upgrade slow.log h=host1 h=host2 Use pt-query-digest to get, execute and compare queries from tcpdump: tcpdump -i eth0 port 3306 -s 65535 -x -n -q -tttt | pt-query-digest --type tcpdump --no-report --print | pt-upgrade h=host1 h=host2 Compare only query times on host1 to host2 and host3: pt-upgrade slow.log h=host1 h=host2 h=host3 --compare query_times Compare a single query, no slowlog needed: pt-upgrade h=host1 h=host2 --query ’SELECT * FROM db.tbl’ 2.34.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-upgrade is a read-only tool that is meant to be used on non-production servers. It executes the SQL that you give it as input, which could cause undesired load on a production server. At the time of this release, there is a bug that causes the tool to crash, and a bug that causes a deadlock. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- upgrade. See also “BUGS” for more information on filing bugs and getting help. 2.34. pt-upgrade 261
  • 266. Percona Toolkit Documentation, Release 2.1.1 2.34.4 DESCRIPTION pt-upgrade executes queries from slowlogs on one or more MySQL server to find differences in query time, warn- ings, results, and other aspects of the queries’ execution. This helps evaluate upgrades, migrations and configuration changes. The comparisons specified by --compare determine what differences can be found. A report is printed which outlines all the differences found; see “OUTPUT” below. The first DSN (host) specified on the command line is authoritative; it defines the results to which the other DSNs are compared. You can “compare” only one host, in which case there will be no differences but the output can be saved to be diffed later against the output of another single host “comparison”. At present, pt-upgrade only reads slowlogs. Use pt-query-digest --print to transform other log formats to slowlog. DSNs and slowlog files can be specified in any order. pt-upgrade will automatically determine if an argument is a DSN or a slowlog file. If no slowlog files are given and --query is not specified then pt-upgrade will read from STDIN. 2.34.5 OUTPUT Queries are group by fingerprints and any with differences are printed. The first part of a query report is a summary of differences. In the example below, the query returns a different number of rows (row counts) on each server. The second part is the side-by-side comparison of values obtained from the query on each server. Then a sample of the query is printed, preceded by its ID which can be used to locate more information in the sub-report at the end. There are sub-reports for various types of differences. # Query 1: ID 0x3C830E3839B916D7 at byte 0 _______________________________ # Found 1 differences in 1 samples: # column counts 0 # column types 0 # column values 0 # row counts 1 # warning counts 0 # warning levels 0 # warnings 0 # 127.1:12345 127.1:12348 # Errors 0 0 # Warnings 0 0 # Query_time # sum 0 0 # min 0 0 # max 0 0 # avg 0 0 # pct_95 0 0 # stddev 0 0 # median 0 0 # row_count # sum 4 3 # min 4 3 # max 4 3 # avg 4 3 # pct_95 4 3 # stddev 0 0 # median 4 3 use ‘test‘; select i from t where i is not null 262 Chapter 2. Tools
  • 267. Percona Toolkit Documentation, Release 2.1.1 /* 3C830E3839B916D7-1 */ select i from t where i is not null # Row count differences # Query ID 127.1:12345 127.1:12348 # ================== =========== =========== # 3C830E3839B916D7-1 4 3 The output will vary slightly depending on which options are specified. 2.34.6 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -base-dir type: string; default: /tmp Save outfiles for the rows comparison method in this directory. See the rows --compare-results-method. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -[no]clear-warnings default: yes Clear warnings before each warnings comparison. If comparing warnings (--compare includes warnings), this option causes pt-upgrade to execute a suc- cessful SELECT statement which clears any warnings left over from previous queries. This requires a current database that pt-upgrade usually detects automatically, but in some cases it might be necessary to specify --temp-database. If pt-upgrade can’t auto-detect the current database, it will create a temporary table in the --temp-database called mk_upgrade_clear_warnings. -clear-warnings-table type: string Execute SELECT * FROM ... LIMIT 1 from this table to clear warnings. -compare type: Hash; default: query_times,results,warnings What to compare for each query executed on each host. Comparisons determine differences when the queries are executed on the hosts. More comparisons enable more differences to be detected. The following comparisons are available: query_times Compare query execution times. If this comparison is disabled, the queries are still executed so that other comparisons will work, but the query time attributes are removed from the events. results 2.34. pt-upgrade 263
  • 268. Percona Toolkit Documentation, Release 2.1.1 Compare result sets to find differences in rows, columns, etc. What differences can be found depends on the --compare-results-method used. warnings Compare warnings from SHOW WARNINGS. Requires at least MySQL 4.1. -compare-results-method type: string; default: CHECKSUM; group: Comparisons Method to use for --compare results. This option has no effect if --no-compare-results is given. Available compare methods (case-insensitive): CHECKSUM Do CREATE TEMPORARY TABLE ‘mk_upgrade‘ AS query then CHECKSUM TABLE ‘mk_upgrade‘. This method is fast and simple but in rare cases might it be inaccurate because the MySQL manual says: [The] fact that two tables produce the same checksum does I<not> mean that the tables are identical. Requires at least MySQL 4.1. rows Compare rows one-by-one to find differences. This method has advantages and disadvantages. Its disadvantages are that it may be slower and it requires writing and reading outfiles from disk. Its advantages are that it is universal (works for all versions of MySQL), it doesn’t alter the query in any way, and it can find column value differences. The rows method works as follows: 1. Rows from each host are compared one-by-one. 2. If no differences are found, comparison stops, else... 3. All remain rows (after the point where they begin to differ) are written to outfiles. 4. The outfiles are loaded into temporary tables with C<LOAD DATA LOCAL INFILE>. 5. The temporary tables are analyzed to determine the differences. The outfiles are written to the --base-dir. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -continue-on-error Continue working even if there is an error. -convert-to-select Convert non-SELECT statements to SELECTs and compare. By default non-SELECT statements are not allowed. This option causes non-SELECT statements (like UP- DATE, INSERT and DELETE) to be converted to SELECT statements, executed and compared. For example, DELETE col FROM tbl WHERE id=1 is converted to SELECT col FROM tbl WHERE id=1. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. 264 Chapter 2. Tools
  • 269. Percona Toolkit Documentation, Release 2.1.1 -explain-hosts Print connection information and exit. -filter type: string Discard events for which this Perl code doesn’t return true. This option is a string of Perl code or a file containing Perl code that gets compiled into a subroutine with one argument: $event. This is a hashref. If the given value is a readable file, then pt-upgrade reads the entire file and uses its contents as the code. The file should not contain a shebang (#!/usr/bin/perl) line. If the code returns true, the chain of callbacks continues; otherwise it ends. The code is the last statement in the subroutine other than return $event. The subroutine template is: sub { $event = shift; filter && return $event; } Filters given on the command line are wrapped inside parentheses like like ( filter ). For complex, multi- line filters, you must put the code inside a file so it will not be wrapped inside parentheses. Either way, the filter must produce syntactically valid code given the template. For example, an if-else branch given on the command line would not be valid: --filter ’if () { } else { }’ # WRONG Since it’s given on the command line, the if-else branch would be wrapped inside parentheses which is not syntactically valid. So to accomplish something more complex like this would require putting the code in a file, for example filter.txt: my $event_ok; if (...) { $event_ok=1; } else { $event_ok=0; } $event_ok Then specify --filter filter.txt to read the code from filter.txt. If the filter code won’t compile, pt-upgrade will die with an error. If the filter code does compile, an error may still occur at runtime if the code tries to do something wrong (like pattern match an undefined value). pt-upgrade does not provide any safeguards so code carefully! An example filter that discards everything but SELECT statements: --filter ’$event->{arg} =~ m/^select/i’ This is compiled into a subroutine like the following: sub { $event = shift; ( $event->{arg} =~ m/^select/i ) && return $event; } It is permissible for the code to have side effects (to alter $event). You can find an explanation of the structure of $event at http://code.google.com/p/maatkit/wiki/EventAttributes. -fingerprints Add query fingerprints to the standard query analysis report. This is mostly useful for debugging purposes. -float-precision type: int Round float, double and decimal values to this many places. This option helps eliminate false-positives caused by floating-point imprecision. -help Show help and exit. -host short form: -h; type: string 2.34. pt-upgrade 265
  • 270. Percona Toolkit Documentation, Release 2.1.1 Connect to host. -iterations type: int; default: 1 How many times to iterate through the collect-and-report cycle. If 0, iterate to infinity. See also –run-time. -limit type: string; default: 95%:20 Limit output to the given percentage or count. If the argument is an integer, report only the top N worst queries. If the argument is an integer followed by the % sign, report that percentage of the worst queries. If the percentage is followed by a colon and another integer, report the top percentage or the number specified by that integer, whichever comes first. -log type: string Print all output to this file when daemonized. -max-different-rows type: int; default: 10 Stop comparing rows for --compare-results-method rows after this many differences are found. -order-by type: string; default: differences:sum Sort events by this attribute and aggregate function. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -query type: string Execute and compare this single query; ignores files on command line. This option allows you to supply a single query on the command line. Any slowlogs also specified on the command line are ignored. -reports type: Hash; default: queries,differences,errors,statistics Print these reports. Valid reports are queries, differences, errors, and statistics. See “OUTPUT” for more information on the various parts of the report. -run-time type: time 266 Chapter 2. Tools
  • 271. Percona Toolkit Documentation, Release 2.1.1 How long to run before exiting. The default is to run forever (you can interrupt with CTRL-C). -set-vars type: string; default: wait_timeout=10000,query_cache_type=0 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -shorten type: int; default: 1024 Shorten long statements in reports. Shortens long statements, replacing the omitted portion with a /*... omitted ...*/ comment. This applies only to the output in reports. It prevents a large statement from causing difficulty in a report. The argument is the preferred length of the shortened statement. Not all statements can be shortened, but very large INSERT and similar statements often can; and so can IN() lists, although only the first such list in the statement will be shortened. If it shortens something beyond recognition, you can find the original statement in the log, at the offset shown in the report header (see “OUTPUT”). -socket short form: -S; type: string Socket file to use for connection. -temp-database type: string Use this database for creating temporary tables. If given, this database is used for creating temporary tables for the results comparison (see --compare). Otherwise, the current database (from the last event that specified its database) is used. -temp-table type: string; default: mk_upgrade Use this table for checksumming results. -user short form: -u; type: string User for login if not current user. -version Show version and exit. -zero-query-times Zero the query times in the report. 2.34.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the =, and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. 2.34. pt-upgrade 267
  • 272. Percona Toolkit Documentation, Release 2.1.1 • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.34.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-upgrade ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.34.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.34.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-upgrade. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool 268 Chapter 2. Tools
  • 273. Percona Toolkit Documentation, Release 2.1.1 • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.34.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.34.12 AUTHORS Daniel Nichter 2.34.13 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.34.14 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2009-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.34. pt-upgrade 269
  • 274. Percona Toolkit Documentation, Release 2.1.1 2.34.15 VERSION pt-upgrade 2.1.1 2.35 pt-variable-advisor 2.35.1 NAME pt-variable-advisor - Analyze MySQL variables and advise on possible problems. 2.35.2 SYNOPSIS Usage pt-variable-advisor [OPTION...] [DSN] pt-variable-advisor analyzes variables and advises on possible problems. Get SHOW VARIABLES from localhost: pt-variable-advisor localhost Get SHOW VARIABLES output saved in vars.txt: pt-variable-advisor --source-of-variables vars.txt 2.35.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-variable-advisor reads MySQL’s configuration and examines it and is thus very low risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- variable-advisor. See also “BUGS” for more information on filing bugs and getting help. 2.35.4 DESCRIPTION pt-variable-advisor examines SHOW VARIABLES for bad values and settings according to the “RULES” described below. It reports on variables that match the rules, so you can find bad settings in your MySQL server. At the time of this release, pt-variable-advisor only examples SHOW VARIABLES, but other input sources are planned like SHOW STATUS and SHOW SLAVE STATUS. 270 Chapter 2. Tools
  • 275. Percona Toolkit Documentation, Release 2.1.1 2.35.5 RULES These are the rules that pt-variable-advisor will apply to SHOW VARIABLES. Each rule has three parts: an ID, a severity, and a description. The rule’s ID is a short, unique name for the rule. It usually relates to the variable that the rule examines. If a variable is examined by several rules, then the rules’ IDs are numbered like “-1”, “-2”, “-N”. The rule’s severity is an indication of how important it is that this rule matched a query. We use NOTE, WARN, and CRIT to denote these levels. The rule’s description is a textual, human-readable explanation of what it means when a variable matches this rule. Depending on the verbosity of the report you generate, you will see more of the text in the description. By default, you’ll see only the first sentence, which is sort of a terse synopsis of the rule’s meaning. At a higher verbosity, you’ll see subsequent sentences. auto_increment severity: note Are you trying to write to more than one server in a dual-master or ring replication configuration? This is potentially very dangerous and in most cases is a serious mistake. Most people’s reasons for doing this are actually not valid at all. concurrent_insert severity: note Holes (spaces left by deletes) in MyISAM tables might never be reused. connect_timeout severity: note A large value of this setting can create a denial of service vulnerability. debug severity: crit Servers built with debugging capability should not be used in production because of the large performance impact. delay_key_write severity: warn MyISAM index blocks are never flushed until necessary. If there is a server crash, data corruption on MyISAM tables can be much worse than usual. flush severity: warn This option might decrease performance greatly. flush_time severity: warn This option might decrease performance greatly. have_bdb severity: note The BDB engine is deprecated. If you aren’t using it, you should disable it with the skip_bdb option. 2.35. pt-variable-advisor 271
  • 276. Percona Toolkit Documentation, Release 2.1.1 init_connect severity: note The init_connect option is enabled on this server. init_file severity: note The init_file option is enabled on this server. init_slave severity: note The init_slave option is enabled on this server. innodb_additional_mem_pool_size severity: warn This variable generally doesn’t need to be larger than 20MB. innodb_buffer_pool_size severity: warn The InnoDB buffer pool size is unconfigured. In a production environment it should always be configured explicitly, and the default 10MB size is not good. innodb_checksums severity: warn InnoDB checksums are disabled. Your data is not protected from hardware corruption or other errors! innodb_doublewrite severity: warn InnoDB doublewrite is disabled. Unless you use a filesystem that protects against partial page writes, your data is not safe! innodb_fast_shutdown severity: warn InnoDB’s shutdown behavior is not the default. This can lead to poor performance, or the need to perform crash recovery upon startup. innodb_flush_log_at_trx_commit-1 severity: warn InnoDB is not configured in strictly ACID mode. If there is a crash, some transactions can be lost. innodb_flush_log_at_trx_commit-2 severity: warn Setting innodb_flush_log_at_trx_commit to 0 has no performance benefits over setting it to 2, and more types of data loss are possible. If you are trying to change it from 1 for performance reasons, you should set it to 2 instead of 0. innodb_force_recovery 272 Chapter 2. Tools
  • 277. Percona Toolkit Documentation, Release 2.1.1 severity: warn InnoDB is in forced recovery mode! This should be used only temporarily when recovering from data corruption or other bugs, not for normal usage. innodb_lock_wait_timeout severity: warn This option has an unusually long value, which can cause system overload if locks are not being released. innodb_log_buffer_size severity: warn The InnoDB log buffer size generally should not be set larger than 16MB. If you are doing large BLOB operations, InnoDB is not really a good choice of engines anyway. innodb_log_file_size severity: warn The InnoDB log file size is set to its default value, which is not usable on production systems. innodb_max_dirty_pages_pct severity: note The innodb_max_dirty_pages_pct is lower than the default. This can cause overly aggressive flushing and add load to the I/O system. flush_time severity: warn This setting is likely to cause very bad performance every flush_time seconds. key_buffer_size severity: warn The key buffer size is unconfigured. In a production environment it should always be configured explicitly, and the default 8MB size is not good. large_pages severity: note Large pages are enabled. locked_in_memory severity: note The server is locked in memory with –memlock. log_warnings-1 severity: note Log_warnings is disabled, so unusual events such as statements unsafe for replication and aborted con- nections will not be logged to the error log. log_warnings-2 severity: note Log_warnings must be set greater than 1 to log unusual events such as aborted connections. low_priority_updates 2.35. pt-variable-advisor 273
  • 278. Percona Toolkit Documentation, Release 2.1.1 severity: note The server is running with non-default lock priority for updates. This could cause update queries to wait unexpectedly for read queries. max_binlog_size severity: note The max_binlog_size is smaller than the default of 1GB. max_connect_errors severity: note max_connect_errors should probably be set as large as your platform allows. max_connections severity: warn If the server ever really has more than a thousand threads running, then the system is likely to spend more time scheduling threads than really doing useful work. This variable’s value should be considered in light of your workload. myisam_repair_threads severity: note myisam_repair_threads > 1 enables multi-threaded repair, which is relatively untested and is still listed as beta-quality code in the official documentation. old_passwords severity: warn Old-style passwords are insecure. They are sent in plain text across the wire. optimizer_prune_level severity: warn The optimizer will use an exhaustive search when planning complex queries, which can cause the planning process to take a long time. port severity: note The server is listening on a non-default port. query_cache_size-1 severity: note The query cache does not scale to large sizes and can cause unstable performance when larger than 128MB, especially on multi-core machines. query_cache_size-2 severity: warn The query cache can cause severe performance problems when it is larger than 256MB, especially on multi-core machines. read_buffer_size-1 274 Chapter 2. Tools
  • 279. Percona Toolkit Documentation, Release 2.1.1 severity: note The read_buffer_size variable should generally be left at its default unless an expert determines it is necessary to change it. read_buffer_size-2 severity: warn The read_buffer_size variable should not be larger than 8MB. It should generally be left at its default unless an expert determines it is necessary to change it. Making it larger than 2MB can hurt performance significantly, and can make the server crash, swap to death, or just become extremely unstable. read_rnd_buffer_size-1 severity: note The read_rnd_buffer_size variable should generally be left at its default unless an expert determines it is necessary to change it. read_rnd_buffer_size-2 severity: warn The read_rnd_buffer_size variable should not be larger than 4M. It should generally be left at its default unless an expert determines it is necessary to change it. relay_log_space_limit severity: warn Setting relay_log_space_limit can cause replicas to stop fetching binary logs from their master immedi- ately. This could increase the risk that your data will be lost if the master crashes. If the replicas have encountered a limit on relay log space, then it is possible that the latest transactions exist only on the master and no replica has retrieved them. slave_net_timeout severity: warn This variable is set too high. This is too long to wait before noticing that the connection to the master has failed and retrying. This should probably be set to 60 seconds or less. It is also a good idea to use pt-heartbeat to ensure that the connection does not appear to time out when the master is simply idle. slave_skip_errors severity: crit You should not set this option. If replication is having errors, you need to find and resolve the cause of that; it is likely that your slave’s data is different from the master. You can find out with pt-table-checksum. sort_buffer_size-1 severity: note The sort_buffer_size variable should generally be left at its default unless an expert determines it is nec- essary to change it. sort_buffer_size-2 severity: note The sort_buffer_size variable should generally be left at its default unless an expert determines it is nec- essary to change it. Making it larger than a few MB can hurt performance significantly, and can make the server crash, swap to death, or just become extremely unstable. sql_notes 2.35. pt-variable-advisor 275
  • 280. Percona Toolkit Documentation, Release 2.1.1 severity: note This server is configured not to log Note level warnings to the error log. sync_frm severity: warn It is best to set sync_frm so that .frm files are flushed safely to disk in case of a server crash. tx_isolation-1 severity: note This server’s transaction isolation level is non-default. tx_isolation-2 severity: warn Most applications should use the default REPEATABLE-READ transaction isolation level, or in a few cases READ-COMMITTED. expire_log_days severity: warn Binary logs are enabled, but automatic purging is not enabled. If you do not purge binary logs, your disk will fill up. If you delete binary logs externally to MySQL, you will cause unwanted behaviors. Always ask MySQL to purge obsolete logs, never delete them externally. innodb_file_io_threads severity: note This option is useless except on Windows. innodb_data_file_path severity: note Auto-extending InnoDB files can consume a lot of disk space that is very difficult to reclaim later. Some people prefer to set innodb_file_per_table and allocate a fixed-size file for ibdata1. innodb_flush_method severity: note Most production database servers that use InnoDB should set innodb_flush_method to O_DIRECT to avoid double-buffering, unless the I/O system is very low performance. innodb_locks_unsafe_for_binlog severity: warn This option makes point-in-time recovery from binary logs, and replication, untrustworthy if statement- based logging is used. innodb_support_xa severity: warn MySQL’s internal XA transaction support between InnoDB and the binary log is disabled. The binary log might not match InnoDB’s state after crash recovery, and replication might drift out of sync due to out-of-order statements in the binary log. log_bin 276 Chapter 2. Tools
  • 281. Percona Toolkit Documentation, Release 2.1.1 severity: warn Binary logging is disabled, so point-in-time recovery and replication are not possible. log_output severity: warn Directing log output to tables has a high performance impact. max_relay_log_size severity: note A custom max_relay_log_size is defined. myisam_recover_options severity: warn myisam_recover_options should be set to some value such as BACKUP,FORCE to ensure that table cor- ruption is noticed. storage_engine severity: note The server is using a non-standard storage engine as default. sync_binlog severity: warn Binary logging is enabled, but sync_binlog isn’t configured so that every transaction is flushed to the binary log for durability. tmp_table_size severity: note The effective minimum size of in-memory implicit temporary tables used internally during query execu- tion is min(tmp_table_size, max_heap_table_size), so max_heap_table_size should be at least as large as tmp_table_size. old mysql version severity: warn These are the recommended minimum version for each major release: 3.23, 4.1.20, 5.0.37, 5.1.30. end-of-life mysql version severity: note Every release older than 5.1 is now officially end-of-life. 2.35.6 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string 2.35. pt-variable-advisor 277
  • 282. Percona Toolkit Documentation, Release 2.1.1 Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -daemonize Fork to the background and detach from the shell. POSIX operating systems only. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -help Show help and exit. -host short form: -h; type: string Connect to host. -ignore-rules type: hash Ignore these rule IDs. Specify a comma-separated list of rule IDs (e.g. LIT.001,RES.002,etc.) to ignore. -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file when daemonized. The file contains the process ID of the daemonized instance. The PID file is removed when the daemonized instance exits. The program checks for the existence of the PID file when starting; if it exists and the process with the matching PID exists, the program exits. -port short form: -P; type: int Port number to use for connection. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -source-of-variables type: string; default: mysql Read SHOW VARIABLES from this source. Possible values are “mysql”, “none” or a file name. If “mysql” is specified then you must also specify a DSN on the command line. 278 Chapter 2. Tools
  • 283. Percona Toolkit Documentation, Release 2.1.1 -user short form: -u; type: string User for login if not current user. -verbose short form: -v; cumulative: yes; default: 1 Increase verbosity of output. At the default level of verbosity, the program prints only the first sentence of each rule’s description. At higher levels, the program prints more of the description. -version Show version and exit. 2.35.7 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u 2.35. pt-variable-advisor 279
  • 284. Percona Toolkit Documentation, Release 2.1.1 dsn: user; copy: yes User for login if not current user. 2.35.8 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-variable-advisor ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.35.9 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.35.10 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-variable-advisor. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 2.35.11 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.35.12 AUTHORS Baron Schwartz and Daniel Nichter 280 Chapter 2. Tools
  • 285. Percona Toolkit Documentation, Release 2.1.1 2.35.13 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.35.14 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2010-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.35.15 VERSION pt-variable-advisor 2.1.1 2.36 pt-visual-explain 2.36.1 NAME pt-visual-explain - Format EXPLAIN output as a tree. 2.36.2 SYNOPSIS Usage pt-visual-explain [OPTION...] [FILE...] pt-visual-explain transforms EXPLAIN output into a tree representation of the query plan. If FILE is given, input is read from the file(s). With no FILE, or when FILE is -, read standard input. Examples pt-visual-explain <file_containing_explain_output> pt-visual-explain -c <file_containing_query> mysql -e "explain select * from mysql.user" | pt-visual-explain 2.36. pt-visual-explain 281
  • 286. Percona Toolkit Documentation, Release 2.1.1 2.36.3 RISKS The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs. pt-visual-explain is read-only and very low-risk. At the time of this release, we know of no bugs that could cause serious harm to users. The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.percona.com/bugs/pt- visual-explain. See also “BUGS” for more information on filing bugs and getting help. 2.36.4 DESCRIPTION pt-visual-explain reverse-engineers MySQL’s EXPLAIN output into a query execution plan, which it then formats as a left-deep tree – the same way the plan is represented inside MySQL. It is possible to do this by hand, or to read EXPLAIN’s output directly, but it requires patience and expertise. Many people find a tree representation more understandable. You can pipe input into pt-visual-explain or specify a filename at the command line, including the magical ‘-‘ file- name, which will read from standard input. It can do two things with the input: parse it for something that looks like EXPLAIN output, or connect to a MySQL instance and run EXPLAIN on the input. When parsing its input, pt-visual-explain understands three formats: tabular like that shown in the mysql command- line client, vertical like that created by using the G line terminator in the mysql command-line client, and tab separated. It ignores any lines it doesn’t know how to parse. When executing the input, pt-visual-explain replaces everything in the input up to the first SELECT keyword with ‘EXPLAIN SELECT,’ and then executes the result. You must specify --connect to execute the input as a query. Either way, it builds a tree from the result set and prints it to standard output. For the following query, select * from sakila.film_actor join sakila.film using(film_id); pt-visual-explain generates this query plan: JOIN +- Bookmark lookup | +- Table | | table film_actor | | possible_keys idx_fk_film_id | +- Index lookup | key film_actor->idx_fk_film_id | possible_keys idx_fk_film_id | key_len 2 | ref sakila.film.film_id | rows 2 +- Table scan rows 952 +- Table table film possible_keys PRIMARY The query plan is left-deep, depth-first search, and the tree’s root is the output node – the last step in the execution plan. In other words, read it like this: 282 Chapter 2. Tools
  • 287. Percona Toolkit Documentation, Release 2.1.1 1 Table scan the ‘film’ table, which accesses an estimated 952 rows. 2 For each row, find matching rows by doing an index lookup into the film_actor->idx_fk_film_id index with the value from sakila.film.film_id, then a bookmark lookup into the film_actor table. For more information on how to read EXPLAIN output, please see http://dev.mysql.com/doc/en/explain.html, and this talk titled “Query Optimizer Internals and What’s New in the MySQL 5.2 Optimizer,” from Timour Katchaounov, one of the MySQL developers: http://maatkit.org/presentations/katchaounov_timour.pdf. 2.36.5 MODULES This program is actually a runnable module, not just an ordinary Perl script. In fact, there are two modules embedded in it. This makes unit testing easy, but it also makes it easy for you to use the parsing and tree-building functionality if you want. The ExplainParser package accepts a string and parses whatever it thinks looks like EXPLAIN output from it. The synopsis is as follows: require "pt-visual-explain"; my $p = ExplainParser->new(); my $rows = $p->parse("some text"); # $rows is an arrayref of hashrefs. The ExplainTree package accepts a set of rows and turns it into a tree. For convenience, you can also have it delegate to ExplainParser and parse text for you. Here’s the synopsis: require "pt-visual-explain"; my $e = ExplainTree->new(); my $tree = $e->parse("some text", %options); my $output = $e->pretty_print($tree); print $tree; 2.36.6 ALGORITHM This section explains the algorithm that converts EXPLAIN into a tree. You may be interested in reading this if you want to understand EXPLAIN more fully, or trying to figure out how this works, but otherwise this section will probably not make your life richer. The tree can be built by examining the id, select_type, and table columns of each row. Here’s what I know about them: The id column is the sequential number of the select. This does not indicate nesting; it just comes from counting SELECT from the left of the SQL statement. It’s like capturing parentheses in a regular expression. A UNION RESULT row doesn’t have an id, because it isn’t a SELECT. The source code actually refers to UNIONs as a fake_lex, as I recall. If two adjacent rows have the same id value, they are joined with the standard single-sweep multi-join method. The select_type column tells a) that a new sub-scope has opened b) what kind of relationship the row has to the previous row c) what kind of operation the row represents. • SIMPLE means there are no subqueries or unions in the whole query. • PRIMARY means there are, but this is the outermost SELECT. • [DEPENDENT] UNION means this result is UNIONed with the previous result (not row; a result might encom- pass more than one row). 2.36. pt-visual-explain 283
  • 288. Percona Toolkit Documentation, Release 2.1.1 • UNION RESULT terminates a set of UNIONed results. • [DEPENDENT|UNCACHEABLE] SUBQUERY means a new sub-scope is opening. This is the kind of sub- query that happens in a WHERE clause, SELECT list or whatnot; it does not return a so-called “derived table.” • DERIVED is a subquery in the FROM clause. Tables that are JOINed all have the same select_type. For example, if you JOIN three tables inside a dependent subquery, they’ll all say the same thing: DEPENDENT SUBQUERY. The table column usually specifies the table name or alias, but may also say <derivedN> or <unionN,N...N>. If it says <derivedN>, the row represents an access to the temporary table that holds the result of the subquery whose id is N. If it says <unionN,..N> it’s the same thing, but it refers to the results it UNIONs together. Finally, order matters. If a row’s id is less than the one before it, I think that means it is dependent on something other than the one before it. For example, explain select (select 1 from sakila.film), (select 2 from sakila.film_actor), (select 3 from sakila.actor); | id | select_type | table | +----+-------------+------------+ | 1 | PRIMARY | NULL | | 4 | SUBQUERY | actor | | 3 | SUBQUERY | film_actor | | 2 | SUBQUERY | film | If the results were in order 2-3-4, I think that would mean 3 is a subquery of 2, 4 is a subquery of 3. As it is, this means 4 is a subquery of the nearest previous recent row with a smaller id, which is 1. Likewise for 3 and 2. This structure is hard to programmatically build into a tree for the same reason it’s hard to understand by inspection: there are both forward and backward references. <derivedN> is a forward reference to selectN, while <unionM,N> is a backward reference to selectM and selectN. That makes recursion and other tree-building algorithms hard to get right (NOTE: after implementation, I now see how it would be possible to deal with both forward and backward references, but I have no motivation to change something that works). Consider the following: select * from ( select 1 from sakila.actor as actor_1 union select 1 from sakila.actor as actor_2 ) as der_1 union select * from ( select 1 from sakila.actor as actor_3 union all select 1 from sakila.actor as actor_4 ) as der_2; | id | select_type | table | +------+--------------+------------+ | 1 | PRIMARY | <derived2> | | 2 | DERIVED | actor_1 | | 3 | UNION | actor_2 | | NULL | UNION RESULT | <union2,3> | | 4 | UNION | <derived5> | | 5 | DERIVED | actor_3 | | 6 | UNION | actor_4 | | NULL | UNION RESULT | <union5,6> | 284 Chapter 2. Tools
  • 289. Percona Toolkit Documentation, Release 2.1.1 | NULL | UNION RESULT | <union1,4> | This would be a lot easier to work with if it looked like this (I’ve bracketed the id on rows I moved): | id | select_type | table | +------+--------------+------------+ | [1] | UNION RESULT | <union1,4> | | 1 | PRIMARY | <derived2> | | [2] | UNION RESULT | <union2,3> | | 2 | DERIVED | actor_1 | | 3 | UNION | actor_2 | | 4 | UNION | <derived5> | | [5] | UNION RESULT | <union5,6> | | 5 | DERIVED | actor_3 | | 6 | UNION | actor_4 | In fact, why not re-number all the ids, so the PRIMARY row becomes 2, and so on? That would make it even easier to read. Unfortunately that would also have the effect of destroying the meaning of the id column, which I think is important to preserve in the final tree. Also, though it makes it easier to read, it doesn’t make it easier to manipulate programmatically; so it’s fine to leave them numbered as they are. The goal of re-ordering is to make it easier to figure out which rows are children of which rows in the execution plan. Given the reordered list and some row whose table is <union...> or <derived>, it is easy to find the beginning of the slice of rows that should be child nodes in the tree: you just look for the first row whose ID is the same as the first number in the table. The next question is how to find the last row that should be a child node of a UNION or DERIVED. I’ll start with DERIVED, because the solution makes UNION easy. Consider how MySQL numbers the SELECTs sequentially according to their position in the SQL, left-to-right. Since a DERIVED table encloses everything within it in a scope, which becomes a temporary table, there are only two things to think about: its child subqueries and unions (if any), and its next siblings in the scope that encloses it. Its children will all have an id greater than it does, by definition, so any later rows with a smaller id terminate the scope. Here’s an example. The middle derived table here has a subquery and a UNION to make it a little more complex for the example. explain select 1 from ( select film_id from sakila.film limit 1 ) as der_1 join ( select film_id, actor_id, (select count(*) from sakila.rental) as r from sakila.film_actor limit 1 union all select 1, 1, 1 from sakila.film_actor as dummy ) as der_2 using (film_id) join ( select actor_id from sakila.actor limit 1 ) as der_3 using (actor_id); Here’s the output of EXPLAIN: | id | select_type | table | | 1 | PRIMARY | <derived2> | | 1 | PRIMARY | <derived6> | | 1 | PRIMARY | <derived3> | | 6 | DERIVED | actor | | 3 | DERIVED | film_actor | 2.36. pt-visual-explain 285
  • 290. Percona Toolkit Documentation, Release 2.1.1 | 4 | SUBQUERY | rental | | 5 | UNION | dummy | | NULL | UNION RESULT | <union3,5> | | 2 | DERIVED | film | The siblings all have id 1, and the middle one I care about is derived3. (Notice MySQL doesn’t execute them in the order I defined them, which is fine). Now notice that MySQL prints out the rows in the opposite order I defined the subqueries: 6, 3, 2. It always seems to do this, and there might be other methods of finding the scope boundaries including looking for the lower boundary of the next largest sibling, but this is a good enough heuristic. I am forced to rely on it for non-DERIVED subqueries, so I rely on it here too. Therefore, I decide that everything greater than or equal to 3 belongs to the DERIVED scope. The rule for UNION is simple: they consume the entire enclosing scope, and to find the component parts of each one, you find each part’s beginning as referred to in the <unionN,...> definition, and its end is either just before the next one, or if it’s the last part, the end is the end of the scope. This is only simple because UNION consumes the entire scope, which is either the entire statement, or the scope of a DERIVED table. This is because a UNION cannot be a sibling of another UNION or a table, DERIVED or not. (Try writing such a statement if you don’t see it intuitively). Therefore, you can just find the enclosing scope’s boundaries, and the rest is easy. Notice in the example above, the UNION is over <union3,5>, which includes the row with id 4 – it includes every row between 3 and 5. Finally, there are non-derived subqueries to deal with as well. In this case I can’t look at siblings to find the end of the scope as I did for DERIVED. I have to trust that MySQL executes depth-first. Here’s an example: explain select actor_id, ( select count(film_id) + (select count(*) from sakila.film) from sakila.film join sakila.film_actor using(film_id) where exists( select * from sakila.actor where sakila.actor.actor_id = sakila.film_actor.actor_id ) ) from sakila.actor; | id | select_type | table | | 1 | PRIMARY | actor | | 2 | SUBQUERY | film | | 2 | SUBQUERY | film_actor | | 4 | DEPENDENT SUBQUERY | actor | | 3 | SUBQUERY | film | In order, the tree should be built like this: • See row 1. • See row 2. It’s a higher id than 1, so it’s a subquery, along with every other row whose id is greater than 2. • Inside this scope, see 2 and 2 and JOIN them. See 4. It’s a higher id than 2, so it’s again a subquery; recurse. After that, see 3, which is also higher; recurse. But the only reason the nested subquery didn’t include select 3 is because select 4 came first. In other words, if EXPLAIN looked like this, | id | select_type | table | | 1 | PRIMARY | actor | | 2 | SUBQUERY | film | 286 Chapter 2. Tools
  • 291. Percona Toolkit Documentation, Release 2.1.1 | 2 | SUBQUERY | film_actor | | 3 | SUBQUERY | film | | 4 | DEPENDENT SUBQUERY | actor | I would be forced to assume upon seeing select 3 that select 4 is a subquery of it, rather than just being the next sibling in the enclosing scope. If this is ever wrong, then the algorithm is wrong, and I don’t see what could be done about it. UNION is a little more complicated than just “the entire scope is a UNION,” because the UNION might itself be inside an enclosing scope that’s only indicated by the first item inside the UNION. There are only three kinds of enclosing scopes: UNION, DERIVED, and SUBQUERY. A UNION can’t enclose a UNION, and a DERIVED has its own “scope markers,” but a SUBQUERY can wholly enclose a UNION, like this strange example on the empty table t1: explain select * from t1 where not exists( (select t11.i from t1 t11) union (select t12.i from t1 t12)); | id | select_type | table | Extra | +------+--------------+------------+--------------------------------+ | 1 | PRIMARY | t1 | const row not found | | 2 | SUBQUERY | NULL | No tables used | | 3 | SUBQUERY | NULL | no matching row in const table | | 4 | UNION | t12 | const row not found | | NULL | UNION RESULT | <union2,4> | | The UNION’s backward references might make it look like the UNION encloses the subquery, but studying the query makes it clear this isn’t the case. So when a UNION’s first row says SUBQUERY, it is this special case. By the way, I don’t fully understand this query plan; there are 4 numbered SELECT in the plan, but only 3 in the query. The parens around the UNIONs are meaningful. Removing them will make the EXPLAIN different. Please tell me how and why this works if you know. Armed with this knowledge, it’s possible to use recursion to turn the parent-child relationship between all the rows into a tree representing the execution plan. MySQL prints the rows in execution order, even the forward and backward references. At any given scope, the rows are processed as a left-deep tree. MySQL does not do “bushy” execution plans. It begins with a table, finds a matching row in the next table, and continues till the last table, when it emits a row. When it runs out, it backtracks till it can find the next row and repeats. There are subtleties of course, but this is the basic plan. This is why MySQL transforms all RIGHT OUTER JOINs into LEFT OUTER JOINs and cannot do FULL OUTER JOIN. This means in any given scope, say | id | select_type | table | | 1 | SIMPLE | tbl1 | | 1 | SIMPLE | tbl2 | | 1 | SIMPLE | tbl3 | The execution plan looks like a depth-first traversal of this tree: JOIN / JOIN tbl3 / tbl1 tbl2 The JOIN might not be a JOIN. It might be a subquery, for example. This comes from the type column of EXPLAIN. The documentation says this is a “join type,” but I think “access type” is more accurate, because it’s “how MySQL accesses rows.” pt-visual-explain decorates the tree significantly more than just turning rows into nodes. Each node may get a series of transformations that turn it into a subtree of more than one node. For example, an index scan not marked with 2.36. pt-visual-explain 287
  • 292. Percona Toolkit Documentation, Release 2.1.1 ‘Using index’ must do a bookmark lookup into the table rows; that is a three-node subtree. However, after the above node-ordering and scoping stuff, the rest of the process is pretty simple. 2.36.7 OPTIONS This tool accepts additional command-line arguments. Refer to the “SYNOPSIS” and usage information for details. -ask-pass Prompt for a password when connecting to MySQL. -charset short form: -A; type: string Default character set. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. -clustered-pk Assume that PRIMARY KEY index accesses don’t need to do a bookmark lookup to retrieve rows. This is the case for InnoDB. -config type: Array Read this comma-separated list of config files; if specified, this must be the first option on the command line. -connect Treat input as a query, and obtain EXPLAIN output by connecting to a MySQL instance and running EXPLAIN on the query. When this option is given, pt-visual-explain uses the other connection-specific options such as --user to connect to the MySQL instance. If you have a .my.cnf file, it will read it, so you may not need to specify any connection-specific options. -database short form: -D; type: string Connect to this database. -defaults-file short form: -F; type: string Only read mysql options from the given file. You must give an absolute pathname. -format type: string; default: tree Set output format. The default is a terse pretty-printed tree. The valid values are: Value Meaning ===== ================================================ tree Pretty-printed terse tree. dump Data::Dumper output (see Data::Dumper for more). -help Show help and exit. -host short form: -h; type: string Connect to host. 288 Chapter 2. Tools
  • 293. Percona Toolkit Documentation, Release 2.1.1 -password short form: -p; type: string Password to use when connecting. -pid type: string Create the given PID file. The file contains the process ID of the script. The PID file is removed when the script exits. Before starting, the script checks if the PID file already exists. If it does not, then the script creates and writes its own PID to it. If it does, then the script checks the following: if the file contains a PID and a process is running with that PID, then the script dies; or, if there is no process running with that PID, then the script overwrites the file with its own PID and starts; else, if the file contains no PID, then the script dies. -port short form: -P; type: int Port number to use for connection. -set-vars type: string; default: wait_timeout=10000 Set these MySQL variables. Immediately after connecting to MySQL, this string will be appended to SET and executed. -socket short form: -S; type: string Socket file to use for connection. -user short form: -u; type: string User for login if not current user. -version Show version and exit. 2.36.8 DSN OPTIONS These DSN options are used to create a DSN. Each option is given like option=value. The options are case- sensitive, so P and p are not the same option. There cannot be whitespace before or after the = and if the value contains whitespace it must be quoted. DSN options are comma-separated. See the percona-toolkit manpage for full details. • A dsn: charset; copy: yes Default character set. • D dsn: database; copy: yes Default database. • F dsn: mysql_read_default_file; copy: yes Only read default options from the given file • h 2.36. pt-visual-explain 289
  • 294. Percona Toolkit Documentation, Release 2.1.1 dsn: host; copy: yes Connect to host. • p dsn: password; copy: yes Password to use when connecting. • P dsn: port; copy: yes Port number to use for connection. • S dsn: mysql_socket; copy: yes Socket file to use for connection. • u dsn: user; copy: yes User for login if not current user. 2.36.9 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-visual-explain ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 2.36.10 SYSTEM REQUIREMENTS You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl. 2.36.11 BUGS For a list of known bugs, see http://www.percona.com/bugs/pt-visual-explain. Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: • Complete command-line used to run the tool • Tool --version • MySQL version of all servers involved • Output from the tool including STDERR • Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 290 Chapter 2. Tools
  • 295. Percona Toolkit Documentation, Release 2.1.1 2.36.12 DOWNLOADING Visit http://www.percona.com/software/percona-toolkit/ to download the latest release of Percona Toolkit. Or, get the latest release from the command line: wget percona.com/get/percona-toolkit.tar.gz wget percona.com/get/percona-toolkit.rpm wget percona.com/get/percona-toolkit.deb You can also get individual tools from the latest release: wget percona.com/get/TOOL Replace TOOL with the name of any tool. 2.36.13 AUTHORS Baron Schwartz 2.36.14 ABOUT PERCONA TOOLKIT This tool is part of Percona Toolkit, a collection of advanced command-line tools developed by Percona for MySQL support and consulting. Percona Toolkit was forked from two projects in June, 2011: Maatkit and Aspersa. Those projects were created by Baron Schwartz and developed primarily by him and Daniel Nichter, both of whom are employed by Percona. Visit http://www.percona.com/software/ for more software developed by Percona. 2.36.15 COPYRIGHT, LICENSE, AND WARRANTY This program is copyright 2007-2011 Baron Schwartz, 2011-2012 Percona Inc. Feedback and improvements are welcome. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 2.36.16 VERSION pt-visual-explain 2.1.1 2.36. pt-visual-explain 291
  • 296. Percona Toolkit Documentation, Release 2.1.1 292 Chapter 2. Tools
  • 297. CHAPTER THREE CONFIGURATION 3.1 CONFIGURATION FILES Percona Toolkit tools can read options from configuration files. The configuration file syntax is simple and direct, and bears some resemblances to the MySQL command-line client tools. The configuration files all follow the same conventions. Internally, what actually happens is that the lines are read from the file and then added as command-line options and arguments to the tool, so just think of the configuration files as a way to write your command lines. 3.1.1 SYNTAX The syntax of the configuration files is as follows: * Whitespace followed by a hash sign (#) signifies that the rest of the line is a comment. This is deleted. For example: * Whitespace is stripped from the beginning and end of all lines. * Empty lines are ignored. * Each line is permitted to be in either of the following formats: option option=value Do not prefix the option with --. Do not quote the values, even if it has spaces; value are literal. Whites- pace around the equals sign is deleted during processing. * Only long options are recognized. * A line containing only two hyphens signals the end of option parsing. Any further lines are interpreted as additional arguments (not options) to the program. 293
  • 298. Percona Toolkit Documentation, Release 2.1.1 3.1.2 EXAMPLE This config file for pt-stalk, # Config for pt-stalk variable=Threads_connected cycles=2 # trigger if problem seen twice in a row -- --user daniel is equivalent to this command line: pt-stalk --variable Threads_connected --cycles 2 -- --user daniel Options after -- are passed literally to mysql and mysqladmin. 3.1.3 READ ORDER The tools read several configuration files in order: 1. The global Percona Toolkit configuration file, /etc/percona-toolkit/percona-toolkit.conf. All tools read this file, so you should only add options to it that you want to apply to all tools. 2. The global tool-specific configuration file, /etc/percona-toolkit/TOOL.conf, where TOOL is a tool name like pt-query-digest. This file is named after the specific tool you’re using, so you can add options that apply only to that tool. 3. The user’s own Percona Toolkit configuration file, $HOME/.percona-toolkit.conf. All tools read this file, so you should only add options to it that you want to apply to all tools. 4. The user’s tool-specific configuration file, $HOME/.TOOL.conf, where TOOL is a tool name like pt-query-digest. This file is named after the specific tool you’re using, so you can add options that apply only to that tool. 3.1.4 SPECIFYING There is a special --config option, which lets you specify which configuration files Percona Toolkit should read. You specify a comma-separated list of files. However, its behavior is not like other command-line options. It must be given first on the command line, before any other options. If you try to specify it anywhere else, it will cause an error. Also, you cannot specify --config=/path/to/file; you must specify the option and the path to the file separated by whitespace without an equal sign between them, like: --config /path/to/file If you don’t want any configuration files at all, specify --config ” to provide an empty list of files. 3.2 DSN (DATA SOURCE NAME) SPECIFICATIONS Percona Toolkit tools use DSNs to specify how to create a DBD connection to a MySQL server. A DSN is a comma- separated string of key=value parts, like: h=host1,P=3306,u=bob 294 Chapter 3. Configuration
  • 299. Percona Toolkit Documentation, Release 2.1.1 The standard key parts are shown below, but some tools add additional key parts. See each tool’s documentation for details. Some tools do not use DSNs but still connect to MySQL using options like --host, --user, and --password. Such tools uses these options to create a DSN automatically, behind the scenes. Other tools uses both DSNs and options like the ones above. The options provide defaults for all DSNs that do not specify the option’s corresponding key part. For example, if DSN h=host1 and option --port=12345 are specified, then the tool automatically adds P=12345 to DSN. 3.2.1 KEY PARTS Many of the tools add more parts to DSNs for special purposes, and sometimes override parts to make them do something slightly different. However, all the tools support at least the following: A Specifies the default character set for the connection. Enables character set settings in Perl and MySQL. If the value is utf8, sets Perl’s binmode on STDOUT to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET NAMES UTF8 after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs SET NAMES after connecting to MySQL. Unfortunately, there is no way from within Perl itself to specify the client library’s character set. SET NAMES only affects the server; if the client library’s settings don’t match, there could be problems. You can use the defaults file to specify the client library’s character set, however. See the description of the F part below. D Specifies the connection’s default database. F Specifies a defaults file the mysql client library (the C client library used by DBD::mysql, not Percona Toolkit itself ) should read. The tools all read the [client] section within the defaults file. If you omit this, the standard defaults files will be read in the usual order. “Standard” varies from system to system, because the filenames to read are compiled into the client library. On Debian systems, for example, it’s usually /etc/mysql/my.cnf then ~/.my.cnf. If you place the following into ~/.my.cnf, tools will Do The Right Thing: [client] user=your_user_name pass=secret Omitting the F part is usually the right thing to do. As long as you have configured your ~/.my.cnf correctly, that will result in tools connecting automatically without needing a username or password. You can also specify a default character set in the defaults file. Unlike the “A” part described above, this will actually instruct the client library (DBD::mysql) to change the character set it uses internally, which cannot be accomplished any other way as far as I know, except for utf8. h Hostname or IP address for the connection. p Password to use when connecting. P 3.2. DSN (DATA SOURCE NAME) SPECIFICATIONS 295
  • 300. Percona Toolkit Documentation, Release 2.1.1 Port number to use for the connection. Note that the usual special-case behaviors apply: if you specify localhost as your hostname on Unix systems, the connection actually uses a socket file, not a TCP/IP connection, and thus ignores the port. S Socket file to use for the connection (on Unix systems). u User for login if not current user. 3.2.2 BAREWORD Many of the tools will let you specify a DSN as a single word, without any key=value syntax. This is called a ‘bareword’. How this is handled is tool-specific, but it is usually interpreted as the “h” part. The tool’s --help output will tell you the behavior for that tool. 3.2.3 PROPAGATION Many tools will let you propagate values from one DSN to the next, so you don’t have to specify all the parts for each DSN. For example, if you want to specify a username and password for each DSN, you can connect to three hosts as follows: h=host1,u=fred,p=wilma host2 host3 This is tool-specific. 3.3 ENVIRONMENT The environment variable PTDEBUG enables verbose debugging output to STDERR. To enable debugging and capture all output to a file, run the tool like: PTDEBUG=1 pt-table-checksum ... > FILE 2>&1 Be careful: debugging output is voluminous and can generate several megabytes of output. 3.4 SYSTEM REQUIREMENTS Most tools require: * Perl v5.8 or newer * Bash v3 or newer * Core Perl modules like Time::HiRes Tools that connect to MySQL require: * Perl modules DBI and DBD::mysql * MySQL 5.0 or newer Percona Toolkit is only tested on UNIX systems, primarily Debian and Red Hat derivatives; other operating systems are not supported. 296 Chapter 3. Configuration
  • 301. Percona Toolkit Documentation, Release 2.1.1 Tools that connect to MySQL may work with MySQL v4.1, but this is not test or supported. 3.4. SYSTEM REQUIREMENTS 297
  • 302. Percona Toolkit Documentation, Release 2.1.1 298 Chapter 3. Configuration
  • 303. CHAPTER FOUR MISCELLANEOUS 4.1 BUGS Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report: * Complete command-line used to run the tool * Tool --version * MySQL version of all servers involved * Output from the tool including STDERR * Input files (log/dump/config files, etc.) If possible, include debugging output by running the tool with PTDEBUG; see “ENVIRONMENT”. 4.2 AUTHORS Percona Toolkit is primarily developed by Baron Schwartz and Daniel Nichter, both of whom are employed by Percona Inc. See each program’s documentation for details. 4.3 COPYRIGHT, LICENSE, AND WARRANTY Percona Toolkit is copyright 2011-2012 Percona Inc. and others. See each program’s documentation for complete copyright notices. THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, IN- CLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read these licenses. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. 299
  • 304. Percona Toolkit Documentation, Release 2.1.1 4.4 VERSION Percona Toolkit v2.1.1 released 2012-04-03 4.5 Release Notes 4.5.1 v2.1.1 released 2012-04-03 Percona Toolkit 2.1.1 has been released. This is the first release in the new 2.1 series which supersedes the 2.0 series. We will continue to fix bugs in 2.0, but 2.1 is now the focus of development. 2.1 introduces a lot of new code for: • pt-online-schema-change (completely redesigned) • pt-mysql-summary (completely redesigned) • pt-summary (completely redesigned) • pt-fingerprint (new tool) • pt-table-usage (new tool) There were also several bug fixes. The redesigned tools are meant to replace their 2.0 counterparts because the 2.1 versions have the same or more functionality and they are simpler and more reliable. pt-online-schema-change was particularly enhanced to be as safe as possible given that the tool is inherently risky. Percona Toolkit packages can be downloaded from http://www.percona.com/downloads/percona-toolkit/ or the Per- cona Software Repositories (http://www.percona.com/software/repositories/). Changelog • Completely redesigned pt-online-schema-change • Completely redesigned pt-mysql-summary • Completely redesigned pt-summary • Added new tool: pt-table-usage • Added new tool: pt-fingerprint • Fixed bug 955860: pt-stalk doesn’t run vmstat, iostat, and mpstat for –run-time • Fixed bug 960513: SHOW TABLE STATUS is used needlessly • Fixed bug 969726: pt-online-schema-change loses foreign keys • Fixed bug 846028: pt-online-schema-change does not show progress until completed • Fixed bug 898695: pt-online-schema-change add useless ORDER BY • Fixed bug 952727: pt-diskstats shows incorrect wr_mb_s • Fixed bug 963225: pt-query-digest fails to set history columns for disk tmp tables and disk filesort • Fixed bug 967451: Char chunking doesn’t quote column name • Fixed bug 972399: pt-table-checksum docs are not rendered right 300 Chapter 4. Miscellaneous
  • 305. Percona Toolkit Documentation, Release 2.1.1 • Fixed bug 896553: Various documentation spelling fixes • Fixed bug 949154: pt-variable-advisor advice for relay-log-space-limit • Fixed bug 953461: pt-upgrade manual broken ‘output’ section • Fixed bug 949653: pt-table-checksum docs don’t mention risks posed by inconsistent schemas 4.5.2 v2.0.4 released 2012-03-07 Percona Toolkit 2.0.4 has been released. 23 bugs were fixed in this release, and three new features were implemented. First, –filter was added to pt-kill which allows for arbitrary –group-by. Second, pt-online-schema-change now requires that its new –execute option be given, else the tool will just check the tables and exit. This is a safeguard to encourage users to read the documentation, particularly when replication is involved. Third, pt-stalk also received a new option: –[no]stalk. To collect immediately without stalking, specify –no-stalk and the tool will collect once and exit. This release is completely backwards compatible with previous 2.0 releases. Given the number of bug fixes, it’s worth upgrading to 2.0.4. Changelog • Added –filter to pt-kill to allow arbitrary –group-by • Added –[no]stalk to pt-stalk (bug 932331) • Added –execute to pt-online-schema-change (bug 933232) • Fixed bug 873598: pt-online-schema-change doesn’t like reserved words in column names • Fixed bug 928966: pt-pmp still uses insecure /tmp • Fixed bug 933232: pt-online-schema-change can break replication • Fixed bug 941225: Use of qw(...) as parentheses is deprecated at pt-kill line 3511 • Fixed bug 821694: pt-query-digest doesn’t recognize hex InnoDB txn IDs • Fixed bug 894255: pt-kill shouldn’t check if STDIN is a tty when –daemonize is given • Fixed bug 916999: pt-table-checksum error: DBD::mysql::st execute failed: called with 2 bind variables when 6 are needed • Fixed bug 926598: DBD::mysql bug causes pt-upgrade to use wrong precision (M) and scale (D) • Fixed bug 928226: pt-diskstats illegal division by zero • Fixed bug 928415: Typo in pt-stalk doc: –trigger should be –function • Fixed bug 930317: pt-archiver doc refers to nonexistent pt-query-profiler • Fixed bug 930533: pt-sift looking for *-processlist1; broken compatibility with pt-stalk • Fixed bug 932331: pt-stalk cannot collect without stalking • Fixed bug 932442: pt-table-checksum error when column name has two spaces • Fixed bug 932883: File Debian bug after each release • Fixed bug 940503: pt-stalk disk space checks wrong on 32bit platforms • Fixed bug 944420: –daemonize doesn’t always close STDIN • Fixed bug 945834: pt-sift invokes pt-diskstats with deprecated argument • Fixed bug 945836: pt-sift prints awk error if there are no stack traces to aggregate 4.5. Release Notes 301
  • 306. Percona Toolkit Documentation, Release 2.1.1 • Fixed bug 945842: pt-sift generates wrong state sum during processlist analysis • Fixed bug 946438: pt-query-digest should print a better message when an unsupported log format is specified • Fixed bug 946776: pt-table-checksum ignores –lock-wait-timeout • Fixed bug 940440: Bad grammar in pt-kill docs 4.5.3 v2.0.3 released 2012-02-03 Percona Toolkit 2.0.3 has been released. The development team was very busy last month making this release signifi- cant: two completely redesigned and improved tools, pt-diskstats and pt-stalk, and 20 bug fixes. Both pt-diskstats and pt-stalk were redesigned and rewritten from the ground up. This allowed us to greatly improve these tools’ functionality and increase testing for them. The accuracy and output of pt-diskstats was enhanced, and the tool was rewritten in Perl. pt-collect was removed and its functionality was put into a new, enhanced pt-stalk. pt-stalk is now designed to be a stable, long-running daemon on a variety of common platforms. It is worth re-reading the documentation for each of these tools. The 20 bug fixes cover a wide range of problems. The most important are fixes to pt-table-checksum, pt-iostats, and pt-kill. Apart from pt-diskstats, pt-stalk, and pt-collect (which was removed), no other tools were changed in backwards-incompatible ways, so it is worth reviewing the full changelog for this release and upgrading if you use any tools which had bug fixes. Thank you to the many people who reported bugs and submitted patches. Download the latest release of Percona Toolkit 2.0 from http://www.percona.com/software/percona-toolkit/ or the Percona Software Repositories (http://www.percona.com/docs/wiki/repositories:start). Changelog • Completely redesigned pt-diskstats • Completely redesigned pt-stalk • Removed pt-collect and put its functionality in pt-stalk • Fixed bug 871438: Bash tools are insecure • Fixed bug 897758: Failed to prepare TableSyncChunk plugin: Use of uninitialized value $args{“chunk_range”} in lc at pt-table-sync line 3055 • Fixed bug 919819: pt-kill –execute-command creates zombies • Fixed bug 925778: pt-ioprofile doesn’t run without a file • Fixed bug 925477: pt-ioprofile docs refer to pt-iostats • Fixed bug 857091: pt-sift downloads http://percona.com/get/pt-pmp, which does not work • Fixed bug 857104: pt-sift tries to invoke mext, should be pt-mext • Fixed bug 872699: pt-diskstats: rd_avkb & wr_avkb derived incorrectly • Fixed bug 897029: pt-diskstats computes wrong values for md0 • Fixed bug 882918: pt-stalk spams warning if oprofile isn’t installed • Fixed bug 884504: pt-stalk doesn’t check pt-collect • Fixed bug 897483: pt-online-schema-change “uninitialized value” due to update-foreign-keys-method 302 Chapter 4. Miscellaneous
  • 307. Percona Toolkit Documentation, Release 2.1.1 • Fixed bug 925007: pt-online-schema-change Use of uninitialized value $tables{“old_table”} in concatenation (.) or string at line 4330 • Fixed bug 915598: pt-config-diff ignores –ask-pass option • Fixed bug 919352: pt-table-checksum changes binlog_format even if already set to statement • Fixed bug 921700: pt-table-checksum doesn’t add –where to chunk size test on replicas • Fixed bug 921802: pt-table-checksum does not recognize –recursion-method=processlist • Fixed bug 925855: pt-table-checksum index check is case-sensitive • Fixed bug 821709: pt-show-grants –revoke and –separate don’t work together • Fixed bug 918247: Some tools use VALUE instead of VALUES 4.5.4 v2.0.2 released 2012-01-05 Percona Toolkit 2.0.2 fixes one critical bug: pt-table-sync –replicate did not work with character values, causing an “Unknown column” error. If using Percona Toolkit 2.0.1, you should upgrade to 2.0.2. Download the latest release of Percona Toolkit 2.0 from http://www.percona.com/software/percona-toolkit/ or the Percona Software Repositories (http://www.percona.com/docs/wiki/repositories:start). Changelog • Fixed bug 911996: pt-table-sync –replicate causes “Unknown column” error 4.5.5 v2.0.1 released 2011-12-30 The Percona Toolkit development team is proud to announce a new major version: 2.0. Beginning with Percona Toolkit 2.0, we are overhauling, redesigning, and improving the major tools. 2.0 tools are therefore not backwards compatible with 1.0 tools, which we still support but will not continue to develop. New in Percona Toolkit 2.0.1 is a completely redesigned pt-table-checksum. The original pt-table-checksum 1.0 was rather complex, but it worked well for many years. By contrast, the new pt-table-checksum 2.0 is much simpler but also much more efficient and reliable. We spent months rethinking, redesigning, and testing every aspect of the tool. The three most significant changes: pt-table-checksum 2.0 does only –replicate, it has only one chunking algorithm, and its memory usage is stable even with hundreds of thousands of tables and trillions of rows. The tool is now dedicated to verifying MySQL replication integrity, nothing else, which it does extremely well. In Percona Toolkit 2.0.1 we also fixed various small bugs and forked ioprofile and align (as pt-ioprofile and pt-align) from Aspersa. If you still need functionalities in the original pt-table-checksum, the latest Percona Toolkit 1.0 release remains avail- able for download. Otherwise, all new development in Percona Toolkit will happen in 2.0. Download the latest release of Percona Toolkit 2.0 from http://www.percona.com/software/percona-toolkit/ or the Percona Software Repositories (http://www.percona.com/docs/wiki/repositories:start). Changelog • Completely redesigned pt-table-checksum • Fixed bug 856065: pt-trend does not work • Fixed bug 887688: Prepared statements crash pt-query-digest 4.5. Release Notes 303
  • 308. Percona Toolkit Documentation, Release 2.1.1 • Fixed bug 888286: align not part of percona-toolkit • Fixed bug 897961: ptc 2.0 replicate-check error does not include hostname • Fixed bug 898318: ptc 2.0 –resume with –tables does not always work • Fixed bug 903513: MKDEBUG should be PTDEBUG • Fixed bug 908256: Percona Toolkit should include pt-ioprofile • Fixed bug 821717: pt-tcp-model –type=requests crashes • Fixed bug 844038: pt-online-schema-change documentation example w/drop-tmp-table does not work • Fixed bug 864205: Remove the query to reset @crc from pt-table-checksum • Fixed bug 898663: Typo in pt-log-player documentation 4.5.6 v1.0.1 released 2011-09-01 Percona Toolkit 1.0.1 has been released. In July, Baron announced planned changes to Maatkit and Aspersa devel- opment;[1] Percona Toolkit is the result. In brief, Percona Toolkit is the combined fork of Maatkit and Aspersa, so although the toolkit is new, the programs are not. That means Percona Toolkit 1.0.1 is mature, stable, and production- ready. In fact, it’s even a little more stable because we fixed a few bugs in this release. Percona Toolkit packages can be downloaded from http://www.percona.com/downloads/percona-toolkit/ or the Per- cona Software Repositories (http://www.percona.com/docs/wiki/repositories:start). Although Maatkit and Aspersa development use Google Code, Percona Toolkit uses Launchpad: https://launchpad.net/percona-toolkit [1] http://www.xaprb.com/blog/2011/07/06/planned-change-in-maatkit-aspersa-development/ Changelog • Fixed bug 819421: MasterSlave::is_replication_thread() doesn’t match all • Fixed bug 821673: pt-table-checksum doesn’t include –where in min max queries • Fixed bug 821688: pt-table-checksum SELECT MIN MAX for char chunking is wrong • Fixed bug 838211: pt-collect: line 24: [: : integer expression expected • Fixed bug 838248: pt-collect creates a “5.1” file 4.5.7 v0.9.5 released 2011-08-04 Percona Toolkit 0.9.5 represents the completed transition from Maatkit and Aspersa. There are no bug fixes or new features, but some features have been removed (like –save-results from pt-query-digest). This release is the starting point for the 1.0 series where new development will happen, and no more changes will be made to the 0.9 series. Changelog • Forked, combined, and rebranded Maatkit and Aspersa as Percona Toolkit. 304 Chapter 4. Miscellaneous
  • 309. INDEX Symbols pt-table-usage command line option, 248 –aggregate pt-upgrade command line option, 263 pt-ioprofile command line option, 92 pt-variable-advisor command line option, 277 –algorithms pt-visual-explain command line option, 288 pt-table-sync command line option, 233 –attribute-aliases –all-structs pt-query-digest command line option, 153 pt-duplicate-key-checker command line option, 47 –attribute-value-limit –alter pt-query-digest command line option, 153 pt-online-schema-change command line option, 127 –autoinc –alter-foreign-keys-method pt-find command line option, 58 pt-online-schema-change command line option, 127 –aux-dsn –always pt-query-digest command line option, 153 pt-slave-restart command line option, 195 –avgrowlen –analyze pt-find command line option, 58 pt-archiver command line option, 9 –base-dir –any-busy-time pt-log-player command line option, 108 pt-kill command line option, 103 pt-upgrade command line option, 263 –apdex-threshold –base-file-name pt-query-digest command line option, 153 pt-log-player command line option, 108 –ascend-first –bidirectional pt-archiver command line option, 9 pt-table-sync command line option, 234 –ask-pass –buffer pt-archiver command line option, 10 pt-archiver command line option, 10 pt-config-diff command line option, 26 –buffer-in-mysql pt-deadlock-logger command line option, 32 pt-table-sync command line option, 234 pt-duplicate-key-checker command line option, 47 –bulk-delete pt-find command line option, 56 pt-archiver command line option, 10 pt-fk-error-logger command line option, 69 –bulk-insert pt-heartbeat command line option, 75 pt-archiver command line option, 10 pt-index-usage command line option, 84 –busy-time pt-kill command line option, 97 pt-kill command line option, 100 pt-log-player command line option, 108 –case-insensitive pt-online-schema-change command line option, 128 pt-find command line option, 56 pt-query-advisor command line option, 142 –cell pt-query-digest command line option, 153 pt-ioprofile command line option, 92 pt-show-grants command line option, 175 –charset pt-slave-delay command line option, 184 pt-archiver command line option, 10 pt-slave-find command line option, 189 pt-config-diff command line option, 26 pt-slave-restart command line option, 195 pt-deadlock-logger command line option, 32 pt-table-checksum command line option, 218 pt-duplicate-key-checker command line option, 47 pt-table-sync command line option, 233 pt-find command line option, 56 pt-fk-error-logger command line option, 69 305
  • 310. Percona Toolkit Documentation, Release 2.1.1 pt-heartbeat command line option, 75 –collation pt-index-usage command line option, 84 pt-find command line option, 58 pt-kill command line option, 97 –collect pt-log-player command line option, 108 pt-stalk command line option, 203 pt-online-schema-change command line option, 129 –collect-gdb pt-query-advisor command line option, 142 pt-stalk command line option, 203 pt-query-digest command line option, 154 –collect-oprofile pt-show-grants command line option, 176 pt-stalk command line option, 204 pt-slave-delay command line option, 184 –collect-strace pt-slave-find command line option, 189 pt-stalk command line option, 204 pt-slave-restart command line option, 195 –collect-tcpdump pt-table-sync command line option, 234 pt-stalk command line option, 204 pt-table-usage command line option, 248 –column-name pt-upgrade command line option, 263 pt-find command line option, 58 pt-variable-advisor command line option, 277 –column-type pt-visual-explain command line option, 288 pt-find command line option, 58 –check –columns pt-heartbeat command line option, 75 pt-archiver command line option, 11 –check-attributes-limit pt-deadlock-logger command line option, 33 pt-query-digest command line option, 154 pt-table-checksum command line option, 219 –check-interval pt-table-sync command line option, 235 pt-archiver command line option, 11 –columns-regex pt-online-schema-change command line option, 129 pt-diskstats command line option, 43 pt-table-checksum command line option, 218 –comment –check-slave-lag pt-find command line option, 58 pt-archiver command line option, 11 –commit-each pt-online-schema-change command line option, 129 pt-archiver command line option, 11 pt-table-checksum command line option, 218 –compare –checksum pt-upgrade command line option, 263 pt-find command line option, 58 –compare-results-method –chunk-column pt-upgrade command line option, 264 pt-table-sync command line option, 235 –config –chunk-index pt-archiver command line option, 11 pt-online-schema-change command line option, 129 pt-config-diff command line option, 26 pt-table-checksum command line option, 218 pt-deadlock-logger command line option, 33 pt-table-sync command line option, 235 pt-diskstats command line option, 43 –chunk-size pt-duplicate-key-checker command line option, 47 pt-online-schema-change command line option, 129 pt-fifo-split command line option, 52 pt-table-checksum command line option, 218 pt-find command line option, 56 pt-table-sync command line option, 235 pt-fingerprint command line option, 66 –chunk-size-limit pt-fk-error-logger command line option, 69 pt-online-schema-change command line option, 129 pt-heartbeat command line option, 75 pt-table-checksum command line option, 219 pt-index-usage command line option, 84 –chunk-time pt-kill command line option, 97 pt-online-schema-change command line option, 130 pt-log-player command line option, 108 pt-table-checksum command line option, 219 pt-mysql-summary command line option, 124 –clear-deadlocks pt-online-schema-change command line option, 130 pt-deadlock-logger command line option, 32 pt-query-advisor command line option, 142 –clear-warnings-table pt-query-digest command line option, 154 pt-upgrade command line option, 263 pt-show-grants command line option, 176 –clustered-pk pt-slave-delay command line option, 184 pt-visual-explain command line option, 288 pt-slave-find command line option, 189 –cmin pt-slave-restart command line option, 196 pt-find command line option, 58 pt-stalk command line option, 204 306 Index
  • 311. Percona Toolkit Documentation, Release 2.1.1 pt-summary command line option, 212 pt-kill command line option, 97 pt-table-checksum command line option, 219 pt-query-advisor command line option, 142 pt-table-sync command line option, 235 pt-query-digest command line option, 154 pt-table-usage command line option, 248 pt-slave-delay command line option, 184 pt-tcp-model command line option, 255 pt-slave-restart command line option, 196 pt-trend command line option, 259 pt-stalk command line option, 204 pt-upgrade command line option, 264 pt-table-usage command line option, 249 pt-variable-advisor command line option, 278 pt-upgrade command line option, 264 pt-visual-explain command line option, 288 pt-variable-advisor command line option, 278 –conflict-column –database pt-table-sync command line option, 235 pt-heartbeat command line option, 76 –conflict-comparison pt-index-usage command line option, 84 pt-table-sync command line option, 235 pt-query-advisor command line option, 142 –conflict-error pt-show-grants command line option, 176 pt-table-sync command line option, 236 pt-slave-find command line option, 190 –conflict-threshold pt-slave-restart command line option, 196 pt-table-sync command line option, 236 pt-table-usage command line option, 249 –conflict-value pt-visual-explain command line option, 288 pt-table-sync command line option, 236 –databases –connect pt-duplicate-key-checker command line option, 48 pt-visual-explain command line option, 288 pt-index-usage command line option, 84 –connection-id pt-mysql-summary command line option, 124 pt-find command line option, 58 pt-table-checksum command line option, 219 –constant-data-value pt-table-sync command line option, 236 pt-table-usage command line option, 248 –databases-regex –continue-on-error pt-index-usage command line option, 84 pt-upgrade command line option, 264 pt-table-checksum command line option, 219 –convert-to-select –datafree pt-upgrade command line option, 264 pt-find command line option, 59 –create-dest-table –datasize pt-deadlock-logger command line option, 33 pt-find command line option, 59 –create-review-history-table –day-start pt-query-digest command line option, 154 pt-find command line option, 56 –create-review-table –dbi-driver pt-query-digest command line option, 154 pt-heartbeat command line option, 76 –create-save-results-database –dblike pt-index-usage command line option, 84 pt-find command line option, 59 –create-table –dbregex pt-heartbeat command line option, 75 pt-find command line option, 59 –create-table-definitions –defaults-file pt-table-usage command line option, 248 pt-config-diff command line option, 26 –createopts pt-deadlock-logger command line option, 33 pt-find command line option, 59 pt-duplicate-key-checker command line option, 48 –critical-load pt-find command line option, 56 pt-online-schema-change command line option, 130 pt-fk-error-logger command line option, 69 –ctime pt-heartbeat command line option, 76 pt-find command line option, 59 pt-index-usage command line option, 84 –cycles pt-kill command line option, 97 pt-stalk command line option, 204 pt-log-player command line option, 108 –daemonize pt-online-schema-change command line option, 130 pt-config-diff command line option, 26 pt-query-advisor command line option, 142 pt-deadlock-logger command line option, 33 pt-query-digest command line option, 154 pt-fk-error-logger command line option, 69 pt-show-grants command line option, 176 pt-heartbeat command line option, 76 pt-slave-delay command line option, 184 Index 307
  • 312. Percona Toolkit Documentation, Release 2.1.1 pt-slave-find command line option, 190 –exec-plus pt-slave-restart command line option, 196 pt-find command line option, 61 pt-table-checksum command line option, 219 –execute pt-table-sync command line option, 236 pt-online-schema-change command line option, 130 pt-table-usage command line option, 249 pt-query-digest command line option, 155 pt-variable-advisor command line option, 278 pt-table-sync command line option, 237 pt-visual-explain command line option, 288 –execute-command –delay pt-kill command line option, 103 pt-slave-delay command line option, 184 –execute-throttle –delayed-insert pt-query-digest command line option, 155 pt-archiver command line option, 12 –expected-range –dest pt-query-digest command line option, 155 pt-archiver command line option, 12 –explain pt-deadlock-logger command line option, 33 pt-query-digest command line option, 156 pt-fk-error-logger command line option, 69 pt-table-checksum command line option, 220 pt-stalk command line option, 204 –explain-extended –devices-regex pt-table-usage command line option, 249 pt-diskstats command line option, 43 –explain-hosts –disk-bytes-free pt-table-sync command line option, 237 pt-stalk command line option, 204 pt-upgrade command line option, 264 –disk-pct-free –fifo pt-stalk command line option, 204 pt-fifo-split command line option, 52 –drop –file pt-index-usage command line option, 84 pt-archiver command line option, 12 pt-show-grants command line option, 176 pt-heartbeat command line option, 76 –dry-run –filter pt-archiver command line option, 12 pt-kill command line option, 97 pt-log-player command line option, 108 pt-log-player command line option, 108 pt-online-schema-change command line option, 130 pt-query-digest command line option, 156 pt-table-sync command line option, 237 pt-table-usage command line option, 249 –each-busy-time pt-upgrade command line option, 265 pt-kill command line option, 103 –fingerprints –embedded-attributes pt-query-digest command line option, 157 pt-query-digest command line option, 154 pt-upgrade command line option, 265 –empty –float-precision pt-find command line option, 59 pt-table-checksum command line option, 220 –empty-save-results-tables pt-table-sync command line option, 237 pt-index-usage command line option, 85 pt-upgrade command line option, 265 –engine –flush pt-find command line option, 59 pt-show-grants command line option, 176 –engines –for-update pt-duplicate-key-checker command line option, 48 pt-archiver command line option, 12 pt-table-checksum command line option, 220 –force pt-table-sync command line option, 237 pt-fifo-split command line option, 53 –error-length –format pt-slave-restart command line option, 196 pt-visual-explain command line option, 288 –error-numbers –frames pt-slave-restart command line option, 196 pt-heartbeat command line option, 76 –error-text –function pt-slave-restart command line option, 196 pt-find command line option, 60 –exec pt-stalk command line option, 204 pt-find command line option, 61 pt-table-checksum command line option, 220 –exec-dsn pt-table-sync command line option, 237 pt-find command line option, 61 –group-by 308 Index
  • 313. Percona Toolkit Documentation, Release 2.1.1 pt-diskstats command line option, 43 pt-log-player command line option, 110 pt-ioprofile command line option, 92 pt-online-schema-change command line option, 131 pt-kill command line option, 98 pt-query-advisor command line option, 143 pt-query-advisor command line option, 143 pt-query-digest command line option, 158 pt-query-digest command line option, 158 pt-show-grants command line option, 176 –header pt-slave-delay command line option, 184 pt-archiver command line option, 13 pt-slave-find command line option, 190 –headers pt-slave-restart command line option, 196 pt-diskstats command line option, 43 pt-table-checksum command line option, 220 –help pt-table-sync command line option, 238 pt-archiver command line option, 13 pt-table-usage command line option, 249 pt-config-diff command line option, 26 pt-upgrade command line option, 265 pt-deadlock-logger command line option, 34 pt-variable-advisor command line option, 278 pt-diskstats command line option, 44 pt-visual-explain command line option, 288 pt-duplicate-key-checker command line option, 48 –id-attribute pt-fifo-split command line option, 53 pt-table-usage command line option, 249 pt-find command line option, 56 –idle-time pt-fingerprint command line option, 66 pt-kill command line option, 100 pt-fk-error-logger command line option, 69 –ignore pt-heartbeat command line option, 77 pt-archiver command line option, 13 pt-index-usage command line option, 85 pt-show-grants command line option, 176 pt-ioprofile command line option, 93 –ignore-attributes pt-kill command line option, 98 pt-query-digest command line option, 158 pt-log-player command line option, 110 –ignore-columns pt-mysql-summary command line option, 124 pt-table-checksum command line option, 220 pt-online-schema-change command line option, 131 pt-table-sync command line option, 238 pt-query-advisor command line option, 143 –ignore-command pt-query-digest command line option, 158 pt-kill command line option, 100 pt-show-grants command line option, 176 –ignore-databases pt-slave-delay command line option, 184 pt-duplicate-key-checker command line option, 48 pt-slave-find command line option, 190 pt-index-usage command line option, 85 pt-slave-restart command line option, 196 pt-table-checksum command line option, 221 pt-stalk command line option, 205 pt-table-sync command line option, 238 pt-summary command line option, 212 –ignore-databases-regex pt-table-checksum command line option, 220 pt-index-usage command line option, 85 pt-table-sync command line option, 237 pt-table-checksum command line option, 221 pt-table-usage command line option, 249 –ignore-db pt-tcp-model command line option, 255 pt-kill command line option, 100 pt-trend command line option, 259 –ignore-engines pt-upgrade command line option, 265 pt-duplicate-key-checker command line option, 48 pt-variable-advisor command line option, 278 pt-table-checksum command line option, 221 pt-visual-explain command line option, 288 pt-table-sync command line option, 238 –high-priority-select –ignore-host pt-archiver command line option, 13 pt-kill command line option, 100 –host –ignore-info pt-archiver command line option, 13 pt-kill command line option, 101 pt-config-diff command line option, 26 –ignore-order pt-deadlock-logger command line option, 34 pt-duplicate-key-checker command line option, 48 pt-duplicate-key-checker command line option, 48 –ignore-rules pt-find command line option, 57 pt-query-advisor command line option, 143 pt-fk-error-logger command line option, 70 pt-variable-advisor command line option, 278 pt-heartbeat command line option, 77 –ignore-state pt-index-usage command line option, 85 pt-kill command line option, 101 pt-kill command line option, 98 –ignore-tables Index 309
  • 314. Percona Toolkit Documentation, Release 2.1.1 pt-duplicate-key-checker command line option, 48 pt-table-checksum command line option, 221 pt-index-usage command line option, 85 –log pt-table-checksum command line option, 221 pt-deadlock-logger command line option, 34 pt-table-sync command line option, 238 pt-fk-error-logger command line option, 70 –ignore-tables-regex pt-heartbeat command line option, 77 pt-index-usage command line option, 85 pt-kill command line option, 98 pt-table-checksum command line option, 221 pt-query-digest command line option, 159 –ignore-user pt-slave-delay command line option, 185 pt-kill command line option, 101 pt-slave-restart command line option, 197 –ignore-variables pt-stalk command line option, 205 pt-config-diff command line option, 26 pt-table-usage command line option, 249 –indexsize pt-upgrade command line option, 266 pt-find command line option, 60 –low-priority-delete –inherit-attributes pt-archiver command line option, 13 pt-query-digest command line option, 158 –low-priority-insert –interval pt-archiver command line option, 13 pt-deadlock-logger command line option, 34 –master-server-id pt-diskstats command line option, 44 pt-heartbeat command line option, 77 pt-fk-error-logger command line option, 70 –match pt-heartbeat command line option, 77 pt-stalk command line option, 206 pt-kill command line option, 98 –match-all pt-query-digest command line option, 159 pt-kill command line option, 101 pt-slave-delay command line option, 185 –match-command pt-stalk command line option, 205 pt-kill command line option, 101 –iterations –match-db pt-diskstats command line option, 44 pt-kill command line option, 101 pt-log-player command line option, 110 –match-embedded-numbers pt-query-digest command line option, 159 pt-fingerprint command line option, 66 pt-stalk command line option, 205 –match-host pt-upgrade command line option, 266 pt-kill command line option, 102 –key-types –match-info pt-duplicate-key-checker command line option, 48 pt-kill command line option, 102 –kill –match-md5-checksums pt-kill command line option, 103 pt-fingerprint command line option, 66 –kill-query –match-state pt-kill command line option, 104 pt-kill command line option, 102 –kmin –match-user pt-find command line option, 60 pt-kill command line option, 102 –ktime –max-different-rows pt-find command line option, 60 pt-upgrade command line option, 266 –limit –max-lag pt-archiver command line option, 13 pt-archiver command line option, 13 pt-query-digest command line option, 159 pt-online-schema-change command line option, 131 pt-upgrade command line option, 266 pt-table-checksum command line option, 221 –lines –max-load pt-fifo-split command line option, 53 pt-online-schema-change command line option, 131 –local pt-table-checksum command line option, 221 pt-archiver command line option, 13 –max-sessions –lock pt-log-player command line option, 110 pt-table-sync command line option, 238 –max-sleep –lock-and-rename pt-slave-restart command line option, 197 pt-table-sync command line option, 239 –min-sleep –lock-wait-timeout pt-slave-restart command line option, 197 pt-online-schema-change command line option, 131 –mirror 310 Index
  • 315. Percona Toolkit Documentation, Release 2.1.1 pt-query-digest command line option, 159 pt-visual-explain command line option, 288 –mmin –pid pt-find command line option, 60 pt-archiver command line option, 14 –monitor pt-config-diff command line option, 26 pt-heartbeat command line option, 77 pt-deadlock-logger command line option, 34 pt-slave-restart command line option, 197 pt-duplicate-key-checker command line option, 48 –mtime pt-fifo-split command line option, 53 pt-find command line option, 60 pt-find command line option, 57 –no-ascend pt-fk-error-logger command line option, 70 pt-archiver command line option, 14 pt-heartbeat command line option, 78 –no-delete pt-kill command line option, 98 pt-archiver command line option, 14 pt-log-player command line option, 110 –notify-by-email pt-online-schema-change command line option, 131 pt-stalk command line option, 206 pt-query-advisor command line option, 143 –numeric-ip pt-query-digest command line option, 160 pt-deadlock-logger command line option, 34 pt-show-grants command line option, 177 –offset pt-slave-delay command line option, 185 pt-fifo-split command line option, 53 pt-slave-find command line option, 190 –only pt-slave-restart command line option, 197 pt-show-grants command line option, 176 pt-stalk command line option, 206 –only-select pt-table-checksum command line option, 222 pt-log-player command line option, 110 pt-table-sync command line option, 239 –optimize pt-table-usage command line option, 249 pt-archiver command line option, 14 pt-trend command line option, 259 –or pt-upgrade command line option, 266 pt-find command line option, 57 pt-variable-advisor command line option, 278 –order-by pt-visual-explain command line option, 289 pt-query-digest command line option, 159 –pipeline-profile pt-upgrade command line option, 266 pt-query-digest command line option, 160 –outliers –play pt-query-digest command line option, 160 pt-log-player command line option, 110 –password –plugin pt-archiver command line option, 14 pt-archiver command line option, 14 pt-config-diff command line option, 26 –port pt-deadlock-logger command line option, 34 pt-archiver command line option, 15 pt-duplicate-key-checker command line option, 48 pt-config-diff command line option, 26 pt-find command line option, 57 pt-deadlock-logger command line option, 34 pt-fk-error-logger command line option, 70 pt-duplicate-key-checker command line option, 48 pt-heartbeat command line option, 78 pt-find command line option, 57 pt-index-usage command line option, 85 pt-fk-error-logger command line option, 70 pt-kill command line option, 98 pt-heartbeat command line option, 78 pt-log-player command line option, 110 pt-index-usage command line option, 85 pt-online-schema-change command line option, 131 pt-kill command line option, 99 pt-query-advisor command line option, 143 pt-log-player command line option, 110 pt-query-digest command line option, 160 pt-online-schema-change command line option, 132 pt-show-grants command line option, 176 pt-query-advisor command line option, 143 pt-slave-delay command line option, 185 pt-query-digest command line option, 160 pt-slave-find command line option, 190 pt-show-grants command line option, 177 pt-slave-restart command line option, 197 pt-slave-delay command line option, 185 pt-table-checksum command line option, 222 pt-slave-find command line option, 190 pt-table-sync command line option, 239 pt-slave-restart command line option, 197 pt-table-usage command line option, 249 pt-table-checksum command line option, 222 pt-upgrade command line option, 266 pt-table-sync command line option, 239 pt-variable-advisor command line option, 278 pt-table-usage command line option, 249 Index 311
  • 316. Percona Toolkit Documentation, Release 2.1.1 pt-upgrade command line option, 266 –quiet pt-variable-advisor command line option, 278 pt-archiver command line option, 15 pt-visual-explain command line option, 289 pt-index-usage command line option, 85 –prefix pt-log-player command line option, 110 pt-stalk command line option, 206 pt-online-schema-change command line option, 132 –primary-key-only pt-slave-delay command line option, 185 pt-archiver command line option, 15 pt-slave-restart command line option, 197 –print pt-table-checksum command line option, 222 pt-deadlock-logger command line option, 34 pt-trend command line option, 259 pt-find command line option, 62 –read-samples pt-fk-error-logger command line option, 70 pt-mysql-summary command line option, 124 pt-kill command line option, 104 pt-summary command line option, 212 pt-log-player command line option, 110 –read-timeout pt-online-schema-change command line option, 132 pt-query-digest command line option, 161 pt-query-digest command line option, 160 pt-table-usage command line option, 250 pt-table-sync command line option, 239 –recurse –print-all pt-heartbeat command line option, 78 pt-query-advisor command line option, 143 pt-online-schema-change command line option, 132 –print-iterations pt-slave-find command line option, 190 pt-query-digest command line option, 160 pt-slave-restart command line option, 197 –print-master-server-id pt-table-checksum command line option, 222 pt-heartbeat command line option, 78 –recursion-method –printf pt-heartbeat command line option, 78 pt-find command line option, 62 pt-online-schema-change command line option, 132 –procedure pt-slave-find command line option, 190 pt-find command line option, 60 pt-slave-restart command line option, 197 –processlist pt-table-checksum command line option, 222 pt-query-digest command line option, 161 pt-table-sync command line option, 239 –profile-pid –replace pt-ioprofile command line option, 93 pt-archiver command line option, 15 –profile-process pt-heartbeat command line option, 78 pt-ioprofile command line option, 93 pt-table-sync command line option, 240 –progress –replicate pt-archiver command line option, 15 pt-table-checksum command line option, 223 pt-index-usage command line option, 85 pt-table-sync command line option, 240 pt-online-schema-change command line option, 132 –replicate-check-only pt-query-digest command line option, 161 pt-table-checksum command line option, 224 pt-table-checksum command line option, 222 –replicate-database pt-table-usage command line option, 250 pt-table-checksum command line option, 224 pt-tcp-model command line option, 255 –replication-threads pt-trend command line option, 259 pt-kill command line option, 102 –purge –report-all pt-archiver command line option, 15 pt-query-digest command line option, 161 –quantile –report-format pt-tcp-model command line option, 255 pt-index-usage command line option, 86 –query pt-query-advisor command line option, 143 pt-fingerprint command line option, 66 pt-query-digest command line option, 161 pt-query-advisor command line option, 143 pt-slave-find command line option, 190 pt-table-usage command line option, 250 –report-histogram pt-upgrade command line option, 266 pt-query-digest command line option, 162 –query-count –report-width pt-kill command line option, 103 pt-config-diff command line option, 27 –quick-delete –reports pt-archiver command line option, 15 pt-upgrade command line option, 266 312 Index
  • 317. Percona Toolkit Documentation, Release 2.1.1 –resume –separate pt-table-checksum command line option, 224 pt-show-grants command line option, 177 –retention-time –separator pt-stalk command line option, 206 pt-table-checksum command line option, 224 –retries –server-id pt-archiver command line option, 15 pt-find command line option, 60 pt-online-schema-change command line option, 133 –session-files pt-table-checksum command line option, 224 pt-log-player command line option, 111 –review –set-vars pt-query-advisor command line option, 144 pt-archiver command line option, 16 pt-query-digest command line option, 162 pt-config-diff command line option, 27 –review-history pt-deadlock-logger command line option, 34 pt-query-digest command line option, 163 pt-duplicate-key-checker command line option, 49 –revoke pt-find command line option, 57 pt-show-grants command line option, 177 pt-fk-error-logger command line option, 70 –rowformat pt-heartbeat command line option, 79 pt-find command line option, 60 pt-index-usage command line option, 88 –rows pt-kill command line option, 99 pt-find command line option, 60 pt-log-player command line option, 111 –run-time pt-online-schema-change command line option, 133 pt-archiver command line option, 16 pt-query-advisor command line option, 144 pt-deadlock-logger command line option, 34 pt-query-digest command line option, 166 pt-fk-error-logger command line option, 70 pt-show-grants command line option, 177 pt-heartbeat command line option, 78 pt-slave-delay command line option, 185 pt-ioprofile command line option, 93 pt-slave-find command line option, 191 pt-kill command line option, 99 pt-slave-restart command line option, 198 pt-query-digest command line option, 165 pt-table-checksum command line option, 224 pt-slave-delay command line option, 185 pt-table-sync command line option, 240 pt-slave-restart command line option, 198 pt-table-usage command line option, 250 pt-stalk command line option, 206 pt-upgrade command line option, 267 pt-table-usage command line option, 250 pt-variable-advisor command line option, 278 pt-tcp-model command line option, 255 pt-visual-explain command line option, 289 pt-upgrade command line option, 266 –share-lock –run-time-mode pt-archiver command line option, 16 pt-query-digest command line option, 165 –shorten –sample pt-query-digest command line option, 167 pt-query-advisor command line option, 144 pt-upgrade command line option, 267 pt-query-digest command line option, 166 –show-all –sample-time pt-query-digest command line option, 167 pt-diskstats command line option, 44 –show-inactive –save-results-database pt-diskstats command line option, 44 pt-index-usage command line option, 86 –show-timestamps –save-samples pt-diskstats command line option, 44 pt-diskstats command line option, 44 –since pt-ioprofile command line option, 93 pt-query-digest command line option, 167 pt-mysql-summary command line option, 124 –skew pt-summary command line option, 212 pt-heartbeat command line option, 79 –select –skip-count pt-query-digest command line option, 166 pt-slave-restart command line option, 198 –sentinel –skip-foreign-key-checks pt-archiver command line option, 16 pt-archiver command line option, 16 pt-heartbeat command line option, 79 –sleep pt-kill command line option, 99 pt-archiver command line option, 16 pt-slave-restart command line option, 198 pt-mysql-summary command line option, 124 Index 313
  • 318. Percona Toolkit Documentation, Release 2.1.1 pt-slave-restart command line option, 198 –summarize-processes pt-stalk command line option, 206 pt-summary command line option, 212 pt-summary command line option, 212 –sync-to-master –sleep-coef pt-table-sync command line option, 240 pt-archiver command line option, 16 –tab –socket pt-deadlock-logger command line option, 35 pt-archiver command line option, 17 –table pt-config-diff command line option, 27 pt-heartbeat command line option, 79 pt-deadlock-logger command line option, 34 –table-access pt-duplicate-key-checker command line option, 49 pt-query-digest command line option, 168 pt-find command line option, 57 –tables pt-fk-error-logger command line option, 70 pt-duplicate-key-checker command line option, 49 pt-heartbeat command line option, 79 pt-index-usage command line option, 89 pt-index-usage command line option, 88 pt-table-checksum command line option, 224 pt-kill command line option, 99 pt-table-sync command line option, 240 pt-log-player command line option, 111 –tables-regex pt-online-schema-change command line option, 133 pt-index-usage command line option, 89 pt-query-advisor command line option, 144 pt-table-checksum command line option, 224 pt-query-digest command line option, 167 –tablesize pt-show-grants command line option, 177 pt-find command line option, 60 pt-slave-delay command line option, 185 –tbllike pt-slave-find command line option, 191 pt-find command line option, 61 pt-slave-restart command line option, 198 –tblregex pt-table-checksum command line option, 224 pt-find command line option, 61 pt-table-sync command line option, 240 –tblversion pt-table-usage command line option, 250 pt-find command line option, 61 pt-upgrade command line option, 267 –tcpdump-errors pt-variable-advisor command line option, 278 pt-query-digest command line option, 168 pt-visual-explain command line option, 289 –temp-database –source pt-upgrade command line option, 267 pt-archiver command line option, 17 –temp-table –source-of-variables pt-upgrade command line option, 267 pt-variable-advisor command line option, 278 –test-matching –split pt-kill command line option, 102 pt-log-player command line option, 111 –threads –split-random pt-log-player command line option, 111 pt-log-player command line option, 111 –threshold –stalk pt-stalk command line option, 206 pt-stalk command line option, 206 –timeline –start-end pt-query-digest command line option, 168 pt-tcp-model command line option, 255 –timeout-ok –statistics pt-table-sync command line option, 240 pt-archiver command line option, 17 –trigger pt-fifo-split command line option, 53 pt-find command line option, 61 pt-query-digest command line option, 168 –trigger-table –stop pt-find command line option, 61 pt-archiver command line option, 18 –trim pt-heartbeat command line option, 79 pt-table-checksum command line option, 224 pt-kill command line option, 99 pt-table-sync command line option, 241 pt-slave-restart command line option, 198 –txn-size –summarize-mounts pt-archiver command line option, 18 pt-summary command line option, 212 –type –summarize-network pt-log-player command line option, 111 pt-summary command line option, 212 pt-query-advisor command line option, 144 314 Index
  • 319. Percona Toolkit Documentation, Release 2.1.1 pt-query-digest command line option, 169 pt-fifo-split command line option, 53 pt-tcp-model command line option, 256 pt-find command line option, 57 –until pt-fingerprint command line option, 66 pt-query-digest command line option, 171 pt-fk-error-logger command line option, 70 –until-master pt-heartbeat command line option, 80 pt-slave-restart command line option, 199 pt-index-usage command line option, 89 –until-relay pt-ioprofile command line option, 93 pt-slave-restart command line option, 199 pt-kill command line option, 99 –update pt-log-player command line option, 112 pt-heartbeat command line option, 79 pt-mysql-summary command line option, 124 –use-master pt-online-schema-change command line option, 133 pt-slave-delay command line option, 185 pt-query-advisor command line option, 144 –user pt-query-digest command line option, 171 pt-archiver command line option, 18 pt-show-grants command line option, 177 pt-config-diff command line option, 27 pt-slave-delay command line option, 186 pt-deadlock-logger command line option, 35 pt-slave-find command line option, 191 pt-duplicate-key-checker command line option, 49 pt-slave-restart command line option, 199 pt-find command line option, 57 pt-stalk command line option, 206 pt-fk-error-logger command line option, 70 pt-summary command line option, 212 pt-heartbeat command line option, 79 pt-table-checksum command line option, 225 pt-index-usage command line option, 89 pt-table-sync command line option, 241 pt-kill command line option, 99 pt-table-usage command line option, 250 pt-log-player command line option, 112 pt-tcp-model command line option, 256 pt-online-schema-change command line option, 133 pt-trend command line option, 259 pt-query-advisor command line option, 144 pt-upgrade command line option, 267 pt-query-digest command line option, 171 pt-variable-advisor command line option, 279 pt-show-grants command line option, 177 pt-visual-explain command line option, 289 pt-slave-delay command line option, 185 –victims pt-slave-find command line option, 191 pt-kill command line option, 99 pt-slave-restart command line option, 199 –view pt-table-checksum command line option, 225 pt-find command line option, 61 pt-table-sync command line option, 241 –wait pt-table-usage command line option, 250 pt-table-sync command line option, 241 pt-upgrade command line option, 267 –wait-after-kill pt-variable-advisor command line option, 278 pt-kill command line option, 100 pt-visual-explain command line option, 289 –wait-before-kill –variable pt-kill command line option, 100 pt-stalk command line option, 206 –watch-server –variations pt-query-digest command line option, 171 pt-query-digest command line option, 171 pt-tcp-model command line option, 256 –verbose –where pt-duplicate-key-checker command line option, 49 pt-archiver command line option, 18 pt-kill command line option, 103 pt-query-advisor command line option, 144 pt-log-player command line option, 112 pt-table-checksum command line option, 225 pt-query-advisor command line option, 144 pt-table-sync command line option, 242 pt-slave-restart command line option, 199 –why-quit pt-table-sync command line option, 241 pt-archiver command line option, 19 pt-variable-advisor command line option, 279 –zero-query-times –version pt-upgrade command line option, 267 pt-archiver command line option, 18 –[no]bin-log pt-config-diff command line option, 27 pt-table-sync command line option, 234 pt-deadlock-logger command line option, 35 –[no]buffer-to-client pt-diskstats command line option, 44 pt-table-sync command line option, 234 pt-duplicate-key-checker command line option, 49 –[no]bulk-delete-limit Index 315
  • 320. Percona Toolkit Documentation, Release 2.1.1 pt-archiver command line option, 10 –[no]replicate-check –[no]check-charset pt-table-checksum command line option, 223 pt-archiver command line option, 11 –[no]report –[no]check-columns pt-config-diff command line option, 26 pt-archiver command line option, 11 pt-index-usage command line option, 85 –[no]check-master pt-query-digest command line option, 161 pt-table-sync command line option, 234 –[no]results –[no]check-privileges pt-log-player command line option, 111 pt-table-sync command line option, 234 –[no]safe-auto-increment –[no]check-relay-log pt-archiver command line option, 16 pt-slave-restart command line option, 196 –[no]show-create-table –[no]check-replication-filters pt-query-advisor command line option, 144 pt-online-schema-change command line option, 129 –[no]sql pt-table-checksum command line option, 218 pt-duplicate-key-checker command line option, 49 –[no]check-slave –[no]strip-comments pt-table-sync command line option, 234 pt-kill command line option, 99 –[no]check-triggers –[no]summary pt-table-sync command line option, 235 pt-duplicate-key-checker command line option, 49 –[no]clear-warnings –[no]swap-tables pt-upgrade command line option, 263 pt-online-schema-change command line option, 133 –[no]clustered –[no]timestamp pt-duplicate-key-checker command line option, 47 pt-show-grants command line option, 177 –[no]collapse –[no]transaction pt-deadlock-logger command line option, 32 pt-table-sync command line option, 241 –[no]continue –[no]unique-checks pt-slave-delay command line option, 184 pt-table-sync command line option, 241 –[no]continue-on-error –[no]warnings pt-query-advisor command line option, 142 pt-log-player command line option, 112 pt-query-digest command line option, 154 –[no]zero-admin pt-table-usage command line option, 248 pt-query-digest command line option, 172 –[no]create-replicate-table –[no]zero-bool pt-table-checksum command line option, 219 pt-query-digest command line option, 172 –[no]create-views –[no]zero-chunk pt-index-usage command line option, 84 pt-table-sync command line option, 242 –[no]drop-old-table pt-online-schema-change command line option, 130 P –[no]empty-replicate-table pt-archiver command line option pt-table-checksum command line option, 220 –analyze, 9 –[no]for-explain –ascend-first, 9 pt-query-digest command line option, 157 –ask-pass, 10 –[no]foreign-key-checks –buffer, 10 pt-table-sync command line option, 237 –bulk-delete, 10 –[no]header –bulk-insert, 10 pt-show-grants command line option, 176 –charset, 10 –[no]hex-blob –check-interval, 11 pt-table-sync command line option, 237 –check-slave-lag, 11 –[no]ignore-self –columns, 11 pt-kill command line option, 101 –commit-each, 11 –[no]index-hint –config, 11 pt-table-sync command line option, 238 –delayed-insert, 12 –[no]insert-heartbeat-row –dest, 12 pt-heartbeat command line option, 77 –dry-run, 12 –[no]quote –file, 12 pt-find command line option, 57 –for-update, 12 316 Index
  • 321. Percona Toolkit Documentation, Release 2.1.1 –header, 13 –pid, 26 –help, 13 –port, 26 –high-priority-select, 13 –report-width, 27 –host, 13 –set-vars, 27 –ignore, 13 –socket, 27 –limit, 13 –user, 27 –local, 13 –version, 27 –low-priority-delete, 13 –[no]report, 26 –low-priority-insert, 13 pt-deadlock-logger command line option –max-lag, 13 –ask-pass, 32 –no-ascend, 14 –charset, 32 –no-delete, 14 –clear-deadlocks, 32 –optimize, 14 –columns, 33 –password, 14 –config, 33 –pid, 14 –create-dest-table, 33 –plugin, 14 –daemonize, 33 –port, 15 –defaults-file, 33 –primary-key-only, 15 –dest, 33 –progress, 15 –help, 34 –purge, 15 –host, 34 –quick-delete, 15 –interval, 34 –quiet, 15 –log, 34 –replace, 15 –numeric-ip, 34 –retries, 15 –password, 34 –run-time, 16 –pid, 34 –sentinel, 16 –port, 34 –set-vars, 16 –print, 34 –share-lock, 16 –run-time, 34 –skip-foreign-key-checks, 16 –set-vars, 34 –sleep, 16 –socket, 34 –sleep-coef, 16 –tab, 35 –socket, 17 –user, 35 –source, 17 –version, 35 –statistics, 17 –[no]collapse, 32 –stop, 18 pt-diskstats command line option –txn-size, 18 –columns-regex, 43 –user, 18 –config, 43 –version, 18 –devices-regex, 43 –where, 18 –group-by, 43 –why-quit, 19 –headers, 43 –[no]bulk-delete-limit, 10 –help, 44 –[no]check-charset, 11 –interval, 44 –[no]check-columns, 11 –iterations, 44 –[no]safe-auto-increment, 16 –sample-time, 44 pt-config-diff command line option –save-samples, 44 –ask-pass, 26 –show-inactive, 44 –charset, 26 –show-timestamps, 44 –config, 26 –version, 44 –daemonize, 26 pt-duplicate-key-checker command line option –defaults-file, 26 –all-structs, 47 –help, 26 –ask-pass, 47 –host, 26 –charset, 47 –ignore-variables, 26 –config, 47 –password, 26 –databases, 48 Index 317
  • 322. Percona Toolkit Documentation, Release 2.1.1 –defaults-file, 48 –engine, 59 –engines, 48 –exec, 61 –help, 48 –exec-dsn, 61 –host, 48 –exec-plus, 61 –ignore-databases, 48 –function, 60 –ignore-engines, 48 –help, 56 –ignore-order, 48 –host, 57 –ignore-tables, 48 –indexsize, 60 –key-types, 48 –kmin, 60 –password, 48 –ktime, 60 –pid, 48 –mmin, 60 –port, 48 –mtime, 60 –set-vars, 49 –or, 57 –socket, 49 –password, 57 –tables, 49 –pid, 57 –user, 49 –port, 57 –verbose, 49 –print, 62 –version, 49 –printf, 62 –[no]clustered, 47 –procedure, 60 –[no]sql, 49 –rowformat, 60 –[no]summary, 49 –rows, 60 pt-fifo-split command line option –server-id, 60 –config, 52 –set-vars, 57 –fifo, 52 –socket, 57 –force, 53 –tablesize, 60 –help, 53 –tbllike, 61 –lines, 53 –tblregex, 61 –offset, 53 –tblversion, 61 –pid, 53 –trigger, 61 –statistics, 53 –trigger-table, 61 –version, 53 –user, 57 pt-find command line option –version, 57 –ask-pass, 56 –view, 61 –autoinc, 58 –[no]quote, 57 –avgrowlen, 58 pt-fingerprint command line option –case-insensitive, 56 –config, 66 –charset, 56 –help, 66 –checksum, 58 –match-embedded-numbers, 66 –cmin, 58 –match-md5-checksums, 66 –collation, 58 –query, 66 –column-name, 58 –version, 66 –column-type, 58 pt-fk-error-logger command line option –comment, 58 –ask-pass, 69 –config, 56 –charset, 69 –connection-id, 58 –config, 69 –createopts, 59 –daemonize, 69 –ctime, 59 –defaults-file, 69 –datafree, 59 –dest, 69 –datasize, 59 –help, 69 –day-start, 56 –host, 70 –dblike, 59 –interval, 70 –dbregex, 59 –log, 70 –defaults-file, 56 –password, 70 –empty, 59 –pid, 70 318 Index
  • 323. Percona Toolkit Documentation, Release 2.1.1 –port, 70 –help, 85 –print, 70 –host, 85 –run-time, 70 –ignore-databases, 85 –set-vars, 70 –ignore-databases-regex, 85 –socket, 70 –ignore-tables, 85 –user, 70 –ignore-tables-regex, 85 –version, 70 –password, 85 pt-heartbeat command line option –port, 85 –ask-pass, 75 –progress, 85 –charset, 75 –quiet, 85 –check, 75 –report-format, 86 –config, 75 –save-results-database, 86 –create-table, 75 –set-vars, 88 –daemonize, 76 –socket, 88 –database, 76 –tables, 89 –dbi-driver, 76 –tables-regex, 89 –defaults-file, 76 –user, 89 –file, 76 –version, 89 –frames, 76 –[no]create-views, 84 –help, 77 –[no]report, 85 –host, 77 pt-ioprofile command line option –interval, 77 –aggregate, 92 –log, 77 –cell, 92 –master-server-id, 77 –group-by, 92 –monitor, 77 –help, 93 –password, 78 –profile-pid, 93 –pid, 78 –profile-process, 93 –port, 78 –run-time, 93 –print-master-server-id, 78 –save-samples, 93 –recurse, 78 –version, 93 –recursion-method, 78 pt-kill command line option –replace, 78 –any-busy-time, 103 –run-time, 78 –ask-pass, 97 –sentinel, 79 –busy-time, 100 –set-vars, 79 –charset, 97 –skew, 79 –config, 97 –socket, 79 –daemonize, 97 –stop, 79 –defaults-file, 97 –table, 79 –each-busy-time, 103 –update, 79 –execute-command, 103 –user, 79 –filter, 97 –version, 80 –group-by, 98 –[no]insert-heartbeat-row, 77 –help, 98 pt-index-usage command line option –host, 98 –ask-pass, 84 –idle-time, 100 –charset, 84 –ignore-command, 100 –config, 84 –ignore-db, 100 –create-save-results-database, 84 –ignore-host, 100 –database, 84 –ignore-info, 101 –databases, 84 –ignore-state, 101 –databases-regex, 84 –ignore-user, 101 –defaults-file, 84 –interval, 98 –drop, 84 –kill, 103 –empty-save-results-tables, 85 –kill-query, 104 Index 319
  • 324. Percona Toolkit Documentation, Release 2.1.1 –log, 98 –type, 111 –match-all, 101 –user, 112 –match-command, 101 –verbose, 112 –match-db, 101 –version, 112 –match-host, 102 –[no]results, 111 –match-info, 102 –[no]warnings, 112 –match-state, 102 pt-mysql-summary command line option –match-user, 102 –config, 124 –password, 98 –databases, 124 –pid, 98 –help, 124 –port, 99 –read-samples, 124 –print, 104 –save-samples, 124 –query-count, 103 –sleep, 124 –replication-threads, 102 –version, 124 –run-time, 99 pt-online-schema-change command line option –sentinel, 99 –alter, 127 –set-vars, 99 –alter-foreign-keys-method, 127 –socket, 99 –ask-pass, 128 –stop, 99 –charset, 129 –test-matching, 102 –check-interval, 129 –user, 99 –check-slave-lag, 129 –verbose, 103 –chunk-index, 129 –version, 99 –chunk-size, 129 –victims, 99 –chunk-size-limit, 129 –wait-after-kill, 100 –chunk-time, 130 –wait-before-kill, 100 –config, 130 –[no]ignore-self, 101 –critical-load, 130 –[no]strip-comments, 99 –defaults-file, 130 pt-log-player command line option –dry-run, 130 –ask-pass, 108 –execute, 130 –base-dir, 108 –help, 131 –base-file-name, 108 –host, 131 –charset, 108 –lock-wait-timeout, 131 –config, 108 –max-lag, 131 –defaults-file, 108 –max-load, 131 –dry-run, 108 –password, 131 –filter, 108 –pid, 131 –help, 110 –port, 132 –host, 110 –print, 132 –iterations, 110 –progress, 132 –max-sessions, 110 –quiet, 132 –only-select, 110 –recurse, 132 –password, 110 –recursion-method, 132 –pid, 110 –retries, 133 –play, 110 –set-vars, 133 –port, 110 –socket, 133 –print, 110 –user, 133 –quiet, 110 –version, 133 –session-files, 111 –[no]check-replication-filters, 129 –set-vars, 111 –[no]drop-old-table, 130 –socket, 111 –[no]swap-tables, 133 –split, 111 pt-query-advisor command line option –split-random, 111 –ask-pass, 142 –threads, 111 –charset, 142 320 Index
  • 325. Percona Toolkit Documentation, Release 2.1.1 –config, 142 –mirror, 159 –daemonize, 142 –order-by, 159 –database, 142 –outliers, 160 –defaults-file, 142 –password, 160 –group-by, 143 –pid, 160 –help, 143 –pipeline-profile, 160 –host, 143 –port, 160 –ignore-rules, 143 –print, 160 –password, 143 –print-iterations, 160 –pid, 143 –processlist, 161 –port, 143 –progress, 161 –print-all, 143 –read-timeout, 161 –query, 143 –report-all, 161 –report-format, 143 –report-format, 161 –review, 144 –report-histogram, 162 –sample, 144 –review, 162 –set-vars, 144 –review-history, 163 –socket, 144 –run-time, 165 –type, 144 –run-time-mode, 165 –user, 144 –sample, 166 –verbose, 144 –select, 166 –version, 144 –set-vars, 166 –where, 144 –shorten, 167 –[no]continue-on-error, 142 –show-all, 167 –[no]show-create-table, 144 –since, 167 pt-query-digest command line option –socket, 167 –apdex-threshold, 153 –statistics, 168 –ask-pass, 153 –table-access, 168 –attribute-aliases, 153 –tcpdump-errors, 168 –attribute-value-limit, 153 –timeline, 168 –aux-dsn, 153 –type, 169 –charset, 154 –until, 171 –check-attributes-limit, 154 –user, 171 –config, 154 –variations, 171 –create-review-history-table, 154 –version, 171 –create-review-table, 154 –watch-server, 171 –daemonize, 154 –[no]continue-on-error, 154 –defaults-file, 154 –[no]for-explain, 157 –embedded-attributes, 154 –[no]report, 161 –execute, 155 –[no]zero-admin, 172 –execute-throttle, 155 –[no]zero-bool, 172 –expected-range, 155 pt-show-grants command line option –explain, 156 –ask-pass, 175 –filter, 156 –charset, 176 –fingerprints, 157 –config, 176 –group-by, 158 –database, 176 –help, 158 –defaults-file, 176 –host, 158 –drop, 176 –ignore-attributes, 158 –flush, 176 –inherit-attributes, 158 –help, 176 –interval, 159 –host, 176 –iterations, 159 –ignore, 176 –limit, 159 –only, 176 –log, 159 –password, 176 Index 321
  • 326. Percona Toolkit Documentation, Release 2.1.1 –pid, 177 –config, 196 –port, 177 –daemonize, 196 –revoke, 177 –database, 196 –separate, 177 –defaults-file, 196 –set-vars, 177 –error-length, 196 –socket, 177 –error-numbers, 196 –user, 177 –error-text, 196 –version, 177 –help, 196 –[no]header, 176 –host, 196 –[no]timestamp, 177 –log, 197 pt-slave-delay command line option –max-sleep, 197 –ask-pass, 184 –min-sleep, 197 –charset, 184 –monitor, 197 –config, 184 –password, 197 –daemonize, 184 –pid, 197 –defaults-file, 184 –port, 197 –delay, 184 –quiet, 197 –help, 184 –recurse, 197 –host, 184 –recursion-method, 197 –interval, 185 –run-time, 198 –log, 185 –sentinel, 198 –password, 185 –set-vars, 198 –pid, 185 –skip-count, 198 –port, 185 –sleep, 198 –quiet, 185 –socket, 198 –run-time, 185 –stop, 198 –set-vars, 185 –until-master, 199 –socket, 185 –until-relay, 199 –use-master, 185 –user, 199 –user, 185 –verbose, 199 –version, 186 –version, 199 –[no]continue, 184 –[no]check-relay-log, 196 pt-slave-find command line option pt-stalk command line option –ask-pass, 189 –collect, 203 –charset, 189 –collect-gdb, 203 –config, 189 –collect-oprofile, 204 –database, 190 –collect-strace, 204 –defaults-file, 190 –collect-tcpdump, 204 –help, 190 –config, 204 –host, 190 –cycles, 204 –password, 190 –daemonize, 204 –pid, 190 –dest, 204 –port, 190 –disk-bytes-free, 204 –recurse, 190 –disk-pct-free, 204 –recursion-method, 190 –function, 204 –report-format, 190 –help, 205 –set-vars, 191 –interval, 205 –socket, 191 –iterations, 205 –user, 191 –log, 205 –version, 191 –match, 206 pt-slave-restart command line option –notify-by-email, 206 –always, 195 –pid, 206 –ask-pass, 195 –prefix, 206 –charset, 195 –retention-time, 206 322 Index
  • 327. Percona Toolkit Documentation, Release 2.1.1 –run-time, 206 –resume, 224 –sleep, 206 –retries, 224 –stalk, 206 –separator, 224 –threshold, 206 –set-vars, 224 –variable, 206 –socket, 224 –version, 206 –tables, 224 pt-summary command line option –tables-regex, 224 –config, 212 –trim, 224 –help, 212 –user, 225 –read-samples, 212 –version, 225 –save-samples, 212 –where, 225 –sleep, 212 –[no]check-replication-filters, 218 –summarize-mounts, 212 –[no]create-replicate-table, 219 –summarize-network, 212 –[no]empty-replicate-table, 220 –summarize-processes, 212 –[no]replicate-check, 223 –version, 212 pt-table-sync command line option pt-table-checksum command line option –algorithms, 233 –ask-pass, 218 –ask-pass, 233 –check-interval, 218 –bidirectional, 234 –check-slave-lag, 218 –buffer-in-mysql, 234 –chunk-index, 218 –charset, 234 –chunk-size, 218 –chunk-column, 235 –chunk-size-limit, 219 –chunk-index, 235 –chunk-time, 219 –chunk-size, 235 –columns, 219 –columns, 235 –config, 219 –config, 235 –databases, 219 –conflict-column, 235 –databases-regex, 219 –conflict-comparison, 235 –defaults-file, 219 –conflict-error, 236 –engines, 220 –conflict-threshold, 236 –explain, 220 –conflict-value, 236 –float-precision, 220 –databases, 236 –function, 220 –defaults-file, 236 –help, 220 –dry-run, 237 –host, 220 –engines, 237 –ignore-columns, 220 –execute, 237 –ignore-databases, 221 –explain-hosts, 237 –ignore-databases-regex, 221 –float-precision, 237 –ignore-engines, 221 –function, 237 –ignore-tables, 221 –help, 237 –ignore-tables-regex, 221 –host, 238 –lock-wait-timeout, 221 –ignore-columns, 238 –max-lag, 221 –ignore-databases, 238 –max-load, 221 –ignore-engines, 238 –password, 222 –ignore-tables, 238 –pid, 222 –lock, 238 –port, 222 –lock-and-rename, 239 –progress, 222 –password, 239 –quiet, 222 –pid, 239 –recurse, 222 –port, 239 –recursion-method, 222 –print, 239 –replicate, 223 –recursion-method, 239 –replicate-check-only, 224 –replace, 240 –replicate-database, 224 –replicate, 240 Index 323
  • 328. Percona Toolkit Documentation, Release 2.1.1 –set-vars, 240 –quantile, 255 –socket, 240 –run-time, 255 –sync-to-master, 240 –start-end, 255 –tables, 240 –type, 256 –timeout-ok, 240 –version, 256 –trim, 241 –watch-server, 256 –user, 241 pt-trend command line option –verbose, 241 –config, 259 –version, 241 –help, 259 –wait, 241 –pid, 259 –where, 242 –progress, 259 –[no]bin-log, 234 –quiet, 259 –[no]buffer-to-client, 234 –version, 259 –[no]check-master, 234 pt-upgrade command line option –[no]check-privileges, 234 –ask-pass, 263 –[no]check-slave, 234 –base-dir, 263 –[no]check-triggers, 235 –charset, 263 –[no]foreign-key-checks, 237 –clear-warnings-table, 263 –[no]hex-blob, 237 –compare, 263 –[no]index-hint, 238 –compare-results-method, 264 –[no]transaction, 241 –config, 264 –[no]unique-checks, 241 –continue-on-error, 264 –[no]zero-chunk, 242 –convert-to-select, 264 pt-table-usage command line option –daemonize, 264 –ask-pass, 248 –explain-hosts, 264 –charset, 248 –filter, 265 –config, 248 –fingerprints, 265 –constant-data-value, 248 –float-precision, 265 –create-table-definitions, 248 –help, 265 –daemonize, 249 –host, 265 –database, 249 –iterations, 266 –defaults-file, 249 –limit, 266 –explain-extended, 249 –log, 266 –filter, 249 –max-different-rows, 266 –help, 249 –order-by, 266 –host, 249 –password, 266 –id-attribute, 249 –pid, 266 –log, 249 –port, 266 –password, 249 –query, 266 –pid, 249 –reports, 266 –port, 249 –run-time, 266 –progress, 250 –set-vars, 267 –query, 250 –shorten, 267 –read-timeout, 250 –socket, 267 –run-time, 250 –temp-database, 267 –set-vars, 250 –temp-table, 267 –socket, 250 –user, 267 –user, 250 –version, 267 –version, 250 –zero-query-times, 267 –[no]continue-on-error, 248 –[no]clear-warnings, 263 pt-tcp-model command line option pt-variable-advisor command line option –config, 255 –ask-pass, 277 –help, 255 –charset, 277 –progress, 255 –config, 278 324 Index
  • 329. Percona Toolkit Documentation, Release 2.1.1 –daemonize, 278 –defaults-file, 278 –help, 278 –host, 278 –ignore-rules, 278 –password, 278 –pid, 278 –port, 278 –set-vars, 278 –socket, 278 –source-of-variables, 278 –user, 278 –verbose, 279 –version, 279 pt-visual-explain command line option –ask-pass, 288 –charset, 288 –clustered-pk, 288 –config, 288 –connect, 288 –database, 288 –defaults-file, 288 –format, 288 –help, 288 –host, 288 –password, 288 –pid, 289 –port, 289 –set-vars, 289 –socket, 289 –user, 289 –version, 289 Index 325